在Stability AI对SDXL 1.0的发布上提到了SDXL用简单的提示词就能生成美观的图像,但是实际上的对于生成照片的效果又是怎样呢?先看看Stability AI自己原话:
More intelligent with simpler language
SDXL requires only a few words to create complex, detailed, and aesthetically pleasing images. Users no longer need to invoke qualifier terms like “masterpiece” to get high-quality images. Furthermore, SDXL can understand the differences between concepts like “The Red Square” (a famous place) vs a “red square” (a shape).
翻译一下大概是下面这个意思:
更简洁的语言,更智能的理解
SDXL仅需要几个词就能创造出复杂、详细且美观的图像。用户无需再使用“masterpiece”等修饰词来获取高质量的图像。更进一步,SDXL能理解诸如“红场”(一个著名的地点)与“红色正方形”(一个形状)之间的概念差异。
那么用SDXL 1.0的Base模型加Refiner生成了一组9张图片,提示词中还是加入了一些与画质相关的提示词,例如masterpiece。反向提示词也是从某些模型的建议里面抄了几个。
效果其实真的很像Midjourney的感觉,但要是说真实感嘛,这些还是比较2.5D的吧,而手依旧是一个识别AI图片的好地方,参考一下Stability AI的“宣传图”,大概我是下错了模型了?
绘图参数:
beautiful young female, qipao, long sleeves, collar around neck,forest garden with dappled sunlight, standing alone, freckles,
(masterpiece,best quality, ultra realistic,32k,RAW photo,detail skin, 8k uhd, dslr,high quality, film grain:1.5),
Negative prompt: (worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), tooth, open mouth,
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 6, Seed: 2410888868, Face restoration: CodeFormer, Size: 800x1200, Model hash: 31e35c80fc, Model: sd_xl_base_1.0, Version: v1.5.1
依旧使用SDXL 1.0的Base模型加Refiner,修改一下提示词,把反向提示词和Masterpiece之类的画质提示词删掉,同样的种子出来9图如下:(注:在批量生成的情况下,这9图的种子是依次+1)
画面上来看就有不同的变化,但在真实风格上是有后退的,不过有Refiner的加持,细节还是十分丰富的。
绘图参数:
beautiful young female, qipao, long sleeves, collar around neck,forest garden with dappled sunlight, standing alone, freckles,
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 6, Seed: 2410888868, Face restoration: CodeFormer, Size: 800x1200, Model hash: 31e35c80fc, Model: sd_xl_base_1.0, Version: v1.5.1
以这个例子来看,SDXL 1.0仍然需要在提示词上下功夫优化的,但是对比SD 1.5又是怎样呢?那么接着用第一组提示词在SD 1.5的一个优化模型上来试验一下。生成9图如下:
我是差点没笑死,这样来看SDXL 1.0其实还是有很明显的进步的。
绘图参数:
beautiful young female, qipao, long sleeves, collar around neck,forest garden with dappled sunlight, standing alone, freckles,
(masterpiece,best quality, ultra realistic,32k,RAW photo,detail skin, 8k uhd, dslr,high quality, film grain:1.5),
Negative prompt: (worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), tooth, open mouth,
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 6, Seed: 2410888868, Face restoration: CodeFormer, Size: 512x768, Model hash: bc2f30f4ad, Model: beautifulRealistic_v60, Denoising strength: 0.5, Hires upscale: 2, Hires upscaler: R-ESRGAN 4x+, Version: v1.5.1
那么要是把那些反向提示词和画质提示词删掉又如何呢?下面生成的9图可以又再笑一次了。
绘图参数:
beautiful young female, qipao, long sleeves, collar around neck,forest garden with dappled sunlight, standing alone, freckles,
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 6, Seed: 2410888868, Face restoration: CodeFormer, Size: 512x768, Model hash: bc2f30f4ad, Model: beautifulRealistic_v60, Denoising strength: 0.5, Hires upscale: 2, Hires upscaler: R-ESRGAN 4x+, Version: v1.5.1
不过想想这些画质提示词和反向提示词都是照着SDXL 1.0里面简化了的来用的,如果用上SD 1.5时常用的一些提示词又怎样呢?试了一下生成9图如下:
这时很明显可以发现使用SD 1.5的优化后的模型再配合合适的提示词,在真实感上比起前面SDXL 1.0的模型出来的效果更佳,虽然画面上是SDXL的更悦目,就是差点真实感。然后我想起Stability AI在 HuggingFace上说明有这么一句。
The model does not achieve perfect photorealism
该模型并未达到完美的照片级真实感。
或许需要像SD 1.5时的Chilloumix那样个突出的优化混和模型来实现吧。这里再用一个C站上的SDXL 1.0模型用第一组提示词来生成9图如下:
画面整体感觉十分协调,手不能说完美,但算是比较靠谱了。
绘图参数:
beautiful young female, qipao, long sleeves, collar around neck,forest garden with dappled sunlight, standing alone,freckles,
(masterpiece,best quality, ultra realistic,32k,RAW photo,detail skin, 8k uhd, dslr,high quality, film grain:1.5),
Negative prompt: (worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), tooth, open mouth,
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 6, Seed: 2410888868, Face restoration: CodeFormer, Size: 800x1200, Model hash: 565af52b8e, Model: sdxlUnstableDiffusers_xlV4Grimorium, Version: v1.5.1
不过要是把反向提示词和画质提示词删掉的话,对真实感照片的偏离就更大了。
绘图参数:
beautiful young female, qipao, long sleeves, collar around neck,forest garden with dappled sunlight, standing alone, freckles,
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 6, Seed: 2410888868, Face restoration: CodeFormer, Size: 800x1200, Model hash: 565af52b8e, Model: sdxlUnstableDiffusers_xlV4Grimorium, Version: v1.5.1
这样来看,SDXL 1.0对比SD 1.5的模型的确是用简单的提示词就能生成质量更高的图片,但是在这个例子中,在真实感照片这个细分上,我还没有很好的掌握应该怎么做,而在有大量优化模型的SD 1.5和前人经验铺路的情况下,SD 1.5 还是挺不错的,就是要怎么用SDXL 1.0才能达到更好的效果。
最后补充一个:必须要选对VAE,下面这4张图是SD 1.5模型用了SDXL的VAE的作品,十分的灵异。
(本篇个人体验完,大图或者看看怎么放笔记发吧,请关注或在个人主页查看,Thanks)
作者声明本文无利益相关,欢迎值友理性交流,和谐讨论~
联系客服