Ai 绘图日常篇十三：SDXL 1.0在简洁提示词理解上的提升

在Stability AI对SDXL 1.0的发布上提到了SDXL用简单的提示词就能生成美观的图像，但是实际上的对于生成照片的效果又是怎样呢？先看看Stability AI自己原话：

More intelligent with simpler language
SDXL requires only a few words to create complex, detailed, and aesthetically pleasing images. Users no longer need to invoke qualifier terms like “masterpiece” to get high-quality images. Furthermore, SDXL can understand the differences between concepts like “The Red Square” (a famous place) vs a “red square” (a shape).

翻译一下大概是下面这个意思：

更简洁的语言，更智能的理解
SDXL仅需要几个词就能创造出复杂、详细且美观的图像。用户无需再使用“masterpiece”等修饰词来获取高质量的图像。更进一步，SDXL能理解诸如“红场”（一个著名的地点）与“红色正方形”（一个形状）之间的概念差异。

那么用SDXL 1.0的Base模型加Refiner生成了一组9张图片，提示词中还是加入了一些与画质相关的提示词，例如masterpiece。反向提示词也是从某些模型的建议里面抄了几个。

效果其实真的很像Midjourney的感觉，但要是说真实感嘛，这些还是比较2.5D的吧，而手依旧是一个识别AI图片的好地方，参考一下Stability AI的“宣传图”，大概我是下错了模型了？

绘图参数：
beautiful young female, qipao, long sleeves, collar around neck,forest garden with dappled sunlight, standing alone, freckles,
(masterpiece,best quality, ultra realistic,32k,RAW photo,detail skin, 8k uhd, dslr,high quality, film grain:1.5),
Negative ｐｒｏｍｐｔ: (worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), tooth, open mouth,
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 6, Seed: 2410888868, Face restoration: CodeFormer, Size: 800x1200, Model hash: 31e35c80fc, Model: sd_xl_base_1.0, Version: v1.5.1

依旧使用SDXL 1.0的Base模型加Refiner，修改一下提示词，把反向提示词和Masterpiece之类的画质提示词删掉，同样的种子出来9图如下：（注：在批量生成的情况下，这9图的种子是依次+1）

画面上来看就有不同的变化，但在真实风格上是有后退的，不过有Refiner的加持，细节还是十分丰富的。

绘图参数：
beautiful young female, qipao, long sleeves, collar around neck,forest garden with dappled sunlight, standing alone, freckles,
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 6, Seed: 2410888868, Face restoration: CodeFormer, Size: 800x1200, Model hash: 31e35c80fc, Model: sd_xl_base_1.0, Version: v1.5.1

以这个例子来看，SDXL 1.0仍然需要在提示词上下功夫优化的，但是对比SD 1.5又是怎样呢？那么接着用第一组提示词在SD 1.5的一个优化模型上来试验一下。生成9图如下：

我是差点没笑死，这样来看SDXL 1.0其实还是有很明显的进步的。

绘图参数：
beautiful young female, qipao, long sleeves, collar around neck,forest garden with dappled sunlight, standing alone, freckles,
(masterpiece,best quality, ultra realistic,32k,RAW photo,detail skin, 8k uhd, dslr,high quality, film grain:1.5),
Negative ｐｒｏｍｐｔ: (worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), tooth, open mouth,
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 6, Seed: 2410888868, Face restoration: CodeFormer, Size: 512x768, Model hash: bc2f30f4ad, Model: beautifulRealistic_v60, Denoising strength: 0.5, Hires upscale: 2, Hires upscaler: R-ESRGAN 4x+, Version: v1.5.1

那么要是把那些反向提示词和画质提示词删掉又如何呢？下面生成的9图可以又再笑一次了。

绘图参数：
beautiful young female, qipao, long sleeves, collar around neck,forest garden with dappled sunlight, standing alone, freckles,
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 6, Seed: 2410888868, Face restoration: CodeFormer, Size: 512x768, Model hash: bc2f30f4ad, Model: beautifulRealistic_v60, Denoising strength: 0.5, Hires upscale: 2, Hires upscaler: R-ESRGAN 4x+, Version: v1.5.1

不过想想这些画质提示词和反向提示词都是照着SDXL 1.0里面简化了的来用的，如果用上SD 1.5时常用的一些提示词又怎样呢？试了一下生成9图如下：

这时很明显可以发现使用SD 1.5的优化后的模型再配合合适的提示词，在真实感上比起前面SDXL 1.0的模型出来的效果更佳，虽然画面上是SDXL的更悦目，就是差点真实感。然后我想起Stability AI在 HuggingFace上说明有这么一句。

The model does not achieve perfect photorealism
该模型并未达到完美的照片级真实感。

或许需要像SD 1.5时的Chilloumix那样个突出的优化混和模型来实现吧。这里再用一个C站上的SDXL 1.0模型用第一组提示词来生成9图如下：

画面整体感觉十分协调，手不能说完美，但算是比较靠谱了。

绘图参数：

beautiful young female, qipao, long sleeves, collar around neck,forest garden with dappled sunlight, standing alone,freckles,
(masterpiece,best quality, ultra realistic,32k,RAW photo,detail skin, 8k uhd, dslr,high quality, film grain:1.5),
Negative ｐｒｏｍｐｔ: (worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), tooth, open mouth,
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 6, Seed: 2410888868, Face restoration: CodeFormer, Size: 800x1200, Model hash: 565af52b8e, Model: sdxlUnstableDiffusers_xlV4Grimorium, Version: v1.5.1

不过要是把反向提示词和画质提示词删掉的话，对真实感照片的偏离就更大了。

绘图参数：
beautiful young female, qipao, long sleeves, collar around neck,forest garden with dappled sunlight, standing alone, freckles,
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 6, Seed: 2410888868, Face restoration: CodeFormer, Size: 800x1200, Model hash: 565af52b8e, Model: sdxlUnstableDiffusers_xlV4Grimorium, Version: v1.5.1

这样来看，SDXL 1.0对比SD 1.5的模型的确是用简单的提示词就能生成质量更高的图片，但是在这个例子中，在真实感照片这个细分上，我还没有很好的掌握应该怎么做，而在有大量优化模型的SD 1.5和前人经验铺路的情况下，SD 1.5 还是挺不错的，就是要怎么用SDXL 1.0才能达到更好的效果。

最后补充一个：必须要选对VAE，下面这4张图是SD 1.5模型用了SDXL的VAE的作品，十分的灵异。

（本篇个人体验完，大图或者看看怎么放笔记发吧，请关注或在个人主页查看，Thanks）

作者声明本文无利益相关，欢迎值友理性交流，和谐讨论～

本站仅提供存储服务，所有内容均由用户发布，如发现有害或侵权内容，请点击举报。