文章目录
- 背景
- 安装
- 方式一
- 方式二
- 测试
背景
StabilityAI
近日开源了Stable Diffusion 3 Medium
,简称 SD3
,该模型拥有着20亿参数。其特点如下:
- 提升了整体图片的质量、真实感
- 提供了三种文本编码器可组合使用,有助于在性能和效率之间做出权衡。同时在空间推理、构图元素、动作、风格理解能力有了更大的提升
- 提升了文本质量,减少拼写、字距调整、字母形成和间距方面的错误
- VRAM 占用空间小,资源高效利用,适合在消费级GPU上运行
- 通过小数据集可以进行更加精细化的微调,适合模型的定制
放一组官方的图体验下
安装
如果之前安装过ComfyUI
,则直接进入ComfyUI
根目录打开Git Bash
工具,输入命令git pull
即可更新ComfyUI
,最新的ComfyUI
已经支持SD3的运行了。
SD3相关模型的下载地址为:https://huggingface.co/stabilityai/stable-diffusion-3-medium/tree/main
那么,这些模型该如何使用呢?
方式一
如果你不想单独在ComfyUI
中加载CLIP
文本编码器模型,那么可以下载
fp8精度的sd3_medium_incl_clips_t5xxlfp8.safetensors
或者fp16精度的sd3_medium_incl_clips_t5xxlfp16.safetensors
或者sd3_medium_incl_clips.safetensors
这三个模型都内嵌了VAE
以及文本编码器模型,所以下载下来直接使用即可
方式二
如果想将文生图模型、文本编码器模型分开使用,那么,只需要下载文生图模型sd3_medium.safetensors
,以及编码器模型clip_g.safetensors
、clip_l.safetensors
、t5xxl_fp8_e4m3fn.safetensors(可选)
、t5xxl_fp16.safetensors(可选)
,然后将下载的编码器模型放在ComfyUI
的models\clip
目录下,如果要放在Stable Diffusion WebUI
的models\clip
目录下,那么需要修改下ComfyUI
的配置文件extra_model_paths.yaml
,将Stable Diffusion WebUI
的models\clip
目录添加到该配置文件夹中,修改完配置文件需要重启ComfyUI
然后在ComfyUI
中添加TripleCLIPLoader
节点来加载文本编码器模型clip_g.safetensors
、clip_l.safetensors
、t5xxl_fp8_e4m3fn.safetensors(可选)
、t5xxl_fp16.safetensors(可选)
其中,t5xxl类型的文本编码器有助于增强模型对prompt的理解能力!
此外,sd3_medium_incl_clips.safetensors
模型并没有内嵌t5xxl类型的编码器。
测试
使用StabilityAI
官方的测试prompt进行测试。工作流如下:
反向prompt统一为
bad quality, poor quality, doll, disfigured, jpg, toy, bad anatomy, missing limbs, missing fingers, 3d, cgi
- a female character with long, flowing hair that appears to be made of ethereal, swirling patterns resembling the Northern Lights or Aurora Borealis. The background is dominated by deep blues and purples, creating a mysterious and dramatic atmosphere. The character’s face is serene, with pale skin and striking features. She wears a dark-colored outfit with subtle patterns. The overall style of the artwork is reminiscent of fantasy or supernatural genres
- Digital art, portrait of an anthropomorphic roaring Tiger warrior with full armor, close up in the middle of a battle, behind him there is a banner with the text “Open Source”.
-
photo of a dog and a cat both standing on a red box, with a blue ball in the middle with a parrot standing on top of the ball. The box has the text “SD3”
-
selfie photo of a wizard with long beard and purple robes, he is apparently in the middle of Tokyo. Probably taken from a phone.
-
A vibrant street wall covered in colorful graffiti, the centerpiece spells “SD3 MEDIUM”, in a storm of colors
-
photo of a young woman with long, wavy brown hair tied in a bun and glasses. She has a fair complexion and is wearing subtle makeup, emphasizing her eyes and lips. She is dressed in a black top. The background appears to be an urban setting with a building facade, and the sunlight casts a warm glow on her face.
-
anime art of a steampunk inventor in their workshop, surrounded by gears, gadgets, and steam. He is holding a blue potion and a red potion, one in each hand
-
photo of picturesque scene of a road surrounded by lush green trees and shrubs. The road is wide and smooth, leading into the distance. On the right side of the road, there’s a blue sports car parked with the license plate spelling “SD32B”. The sky above is partly cloudy, suggesting a pleasant day. The trees have a mix of green and brown foliage. There are no people visible in the image. The overall composition is balanced, with the car serving as a focal point.
-
photo of young man in a black suit, white shirt, and black tie. He has a neatly styled haircut and is looking directly at the camera with a neutral expression. The background consists of a textured wall with horizontal lines. The photograph is in black and white, emphasizing contrasts and shadows. The man appears to be in his late twenties or early thirties, with fair skin and short, dark hair.
-
photo of a woman on the beach, shot from above. She is facing the sea, while wearing a white dress. She has long blonde hair