推荐:write_own_pipeline.ipynb - Colab (google.com)
为您的任务选择一个 AutoPipeline
首先选择一个检查点。例如,如果您对使用 runwayml/stable-diffusion-v1-5 检查点的文本到图像感兴趣,请使用 AutoPipelineForText2Image:
from diffusers import AutoPipelineForText2Image import torch pipeline = AutoPipelineForText2Image.from_pretrained( "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True ).to("cuda") prompt = "peasant and dragon combat, wood cutting style, viking era, bevel with rune" image = pipeline(prompt, num_inference_steps=25).images[0] image
在引擎盖下,AutoPipelineForText2Image:
- 自动检测 model_index.json 文件中的类
"stable-diffusion"
- 根据类名加载对应的文本到图像的 StableDiffusionPipeline
"stable-diffusion"
同样,对于图像到图像,AutoPipelineForImage2Image 会从文件中检测检查点,并将在后台加载相应的 StableDiffusionImg2ImgPipeline。还可以传递特定于管道类的任何其他参数,例如 ,它确定添加到输入图像的噪声或变化量:
"stable-diffusion"
model_index.json
strength
from diffusers import AutoPipelineForImage2Image import torch import requests from PIL import Image from io import BytesIO pipeline = AutoPipelineForImage2Image.from_pretrained( "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True, ).to("cuda") prompt = "a portrait of a dog wearing a pearl earring" url = "https://upload.wikimedia.org/wikipedia/commons/thumb/0/0f/1665_Girl_with_a_Pearl_Earring.jpg/800px-1665_Girl_with_a_Pearl_Earring.jpg" response = requests.get(url) image = Image.open(BytesIO(response.content)).convert("RGB") image.thumbnail((768, 768)) image = pipeline(prompt, image, num_inference_steps=200, strength=0.75, guidance_scale=10.5).images[0] image
原图:
生图:
如果要进行修复,则 AutoPipelineForInpainting 会以相同的方式加载基础 StableDiffusionInpaintPipeline 类:
from diffusers import AutoPipelineForInpainting from diffusers.utils import load_image import torch pipeline = AutoPipelineForInpainting.from_pretrained( "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, use_safetensors=True ).to("cuda") img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png" init_image = load_image(img_url).convert("RGB") mask_image = load_image(mask_url).convert("RGB") prompt = "A majestic tiger sitting on a bench" image = pipeline(prompt, image=init_image, mask_image=mask_image, num_inference_steps=50, strength=0.80).images[0] image
原图:
原掩码图:
生成图像:
好像,,震惊