[linux-sd-webui]api之dreambooth训练

news2025/1/13 2:50:40

https://gitee.com/leeguandong/dreambooth-for-diffusionhttps://gitee.com/leeguandong/dreambooth-for-diffusionhttps://zhuanlan.zhihu.com/p/584736850https://zhuanlan.zhihu.com/p/584736850这个库使用的是diffusers库，现在主要就是kohya-ss/sd-scripts混合写的一套，要不就是直接使用diffusers来搞。

1.install

torch
torchvision
huggingface_hub==0.14.1
tokenizers==0.13.3
transformers==4.25.1
diffusers==0.16.0
accelerate==0.15.0

升级libstdc++.so.6，glic

2.ckpt2diffusers

from diffusers.pipelines.stable_diffusion.safety_checker import StableDiffusionSafetyChecker

第705/817行配置一下openai的clip权重

if os.path.exists(default_model_path):
    text_model = CLIPTextModel.from_pretrained(os.path.join(default_model_path, "clip-vit-large-patch14"))
else:
    text_model = CLIPTextModel.from_pretrained("/home/imcs/local_disk/dreambooth-for-diffusion-main/tools/clip-vit-large-patch14")

if os.path.exists(default_model_path):
    tokenizer = CLIPTokenizer.from_pretrained(os.path.join(default_model_path, "clip-vit-large-patch14"))
else:  
    tokenizer = CLIPTokenizer.from_pretrained("/home/imcs/local_disk/dreambooth-for-diffusion-main/tools/clip-vit-large-patch14")

3.train_object.sh

训练特定人、事物：（推荐准备3~5张风格统一、特定对象的图片）

# accelerate launch tools/train_dreambooth.py \
python -m torch.distributed.launch --nproc_per_node=4 --nnodes=1 --node_rank=0 --master_addr=localhost   --master_port=22222 --use_env "tools/train_dreambooth.py" \
  --train_text_encoder \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --instance_prompt="a photo of <xxx> building" \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --class_prompt="a photo of building" \
  --class_data_dir=$CLASS_DIR \
  --num_class_images=200 \
  --output_dir=$OUTPUT_DIR \
  --logging_dir=$LOG_DIR \
  --center_crop \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=1 --gradient_checkpointing \
  --use_8bit_adam \
  --learning_rate=2e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --auto_test_model \
  --test_prompts_file=$TEST_PROMPTS_FILE \
  --test_seed=123 \
  --test_num_per_prompt=3 \
  --max_train_steps=1000 \
  --save_model_every_n_steps=500 
#     --mixed_precision="fp16" \

4.train_style.sh

Finetune训练自己的大模型：（推荐准备3000+张图片，包含尽可能的多样性，数据决定训练出的模型质量）

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/474985.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！