一. AdaAttN-Revisit Attention Mechanism in Arbitrary Neural Style Transfer(ICCV2021)
- 下载vgg_normalised.pth
- 打开visdom
python -m visdom.server
- 在 train_adaattn.sh 中配置 content_path、style_path 和 image_encoder_path,分别表示训练内容图像、训练样式图像和 "vgg_normalised.pth "文件夹的路径。
python train.py --content_path F:\RefDayDataset\KAIST_256\trainA --style_path F:\RefDayDataset\KAIST_256\trainB --name AdaAttN_kaist --model adaattn --dataset_mode unaligned --no_dropout --load_size 286 --crop_size 256 --image_encoder_path C:\Users\64883\Desktop\AdaAttN-main\models\vgg_normalised.pth --gpu_ids 0 --batch_size 1 --n_epochs 2 --n_epochs_decay 3 --display_freq 1 --display_port 8097 --display_env AdaAttN --lambda_local 3 --lambda_global 10 --lambda_content 0 --shallow_layer --skip_connection_3
问题1
OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "D:\Anaconda3\envs\paddlepaddle\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll" or one of its dependencies.
self._popen = self._Popen(self)
File "D:\Anaconda3\envs\paddlepaddle\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "D:\Anaconda3\envs\paddlepaddle\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "D:\Anaconda3\envs\paddlepaddle\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
reduction.dump(process_obj, to_child)
File "D:\Anaconda3\envs\paddlepaddle\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe
解决方法
parser.add_argument('--num_threads', default=4, type=int, help='# threads for loading data')
修改为
parser.add_argument('--num_threads', default=0, type=int, help='# threads for loading data')
问题2
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.08 GiB (GPU 0; 8.00 GiB total capacity; 134.76 MiB already allocated; 4.94 GiB free; 748.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for M
emory Management and PYTORCH_CUDA_ALLOC_CONF
解决方法:降低分辨率
问题3 输出频率太频繁了
解决方法
--display_freq 1
更改为
--display_freq 1000
问题4 内容损失始终为0
解决方法
--lambda_content 0
修改为
--lambda_content 10
问题5 训练轮次过少
解决方法
--n_epochs 2 --n_epochs_decay 3
修改为
--n_epochs 100 --n_epochs_decay 100
二. ArtFlow- Unbiased Image Style Transfer via Reversible Neural Flows(CVPR2021)
- 下载VGG模型,创建models文件夹,将模型移动到models文件夹下
- 修改训练代码
创建experiments文件夹
python -u train.py --content_dir F:/RefDayDataset/KAIST_256/trainA --style_dir F:/RefDayDataset/KAIST_256/trainB --save_dir ./experiments/ArtFlow-AdaIN --n_flow 8 --n_block 2 --batch_size 4 --operator adain
问题1
Traceback (most recent call last):
File "train.py", line 152, in <module>
content_dataset = FlatFolderDataset(args.content_dir, content_tf)
File "train.py", line 37, in __init__
self.paths = os.listdir(self.root)
OSError: [WinError 123] 文件名、目录名或卷标语法不正确。: "'F:\\RefDayDataset\\KAIST_256\\trainA'"
解决方法:把单引号删除
问题2
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
解决方法
parser.add_argument('--n_threads', type=int, default=8)
修改为
parser.add_argument('--n_threads', type=int, default=0)
问题3
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 36.00 MiB (GPU 0; 8.00 GiB total capacity; 7.42 GiB already allocated; 0 bytes free; 7.47 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memor
y Management and PYTORCH_CUDA_ALLOC_CONF
解决方法:降低batchsize,降低分辨率
--batch_size 4
修改为
--batch_size 1
三. IEST- Artistic Style Transfer with Internal-external Learning and Contrastive Learning(NeurIPS2021)
- 下载VGG模型,并移动到models文件夹下
- 修改训练代码
python train.py --content_dir F:/RefDayDataset/KAIST_256/trainA --style_dir F:/RefDayDataset/KAIST_256/trainB
问题1
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
解决方法
parser.add_argument('--n_threads', type=int, default=16)
修改为
parser.add_argument('--n_threads', type=int, default=0)
问题2
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 36.00 MiB (GPU 0; 8.00 GiB total capacity; 7.42 GiB already allocated; 0 bytes free; 7.47 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memor
y Management and PYTORCH_CUDA_ALLOC_CONF
解决方法:降低batchsize,降低分辨率
parser.add_argument('--batch_size', type=int, default=12)
修改为
parser.add_argument('--batch_size', type=int, default=2)
问题3
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
解决方法:试试在另外一张卡,或者改变num_workers
四. CAST- Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning(SIGGRAPH2022)
- 下载pretrained style classification model和pretrained content encoder
- 修改训练代码
python train.py --dataroot F:/RefDayDataset/KAIST_256 --name cast
问题1
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "C:\Users\64883\Desktop\CAST_pytorch-main\models\cast_model.py", line 11, in <module>
import kornia.augmentation as K
ModuleNotFoundError: No module named 'kornia'
解决方法
pip install kornia
问题2
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /env/main (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x00000230810E0588>: Failed to establish a new connection: [WinError 10061] 由于目标计算机积极拒绝,无法连接。')
)
[WinError 10061] 由于目标计算机积极拒绝,无法连接。
on_close() takes 1 positional argument but 3 were given
Visdom python client failed to establish socket to get messages from the server. This feature is optional and can be disabled by initializing Visdom with `use_incoming_socket=False`, which will prevent waiting for this request to timeout.
Traceback (most recent call last):
File "D:\Anaconda3\envs\paddlepaddle\lib\site-packages\urllib3\util\connection.py", line 85, in create_connection
sock.connect(sa)
ConnectionRefusedError: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
During handling of the above exception, another exception occurred:
解决方法
python -m visdom.server
问题3
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "D:\Anaconda3\envs\paddlepaddle\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "D:\Anaconda3\envs\paddlepaddle\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
解决方法
parser.add_argument('--num_threads', default=4, type=int, help='# threads for loading data')
修改为
parser.add_argument('--num_threads', default=0, type=int, help='# threads for loading data')
问题4
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.08 GiB (GPU 0; 8.00 GiB total capacity; 751.44 MiB already allocated; 4.37 GiB free; 1.30 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Mem
ory Management and PYTORCH_CUDA_ALLOC_CONF
解决方法:降低batchsize,降低分辨率
五. StyTr2- Image Style Transfer with Transformers(CVPR2022)
- 下载VGG模型,移动到models文件夹下
- 修改训练代码
python train.py --content_dir F:/RefDayDataset/KAIST_256/trainA --style_dir F:/RefDayDataset/KAIST_256/trainB --save_dir experiments/ --batch_size 1
问题1
ImportError: cannot import name '_new_empty_tensor' from 'torchvision.ops' (D:\python\lib\site-packages\torchvision\ops\__init__.py)
解决方法
import torchvision
if float(torchvision.__version__[:3]) < 0.7:
from torchvision.ops import _new_empty_tensor
from torchvision.ops.misc import _output_size
修改为
import torchvision
if float(torchvision.__version__[2:4]) < 7:
from torchvision.ops import _new_empty_tensor
from torchvision.ops.misc import _output_size
问题2
ImportError: cannot import name 'container_abcs' from 'torch._six' (D:\Anaconda3\envs\paddlepaddle\lib\site-packages\torch\_six.py)
解决方法
from torch._six import container_abcs
修改为
import collections.abc as container_abcs
问题3
File "D:\Anaconda3\envs\paddlepaddle\lib\site-packages\torch\_utils.py", line 577, in <lambda>
return [_get_device_attr(lambda m: m.get_device_properties(i)) for i in device_ids]
File "D:\Anaconda3\envs\paddlepaddle\lib\site-packages\torch\cuda\__init__.py", line 374, in get_device_properties
raise AssertionError("Invalid device id")
AssertionError: Invalid device id
解决方法
train中116行注释掉
# network = nn.DataParallel(network, device_ids=[0,1])
问题4
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
解决方法
parser.add_argument('--n_threads', type=int, default=16)
修改为
parser.add_argument('--n_threads', type=int, default=0)
问题5
Traceback (most recent call last):
File "train.py", line 135, in <module>
{'params': network.module.transformer.parameters()},
File "D:\Anaconda3\envs\paddlepaddle\lib\site-packages\torch\nn\modules\module.py", line 1270, in __getattr__
type(self).__name__, name))
AttributeError: 'StyTrans' object has no attribute 'module'
这个错误通常在使用 PyTorch 的多 GPU 训练时出现。在多 GPU 训练中,模型通常会被包装在 nn.DataParallel 或 nn.parallel.DistributedDataParallel 中,以实现并行计算。这会导致模型对象的属性访问发生变化。
解决方法
optimizer = torch.optim.Adam([
{'params': network.module.transformer.parameters()},
{'params': network.module.decode.parameters()},
{'params': network.module.embedding.parameters()},
], lr=args.lr)
更改为
optimizer = torch.optim.Adam([
{'params': network.transformer.parameters()},
{'params': network.decode.parameters()},
{'params': network.embedding.parameters()},
], lr=args.lr)