GitHub - littlebeen/Cloud-removal-model-collection: A collection of the existing end-to-end cloud removal model
readme
云恢复的扩散增强
基于ADM的超分辨率遥感图像去云扩散增强算法。
几种传统的CR模型可以参考https://github.com/littlebeen/Cloud-removal-model-collection!
使用
训练
纯扩散 respace.py: gaussian_diffusion;unet.py: UnetModel
锁定扩散+训练WA:gaussian_diffusion_enhance;unet.py: UnetModel256;锁定在train_util.py的第74行
全部更改 train_util.py第74行
测试
将预训练模型放入` pre_train `中
python super_res_sample.py
权重在带有mn和mdsa的RICE2上进行预训练的模型被上传。百度网盘 请输入提取码 密码bean
CUHK-CR
一个新的多光谱云去除数据集
下载链接 百度网盘 请输入提取码 密码bean-CUHK-CR1(薄云数据集CUHK-CR1的RGB图像)
-CUHK-CR2 (厚云数据集CUHK-CR2的RGB图像)
-近红外(CUHK-CR1及CUHK-CR2的近红外图像)
如果你需要4个波段(RGB + 近红外)的图像,你可以加载RGB数据集和近红外数据集中的图像,并将4个通道组合在一起。
File "super_res_train.py", line 124, in <module>
main()
File "super_res_train.py", line 27, in main
dist_util.setup_dist()
File "D:\learn\txhf\DDPM-Enhancement-for-Cloud-Removal-main\guided_diffusion\dist_util.py", line 42, in setup_dist
dist.init_process_group(backend=backend, init_method="env://")
File "D:\an\anaconda\envs\inpaint\lib\site-packages\torch\distributed\distributed_c10d.py", line 602, in init_process_group
default_pg = _new_process_group_helper(
File "D:\an\anaconda\envs\inpaint\lib\site-packages\torch\distributed\distributed_c10d.py", line 727, in _new_process_group_helper
raise RuntimeError("Distributed package doesn't have NCCL " "built in")
RuntimeError: Distributed package doesn't have NCCL built in
首先你要有一个cuda环境,然后安装资源包
pip install blobfile
pip install mpi4py
运行 一个是训练一个是测试,先运行训练。
python super_res_train.py
python super_res_sample.py
运行报错
ImportError: DLL load failed while importing MPI: 找不到指定的模块。
因为本机缺乏MPI程序,直接此处下载 https://www.microsoft.com/en-us/download/details.aspx?id=57467,安装到默认C盘地方因为也不大。
运行报错
RuntimeError: No CUDA GPUs are available
super_res_train.py中的os.environ["CUDA_VISIBLE_DEVICES"] = "1" 改为0,电脑默认的cuda是0
运行报错
raise RuntimeError("Distributed package doesn't have NCCL " "built in")
RuntimeError: Distributed package doesn't have NCCL built in
windows不支持NCCL backend 原代码可能用的linux系统 super_res_train.py中加入
import os
os.environ["PL_TORCH_DISTRIBUTED_BACKEND"] = "gloo"
同时将报错的地方 guided_diffusion/dist_util.py路径下的
dist.init_process_group(backend=backend, init_method="env://")
修改为 dist.init_process_group(backend='gloo', init_method="env://")
运行报错
FileNotFoundError: [Errno 2] No such file or directory: './guided_diffusion/cloudnet/mn/pretrain/mn2.pth'
找不到预训练权重我们 到百度网盘下载同时 路径放到这个下面并修改文件名称(模型+数据) 保证路径一致 './guided_diffusion/cloudnet/mn/pretrain/mn2.pth' 我直接将guided_diffusion/diff/gaussian_diffusion_enhance.py路径下的 self.cloudnet.load_state_dict(th.load('./guided_diffusion/cloudnet/'+model+'/pretrain/'+model+'2.pth'),strict=True)直接r+绝对路径self.cloudnet.load_state_dict(th.load(r'weight/ema_0.9999_010000.pt.pth'),strict=True) pt和pth本质一样后缀名随意改
运行报错
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for MPRNet:
load_state_dict方法参数的官方说明 strict 参数默认是true,他的含义是 是否严格要求state_dict中的键与该模块的键返回的键匹配。在报错代码guided_diffusion/diff/gaussian_diffusion_enhance.py路径下,将strict=True改为strict=False
if(data=='RICE1'):
# self.cloudnet.load_state_dict(th.load('./guided_diffusion/cloudnet/'+model+'/pretrain/'+model+'rice2.pth'),strict=True)
self.cloudnet.load_state_dict(
th.load('./guided_diffusion/cloudnet/' + 'pretrain/' + model + '_rice2.pth'),
strict=True)#改为False
print(model+'1 is load')
elif(data=='RICE2'):
# self.cloudnet.load_state_dict(th.load('./guided_diffusion/cloudnet/'+model+'/pretrain/'+model+'rice2.pth'),strict=True)
self.cloudnet.load_state_dict(
th.load('./guided_diffusion/cloudnet/' + 'pretrain/' + model + '_rice2.pth'),
strict=True)#改为False
print(model+'2 is load')
改参数为False即可:
下载的这两个模型和数据集都加载不了 全换成别的
FileNotFoundError: The system cannot find the path specified: './pre_train'
在DDPM-Enhancement-for-Cloud-Removal-main路径下创建一个pre_train文件夹
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
出现报错
work.wait()
RuntimeError: a leaf Variable that requires grad is being used in an in-place operat
这个可能是多卡运行的问题, 注释掉报错代码,路径在guided_diffusion/train_util.py
self._load_and_sync_parameters()
找不到数据集,修改数据集路径
File "D:\an\anaconda\envs\inpaint\lib\site-packages\blobfile\_context.py", line 353, in scandir
raise FileNotFoundError(f"The system cannot find the path specified: '{path}'")
FileNotFoundError: The system cannot find the path specified: '../data/RICE2/train/cloud'
DDPM-Enhancement-for-Cloud-Removal-main根目录下,新建data文件夹,将数据按照格式放进去。
代码正常运行了
看论文得知,