GCANet_Gated context aggregation network for image dehazing and deraining

news2025/4/28 11:25:51

2019、中科大+港科、有代码

Chen D, He M, Fan Q, et al. Gated context aggregation network for image dehazing and deraining[C]//2019 IEEE winter conference on applications of computer vision (WACV). IEEE, 2019: 1375-1383.

GitHub - cddlyf/GCANet: Implementation of “Gated Context Aggregation Network for Image Dehazing and Deraining”

1、Absract

propose an end-to-end gated context aggregation network(?)
use smoothed dilation to help remove the gridding artifacts
use a gated sub-network to fuse the features from different levels

BR： gated？

2、Related Work

DehazeNet：[3] presents an end-to-end network to estimate the intermediate transmission map.
AODNet：[22] reformulates the atmospheric scattering model to predict the final clean image through a light-weight CNN.
[32] creates three different derived input images from the original hazy image and fuses the dehazed results out of these derived inputs.
[42] incorporates the physical model in Equation (1) into the network design and uses two sub-networks to regress the transmission map and atmospheric light respectively.
[3, 31, 22, 24, 42, 44]、[5]整了视频去雾

BR：记录并不完整，边阅读边整理吧~

3、Method

given a hazy input image, we first encode it into feature maps by the encoder part, then enhance them by aggregating more context information and fusing the features of different levels without downsampling. Specifically, the smoothed dilated convolution and an extra gate sub-network are leveraged. The enhanced feature maps will be finally decoded back to the original image space to get the target haze residue. By adding it onto the input hazy image, we will get the final haze free image.

BR：P3-4 介绍了方法实现的细节，有需要深入学习的时候再补充阅读，当前只用知道每一个结构的效果，为什么这么设计就够了。

3.1 数据集

1、以往方法创建有雾数据集的方式：用现有的有深度信息的数据集+物理退化模型合成有雾数据集。

2、[23] 提出了图像有雾的基准数据集：RESIDE——由深度和立体数据集合成的有雾图像对构成的大规模数据集。

3.2 损失函数

使用残差的均方误差（MSE）损失就能达到 SOTA 效果

4、Thinking

1、按作者的说法对去雾去雨都有效，如果是这样的话，它真的很强。

2、文章 AblationStudy 做得不到位，我暂时不能理解，1+3+3+1=8，而它只做了四组。√

3、在深度学习之前，大家对于不适定问题好像都是利用图像先验信息作为恢复约束来处理。我在想选择走深度学习的路线，抛去前期数学/物理原理的路线导向到底是什么，它解决了问题但是否远离真理了呢，是不是让人更懒惰了呢？TBD

4、什么是 smooth dilated convolution？和原本的 dilated convolution 有什么不同？（P3）√

5、gated 的门控如何体现？（P3）√

6、基本理解了作者的整个结构框架和实验逻辑，下一步任务就看代码细节+调整输入数据自行训练吧（20231122）

5、读图环节

在这里插入图片描述
1、网络结构：编码器（三个卷积块）、聚合上下文信息+减少网格伪影（七个平滑膨胀卷积层）、解码器（解卷积+两个卷积块）。

2、gated fusion：以膨胀卷积的低、中、高层特征为输入，输出相应的权重参数进行加权求和。

在这里插入图片描述
1、数据集：the SOTS indoor dataset from RESIDE。

2、这里的定量分析是复现了七个实验方法得出的吗？如果是从原论文中摘取的数据可信度再打折扣。

在这里插入图片描述
1、文章中作者说实验证明使用 instance norm 比 batch norm 更适合，表格中看不出来，也就是正文和实验逻辑不够自洽。

2、进一步理解了 gated fusion 后明白了为什么 ablation analysis 不是 8 组，因为 gated fusion 实现的前提是有 smoothed dilation 层，因此可能性要减少两种。但是为什么没有只用了 smoothed dilation + instance norm 的组？

在这里插入图片描述
1、能进行这个实验很棒，是值得学习借鉴的。它直观的对比同一张图在不同算法中的性能效果，但是既然对比了，如果能将 PSNR、SSIM、耗时情况都写下来，效果更佳。