在我前面的一些文章中也有用到过很多次注意力的集成来提升原生检测模型的性能,这里同样是加入了注意力机制,区别在于,这里同时在两处加入了注意力机制,第一处是讲CBAM集成进入原生的C3模块中,在特征提取部分就可以发挥注意力的作用,另一处是在Detect头前面加入CBAM模块,提升特征融合计算表达的能力,话不多说,首先看下效果图:
改进后的模型yaml文件如下:
#Parameters
nc: 4 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.50 # layer channel multiple
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32
#Backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3CBAM, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 6, C3CBAM, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3CBAM, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3CBAM, [1024]],
[-1, 1, SPPF, [1024, 5]], # 9
]
#Head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3CBAM, [512, False]], # 13
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3CBAM, [256, False]], # 17 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 14], 1, Concat, [1]], # cat head P4
[-1, 3, C3CBAM, [512, False]], # 20 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]], # cat head P5
[-1, 3, C3CBAM, [1024, False]], # 23 (P5/32-large)
[-1, 1, CBAM, [1024]],
[[17, 20, 24], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]
这里的改进主要体现在BackBone和Head两部分。
首先是BackBone部分,如下:
其次是Head部分,如下:
这里的改动还是比较多的,不过理清逻辑对照着来处理就行了。
简答看下数据集,如下:
YOLO格式标注数据如下所示:
实例标注内容如下所示:
3 0.754861 0.383951 0.140278 0.180247
2 0.454167 0.150617 0.080556 0.088889
2 0.370139 0.104938 0.068056 0.076543
1 0.7625 0.892593 0.180556 0.214815
1 0.848611 0.785185 0.161111 0.296296
1 0.665278 0.624691 0.144444 0.237037
3 0.578472 0.520988 0.140278 0.182716
3 0.350694 0.497531 0.134722 0.224691
1 0.439583 0.522222 0.101389 0.160494
2 0.416667 0.660494 0.147222 0.22963
0 0.355556 0.897531 0.194444 0.204938
2 0.280556 0.848148 0.127778 0.17037
2 0.284028 0.732099 0.106944 0.17037
2 0.186806 0.781481 0.1125 0.219753
0 0.165278 0.633333 0.113889 0.08642
0 0.194444 0.597531 0.158333 0.128395
2 0.247917 0.490123 0.095833 0.135802
2 0.211111 0.404938 0.127778 0.182716
2 0.215972 0.307407 0.106944 0.106173
2 0.068056 0.191358 0.075 0.081481
2 0.130556 0.237037 0.127778 0.083951
1 0.119444 0.28642 0.088889 0.108642
0 0.029861 0.367901 0.056944 0.148148
0 0.095833 0.407407 0.088889 0.083951
2 0.113194 0.469136 0.123611 0.11358
1 0.053472 0.644444 0.101389 0.330864
1 0.094444 0.782716 0.152778 0.316049
VOC格式标注数据如下:
实例标注内容如下所示:
<annotation>
<folder>UWA</folder>
<filename>00b27110-7b71-4c33-aced-a73527a36cd9.jpg</filename>
<source>
<database>The UWA Database</database>
<annotation>UWA</annotation>
<image>UWA</image>
</source>
<owner>
<name>YSHC</name>
</owner>
<size>
<width>720</width>
<height>405</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>echinus</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>0</xmin>
<ymin>128</ymin>
<xmax>71</xmax>
<ymax>248</ymax>
</bndbox>
</object>
<object>
<name>echinus</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>82</xmin>
<ymin>154</ymin>
<xmax>160</xmax>
<ymax>248</ymax>
</bndbox>
</object>
<object>
<name>echinus</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>145</xmin>
<ymin>186</ymin>
<xmax>225</xmax>
<ymax>284</ymax>
</bndbox>
</object>
<object>
<name>echinus</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>70</xmin>
<ymin>284</ymin>
<xmax>196</xmax>
<ymax>406</ymax>
</bndbox>
</object>
<object>
<name>echinus</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>200</xmin>
<ymin>290</ymin>
<xmax>280</xmax>
<ymax>406</ymax>
</bndbox>
</object>
<object>
<name>echinus</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>271</xmin>
<ymin>287</ymin>
<xmax>351</xmax>
<ymax>368</ymax>
</bndbox>
</object>
<object>
<name>echinus</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>324</xmin>
<ymin>136</ymin>
<xmax>401</xmax>
<ymax>219</ymax>
</bndbox>
</object>
<object>
<name>echinus</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>294</xmin>
<ymin>73</ymin>
<xmax>360</xmax>
<ymax>127</ymax>
</bndbox>
</object>
<object>
<name>echinus</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>194</xmin>
<ymin>71</ymin>
<xmax>266</xmax>
<ymax>140</ymax>
</bndbox>
</object>
<object>
<name>echinus</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>186</xmin>
<ymin>7</ymin>
<xmax>236</xmax>
<ymax>68</ymax>
</bndbox>
</object>
<object>
<name>echinus</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>632</xmin>
<ymin>84</ymin>
<xmax>694</xmax>
<ymax>156</ymax>
</bndbox>
</object>
<object>
<name>scallop</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>533</xmin>
<ymin>220</ymin>
<xmax>579</xmax>
<ymax>270</ymax>
</bndbox>
</object>
<object>
<name>scallop</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>576</xmin>
<ymin>124</ymin>
<xmax>625</xmax>
<ymax>184</ymax>
</bndbox>
</object>
</annotation>
默认执行100个epoch的迭代计算,日志输出如下:
训练完成,结果目录数据如下所示:
LABEL可视化:
F1值曲线和PR曲线:
混淆矩阵:
batch检测实例:
开发界面实现可视化推理计算,如下:
上传图像:
检测推理: