💡💡💡本文主要内容:详细介绍了水下生物检测整个过程,从数据集到训练模型到结果可视化分析,以及如何优化提升检测性能。
💡💡💡加入自研注意力MSAM mAP@0.5由原始的0.522提升至0.534
1.水下生物检测数据集介绍
水下生物检测类别:
0: echinus
1: holothurian
2: scallop
3: starfish
4: waterweeds
数据集大小:1000张
细节图:
2.基于YOLOv8的水下生物检测
2.1 修改fish.yaml
path: ./data/fish # dataset root dir
train: train.txt # train images (relative to 'path') 118287 images
val: val.txt # val images (relative to 'path') 5000 images
# number of classes
nc: 5
# class names
names:
0: echinus
1: holothurian
2: scallop
3: starfish
4: waterweeds
2.2 开启训练
import warnings
warnings.filterwarnings('ignore')
from ultralytics import YOLO
if __name__ == '__main__':
model = YOLO('ultralytics/cfg/models/v8/lyolo/yolov8n-lyolo.yaml')
#model.load('yolov8n.pt') # loading pretrain weights
model.train(data='data/fish/fish.yaml',
cache=False,
imgsz=640,
epochs=200,
batch=16,
close_mosaic=10,
workers=0,
device='0',
optimizer='SGD', # using SGD
project='runs/train',
name='exp',
)
3.结果可视化分析
F1_curve.png:F1分数与置信度(x轴)之间的关系。F1分数是分类的一个衡量标准,是精确率和召回率的调和平均函数,介于0,1之间。越大越好。
TP:真实为真,预测为真;
FN:真实为真,预测为假;
FP:真实为假,预测为真;
TN:真实为假,预测为假;
精确率(precision)=TP/(TP+FP)
召回率(Recall)=TP/(TP+FN)
F1=2*(精确率*召回率)/(精确率+召回率)
PR_curve.png :PR曲线中的P代表的是precision(精准率),R代表的是recall(召回率),其代表的是精准率与召回率的关系。
R_curve.png :召回率与置信度之间关系
results.png
mAP_0.5:0.95表示从0.5到0.95以0.05的步长上的平均mAP.
预测结果:
4.如何优化模型
4.1 加入自研注意力
MSAM(CBAM升级版):通道注意力具备多尺度性能,多分支深度卷积更好的提取多尺度特征,最后高效结合空间注意力
4.2 yolov8_MSAM.yaml
# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect
# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
# [depth, width, max_channels]
n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers, 3157200 parameters, 3157184 gradients, 8.9 GFLOPs
s: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients, 28.8 GFLOPs
m: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients, 79.3 GFLOPs
l: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
x: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs
# YOLOv8.0n backbone
backbone:
# [from, repeats, module, args]
- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
- [-1, 3, C2f, [128, True]]
- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
- [-1, 6, C2f, [256, True]]
- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
- [-1, 6, C2f, [512, True]]
- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
- [-1, 3, C2f, [1024, True]]
- [-1, 1, SPPF, [1024, 5]] # 9
- [-1, 1, MSAM, [1024]] # 10
# YOLOv8.0n head
head:
- [-1, 1, nn.Upsample, [None, 2, 'nearest']]
- [[-1, 6], 1, Concat, [1]] # cat backbone P4
- [-1, 3, C2f, [512]] # 13
- [-1, 1, nn.Upsample, [None, 2, 'nearest']]
- [[-1, 4], 1, Concat, [1]] # cat backbone P3
- [-1, 3, C2f, [256]] # 16 (P3/8-small)
- [-1, 1, Conv, [256, 3, 2]]
- [[-1, 13], 1, Concat, [1]] # cat head P4
- [-1, 3, C2f, [512]] # 19 (P4/16-medium)
- [-1, 1, Conv, [512, 3, 2]]
- [[-1, 10], 1, Concat, [1]] # cat head P5
- [-1, 3, C2f, [1024]] # 22 (P5/32-large)
- [[16, 19, 22], 1, Detect, [nc]] # Detect(P3, P4, P5)
4.3 实验结果分析
mAP@0.5由原始的0.522提升至0.534
YOLOv8_MSAM summary: 238 layers, 3105185 parameters, 0 gradients, 8.2 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 9/9 [00:21<00:00, 2.40s/it]
all 270 1602 0.612 0.516 0.534 0.26
echinus 270 976 0.798 0.724 0.769 0.38
holothurian 270 265 0.436 0.287 0.269 0.116
scallop 270 92 0.542 0.413 0.426 0.219
starfish 270 269 0.673 0.639 0.672 0.325
5.系列篇
系列篇1:自研注意力BSAM
系列篇2:MSAM(CBAM升级版)