基于YOLOv8的水下生物检测，多种优化方法---自研注意力BSAM助力涨点（一）

news2025/4/13 17:36:11

💡💡💡本文主要内容:详细介绍了水下生物检测整个过程，从数据集到训练模型到结果可视化分析，以及如何优化提升检测性能。

💡💡💡加入自研注意力BSAM mAP@0.5由原始的0.522提升至0.553

1.水下生物检测数据集介绍

水下生物检测类别：

  0: echinus
  1: holothurian
  2: scallop
  3: starfish
  4: waterweeds

数据集大小：1000张

细节图：

2.基于YOLOv8的水下生物检测

2.1 修改fish.yaml

path: ./data/fish  # dataset root dir
train: train.txt  # train images (relative to 'path') 118287 images
val: val.txt  # val images (relative to 'path') 5000 images

# number of classes
nc: 5

# class names
names:
  0: echinus
  1: holothurian
  2: scallop
  3: starfish
  4: waterweeds

2.2 开启训练

import warnings
warnings.filterwarnings('ignore')
from ultralytics import YOLO

if __name__ == '__main__':
    model = YOLO('ultralytics/cfg/models/v8/lyolo/yolov8n-lyolo.yaml')
    #model.load('yolov8n.pt') # loading pretrain weights
    model.train(data='data/fish/fish.yaml',
                cache=False,
                imgsz=640,
                epochs=200,
                batch=16,
                close_mosaic=10,
                workers=0,
                device='0',
                optimizer='SGD', # using SGD
                project='runs/train',
                name='exp',
                )

3.结果可视化分析

F1_curve.png：F1分数与置信度（x轴）之间的关系。F1分数是分类的一个衡量标准，是精确率和召回率的调和平均函数，介于0，1之间。越大越好。

TP：真实为真，预测为真；

FN：真实为真，预测为假；

FP：真实为假，预测为真；

TN：真实为假，预测为假；

精确率（precision）=TP/(TP+FP)

召回率(Recall)=TP/(TP+FN)

F1=2*（精确率*召回率）/（精确率+召回率）

PR_curve.png ：PR曲线中的P代表的是precision（精准率），R代表的是recall（召回率），其代表的是精准率与召回率的关系。

R_curve.png ：召回率与置信度之间关系

results.png

mAP_0.5:0.95表示从0.5到0.95以0.05的步长上的平均mAP.

预测结果：

4.如何优化模型

4.1 加入自研注意力

提出新颖的注意力BSAM（BiLevel Spatial Attention Module），创新度极佳，适合科研创新，效果秒杀CBAM，Channel Attention+Spartial Attention升级为新颖的 BiLevel Attention+Spartial Attention

4.2 对应yaml

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect
 
# Parameters
nc: 80  # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024]  # YOLOv8n summary: 225 layers,  3157200 parameters,  3157184 gradients,   8.9 GFLOPs
  s: [0.33, 0.50, 1024]  # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients,  28.8 GFLOPs
  m: [0.67, 0.75, 768]   # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients,  79.3 GFLOPs
  l: [1.00, 1.00, 512]   # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
  x: [1.00, 1.25, 512]   # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs
 
# YOLOv8.0n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]]  # 9
  - [-1, 1, BSAM, [1024]]  # 10
 
# YOLOv8.0n head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4
  - [-1, 3, C2f, [512]]  # 13
 
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]  # cat backbone P3
  - [-1, 3, C2f, [256]]  # 16 (P3/8-small)
 
  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 13], 1, Concat, [1]]  # cat head P4
  - [-1, 3, C2f, [512]]  # 19 (P4/16-medium)
 
  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 10], 1, Concat, [1]]  # cat head P5
  - [-1, 3, C2f, [1024]]  # 22 (P5/32-large)
 
  - [[16, 19, 22], 1, Detect, [nc]]  # Detect(P3, P4, P5)

4.3 实验结果分析

mAP@0.5由原始的0.522提升至0.553

YOLOv8_BSAM summary (fused): 183 layers, 3272449 parameters, 0 gradients, 8.3 GFLOPs
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 9/9 [00:23<00:00,  2.58s/it]
                   all        270       1602      0.637      0.522      0.553      0.275
               echinus        270        976      0.787      0.744      0.786      0.387
           holothurian        270        265      0.521      0.264      0.286      0.139
               scallop        270         92      0.516       0.38      0.412      0.215
              starfish        270        269      0.723        0.7      0.729       0.36