基于YOLOv6m的接打电话检测识别分析系统

news2025/7/7 0:33:37

本身在实际项目开发应用中YOLO都是目标检测任务的绝对主力，从v3一直跟着到了v7，做了很多的项目，处理了很多的数据，当然了也积累了一些自己的成果和心得，这里主要是以不常用到的yolov6m系列的模型来开发构建接打电话行为检测识别模型，首先看下效果图：

如果不会使用yolov6项目的话可以参考我之前的教程：

《基于美团技术团队最新开源的yolov6模型实现裸土检测》

教程这里就不再赘述了。

这里使用的是精度更高的yolov6m系列的模型，如下：

# YOLOv6m6 model
model = dict(
    type='YOLOv6m6',
    pretrained='weights/yolov6m6.pt',
    depth_multiple=0.60,
    width_multiple=0.75,
    backbone=dict(
        type='CSPBepBackbone_P6',
        num_repeats=[1, 6, 12, 18, 6, 6],
        out_channels=[64, 128, 256, 512, 768, 1024],
        csp_e=float(2)/3,
        fuse_P2=True,
        ),
    neck=dict(
        type='CSPRepBiFPANNeck_P6',
        num_repeats=[12, 12, 12, 12, 12, 12],
        out_channels=[512, 256, 128, 256, 512, 1024],
        csp_e=float(2)/3,
        ),
    head=dict(
        type='EffiDeHead',
        in_channels=[128, 256, 512, 1024],
        num_layers=4,
        anchors=1,
        strides=[8, 16, 32, 64],
        atss_warmup_epoch=4,
        iou_type='giou',
        use_dfl=True,
        reg_max=16, #if use_dfl is False, please set reg_max to 0
        distill_weight={
            'class': 1.0,
            'dfl': 1.0,
        },
    )
)

solver = dict(
    optim='SGD',
    lr_scheduler='Cosine',
    lr0=0.0032,
    lrf=0.12,
    momentum=0.843,
    weight_decay=0.00036,
    warmup_epochs=2.0,
    warmup_momentum=0.5,
    warmup_bias_lr=0.05
)

data_aug = dict(
    hsv_h=0.0138,
    hsv_s=0.664,
    hsv_v=0.464,
    degrees=0.373,
    translate=0.245,
    scale=0.898,
    shear=0.602,
    flipud=0.00856,
    fliplr=0.5,
    mosaic=1.0,
    mixup=0.243,
)

接下来看下数据集：

数据集种隐私敏感信息都做了脱敏处理了。

YOLO格式的标注数据如下：

实例标注内容如下：

1 0.5875 0.419444 0.165625 0.394444

VOC格式标注数据如下所示：

实例标注内容如下：

<?xml version="1.0" ?><annotation>
    <folder>3</folder>
    <filename>3dd213ba-4491-4425-90f8-6a7012f9b9f5.jpg</filename>
    <path>3dd213ba-4491-4425-90f8-6a7012f9b9f5.jpg</path>
    <source>
        <database>Unknown</database>
    </source>
    <size>
        <width>960</width>
        <height>540</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>person</name>
        <pose>Unspecified</pose>
        <truncated>1</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>908</xmin>
            <ymin>97</ymin>
            <xmax>960</xmax>
            <ymax>132</ymax>
        </bndbox>
    </object>
    <object>
        <name>person</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>2</xmin>
            <ymin>228</ymin>
            <xmax>55</xmax>
            <ymax>292</ymax>
        </bndbox>
    </object>
    <object>
        <name>person</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>30</xmin>
            <ymin>92</ymin>
            <xmax>73</xmax>
            <ymax>139</ymax>
        </bndbox>
    </object>
    <object>
        <name>person</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>190</xmin>
            <ymin>32</ymin>
            <xmax>230</xmax>
            <ymax>65</ymax>
        </bndbox>
    </object>
    <object>
        <name>person</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>130</xmin>
            <ymin>61</ymin>
            <xmax>167</xmax>
            <ymax>97</ymax>
        </bndbox>
    </object>
    <object>
        <name>person</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>214</xmin>
            <ymin>59</ymin>
            <xmax>263</xmax>
            <ymax>94</ymax>
        </bndbox>
    </object>
    </annotation>

默认执行100次epoch迭代计算，结果详情如下：

LABEL可视化：