yolo增加RFEM

news2024/11/16 11:31:21

论文地址:https://arxiv.org/pdf/2208.02019.pdf

代码地址:GitHub - Krasjet-Yu/YOLO-FaceV2: YOLO-FaceV2: A Scale and Occlusion Aware Face Detector

总的来说就是RFEM利用了感受野在特征图中的优势,通过使用不同膨胀卷积率的分支来捕捉多尺度信息和不同范围的依赖关系。这种设计有助于减少参数数量,降低过拟合风险,并充分利用每个样本。

1、yolov7-tiny

 创建配置文件yolov7-tiny-RFEM.yaml

# parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple

activation: nn.LeakyReLU(0.1)
# anchors
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# yolov7-tiny backbone
backbone:
  # [from, number, module, args] c2, k=1, s=1, p=None, g=1, act=True
  [[-1, 1, Conv, [32, 3, 2, None, 1]],  # 0-P1/2
  
   [-1, 1, Conv, [64, 3, 2, None, 1]],  # 1-P2/4
   
   [-1, 1, Conv, [32, 1, 1, None, 1]],
   [-2, 1, Conv, [32, 1, 1, None, 1]],
   [-1, 1, Conv, [32, 3, 1, None, 1]],
   [-1, 1, Conv, [32, 3, 1, None, 1]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [64, 1, 1, None, 1]],  # 7
   
   [-1, 1, MP, []],  # 8-P3/8
   [-1, 1, Conv, [64, 1, 1, None, 1]],
   [-2, 1, Conv, [64, 1, 1, None, 1]],
   [-1, 1, Conv, [64, 3, 1, None, 1]],
   [-1, 1, Conv, [64, 3, 1, None, 1]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [128, 1, 1, None, 1]],  # 14
   
   [-1, 1, MP, []],  # 15-P4/16
   [-1, 1, Conv, [128, 1, 1, None, 1]],
   [-2, 1, Conv, [128, 1, 1, None, 1]],
   [-1, 1, Conv, [128, 3, 1, None, 1]],
   [-1, 1, Conv, [128, 3, 1, None, 1]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1, None, 1]],  # 21
   
   [-1, 1, MP, []],  # 22-P5/32
   [-1, 1, Conv, [256, 1, 1, None, 1]],
   [-2, 1, Conv, [256, 1, 1, None, 1]],
   [-1, 1, Conv, [256, 3, 1, None, 1]],
   [-1, 1, Conv, [256, 3, 1, None, 1]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [512, 1, 1, None, 1]],  # 28
  ]

# yolov7-tiny head
head:
  [[-1, 1, Conv, [256, 1, 1, None, 1]],
   [-2, 1, Conv, [256, 1, 1, None, 1]],
   [-1, 1, SP, [5]],
   [-2, 1, SP, [9]],
   [-3, 1, SP, [13]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1, None, 1]],
   [[-1, -7], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1, None, 1]],  # 37
   [-1, 1, RFEM, [256]],
  
   [-1, 1, Conv, [128, 1, 1, None, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [21, 1, Conv, [128, 1, 1, None, 1]], # route backbone P4
   [[-1, -2], 1, Concat, [1]],
   
   [-1, 1, Conv, [64, 1, 1, None, 1]],
   [-2, 1, Conv, [64, 1, 1, None, 1]],
   [-1, 1, Conv, [64, 3, 1, None, 1]],
   [-1, 1, Conv, [64, 3, 1, None, 1]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [128, 1, 1, None, 1]],  # 48
  
   [-1, 1, Conv, [64, 1, 1, None, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [14, 1, Conv, [64, 1, 1, None, 1]], # route backbone P3
   [[-1, -2], 1, Concat, [1]],
   
   [-1, 1, Conv, [32, 1, 1, None, 1]],
   [-2, 1, Conv, [32, 1, 1, None, 1]],
   [-1, 1, Conv, [32, 3, 1, None, 1]],
   [-1, 1, Conv, [32, 3, 1, None, 1]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [64, 1, 1, None, 1]],  # 58
   
   [-1, 1, Conv, [128, 3, 2, None, 1]],
   [[-1, 48], 1, Concat, [1]],
   
   [-1, 1, Conv, [64, 1, 1, None, 1]],
   [-2, 1, Conv, [64, 1, 1, None, 1]],
   [-1, 1, Conv, [64, 3, 1, None, 1]],
   [-1, 1, Conv, [64, 3, 1, None, 1]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [128, 1, 1, None, 1]],  # 66
   
   [-1, 1, Conv, [256, 3, 2, None, 1]],
   [[-1, 37], 1, Concat, [1]],
   
   [-1, 1, Conv, [128, 1, 1, None, 1]],
   [-2, 1, Conv, [128, 1, 1, None, 1]],
   [-1, 1, Conv, [128, 3, 1, None, 1]],
   [-1, 1, Conv, [128, 3, 1, None, 1]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1, None, 1]],  # 74
      
   [58, 1, Conv, [128, 3, 1, None, 1]],
   [66, 1, Conv, [256, 3, 1, None, 1]],
   [74, 1, Conv, [512, 3, 1, None, 1]],

   [[75,76,77], 1, Detect, [nc, anchors]],   # Detect(P3, P4, P5)
  ]

在common.py中增加

# RFEM
class TridentBlock(nn.Module):
    def __init__(self, c1, c2, stride=1, c=False, e=0.5, padding=[1, 2, 3], dilate=[1, 2, 3], bias=False):
        super(TridentBlock, self).__init__()
        self.stride = stride
        self.c = c
        c_ = int(c2 * e)
        self.padding = padding
        self.dilate = dilate
        self.share_weightconv1 = nn.Parameter(torch.Tensor(c_, c1, 1, 1))
        self.share_weightconv2 = nn.Parameter(torch.Tensor(c2, c_, 3, 3))

        self.bn1 = nn.BatchNorm2d(c_)
        self.bn2 = nn.BatchNorm2d(c2)

        # self.act = nn.SiLU()
        self.act = Conv.default_act

        nn.init.kaiming_uniform_(self.share_weightconv1, nonlinearity="relu")
        nn.init.kaiming_uniform_(self.share_weightconv2, nonlinearity="relu")

        if bias:
            self.bias = nn.Parameter(torch.Tensor(c2))
        else:
            self.bias = None

        if self.bias is not None:
            nn.init.constant_(self.bias, 0)

    def forward_for_small(self, x):
        residual = x
        out = nn.functional.conv2d(x, self.share_weightconv1, bias=self.bias)
        out = self.bn1(out)
        out = self.act(out)

        out = nn.functional.conv2d(out, self.share_weightconv2, bias=self.bias, stride=self.stride,
                                   padding=self.padding[0],
                                   dilation=self.dilate[0])
        out = self.bn2(out)
        out += residual
        out = self.act(out)

        return out

    def forward_for_middle(self, x):
        residual = x
        out = nn.functional.conv2d(x, self.share_weightconv1, bias=self.bias)
        out = self.bn1(out)
        out = self.act(out)

        out = nn.functional.conv2d(out, self.share_weightconv2, bias=self.bias, stride=self.stride,
                                   padding=self.padding[1],
                                   dilation=self.dilate[1])
        out = self.bn2(out)
        out += residual
        out = self.act(out)

        return out

    def forward_for_big(self, x):
        residual = x
        out = nn.functional.conv2d(x, self.share_weightconv1, bias=self.bias)
        out = self.bn1(out)
        out = self.act(out)

        out = nn.functional.conv2d(out, self.share_weightconv2, bias=self.bias, stride=self.stride,
                                   padding=self.padding[2],
                                   dilation=self.dilate[2])
        out = self.bn2(out)
        out += residual
        out = self.act(out)

        return out

    def forward(self, x):
        xm = x
        base_feat = []
        if self.c is not False:
            x1 = self.forward_for_small(x)
            x2 = self.forward_for_middle(x)
            x3 = self.forward_for_big(x)
        else:
            x1 = self.forward_for_small(xm[0])
            x2 = self.forward_for_middle(xm[1])
            x3 = self.forward_for_big(xm[2])

        base_feat.append(x1)
        base_feat.append(x2)
        base_feat.append(x3)

        return base_feat


class RFEM(nn.Module):
    def __init__(self, c1, c2, n=1, e=0.5, stride=1):
        super(RFEM, self).__init__()
        c = True
        layers = []
        layers.append(TridentBlock(c1, c2, stride=stride, c=c, e=e))
        c1 = c2
        for i in range(1, n):
            layers.append(TridentBlock(c1, c2))
        self.layer = nn.Sequential(*layers)
        # self.cv = Conv(c2, c2)
        self.bn = nn.BatchNorm2d(c2)
        # self.act = nn.SiLU()
        self.act = Conv.default_act

    def forward(self, x):
        out = self.layer(x)
        out = out[0] + out[1] + out[2] + x
        out = self.act(self.bn(out))
        return out


class C3RFEM(nn.Module):
    def __init__(self, c1, c2, n=1, shortcut=True, e=0.5):  # ch_in, ch_out, number, shortcut, groups, expansion
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c1, c_, 1, 1)
        self.cv3 = Conv(2 * c_, c2, 1)  # act=FReLU(c2)
        # self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))
        # self.rfem = RFEM(c_, c_, n)
        self.m = nn.Sequential(*[RFEM(c_, c_, n=1, e=e) for _ in range(n)])
        # self.m = nn.Sequential(*[CrossConv(c_, c_, 3, 1, g, 1.0, shortcut) for _ in range(n)])

    def forward(self, x):
        return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))

其中C3RFEM对应原作者的实现

在yolo.py中修改:

        n = n_ = max(round(n * gd), 1) if n > 1 else n  # depth gain
        if m in {
            Conv, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, DWConv, MixConv2d, Focus, CrossConv,
            BottleneckCSP, C3, C3TR, C3SPP, C3Ghost, nn.ConvTranspose2d, DWConvTranspose2d, C3x, StemBlock,
            BlazeBlock, DoubleBlazeBlock, ShuffleV2Block, MobileBottleneck, InvertedResidual, ConvBNReLU,
            RepVGGBlock, SEBlock, RepBlock, SimCSPSPPF, C3_P, SPPCSPC, RepConv, RFEM, C3RFEM}:
            c1, c2 = ch[f], args[0]
            if c2 != no:  # if not output
                c2 = make_divisible(c2 * gw, 8)
            if m == InvertedResidual:
                c2 = make_divisible(c2 * gw, 4 if gw == 0.1 else 8)
                
            args = [c1, c2, *args[1:]]
            if m in {BottleneckCSP, C3, C3TR, C3Ghost, C3x, C3_P, C3RFEM}:
                args.insert(2, n)  # number of repeats
                n = 1

运行yolo.py

2、yolov5

yolov5s-RFEM.yaml

# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
   [-1, 3, C3RFEM, [1024, False]],  # 10
  ]

# YOLOv5 v6.0 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 15], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 11], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [1024, False]],  # 23 (P5/32-large)

   [[18, 21, 24], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

改法和上面一样

运行yolo.py

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/947569.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

新能源发展趋安科瑞助力风力发电场集中监控系统解决方案

安科瑞 崔丽洁 作为清洁能源之一,风力发电场近几年装机容量快速增长。2023年8月17日,国家能源局发布1-7月份全国电力工业统计数据。截至7月底,全国累计发电装机容量约27.4亿千瓦,同比增长11.5%。其中,太阳能发电装机容…

OA项目之我的审批(查询会议签字审批)

目录 会议查询 会议签字 会议审批 讲解思路 我的审批查询功能手写签批插件及工具类介绍手写签批插件集成手写签批功能实现 会议查询 MeetingInfoDao.java // 我的审批public List<Map<String,Object>> myAudit(MeetingInfo info,PageBean pageBean) throws E…

Linux中的基础IO

目录 1、关于C语言中的文件操作符 1.1 C语言中写文件 1.2 C语言读文件 1.3 往显示器上输出信息 1.4 stdin & stdout & stderr 1.5 打开文件的方式 2、系统文件IO 2.1 写操作文件 2.2 读操作文件、 2.3 open open函数的返回值 2.4 文件描述符 0 & 1 &a…

MybatisPlus-插件篇

文章目录 一、前言二、插件1、分页插件2.1.1、引入依赖2.1.1、配置分页插件2.1.3、使用分页方法 2、乐观锁插件2.1、引入依赖2.2、添加版本字段2.3、配置乐观锁插件2.4、执行更新操作 三、总结 一、前言 本文将详细介绍mybatisplus中常用插件的使用。 二、插件 1、分页插件 …

双向A*算法

前面看最佳路径优先搜索算法的时候顺便研究了一下它的改进算法&#xff1a;双向最佳路径优先搜索算法。那既然有双向最佳路径优先搜索算法自然也可以有双向A* 算法。这篇文章简单看一下双向A*算法的基本原理以及代码实现。 基本原理 双向A* 算法是一种用于解决图搜索问题的启…

供水营业收费管理系统:智慧水务的得力助手

随着我国经济的快速发展&#xff0c;城市化进程不断加快&#xff0c;供水行业的需求也不断增长。为满足人们日益增长的用水需求&#xff0c;提高供水企业的管理水平和服务质量&#xff0c;供水营业收费管理系统应运而生&#xff0c;成为智慧水务的得力助手。 一、供水营业收费管…

算法通关村-----哈希和队列的基本知识

哈希概念 哈希也称为散列&#xff0c;就是把任意长度的输入&#xff0c;通过散列算法&#xff0c;变成固定长度的输出&#xff0c;这个输出值就是散列值。 哈希存储 现在有1&#xff0c;2&#xff0c;3…15&#xff0c;要将其存储到大小为7的哈希表中&#xff0c;应该如何存…

Android studio实现水平进度条

原文 ProgressBar 用于显示某个耗时操作完成的百分比的组件称为进度条。ProgressBar默认产生圆形进度条。 实现效果图&#xff1a; MainActivity import android.os.Bundle; import android.view.View; import android.app.Activity; import android.widget.Button; import…

算法 稀疏数组 数组优化 数组压缩 二维数组转稀疏数组 算法合集(二)

1. 五子棋游戏&#xff0c;玩家对战一半停战休息&#xff0c;此时需要存储当前对战双方棋子信息 a. 采用二维数组存储&#xff1a; 0为空&#xff0c; 1代表黑棋 2代表蓝色棋子 b. 棋盘为11行&#xff0c;11列 > int [][] chessArray new int [11][11]; c. 出现的问题&am…

RT_Thread内核机制学习(五)邮箱

之所以引入线程间通信&#xff0c;是为了实现互斥&#xff0c;休眠-唤醒。 队列可以指定消息的大小、个数&#xff0c;存放消息&#xff0c;取出消息时都是由rt_memcpy()实现。 邮箱 保存数据的核心在于数组&#xff0c;只能存放unsigned long类型数据&#xff0c;数据存取、…

Acwing798.差分矩阵

前缀和与差分 图文并茂 超详细整理&#xff08;全网最通俗易懂&#xff09;_前缀和差分_林小鹿的博客-CSDN博客 代码展示&#xff1a; #include<iostream> #include<cstdio> using namespace std; const int N 1e3 10; int a[N][N], b[N][N]; void insert(int x…

在iPhone 15发布之前,iPhone在智能手机出货量上占据主导地位,这对安卓来说是个坏消息

可以说这是一记重拳&#xff0c;但似乎没有一个有价值的竞争者能与苹果今年迄今为止的智能手机出货量相媲美。 事实上&#xff0c;根据Omdia智能手机型号市场跟踪机构收集的数据&#xff0c;苹果的iPhone占据了前四名。位居榜首的是iPhone 14 Pro Max&#xff0c;2023年上半年…

Python Qt学习(五)Checkbox

源码 # -*- coding: utf-8 -*-# Form implementation generated from reading ui file qt_checkbox.ui # # Created by: PyQt5 UI code generator 5.15.9 # # WARNING: Any manual changes made to this file will be lost when pyuic5 is # run again. Do not edit this fil…

ATF(TF-A)安全通告 TFV-3 (CVE-2017-7563)

安全之安全(security)博客目录导读 ATF(TF-A)安全通告汇总 目录 一、ATF(TF-A)安全通告 TFV-3 (CVE-2017-7563) 二、CVE-2017-7563 一、ATF(TF-A)安全通告 TFV-3 (CVE-2017-7563) Title RO内存始终在AArch64 Secure EL1下可执行 CVE ID CVE-2017-7563 Date 06 Apr 2017 …

字符设备驱动框架解析

一、字符设备驱动框架解析 设备的操作函数如果比喻是桩的话&#xff08;性质类似于设备操作函数的函数&#xff0c;在一些场合被称为桩函数&#xff09;&#xff0c;则&#xff1a; 驱动实现设备操作函数 ----------- 做桩 insmod调用的init函数主要作用 --------- 钉桩 rm…

LinearAlgebraMIT_11_MatrixSpace/Rank==1‘sMatrix/SmallWorldGraph

x.1 矩阵空间 向量空间定义&#xff1a;满足加法和数乘的封闭性。就类似向量空间一样&#xff0c;也存在着矩阵空间的定义。举个例子&#xff0c;例如所有的3x3的矩阵构成的矩阵空间M&#xff0c;它的纬度就是9&#xff0c;如[1, 0, …], [0, 1, …]。对于M中所有对称矩阵组成…

Ansible学习笔记3

ansible模块&#xff1a; ansible是基于模块来工作的&#xff0c;本身没有批量部署的能力&#xff0c;真正具有批量部署的是ansible所运行的模块&#xff0c;ansible只是提供一个框架。 ansible支持的模块非常多&#xff0c;我们并不需要把每个模块记住&#xff0c;而只需要熟…

Ubuntu20以上高版本如何安装低版本GCC

安装了Ubuntu 20.04之后&#xff0c;通过命令行 sudo apt-get install build-essential安装gcc&#xff0c;再通过命令行 gcc -v可查看gcc版本为gcc13 如果想用低版本的gcc&#xff0c;比如gcc4.8&#xff0c;尝试输入命令 sudo apt-get install gcc-4.8会提示找不到gcc4.8的…

胡歌深夜发文:我对不起好多人

胡歌的微博又上了热搜。 8月29日01:18分&#xff0c;胡歌微博发文称&#xff1a;“我尽量保持冷静&#xff0c;我对不起好多人&#xff0c;我希望对得起这短暂的一生”&#xff0c;并配了一张自己胡子拉碴的图&#xff0c;右眼的伤疤清晰可见。 不少网友留言称“哥你又喝多了吗…

基于Java+SpringBoot+Vue前后端分离教师工作量管理系统设计和实现

博主介绍&#xff1a;✌全网粉丝30W,csdn特邀作者、博客专家、CSDN新星计划导师、Java领域优质创作者,博客之星、掘金/华为云/阿里云/InfoQ等平台优质作者、专注于Java技术领域和毕业项目实战✌ &#x1f345;文末获取源码联系&#x1f345; &#x1f447;&#x1f3fb; 精彩专…