【YOLOv5/v7改进系列】替换Neck为Gold-Yolo特征融合网络

news2024/11/16 0:01:15
一、导言

Gold-YOLO是一种高效的物体检测模型,它通过一种新的机制——Gather-and-Distribute(GD)机制来增强多尺度特征融合的能力,从而在保证实时性能的同时提高了检测精度。下面是对Gold-YOLO的主要特点和创新点的概述:

关键特点
  1. Gather-and-Distribute (GD) 机制:这是Gold-YOLO的核心贡献之一。GD机制通过卷积和自注意力操作来实现对多尺度特征的有效融合,从而提升模型在不同大小物体检测上的性能。
  2. 平衡速度与精度:Gold-YOLO在保证实时性能的同时,也达到了较高的检测精度,特别是在不同模型规模下的平衡。
  3. 预训练方法:Gold-YOLO首次在YOLO系列中实现了MAE-style预训练方法,允许模型从无监督学习中受益,进一步提高了模型的性能。
创新点
  1. GD机制的设计:GD机制由三个主要模块组成:特征对齐模块(FAM)、信息融合模块(IFM)和信息注入模块(Inject)。FAM负责收集和对齐来自不同层级的特征,IFM则融合这些对齐后的特征以生成全局信息。最后,通过简单的注意力操作,Inject模块将这些全局信息分配到各个层级,进而增强各个分支的检测能力。

    • 低级Gather-and-Distribute分支(Low-GD):专门针对小目标和中等目标,通过融合高层级的特征来保留高分辨率信息。
    • 高级Gather-and-Distribute分支(High-GD):主要关注大目标的检测,通过融合低层级的特征来获得更大的感受野。
    • 轻量级相邻层融合模块(LAF):增强两个分支之间的信息交流,进一步提升模型性能。
  2. 高效的信息融合:通过上述的GD机制,Gold-YOLO能够有效解决传统FPN和PANet中存在的信息融合问题,实现了更好的多尺度特征融合,从而在检测不同大小的目标时表现出色。

  3. MAE-style预训练:Gold-YOLO在主干网络上采用了掩码自编码(MAE)式的预训练方法,这有助于模型在有监督训练前就具备良好的特征表示能力,进而加速收敛并提高最终性能。

性能评估

Gold-YOLO在多个方面都展示了出色的性能,包括但不限于:

  • 速度与精度的平衡:Gold-YOLO的不同版本(如Gold-YOLO-N、Gold-YOLO-S、Gold-YOLO-M等)均能在保持或超越竞争对手的速度的同时,提供更高的平均精度(AP)。
  • 跨模型规模的一致性:无论是在小型、中型还是大型模型版本中,Gold-YOLO都能够保持一致的高性能,证明了其机制的有效性和通用性。
  • 跨任务扩展性:除了物体检测任务外,GD机制还被应用于实例分割和语义分割任务中,同样取得了显著的性能提升。
实验结果
  • 物体检测:在COCO数据集上,Gold-YOLO的不同版本均优于或与同类模型相当,例如YOLov6-3.0、YOLOX和PPYOLOE等,在速度和精度上都有所提升。
  • 实例分割:在Mask R-CNN上应用GD机制后,在COCO实例分割数据集上也获得了显著的性能提升。
  • 语义分割:在PointRend模型上替换为GD机制后,在Cityscapes数据集上的语义分割任务中同样显示出了更好的性能。

综上所述,Gold-YOLO通过其创新的GD机制和其他优化措施,在保持高效运行的同时显著提高了物体检测的准确性,并且证明了其机制的广泛适用性。

二、准备工作

首先在YOLOv5/v7的models文件夹下新建文件goldyolo.py,导入如下代码

from models.common import *

import torch.nn.functional as F


# https://arxiv.org/pdf/2309.11331v4

def conv_bn(in_channels, out_channels, kernel_size, stride, padding, groups=1, bias=False):
    '''Basic cell for rep-style block, including conv and bn'''
    result = nn.Sequential()
    result.add_module('conv', nn.Conv2d(in_channels=in_channels, out_channels=out_channels,
                                        kernel_size=kernel_size, stride=stride, padding=padding, groups=groups,
                                        bias=bias))
    result.add_module('bn', nn.BatchNorm2d(num_features=out_channels))
    return result


class RepVGGBlock(nn.Module):
    '''RepVGGBlock is a basic rep-style block, including training and deploy status
    This code is based on https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py
    '''

    def __init__(self, in_channels, out_channels, kernel_size=3,
                 stride=1, padding=1, dilation=1, groups=1, padding_mode='zeros', deploy=False, use_se=False):
        super(RepVGGBlock, self).__init__()
        """ Initialization of the class.
        Args:
            in_channels (int): Number of channels in the input image
            out_channels (int): Number of channels produced by the convolution
            kernel_size (int or tuple): Size of the convolving kernel
            stride (int or tuple, optional): Stride of the convolution. Default: 1
            padding (int or tuple, optional): Zero-padding added to both sides of
                the input. Default: 1
            dilation (int or tuple, optional): Spacing between kernel elements. Default: 1
            groups (int, optional): Number of blocked connections from input
                channels to output channels. Default: 1
            padding_mode (string, optional): Default: 'zeros'
            deploy: Whether to be deploy status or training status. Default: False
            use_se: Whether to use se. Default: False
        """
        self.deploy = deploy
        self.groups = groups
        self.in_channels = in_channels
        self.out_channels = out_channels

        assert kernel_size == 3
        assert padding == 1

        padding_11 = padding - kernel_size // 2

        self.nonlinearity = nn.ReLU()

        if use_se:
            raise NotImplementedError("se block not supported yet")
        else:
            self.se = nn.Identity()

        if deploy:
            self.rbr_reparam = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size,
                                         stride=stride,
                                         padding=padding, dilation=dilation, groups=groups, bias=True,
                                         padding_mode=padding_mode)

        else:
            self.rbr_identity = nn.BatchNorm2d(
                num_features=in_channels) if out_channels == in_channels and stride == 1 else None
            self.rbr_dense = conv_bn(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size,
                                     stride=stride, padding=padding, groups=groups)
            self.rbr_1x1 = conv_bn(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=stride,
                                   padding=padding_11, groups=groups)

    def forward(self, inputs):
        '''Forward process'''
        if hasattr(self, 'rbr_reparam'):
            return self.nonlinearity(self.se(self.rbr_reparam(inputs)))

        if self.rbr_identity is None:
            id_out = 0
        else:
            id_out = self.rbr_identity(inputs)

        return self.nonlinearity(self.se(self.rbr_dense(inputs) + self.rbr_1x1(inputs) + id_out))

    def get_equivalent_kernel_bias(self):
        kernel3x3, bias3x3 = self._fuse_bn_tensor(self.rbr_dense)
        kernel1x1, bias1x1 = self._fuse_bn_tensor(self.rbr_1x1)
        kernelid, biasid = self._fuse_bn_tensor(self.rbr_identity)
        return kernel3x3 + self._pad_1x1_to_3x3_tensor(kernel1x1) + kernelid, bias3x3 + bias1x1 + biasid

    def _pad_1x1_to_3x3_tensor(self, kernel1x1):
        if kernel1x1 is None:
            return 0
        else:
            return torch.nn.functional.pad(kernel1x1, [1, 1, 1, 1])

    def _fuse_bn_tensor(self, branch):
        if branch is None:
            return 0, 0
        if isinstance(branch, nn.Sequential):
            kernel = branch.conv.weight
            running_mean = branch.bn.running_mean
            running_var = branch.bn.running_var
            gamma = branch.bn.weight
            beta = branch.bn.bias
            eps = branch.bn.eps
        else:
            assert isinstance(branch, nn.BatchNorm2d)
            if not hasattr(self, 'id_tensor'):
                input_dim = self.in_channels // self.groups
                kernel_value = np.zeros((self.in_channels, input_dim, 3, 3), dtype=np.float32)
                for i in range(self.in_channels):
                    kernel_value[i, i % input_dim, 1, 1] = 1
                self.id_tensor = torch.from_numpy(kernel_value).to(branch.weight.device)
            kernel = self.id_tensor
            running_mean = branch.running_mean
            running_var = branch.running_var
            gamma = branch.weight
            beta = branch.bias
            eps = branch.eps
        std = (running_var + eps).sqrt()
        t = (gamma / std).reshape(-1, 1, 1, 1)
        return kernel * t, beta - running_mean * gamma / std

    def switch_to_deploy(self):
        if hasattr(self, 'rbr_reparam'):
            return
        kernel, bias = self.get_equivalent_kernel_bias()
        self.rbr_reparam = nn.Conv2d(in_channels=self.rbr_dense.conv.in_channels,
                                     out_channels=self.rbr_dense.conv.out_channels,
                                     kernel_size=self.rbr_dense.conv.kernel_size, stride=self.rbr_dense.conv.stride,
                                     padding=self.rbr_dense.conv.padding, dilation=self.rbr_dense.conv.dilation,
                                     groups=self.rbr_dense.conv.groups, bias=True)
        self.rbr_reparam.weight.data = kernel
        self.rbr_reparam.bias.data = bias
        for para in self.parameters():
            para.detach_()
        self.__delattr__('rbr_dense')
        self.__delattr__('rbr_1x1')
        if hasattr(self, 'rbr_identity'):
            self.__delattr__('rbr_identity')
        if hasattr(self, 'id_tensor'):
            self.__delattr__('id_tensor')
        self.deploy = True


def onnx_AdaptiveAvgPool2d(x, output_size):
    stride_size = np.floor(np.array(x.shape[-2:]) / output_size).astype(np.int32)
    kernel_size = np.array(x.shape[-2:]) - (output_size - 1) * stride_size
    avg = nn.AvgPool2d(kernel_size=list(kernel_size), stride=list(stride_size))
    x = avg(x)
    return x


def get_avg_pool():
    if torch.onnx.is_in_onnx_export():
        avg_pool = onnx_AdaptiveAvgPool2d
    else:
        avg_pool = nn.functional.adaptive_avg_pool2d
    return avg_pool


class SimFusion_3in(nn.Module):
    def __init__(self, in_channel_list, out_channels):
        super().__init__()
        self.cv1 = Conv(in_channel_list[0], out_channels, act=nn.ReLU()) if in_channel_list[
                                                                                0] != out_channels else nn.Identity()
        self.cv2 = Conv(in_channel_list[1], out_channels, act=nn.ReLU()) if in_channel_list[
                                                                                1] != out_channels else nn.Identity()
        self.cv3 = Conv(in_channel_list[2], out_channels, act=nn.ReLU()) if in_channel_list[
                                                                                2] != out_channels else nn.Identity()
        self.cv_fuse = Conv(out_channels * 3, out_channels, act=nn.ReLU())
        self.downsample = nn.functional.adaptive_avg_pool2d

    def forward(self, x):
        N, C, H, W = x[1].shape
        output_size = (H, W)

        if torch.onnx.is_in_onnx_export():
            self.downsample = onnx_AdaptiveAvgPool2d
            output_size = np.array([H, W])

        x0 = self.cv1(self.downsample(x[0], output_size))
        x1 = self.cv2(x[1])
        x2 = self.cv3(F.interpolate(x[2], size=(H, W), mode='bilinear', align_corners=False))
        return self.cv_fuse(torch.cat((x0, x1, x2), dim=1))


class SimFusion_4in(nn.Module):
    def __init__(self):
        super().__init__()
        self.avg_pool = nn.functional.adaptive_avg_pool2d

    def forward(self, x):
        x_l, x_m, x_s, x_n = x
        B, C, H, W = x_s.shape
        output_size = np.array([H, W])

        if torch.onnx.is_in_onnx_export():
            self.avg_pool = onnx_AdaptiveAvgPool2d

        x_l = self.avg_pool(x_l, output_size)
        x_m = self.avg_pool(x_m, output_size)
        x_n = F.interpolate(x_n, size=(H, W), mode='bilinear', align_corners=False)

        out = torch.cat([x_l, x_m, x_s, x_n], 1)
        return out


class IFM(nn.Module):
    def __init__(self, inc, ouc, embed_dim_p=96, fuse_block_num=3) -> None:
        super().__init__()

        self.conv = nn.Sequential(
            Conv(inc, embed_dim_p),
            *[RepVGGBlock(embed_dim_p, embed_dim_p) for _ in range(fuse_block_num)],
            Conv(embed_dim_p, sum(ouc))
        )

    def forward(self, x):
        return self.conv(x)


class h_sigmoid(nn.Module):
    def __init__(self, inplace=True):
        super(h_sigmoid, self).__init__()
        self.relu = nn.ReLU6(inplace=inplace)

    def forward(self, x):
        return self.relu(x + 3) / 6


class InjectionMultiSum_Auto_pool(nn.Module):
    def __init__(
            self,
            inp: int,
            oup: int,
            global_inp: list,
            flag: int
    ) -> None:
        super().__init__()
        self.global_inp = global_inp
        self.flag = flag
        self.local_embedding = Conv(inp, oup, 1, act=False)
        self.global_embedding = Conv(global_inp[self.flag], oup, 1, act=False)
        self.global_act = Conv(global_inp[self.flag], oup, 1, act=False)
        self.act = h_sigmoid()

    def forward(self, x):
        '''
        x_g: global features
        x_l: local features
        '''
        x_l, x_g = x
        B, C, H, W = x_l.shape
        g_B, g_C, g_H, g_W = x_g.shape
        use_pool = H < g_H

        gloabl_info = x_g.split(self.global_inp, dim=1)[self.flag]

        local_feat = self.local_embedding(x_l)

        global_act = self.global_act(gloabl_info)
        global_feat = self.global_embedding(gloabl_info)

        if use_pool:
            avg_pool = get_avg_pool()
            output_size = np.array([H, W])

            sig_act = avg_pool(global_act, output_size)
            global_feat = avg_pool(global_feat, output_size)

        else:
            sig_act = F.interpolate(self.act(global_act), size=(H, W), mode='bilinear', align_corners=False)
            global_feat = F.interpolate(global_feat, size=(H, W), mode='bilinear', align_corners=False)

        out = local_feat * sig_act + global_feat
        return out


def get_shape(tensor):
    shape = tensor.shape
    if torch.onnx.is_in_onnx_export():
        shape = [i.cpu().numpy() for i in shape]
    return shape


class PyramidPoolAgg(nn.Module):
    def __init__(self, inc, ouc, stride, pool_mode='torch'):
        super().__init__()
        self.stride = stride
        if pool_mode == 'torch':
            self.pool = nn.functional.adaptive_avg_pool2d
        elif pool_mode == 'onnx':
            self.pool = onnx_AdaptiveAvgPool2d
        self.conv = Conv(inc, ouc)

    def forward(self, inputs):
        B, C, H, W = get_shape(inputs[-1])
        H = (H - 1) // self.stride + 1
        W = (W - 1) // self.stride + 1

        output_size = np.array([H, W])

        if not hasattr(self, 'pool'):
            self.pool = nn.functional.adaptive_avg_pool2d

        if torch.onnx.is_in_onnx_export():
            self.pool = onnx_AdaptiveAvgPool2d

        out = [self.pool(inp, output_size) for inp in inputs]

        return self.conv(torch.cat(out, dim=1))


def drop_path(x, drop_prob: float = 0., training: bool = False):
    """Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
    This is the same as the DropConnect impl I created for EfficientNet, etc networks, however,
    the original name is misleading as 'Drop Connect' is a different form of dropout in a separate paper...
    See discussion: https://github.com/tensorflow/tpu/issues/494#issuecomment-532968956 ... I've opted for
    changing the layer and argument names to 'drop path' rather than mix DropConnect as a layer name and use
    'survival rate' as the argument.
    """
    if drop_prob == 0. or not training:
        return x
    keep_prob = 1 - drop_prob
    shape = (x.shape[0],) + (1,) * (x.ndim - 1)  # work with diff dim tensors, not just 2D ConvNets
    random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)
    random_tensor.floor_()  # binarize
    output = x.div(keep_prob) * random_tensor
    return output


class Mlp(nn.Module):
    def __init__(self, in_features, hidden_features=None, out_features=None, drop=0.):
        super().__init__()
        out_features = out_features or in_features
        hidden_features = hidden_features or in_features
        self.fc1 = Conv(in_features, hidden_features, act=False)
        self.dwconv = nn.Conv2d(hidden_features, hidden_features, 3, 1, 1, bias=True, groups=hidden_features)
        self.act = nn.ReLU6()
        self.fc2 = Conv(hidden_features, out_features, act=False)
        self.drop = nn.Dropout(drop)

    def forward(self, x):
        x = self.fc1(x)
        x = self.dwconv(x)
        x = self.act(x)
        x = self.drop(x)
        x = self.fc2(x)
        x = self.drop(x)
        return x


class DropPath(nn.Module):
    """Drop paths (Stochastic Depth) per sample  (when applied in main path of residual blocks).
    """

    def __init__(self, drop_prob=None):
        super(DropPath, self).__init__()
        self.drop_prob = drop_prob

    def forward(self, x):
        return drop_path(x, self.drop_prob, self.training)


class Attention(torch.nn.Module):
    def __init__(self, dim, key_dim, num_heads, attn_ratio=4):
        super().__init__()
        self.num_heads = num_heads
        self.scale = key_dim ** -0.5
        self.key_dim = key_dim
        self.nh_kd = nh_kd = key_dim * num_heads  # num_head key_dim
        self.d = int(attn_ratio * key_dim)
        self.dh = int(attn_ratio * key_dim) * num_heads
        self.attn_ratio = attn_ratio

        self.to_q = Conv(dim, nh_kd, 1, act=False)
        self.to_k = Conv(dim, nh_kd, 1, act=False)
        self.to_v = Conv(dim, self.dh, 1, act=False)

        self.proj = torch.nn.Sequential(nn.ReLU6(), Conv(self.dh, dim, act=False))

    def forward(self, x):  # x (B,N,C)
        B, C, H, W = get_shape(x)

        qq = self.to_q(x).reshape(B, self.num_heads, self.key_dim, H * W).permute(0, 1, 3, 2)
        kk = self.to_k(x).reshape(B, self.num_heads, self.key_dim, H * W)
        vv = self.to_v(x).reshape(B, self.num_heads, self.d, H * W).permute(0, 1, 3, 2)

        attn = torch.matmul(qq, kk)
        attn = attn.softmax(dim=-1)  # dim = k

        xx = torch.matmul(attn, vv)

        xx = xx.permute(0, 1, 3, 2).reshape(B, self.dh, H, W)
        xx = self.proj(xx)
        return xx


class top_Block(nn.Module):

    def __init__(self, dim, key_dim, num_heads, mlp_ratio=4., attn_ratio=2., drop=0.,
                 drop_path=0.):
        super().__init__()
        self.dim = dim
        self.num_heads = num_heads
        self.mlp_ratio = mlp_ratio

        self.attn = Attention(dim, key_dim=key_dim, num_heads=num_heads, attn_ratio=attn_ratio)

        # NOTE: drop path for stochastic depth, we shall see if this is better than dropout here
        self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity()
        mlp_hidden_dim = int(dim * mlp_ratio)
        self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, drop=drop)

    def forward(self, x1):
        x1 = x1 + self.drop_path(self.attn(x1))
        x1 = x1 + self.drop_path(self.mlp(x1))
        return x1


class TopBasicLayer(nn.Module):
    def __init__(self, embedding_dim, ouc_list, block_num=2, key_dim=8, num_heads=4,
                 mlp_ratio=4., attn_ratio=2., drop=0., attn_drop=0., drop_path=0.):
        super().__init__()
        self.block_num = block_num

        self.transformer_blocks = nn.ModuleList()
        for i in range(self.block_num):
            self.transformer_blocks.append(top_Block(
                embedding_dim, key_dim=key_dim, num_heads=num_heads,
                mlp_ratio=mlp_ratio, attn_ratio=attn_ratio,
                drop=drop, drop_path=drop_path[i] if isinstance(drop_path, list) else drop_path))
        self.conv = nn.Conv2d(embedding_dim, sum(ouc_list), 1)

    def forward(self, x):
        # token * N
        for i in range(self.block_num):
            x = self.transformer_blocks[i](x)
        return self.conv(x)


class AdvPoolFusion(nn.Module):
    def forward(self, x):
        x1, x2 = x
        if torch.onnx.is_in_onnx_export():
            self.pool = onnx_AdaptiveAvgPool2d
        else:
            self.pool = nn.functional.adaptive_avg_pool2d

        N, C, H, W = x2.shape
        output_size = np.array([H, W])
        x1 = self.pool(x1, output_size)

        return torch.cat([x1, x2], 1)

其次在在YOLOv5/v7项目文件下的models/yolo.py中在文件首部添加代码

from models.goldyolo import *

并搜索def parse_model(d, ch)

定位到如下行添加以下代码

        elif m is SimFusion_4in: # goldyolo
            c2 = sum(ch[x] for x in f)
        elif m is SimFusion_3in:
            c2 = args[0]
            if c2 != no:  # if not output
                c2 = make_divisible(c2 * gw, 8)
            args = [[ch[f_] for f_ in f], c2]
        elif m is IFM:
            c1 = ch[f]
            c2 = sum(args[0])
            args = [c1, *args]
        elif m is InjectionMultiSum_Auto_pool:
            c1 = ch[f[0]]
            c2 = args[0]
            args = [c1, *args]
        elif m is PyramidPoolAgg:
            c2 = args[0]
            args = [sum([ch[f_] for f_ in f]), *args]
        elif m is AdvPoolFusion:
            c2 = sum(ch[x] for x in f)
        elif m is TopBasicLayer:
            c2 = sum(args[1]) # goldyolo

三、YOLOv7-tiny改进工作

完成二后,在YOLOv7项目文件下的models文件夹下创建新的文件yolov7-tiny-goldyolo.yaml,导入如下代码。

# parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple

# anchors
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# yolov7-tiny backbone
backbone:
  # [from, number, module, args] c2, k=1, s=1, p=None, g=1, act=True
  [[-1, 1, Conv, [32, 3, 2, None, 1, nn.LeakyReLU(0.1)]],  # 0-P1/2

   [-1, 1, Conv, [64, 3, 2, None, 1, nn.LeakyReLU(0.1)]],  # 1-P2/4

   [-1, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-2, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],  # 7

   [-1, 1, MP, []],  # 8-P3/8
   [-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-2, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],  # 14

   [-1, 1, MP, []],  # 15-P4/16
   [-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-2, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],  # 21

   [-1, 1, MP, []],  # 22-P5/32
   [-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-2, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [256, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [256, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [512, 1, 1, None, 1, nn.LeakyReLU(0.1)]],  # 28
  ]

# yolov7-tiny head
head:
  [[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-2, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, SP, [5]],
   [-2, 1, SP, [9]],
   [-3, 1, SP, [13]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [[-1, -7], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],  # 37

   [[7, 14, 21, 37], 1, SimFusion_4in, []], # 38
   [-1, 1, IFM, [[64, 32]]], # 39

   [37, 1, Conv, [256, 1, 1]], # 40
   [[14, 21, -1], 1, SimFusion_3in, [256]], # 41
   [[-1, 39], 1, InjectionMultiSum_Auto_pool, [256, [64, 32], 0]], # 42

   [-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-2, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],  # 48

   [21, 1, Conv, [128, 1, 1]], # 49
   [[7, 21, -1], 1, SimFusion_3in, [128]], # 50
   [[-1, 39], 1, InjectionMultiSum_Auto_pool, [128, [64, 32], 1]], # 51

   [-1, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-2, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],  # 57

   [[57, 48, 37], 1, PyramidPoolAgg, [352, 2]], # 58
   [-1, 1, TopBasicLayer, [352, [64, 128]]], # 59

   [[57, 49], 1, AdvPoolFusion, []], # 60
   [[-1, 59], 1, InjectionMultiSum_Auto_pool, [128, [64, 128], 0]], # 61

   [-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-2, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],  # 67

   [[-1, 40], 1, AdvPoolFusion, []], # 68
   [[-1, 59], 1, InjectionMultiSum_Auto_pool, [256, [64, 128], 1]], # 69

   [-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-2, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [[-1, -2, -3, -4], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]],  # 75

   [57, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [67, 1, Conv, [256, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
   [75, 1, Conv, [512, 3, 1, None, 1, nn.LeakyReLU(0.1)]],

   [[76,77,78], 1, IDetect, [nc, anchors]],   # Detect(P3, P4, P5)
  ]

                 from  n    params  module                                  arguments                     
  0                -1  1       928  models.common.Conv                      [3, 32, 3, 2, None, 1, LeakyReLU(negative_slope=0.1)]
  1                -1  1     18560  models.common.Conv                      [32, 64, 3, 2, None, 1, LeakyReLU(negative_slope=0.1)]
  2                -1  1      2112  models.common.Conv                      [64, 32, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
  3                -2  1      2112  models.common.Conv                      [64, 32, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
  4                -1  1      9280  models.common.Conv                      [32, 32, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
  5                -1  1      9280  models.common.Conv                      [32, 32, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
  6  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
  7                -1  1      8320  models.common.Conv                      [128, 64, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
  8                -1  1         0  models.common.MP                        []                            
  9                -1  1      4224  models.common.Conv                      [64, 64, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 10                -2  1      4224  models.common.Conv                      [64, 64, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 11                -1  1     36992  models.common.Conv                      [64, 64, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 12                -1  1     36992  models.common.Conv                      [64, 64, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 13  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 14                -1  1     33024  models.common.Conv                      [256, 128, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 15                -1  1         0  models.common.MP                        []                            
 16                -1  1     16640  models.common.Conv                      [128, 128, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 17                -2  1     16640  models.common.Conv                      [128, 128, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 18                -1  1    147712  models.common.Conv                      [128, 128, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 19                -1  1    147712  models.common.Conv                      [128, 128, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 20  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 21                -1  1    131584  models.common.Conv                      [512, 256, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 22                -1  1         0  models.common.MP                        []                            
 23                -1  1     66048  models.common.Conv                      [256, 256, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 24                -2  1     66048  models.common.Conv                      [256, 256, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 25                -1  1    590336  models.common.Conv                      [256, 256, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 26                -1  1    590336  models.common.Conv                      [256, 256, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 27  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 28                -1  1    525312  models.common.Conv                      [1024, 512, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 29                -1  1    131584  models.common.Conv                      [512, 256, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 30                -2  1    131584  models.common.Conv                      [512, 256, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 31                -1  1         0  models.common.SP                        [5]                           
 32                -2  1         0  models.common.SP                        [9]                           
 33                -3  1         0  models.common.SP                        [13]                          
 34  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 35                -1  1    262656  models.common.Conv                      [1024, 256, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 36          [-1, -7]  1         0  models.common.Concat                    [1]                           
 37                -1  1    131584  models.common.Conv                      [512, 256, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 38   [7, 14, 21, 37]  1         0  models.goldyolo.SimFusion_4in           []                            
 39                -1  1    355392  models.goldyolo.IFM                     [704, [64, 32]]               
 40                37  1     66048  models.common.Conv                      [256, 256, 1, 1]              
 41      [14, 21, -1]  1    230400  models.goldyolo.SimFusion_3in           [[128, 256, 256], 256]        
 42          [-1, 39]  1     99840  models.goldyolo.InjectionMultiSum_Auto_pool[256, 256, [64, 32], 0]       
 43                -1  1     16512  models.common.Conv                      [256, 64, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 44                -2  1     16512  models.common.Conv                      [256, 64, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 45                -1  1     36992  models.common.Conv                      [64, 64, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 46                -1  1     36992  models.common.Conv                      [64, 64, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 47  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 48                -1  1     33024  models.common.Conv                      [256, 128, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 49                21  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 50       [7, 21, -1]  1     90880  models.goldyolo.SimFusion_3in           [[64, 256, 128], 128]         
 51          [-1, 39]  1     25344  models.goldyolo.InjectionMultiSum_Auto_pool[128, 128, [64, 32], 1]       
 52                -1  1      4160  models.common.Conv                      [128, 32, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 53                -2  1      4160  models.common.Conv                      [128, 32, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 54                -1  1      9280  models.common.Conv                      [32, 32, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 55                -1  1      9280  models.common.Conv                      [32, 32, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 56  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 57                -1  1      8320  models.common.Conv                      [128, 64, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 58      [57, 48, 37]  1    158400  models.goldyolo.PyramidPoolAgg          [448, 352, 2]                 
 59                -1  1   2222528  models.goldyolo.TopBasicLayer           [352, [64, 128]]              
 60          [57, 49]  1         0  models.goldyolo.AdvPoolFusion           []                            
 61          [-1, 59]  1     41728  models.goldyolo.InjectionMultiSum_Auto_pool[192, 128, [64, 128], 0]      
 62                -1  1      8320  models.common.Conv                      [128, 64, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 63                -2  1      8320  models.common.Conv                      [128, 64, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 64                -1  1     36992  models.common.Conv                      [64, 64, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 65                -1  1     36992  models.common.Conv                      [64, 64, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 66  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 67                -1  1     33024  models.common.Conv                      [256, 128, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 68          [-1, 40]  1         0  models.goldyolo.AdvPoolFusion           []                            
 69          [-1, 59]  1    165376  models.goldyolo.InjectionMultiSum_Auto_pool[384, 256, [64, 128], 1]      
 70                -1  1     33024  models.common.Conv                      [256, 128, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 71                -2  1     33024  models.common.Conv                      [256, 128, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 72                -1  1    147712  models.common.Conv                      [128, 128, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 73                -1  1    147712  models.common.Conv                      [128, 128, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 74  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 75                -1  1    131584  models.common.Conv                      [512, 256, 1, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 76                57  1     73984  models.common.Conv                      [64, 128, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 77                67  1    295424  models.common.Conv                      [128, 256, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 78                75  1   1180672  models.common.Conv                      [256, 512, 3, 1, None, 1, LeakyReLU(negative_slope=0.1)]
 79      [76, 77, 78]  1     17132  models.yolo.IDetect                     [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]

Model Summary: 443 layers, 8969932 parameters, 8969932 gradients, 16.1 GFLOPS

运行后若打印出如上文本代表改进成功。

四、YOLOv5s改进工作

完成二后,在YOLOv5项目文件下的models文件夹下创建新的文件yolov5s-goldyolo.yaml,导入如下代码。

# YOLOv5 🚀 by Ultralytics, AGPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]

# YOLOv5 v6.0 head
head:
  [[[2, 4, 6, 9], 1, SimFusion_4in, []], # 10
   [-1, 1, IFM, [[64, 32]]], # 11

   [9, 1, Conv, [512, 1, 1]], # 12
   [[4, 6, -1], 1, SimFusion_3in, [512]], # 13
   [[-1, 11], 1, InjectionMultiSum_Auto_pool, [512, [64, 32], 0]], # 14
   [-1, 3, C3, [512, False]], # 15

   [6, 1, Conv, [256, 1, 1]], # 16
   [[2, 4, -1], 1, SimFusion_3in, [256]], # 17
   [[-1, 11], 1, InjectionMultiSum_Auto_pool, [256, [64, 32], 1]], # 18
   [-1, 3, C3, [256, False]], # 19

   [[19, 15, 9], 1, PyramidPoolAgg, [352, 2]], # 20
   [-1, 1, TopBasicLayer, [352, [64, 128]]], # 21

   [[19, 16], 1, AdvPoolFusion, []], # 22
   [[-1, 21], 1, InjectionMultiSum_Auto_pool, [256, [64, 128], 0]], # 23
   [-1, 3, C3, [256, False]], # 24

   [[-1, 12], 1, AdvPoolFusion, []], # 25
   [[-1, 21], 1, InjectionMultiSum_Auto_pool, [512, [64, 128], 1]], # 26
   [-1, 3, C3, [512, False]], # 27

   [[19, 24, 27], 1, Detect, [nc, anchors]] # 28
  ]

                 from  n    params  module                                  arguments                     
  0                -1  1      3520  models.common.Conv                      [3, 32, 6, 2, 2]              
  1                -1  1     18560  models.common.Conv                      [32, 64, 3, 2]                
  2                -1  1     18816  models.common.C3                        [64, 64, 1]                   
  3                -1  1     73984  models.common.Conv                      [64, 128, 3, 2]               
  4                -1  2    115712  models.common.C3                        [128, 128, 2]                 
  5                -1  1    295424  models.common.Conv                      [128, 256, 3, 2]              
  6                -1  3    625152  models.common.C3                        [256, 256, 3]                 
  7                -1  1   1180672  models.common.Conv                      [256, 512, 3, 2]              
  8                -1  1   1182720  models.common.C3                        [512, 512, 1]                 
  9                -1  1    656896  models.common.SPPF                      [512, 512, 5]                 
 10      [2, 4, 6, 9]  1         0  models.common.SimFusion_4in             []                            
 11                -1  1    379968  models.common.IFM                       [960, [64, 32]]               
 12                 9  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 13        [4, 6, -1]  1    230400  models.common.SimFusion_3in             [[128, 256, 256], 256]        
 14          [-1, 11]  1    199680  models.common.InjectionMultiSum_Auto_pool[256, 512, [64, 32], 0]       
 15                -1  1    361984  models.common.C3                        [512, 256, 1, False]          
 16                 6  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 17        [2, 4, -1]  1     57856  models.common.SimFusion_3in             [[64, 128, 128], 128]         
 18          [-1, 11]  1     50688  models.common.InjectionMultiSum_Auto_pool[128, 256, [64, 32], 1]       
 19                -1  1     90880  models.common.C3                        [256, 128, 1, False]          
 20       [19, 15, 9]  1    316096  models.common.PyramidPoolAgg            [896, 352, 2]                 
 21                -1  1   2222528  models.common.TopBasicLayer             [352, [64, 128]]              
 22          [19, 16]  1         0  models.common.AdvPoolFusion             []                            
 23          [-1, 21]  1     99840  models.common.InjectionMultiSum_Auto_pool[256, 256, [64, 128], 0]      
 24                -1  1     90880  models.common.C3                        [256, 128, 1, False]          
 25          [-1, 12]  1         0  models.common.AdvPoolFusion             []                            
 26          [-1, 21]  1    330752  models.common.InjectionMultiSum_Auto_pool[384, 512, [64, 128], 1]      
 27                -1  1    361984  models.common.C3                        [512, 256, 1, False]          
 28      [19, 24, 27]  1      9270  models.yolo.Detect                      [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 128, 256]]

Model Summary: 455 layers, 9138870 parameters, 9138870 gradients, 20.1 GFLOPs

运行后若打印出如上文本代表改进成功。

五、YOLOv5n改进工作

完成二后,在YOLOv5项目文件下的models文件夹下创建新的文件yolov5n-goldyolo.yaml,导入如下代码。

# YOLOv5 🚀 by Ultralytics, AGPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.25  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]

# YOLOv5 v6.0 head
head:
  [[[2, 4, 6, 9], 1, SimFusion_4in, []], # 10
   [-1, 1, IFM, [[64, 32]]], # 11

   [9, 1, Conv, [512, 1, 1]], # 12
   [[4, 6, -1], 1, SimFusion_3in, [512]], # 13
   [[-1, 11], 1, InjectionMultiSum_Auto_pool, [512, [64, 32], 0]], # 14
   [-1, 3, C3, [512, False]], # 15

   [6, 1, Conv, [256, 1, 1]], # 16
   [[2, 4, -1], 1, SimFusion_3in, [256]], # 17
   [[-1, 11], 1, InjectionMultiSum_Auto_pool, [256, [64, 32], 1]], # 18
   [-1, 3, C3, [256, False]], # 19

   [[19, 15, 9], 1, PyramidPoolAgg, [352, 2]], # 20
   [-1, 1, TopBasicLayer, [352, [64, 128]]], # 21

   [[19, 16], 1, AdvPoolFusion, []], # 22
   [[-1, 21], 1, InjectionMultiSum_Auto_pool, [256, [64, 128], 0]], # 23
   [-1, 3, C3, [256, False]], # 24

   [[-1, 12], 1, AdvPoolFusion, []], # 25
   [[-1, 21], 1, InjectionMultiSum_Auto_pool, [512, [64, 128], 1]], # 26
   [-1, 3, C3, [512, False]], # 27

   [[19, 24, 27], 1, Detect, [nc, anchors]] # 28
  ]

                 from  n    params  module                                  arguments                     
  0                -1  1      1760  models.common.Conv                      [3, 16, 6, 2, 2]              
  1                -1  1      4672  models.common.Conv                      [16, 32, 3, 2]                
  2                -1  1      4800  models.common.C3                        [32, 32, 1]                   
  3                -1  1     18560  models.common.Conv                      [32, 64, 3, 2]                
  4                -1  2     29184  models.common.C3                        [64, 64, 2]                   
  5                -1  1     73984  models.common.Conv                      [64, 128, 3, 2]               
  6                -1  3    156928  models.common.C3                        [128, 128, 3]                 
  7                -1  1    295424  models.common.Conv                      [128, 256, 3, 2]              
  8                -1  1    296448  models.common.C3                        [256, 256, 1]                 
  9                -1  1    164608  models.common.SPPF                      [256, 256, 5]                 
 10      [2, 4, 6, 9]  1         0  models.common.SimFusion_4in             []                            
 11                -1  1    333888  models.common.IFM                       [480, [64, 32]]               
 12                 9  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 13        [4, 6, -1]  1     57856  models.common.SimFusion_3in             [[64, 128, 128], 128]         
 14          [-1, 11]  1    134144  models.common.InjectionMultiSum_Auto_pool[128, 512, [64, 32], 0]       
 15                -1  1    123648  models.common.C3                        [512, 128, 1, False]          
 16                 6  1      8320  models.common.Conv                      [128, 64, 1, 1]               
 17        [2, 4, -1]  1     14592  models.common.SimFusion_3in             [[32, 64, 64], 64]            
 18          [-1, 11]  1     34304  models.common.InjectionMultiSum_Auto_pool[64, 256, [64, 32], 1]        
 19                -1  1     31104  models.common.C3                        [256, 64, 1, False]           
 20       [19, 15, 9]  1    158400  models.common.PyramidPoolAgg            [448, 352, 2]                 
 21                -1  1   2222528  models.common.TopBasicLayer             [352, [64, 128]]              
 22          [19, 16]  1         0  models.common.AdvPoolFusion             []                            
 23          [-1, 21]  1     67072  models.common.InjectionMultiSum_Auto_pool[128, 256, [64, 128], 0]      
 24                -1  1     31104  models.common.C3                        [256, 64, 1, False]           
 25          [-1, 12]  1         0  models.common.AdvPoolFusion             []                            
 26          [-1, 21]  1    232448  models.common.InjectionMultiSum_Auto_pool[192, 512, [64, 128], 1]      
 27                -1  1    123648  models.common.C3                        [512, 128, 1, False]          
 28      [19, 24, 27]  1      4662  models.yolo.Detect                      [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [64, 64, 128]]

Model Summary: 455 layers, 4657110 parameters, 4657110 gradients, 8.3 GFLOPs

运行后打印如上代码说明改进成功。

更多文章产出中,主打简洁和准确,欢迎关注我,共同探讨!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2043325.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

谈一谈TVM编译工程师的修炼手册

首先提一下TVM TVM 被称为编译器&#xff0c;是因为它在深度学习模型的优化和执行过程中执行了类似传统编译器的许多工作。与传统编译器将高级语言代码&#xff08;如 C&#xff09;编译为机器代码类似&#xff0c;TVM 将深度学习模型表示&#xff08;如 ONNX&#xff09;转化…

深度相机,通过2d检测得到目标坐标系的3d检测框

算法流程图如下 1. 输入同步&#xff1a; 订阅三个主题&#xff1a; 深度图像 (depth_image)。相机信息 (depth_info)。2D目标检测 (detections)。 使用 message_filters.ApproximateTimeSynchronizer 来同步这些输入&#xff0c;以确保处理的消息是对应的。 2. 计算2D边界…

GET新知识-如何通过Ubuntu和Windows进行文件交互

知识记录篇改求助篇了呜呜呜~~~ &#xff0c;到最后一步passwd咋样都无法链接了。 1.通过下载open SSH进行交互 输入Linux命令: sudo apt install openssh-server 然后就会出现这个&#xff0c;输入Y确认&#xff0c;即可安装成功 2.在Windows上也安装open SSH&#xff0c;具体…

CAN总线详解-理论知识部分

目录 CAN总线简介 CAN总线硬件电路 CAN电平标准 CAN收发器 ​编辑 CAN物理层特性 CAN总线帧格式 数据帧 数据帧格式 数据帧发展历史 遥控帧 错误帧 过载帧 帧间隔 位填充 波形实例 CAN总线接收方数据采样 接收方数据采样遇到的问题 位时序 硬同步 再同步 波…

月销量不足1000的新能源车,都是什么人在买?

近日&#xff0c;盐城经济技术开发区人民法院决定受理高合汽车母公司华人运通&#xff08;江苏&#xff09;技术有限公司预重整申请。 由此&#xff0c;高合汽车正式破产重整。无独有偶&#xff0c;恒大汽车附属公司广东恒大新能源汽车和智能汽车也被申请破产重整。 高合和恒…

使用API有效率地管理Dynadot域名,对拍卖的域名进行出价

前言 Dynadot是通过ICANN认证的域名注册商&#xff0c;自2002年成立以来&#xff0c;服务于全球108个国家和地区的客户&#xff0c;为数以万计的客户提供简洁&#xff0c;优惠&#xff0c;安全的域名注册以及管理服务。 Dynadot平台操作教程索引&#xff08;包括域名邮箱&…

python实现每天定时发送邮件

文章目录 步骤 1: 安装所需的库步骤 2: 编写发送电子邮件的 Python 脚本步骤 3: 配置电子邮件发送服务步骤 4: 运行脚本进一步扩展 要编写一个用于自动发送每日电子邮件报告的 Python 脚本&#xff0c;并配置它在每天的特定时间发送电子邮件&#xff0c;使用 smtplib 和 emai…

高可用集群keepalived的应用以及部署

1.高可用集群 1.1.集群类型 LB&#xff1a;Load Balance 负载均衡 LVS/haproxy/nginx&#xff08;http/upstream,stream/upstream&#xff09; HA : High Point of Failure 高可用集群 数据库、Redis SPoF&#xff1a;single point Of failure 解决单点故障 HPC&#xff1a;…

数据库管理-第228期 Oracle全球分布式数据库-初探(20240812)

数据库管理228期 2024-08-12 数据库管理-第228期 Oracle全球分布式数据库-初探&#xff08;20240812&#xff09;1 概念2 关于全球分布式数据库3 分布式分区4 Oracle全球分布式数据库的优势总结 数据库管理-第228期 Oracle全球分布式数据库-初探&#xff08;20240812&#xff0…

如何在Shopify开发中高度还原Figma设计稿

### 一、理解设计意图&#xff1a;设计与开发的有效沟通#### 1. 早期沟通的重要性在开发工作开始之前&#xff0c;开发人员应与设计师进行详细的沟通&#xff0c;确保对设计意图有深刻理解。关键点包括&#xff1a;- **色彩和字体**&#xff1a;了解设计师对品牌色彩和字体的选…

从0开始搭建vue + flask 旅游景点数据分析系统(十一):登录、注册页面、未登录拦截、注销逻辑

这一期已经到了系列教程的尾声了&#xff0c;下面来搭建登录、注册页面&#xff0c;处理登录拦截和注销的逻辑 1 建表 先把数据库表用户相关的数据库表建立一下&#xff1a; CREATE TABLE tb_user (id int NOT NULL AUTO_INCREMENT COMMENT id,realname varchar(255) CHARAC…

燃气综合管理解决方案:构建安全可靠的燃气供应网络

无论是繁华的都市中心还是宁静的郊区&#xff0c;地下深处都隐藏着错综复杂的燃气管道网络&#xff0c;这些管网为千家万户输送着生活必需的能源。尽管这些管道由高强度和耐腐蚀材料制成&#xff0c;但长期暴露于自然环境中仍会逐渐老化和磨损。一旦发生破损或因人为错误导致管…

eNSP 华为单臂路由实现VLAN间通信

华为单臂路由实现VLAN间通信 SW&#xff1a; <Huawei>sys <Huawei>system-view [Huawei]sysname SW [SW]VLAN batch 10 20 //批量划分VLAN [SW]int g0/0/1 [SW-GigabitEthernet0/0/1]port link-type trunk [SW-GigabitEthernet0/0/1]port trunk allow-pa…

LabVIEW光纤水听器闭环系统

开发了一种利用LabVIEW软件开发的干涉型光纤水听器闭环工作点控制系统。该系统通过调节光源频率和非平衡干涉仪的光程差&#xff0c;实现了工作点的精确控制&#xff0c;从而提高系统的稳定性和检测精度&#xff0c;避免了使用压电陶瓷&#xff0c;使操作更加简便。 项目背景 …

安卓平板电脑定制方案_MTK联发科智能终端方案开发

基于联发科MT8788八核2.0GHz处理器的平板电脑方案&#xff0c;这款平板电脑不仅支持4G和Wi-Fi 5高速网络&#xff0c;还搭载了Android12.0系统&#xff0c;可带来流畅的多任务处理和广泛兼容性。其6GB运行内存和128GB内置存储&#xff0c;再加上支持TF卡扩展&#xff0c;确保了…

【微信小程序】WXS脚本

概述 1. 什么是 wxs WXS&#xff08;WeiXin Script&#xff09;是小程序独有的一套脚本语言&#xff0c;结合 WXML&#xff0c;可以构建出页面的结构。 2. wxs 的应用场景 3. wxs 和 JavaScript 的关系* 基础语法 1. 内嵌 wxs 脚本 2. 定义外联的 wxs 脚本 3. 使用外联的 w…

【Qt菜鸟笔记】QLCDNumber控件

1.QLCDNumber控件简介&#xff08;Qt6.5.3版&#xff09; QLCDNumber控件显示带有类似 LCD 数字的数字。它可以显示十进制、十六进制、八进制或二进制数。 头文件引入#include<QLCDNumber>CMake配置:find_package(Qt6 REQUIRED COMPONENTS Widgets) target_link_librar…

ffmpeg的基础命令

文章目录 ffmpeg/ffplay/ffprobe区别ffmpeg 的作用ffplay的作用ffprobe的作用 ffmpeg使用概述功能概述转码过程简单使用FFMPEG -i常用的 -i例子 ff***工具之间共享的选项ffmpeg主要选项ffmpeg提取音视频数据ffmpeg命令修改原有的视频格式ffmpeg命令裁剪和合并视频拼接视频的方式…

【二叉树OJ】常见面试题

文章目录 1.[单值二叉树](https://leetcode.cn/problems/univalued-binary-tree/description/)1.2 题目要求1.3 深度优先搜索 2.[相同的树](https://leetcode.cn/problems/same-tree/)2.1 题目要求2.2 深度优先遍历 3.[对称二叉树](https://leetcode.cn/problems/symmetric-tre…

热敏打印机ESC指令封装-SAAS本地化及未来之窗行业应用跨平台架构

一热敏打印机ESC指令 它包含了一系列的控制指令&#xff0c;通过这些指令可以实现对打印内容的格式设置&#xff0c;如字体大小、加粗、下划线、对齐方式等。还能够控制打印的位置、纸张进纸、切纸等操作。 例如&#xff0c;在商业零售场景中&#xff0c;收银小票打印机就是基…