RT-DETR融合GhostModel V3及相关改进思路

news2024/11/16 7:20:47


RT-DETR使用教程: RT-DETR使用教程

RT-DETR改进汇总贴:RT-DETR更新汇总贴


《GhostNetV3: Exploring the Training Strategies for Compact Models》

一、 模块介绍

        论文链接:https://arxiv.org/pdf/2404.11202v1

        代码链接:https://github.com/huawei-noah/Efficient-AI-Backbones/....

论文速览:

         小型神经网络专为边缘设备上的应用程序而设计,具有更快的推理速度和适中的性能。然而,目前紧凑模型的训练策略借鉴了传统模型的训练策略,忽略了它们在模型容量上的差异,从而可能阻碍紧凑模型的性能。在本文中,通过系统地研究不同训练成分的影响,我们引入了一种针对紧凑模型的强大训练策略。我们发现,适当的重新参数化和知识蒸馏设计对于训练高性能紧凑模型至关重要,而一些常用的用于训练常规模型的数据增强,如 Mixup 和 CutMix,会导致性能变差。我们在 ImageNet-1K 数据集上的实验表明,我们对紧凑模型的专门训练策略适用于各种架构,包括 GhostNetV2、MobileNetV2 和 ShuffleNetV2。具体来说,配备我们的策略,GhostNetV3 1.3 × 在移动设备上仅以 269M FLOPs和 14.46ms 的延迟实现了 79.1% 的顶级准确率,大大超过了通常训练的同类产品。此外,我们的观察还可以扩展到对象检测场景。

总结:Ghost Net V3。


二、 加入到RT-DETR中

2.1 创建脚本文件

        首先在ultralytics->nn路径下创建blocks.py脚本,用于存放模块代码。

2.2 复制代码        

        复制代码粘到刚刚创建的blocks.py脚本中,如下图所示:

import torch
import torch.nn as nn
import torch.nn.functional as F
import math

from typing import Tuple


def _make_divisible(v, divisor, min_value=None):
    """
    This function is taken from the original tf repo.
    It ensures that all layers have a channel number that is divisible by 8
    It can be seen here:
    https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
    """
    if min_value is None:
        min_value = divisor
    new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
    # Make sure that round down does not go down by more than 10%.
    if new_v < 0.9 * v:
        new_v += divisor
    return new_v


def hard_sigmoid(x, inplace: bool = False):
    if inplace:
        return x.add_(3.).clamp_(0., 6.).div_(6.)
    else:
        return F.relu6(x + 3.) / 6.


class SqueezeExcite(nn.Module):
    def __init__(self, in_chs, se_ratio=0.25, reduced_base_chs=None,
                 act_layer=nn.ReLU, gate_fn=hard_sigmoid, divisor=4, **_):
        super(SqueezeExcite, self).__init__()
        self.gate_fn = gate_fn
        reduced_chs = _make_divisible((reduced_base_chs or in_chs) * se_ratio, divisor)
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.conv_reduce = nn.Conv2d(in_chs, reduced_chs, 1, bias=True)
        self.act1 = act_layer(inplace=True)
        self.conv_expand = nn.Conv2d(reduced_chs, in_chs, 1, bias=True)

    def forward(self, x):
        x_se = self.avg_pool(x)
        x_se = self.conv_reduce(x_se)
        x_se = self.act1(x_se)
        x_se = self.conv_expand(x_se)
        x = x * self.gate_fn(x_se)
        return x


class ConvBnAct(nn.Module):
    def __init__(self, in_chs, out_chs, kernel_size,
                 stride=1, act_layer=nn.ReLU):
        super(ConvBnAct, self).__init__()
        self.conv = nn.Conv2d(in_chs, out_chs, kernel_size, stride, kernel_size // 2, bias=False)
        self.bn1 = nn.BatchNorm2d(out_chs)
        self.act1 = act_layer(inplace=True)

    def forward(self, x):
        x = self.conv(x)
        x = self.bn1(x)
        x = self.act1(x)
        return x


def gcd(a, b):
    if a < b:
        a, b = b, a
    while (a % b != 0):
        c = a % b
        a = b
        b = c
    return b


def MyNorm(dim):
    return nn.GroupNorm(1, dim)


class GhostModuleV3(nn.Module):
    def __init__(self, inp, oup, kernel_size=1, ratio=2, dw_size=3, stride=1, relu=True, mode='ori', args=None):
        super(GhostModuleV3, self).__init__()
        # self.args=args
        # mode = 'ori_shortcut_mul_conv15'
        self.mode = mode
        self.gate_loc = 'before'

        self.inter_mode = 'nearest'
        self.scale = 1.0

        self.infer_mode = False
        self.num_conv_branches = 3
        self.dconv_scale = True
        self.gate_fn = nn.Sigmoid()

        # if args.gate_fn=='hard_sigmoid':
        #     self.gate_fn=hard_sigmoid
        # elif args.gate_fn=='sigmoid':
        #     self.gate_fn=nn.Sigmoid()
        # elif args.gate_fn=='relu':
        #     self.gate_fn=nn.ReLU()
        # elif args.gate_fn=='clip':
        #     self.gate_fn=myclip
        # elif args.gate_fn=='tanh':
        #     self.gate_fn=nn.Tanh()

        if self.mode in ['ori']:
            self.oup = oup
            init_channels = math.ceil(oup / ratio)
            new_channels = init_channels * (ratio - 1)
            if self.infer_mode:
                self.primary_conv = nn.Sequential(
                    nn.Conv2d(inp, init_channels, kernel_size, stride, kernel_size // 2, bias=False),
                    nn.BatchNorm2d(init_channels),
                    nn.ReLU(inplace=True) if relu else nn.Sequential(),
                )
                self.cheap_operation = nn.Sequential(
                    nn.Conv2d(init_channels, new_channels, dw_size, 1, dw_size // 2, groups=init_channels, bias=False),
                    nn.BatchNorm2d(new_channels),
                    nn.ReLU(inplace=True) if relu else nn.Sequential(),
                )
            else:
                self.primary_rpr_skip = nn.BatchNorm2d(inp) \
                    if inp == init_channels and stride == 1 else None
                primary_rpr_conv = list()
                for _ in range(self.num_conv_branches):
                    primary_rpr_conv.append(
                        self._conv_bn(inp, init_channels, kernel_size, stride, kernel_size // 2, bias=False))
                self.primary_rpr_conv = nn.ModuleList(primary_rpr_conv)
                # Re-parameterizable scale branch
                self.primary_rpr_scale = None
                if kernel_size > 1:
                    self.primary_rpr_scale = self._conv_bn(inp, init_channels, 1, 1, 0, bias=False)
                self.primary_activation = nn.ReLU(inplace=True) if relu else None

                self.cheap_rpr_skip = nn.BatchNorm2d(init_channels) \
                    if init_channels == new_channels else None
                cheap_rpr_conv = list()
                for _ in range(self.num_conv_branches):
                    cheap_rpr_conv.append(
                        self._conv_bn(init_channels, new_channels, dw_size, 1, dw_size // 2, groups=init_channels,
                                      bias=False))
                self.cheap_rpr_conv = nn.ModuleList(cheap_rpr_conv)
                # Re-parameterizable scale branch
                self.cheap_rpr_scale = None
                if dw_size > 1:
                    self.cheap_rpr_scale = self._conv_bn(init_channels, new_channels, 1, 1, 0, groups=init_channels,
                                                         bias=False)
                self.cheap_activation = nn.ReLU(inplace=True) if relu else None
                self.in_channels = init_channels
                self.groups = init_channels
                self.kernel_size = dw_size

        elif self.mode in ['ori_shortcut_mul_conv15']:
            self.oup = oup
            init_channels = math.ceil(oup / ratio)
            new_channels = init_channels * (ratio - 1)
            self.short_conv = nn.Sequential(
                nn.Conv2d(inp, oup, kernel_size, stride, kernel_size // 2, bias=False),
                nn.BatchNorm2d(oup),
                nn.Conv2d(oup, oup, kernel_size=(1, 5), stride=1, padding=(0, 2), groups=oup, bias=False),
                nn.BatchNorm2d(oup),
                nn.Conv2d(oup, oup, kernel_size=(5, 1), stride=1, padding=(2, 0), groups=oup, bias=False),
                nn.BatchNorm2d(oup),
            )
            if self.infer_mode:
                self.primary_conv = nn.Sequential(
                    nn.Conv2d(inp, init_channels, kernel_size, stride, kernel_size // 2, bias=False),
                    nn.BatchNorm2d(init_channels),
                    nn.ReLU(inplace=True) if relu else nn.Sequential(),
                )
                self.cheap_operation = nn.Sequential(
                    nn.Conv2d(init_channels, new_channels, dw_size, 1, dw_size // 2, groups=init_channels, bias=False),
                    nn.BatchNorm2d(new_channels),
                    nn.ReLU(inplace=True) if relu else nn.Sequential(),
                )
            else:
                self.primary_rpr_skip = nn.BatchNorm2d(inp) \
                    if inp == init_channels and stride == 1 else None
                primary_rpr_conv = list()
                for _ in range(self.num_conv_branches):
                    primary_rpr_conv.append(
                        self._conv_bn(inp, init_channels, kernel_size, stride, kernel_size // 2, bias=False))
                self.primary_rpr_conv = nn.ModuleList(primary_rpr_conv)
                # Re-parameterizable scale branch
                self.primary_rpr_scale = None
                if kernel_size > 1:
                    self.primary_rpr_scale = self._conv_bn(inp, init_channels, 1, 1, 0, bias=False)
                self.primary_activation = nn.ReLU(inplace=True) if relu else None

                self.cheap_rpr_skip = nn.BatchNorm2d(init_channels) \
                    if init_channels == new_channels else None
                cheap_rpr_conv = list()
                for _ in range(self.num_conv_branches):
                    cheap_rpr_conv.append(
                        self._conv_bn(init_channels, new_channels, dw_size, 1, dw_size // 2, groups=init_channels,
                                      bias=False))
                self.cheap_rpr_conv = nn.ModuleList(cheap_rpr_conv)
                # Re-parameterizable scale branch
                self.cheap_rpr_scale = None
                if dw_size > 1:
                    self.cheap_rpr_scale = self._conv_bn(init_channels, new_channels, 1, 1, 0, groups=init_channels,
                                                         bias=False)
                self.cheap_activation = nn.ReLU(inplace=True) if relu else None
                self.in_channels = init_channels
                self.groups = init_channels
                self.kernel_size = dw_size

    def forward(self, x):
        if self.mode in ['ori']:
            if self.infer_mode:
                x1 = self.primary_conv(x)
                x2 = self.cheap_operation(x1)
            else:
                identity_out = 0
                if self.primary_rpr_skip is not None:
                    identity_out = self.primary_rpr_skip(x)
                scale_out = 0
                if self.primary_rpr_scale is not None and self.dconv_scale:
                    scale_out = self.primary_rpr_scale(x)
                x1 = scale_out + identity_out
                for ix in range(self.num_conv_branches):
                    x1 += self.primary_rpr_conv[ix](x)
                if self.primary_activation is not None:
                    x1 = self.primary_activation(x1)

                cheap_identity_out = 0
                if self.cheap_rpr_skip is not None:
                    cheap_identity_out = self.cheap_rpr_skip(x1)
                cheap_scale_out = 0
                if self.cheap_rpr_scale is not None and self.dconv_scale:
                    cheap_scale_out = self.cheap_rpr_scale(x1)
                x2 = cheap_scale_out + cheap_identity_out
                for ix in range(self.num_conv_branches):
                    x2 += self.cheap_rpr_conv[ix](x1)
                if self.cheap_activation is not None:
                    x2 = self.cheap_activation(x2)

            out = torch.cat([x1, x2], dim=1)
            return out

        elif self.mode in ['ori_shortcut_mul_conv15']:
            res = self.short_conv(F.avg_pool2d(x, kernel_size=2, stride=2))

            if self.infer_mode:
                x1 = self.primary_conv(x)
                x2 = self.cheap_operation(x1)
            else:
                identity_out = 0
                if self.primary_rpr_skip is not None:
                    identity_out = self.primary_rpr_skip(x)
                scale_out = 0
                if self.primary_rpr_scale is not None and self.dconv_scale:
                    scale_out = self.primary_rpr_scale(x)
                x1 = scale_out + identity_out
                for ix in range(self.num_conv_branches):
                    x1 += self.primary_rpr_conv[ix](x)
                if self.primary_activation is not None:
                    x1 = self.primary_activation(x1)

                cheap_identity_out = 0
                if self.cheap_rpr_skip is not None:
                    cheap_identity_out = self.cheap_rpr_skip(x1)
                cheap_scale_out = 0
                if self.cheap_rpr_scale is not None and self.dconv_scale:
                    cheap_scale_out = self.cheap_rpr_scale(x1)
                x2 = cheap_scale_out + cheap_identity_out
                for ix in range(self.num_conv_branches):
                    x2 += self.cheap_rpr_conv[ix](x1)
                if self.cheap_activation is not None:
                    x2 = self.cheap_activation(x2)

            out = torch.cat([x1, x2], dim=1)

            if self.gate_loc == 'before':
                return out[:, :self.oup, :, :] * F.interpolate(self.gate_fn(res / self.scale), size=out.shape[-2:],
                                                               mode=self.inter_mode)  # 'nearest'
            #                 return out*F.interpolate(self.gate_fn(res/self.scale),size=out.shape[-1].item(),mode=self.inter_mode) # 'nearest'
            else:
                return out[:, :self.oup, :, :] * self.gate_fn(
                    F.interpolate(res, size=out.shape[-2:], mode=self.inter_mode))
            #                 return out*self.gate_fn(F.interpolate(res,size=out.shape[-1],mode=self.inter_mode))

    def reparameterize(self):
        """ Following works like `RepVGG: Making VGG-style ConvNets Great Again` -
        https://arxiv.org/pdf/2101.03697.pdf. We re-parameterize multi-branched
        architecture used at training time to obtain a plain CNN-like structure
        for inference.
        """
        if self.infer_mode:
            return
        primary_kernel, primary_bias = self._get_kernel_bias_primary()
        self.primary_conv = nn.Conv2d(in_channels=self.primary_rpr_conv[0].conv.in_channels,
                                      out_channels=self.primary_rpr_conv[0].conv.out_channels,
                                      kernel_size=self.primary_rpr_conv[0].conv.kernel_size,
                                      stride=self.primary_rpr_conv[0].conv.stride,
                                      padding=self.primary_rpr_conv[0].conv.padding,
                                      dilation=self.primary_rpr_conv[0].conv.dilation,
                                      groups=self.primary_rpr_conv[0].conv.groups,
                                      bias=True)
        self.primary_conv.weight.data = primary_kernel
        self.primary_conv.bias.data = primary_bias
        self.primary_conv = nn.Sequential(
            self.primary_conv,
            self.primary_activation if self.primary_activation is not None else nn.Sequential()
        )

        cheap_kernel, cheap_bias = self._get_kernel_bias_cheap()
        self.cheap_operation = nn.Conv2d(in_channels=self.cheap_rpr_conv[0].conv.in_channels,
                                         out_channels=self.cheap_rpr_conv[0].conv.out_channels,
                                         kernel_size=self.cheap_rpr_conv[0].conv.kernel_size,
                                         stride=self.cheap_rpr_conv[0].conv.stride,
                                         padding=self.cheap_rpr_conv[0].conv.padding,
                                         dilation=self.cheap_rpr_conv[0].conv.dilation,
                                         groups=self.cheap_rpr_conv[0].conv.groups,
                                         bias=True)
        self.cheap_operation.weight.data = cheap_kernel
        self.cheap_operation.bias.data = cheap_bias

        self.cheap_operation = nn.Sequential(
            self.cheap_operation,
            self.cheap_activation if self.cheap_activation is not None else nn.Sequential()
        )

        # Delete un-used branches
        for para in self.parameters():
            para.detach_()
        if hasattr(self, 'primary_rpr_conv'):
            self.__delattr__('primary_rpr_conv')
        if hasattr(self, 'primary_rpr_scale'):
            self.__delattr__('primary_rpr_scale')
        if hasattr(self, 'primary_rpr_skip'):
            self.__delattr__('primary_rpr_skip')

        if hasattr(self, 'cheap_rpr_conv'):
            self.__delattr__('cheap_rpr_conv')
        if hasattr(self, 'cheap_rpr_scale'):
            self.__delattr__('cheap_rpr_scale')
        if hasattr(self, 'cheap_rpr_skip'):
            self.__delattr__('cheap_rpr_skip')

        self.infer_mode = True

    def _get_kernel_bias_primary(self) -> Tuple[torch.Tensor, torch.Tensor]:
        """ Method to obtain re-parameterized kernel and bias.
        Reference: https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py#L83

        :return: Tuple of (kernel, bias) after fusing branches.
        """
        # get weights and bias of scale branch
        kernel_scale = 0
        bias_scale = 0
        if self.primary_rpr_scale is not None:
            kernel_scale, bias_scale = self._fuse_bn_tensor(self.primary_rpr_scale)
            # Pad scale branch kernel to match conv branch kernel size.
            pad = self.kernel_size // 2
            kernel_scale = torch.nn.functional.pad(kernel_scale,
                                                   [pad, pad, pad, pad])

        # get weights and bias of skip branch
        kernel_identity = 0
        bias_identity = 0
        if self.primary_rpr_skip is not None:
            kernel_identity, bias_identity = self._fuse_bn_tensor(self.primary_rpr_skip)

        # get weights and bias of conv branches
        kernel_conv = 0
        bias_conv = 0
        for ix in range(self.num_conv_branches):
            _kernel, _bias = self._fuse_bn_tensor(self.primary_rpr_conv[ix])
            kernel_conv += _kernel
            bias_conv += _bias

        kernel_final = kernel_conv + kernel_scale + kernel_identity
        bias_final = bias_conv + bias_scale + bias_identity
        return kernel_final, bias_final

    def _get_kernel_bias_cheap(self) -> Tuple[torch.Tensor, torch.Tensor]:
        """ Method to obtain re-parameterized kernel and bias.
        Reference: https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py#L83

        :return: Tuple of (kernel, bias) after fusing branches.
        """
        # get weights and bias of scale branch
        kernel_scale = 0
        bias_scale = 0
        if self.cheap_rpr_scale is not None:
            kernel_scale, bias_scale = self._fuse_bn_tensor(self.cheap_rpr_scale)
            # Pad scale branch kernel to match conv branch kernel size.
            pad = self.kernel_size // 2
            kernel_scale = torch.nn.functional.pad(kernel_scale,
                                                   [pad, pad, pad, pad])

        # get weights and bias of skip branch
        kernel_identity = 0
        bias_identity = 0
        if self.cheap_rpr_skip is not None:
            kernel_identity, bias_identity = self._fuse_bn_tensor(self.cheap_rpr_skip)

        # get weights and bias of conv branches
        kernel_conv = 0
        bias_conv = 0
        for ix in range(self.num_conv_branches):
            _kernel, _bias = self._fuse_bn_tensor(self.cheap_rpr_conv[ix])
            kernel_conv += _kernel
            bias_conv += _bias

        kernel_final = kernel_conv + kernel_scale + kernel_identity
        bias_final = bias_conv + bias_scale + bias_identity
        return kernel_final, bias_final

    def _fuse_bn_tensor(self, branch) -> Tuple[torch.Tensor, torch.Tensor]:
        """ Method to fuse batchnorm layer with preceeding conv layer.
        Reference: https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py#L95

        :param branch:
        :return: Tuple of (kernel, bias) after fusing batchnorm.
        """
        if isinstance(branch, nn.Sequential):
            kernel = branch.conv.weight
            running_mean = branch.bn.running_mean
            running_var = branch.bn.running_var
            gamma = branch.bn.weight
            beta = branch.bn.bias
            eps = branch.bn.eps
        else:
            assert isinstance(branch, nn.BatchNorm2d)
            if not hasattr(self, 'id_tensor'):
                input_dim = self.in_channels // self.groups
                kernel_value = torch.zeros((self.in_channels,
                                            input_dim,
                                            self.kernel_size,
                                            self.kernel_size),
                                           dtype=branch.weight.dtype,
                                           device=branch.weight.device)
                for i in range(self.in_channels):
                    kernel_value[i, i % input_dim,
                                    self.kernel_size // 2,
                                    self.kernel_size // 2] = 1
                self.id_tensor = kernel_value
            kernel = self.id_tensor
            running_mean = branch.running_mean
            running_var = branch.running_var
            gamma = branch.weight
            beta = branch.bias
            eps = branch.eps
        std = (running_var + eps).sqrt()
        t = (gamma / std).reshape(-1, 1, 1, 1)
        return kernel * t, beta - running_mean * gamma / std

    def _conv_bn(self, in_channels, out_channels, kernel_size, stride, padding, groups=1, bias=False):
        """ Helper method to construct conv-batchnorm layers.

        :param kernel_size: Size of the convolution kernel.
        :param padding: Zero-padding size.
        :return: Conv-BN module.
        """
        mod_list = nn.Sequential()
        mod_list.add_module('conv', nn.Conv2d(in_channels=in_channels,
                                              out_channels=out_channels,
                                              kernel_size=kernel_size,
                                              stride=stride,
                                              padding=padding,
                                              groups=groups,
                                              bias=bias))
        mod_list.add_module('bn', nn.BatchNorm2d(out_channels))
        return mod_list

2.3 更改task.py文件 

       打开ultralytics->nn->modules->task.py,在脚本空白处导入函数。

from ultralytics.nn.blocks import *

        之后找到模型解析函数parse_model(约在tasks.py脚本中940行左右位置,可能因代码版本不同变动),在该函数的最后一个else分支上面增加相关解析代码。

        elif m is GhostModuleV3:
            c2 = args[0]
            args = [ch[f], *args]

2.4 更改yaml文件 

yam文件解读:YOLO系列 “.yaml“文件解读_yolo yaml文件-CSDN博客

       打开更改ultralytics/cfg/models/rt-detr路径下的rtdetr-l.yaml文件,替换原有模块。(放在该位置仅能插入该模块,具体效果未知。博主精力有限,仅完成与其他模块二次创新融合的测试,结构图见文末,代码见群文件更新。)

# Ultralytics YOLO 🚀, AGPL-3.0 license
# RT-DETR-l object detection model with P3-P5 outputs. For details see https://docs.ultralytics.com/models/rtdetr

# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n-cls.yaml' will call yolov8-cls.yaml with scale 'n'
  # [depth, width, max_channels]
  l: [1.00, 1.00, 1024]

backbone:
  # [from, repeats, module, args]
  - [-1, 1, HGStem, [32, 48]] # 0-P2/4
  - [-1, 6, HGBlock, [48, 128, 3]] # stage 1

  - [-1, 1, DWConv, [128, 3, 2, 1, False]] # 2-P3/8
  - [-1, 6, HGBlock, [96, 512, 3]] # stage 2

  - [-1, 1, DWConv, [512, 3, 2, 1, False]] # 4-P3/16
  - [-1, 2, GhostModuleV3, [512, 3]] # cm, c2, k, light, shortcut
  - [-1, 6, HGBlock, [192, 1024, 5, True, True]]
  - [-1, 6, HGBlock, [192, 1024, 5, True, True]] # stage 3

  - [-1, 1, DWConv, [1024, 3, 2, 1, False]] # 8-P4/32
  - [-1, 6, HGBlock, [384, 2048, 5, True, False]] # stage 4

head:
  - [-1, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 10 input_proj.2
  - [-1, 1, AIFI, [1024, 8]]
  - [-1, 1, Conv, [256, 1, 1]] # 12, Y5, lateral_convs.0

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [7, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 14 input_proj.1
  - [[-2, -1], 1, Concat, [1]]
  - [-1, 3, RepC3, [256]] # 16, fpn_blocks.0
  - [-1, 1, Conv, [256, 1, 1]] # 17, Y4, lateral_convs.1

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [3, 1, Conv, [256, 1, 1, None, 1, 1, False]] # 19 input_proj.0
  - [[-2, -1], 1, Concat, [1]] # cat backbone P4
  - [-1, 3, RepC3, [256]] # X3 (21), fpn_blocks.1

  - [-1, 1, Conv, [256, 3, 2]] # 22, downsample_convs.0
  - [[-1, 17], 1, Concat, [1]] # cat Y4
  - [-1, 3, RepC3, [256]] # F4 (24), pan_blocks.0

  - [-1, 1, Conv, [256, 3, 2]] # 25, downsample_convs.1
  - [[-1, 12], 1, Concat, [1]] # cat Y5
  - [-1, 3, RepC3, [256]] # F5 (27), pan_blocks.1

  - [[21, 24, 27], 1, RTDETRDecoder, [nc]] # Detect(P3, P4, P5)


 2.5 修改train.py文件

       创建Train_RT脚本用于训练。

from ultralytics.models import RTDETR
import os
os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'

if __name__ == '__main__':
    model = RTDETR(model='ultralytics/cfg/models/rt-detr/rtdetr-l.yaml')
    # model.load('yolov8n.pt')
    model.train(data='./data.yaml', epochs=2, batch=1, device='0', imgsz=640, workers=2, cache=False,
                amp=True, mosaic=False, project='runs/train', name='exp')

         在train.py脚本中填入修改好的yaml路径,运行即可训。

三、相关改进思路(2024/11/23日群文件)

        该模块可如图加入到HGBlock、RepNCSPELAN4、RepC3等模块中,代码见群文件,结构如图。自研模块与该模块融合代码及yaml文件见群文件。

 ⭐另外,融合上百种改进模块的YOLO项目仅79.9(含百种改进的v9),RTDETR79.9,含高性能自研模型,更易发论文,代码每周更新,欢迎点击下方小卡片加我了解。

⭐⭐平均每个文章对应4-6个二创及自研融合模块⭐⭐


本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2241354.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

基于yolov8、yolov5的鸟类分类系统(含UI界面、训练好的模型、Python代码、数据集)

项目介绍 项目中所用到的算法模型和数据集等信息如下&#xff1a; 算法模型&#xff1a;     yolov8、yolov8 SE注意力机制 或 yolov5、yolov5 SE注意力机制 &#xff0c; 直接提供最少两个训练好的模型。模型十分重要&#xff0c;因为有些同学的电脑没有 GPU&#xff0…

如何保证MySQL与Redis缓存的数据一致性?

文章目录 一、引言二、场景来源三、高并发解决方案1. 先更新缓存&#xff0c;再更新数据库2. 先更新数据库&#xff0c;再更新缓存3. 先删除缓存&#xff0c;再更新数据库4. 先更新数据库&#xff0c;再删除缓存小结 四、拓展方案1. 分布式锁与分布式事务2. 消息队列3. 监听bin…

Chromium 中sqlite数据库操作演示c++

本文主要演示sqlite数据库 增删改查创建数据库以及数据库表的基本操作&#xff0c;仅供学习参考。 一、sqlite数据库操作类封装&#xff1a; sql\database.h sql\database.cc // Copyright 2012 The Chromium Authors // Use of this source code is governed by a BSD-sty…

pycharm分支提交操作

一、Pycharm拉取Git远程仓库代码 1、点击VCS > Get from Version Control 2、输入git的url&#xff0c;选择自己的项目路径 3、点击Clone&#xff0c;就拉取成功了 默认签出分支为main 选择develop签出即可进行开发工作 二、创建分支&#xff08;非必要可以不使用&#xf…

河道无人机雷达测流监测系统由哪几部分组成?

在现代水利管理中&#xff0c;河道无人机雷达监测系统正逐渐成为一种重要的工具&#xff0c;为河道的安全和管理提供了强大的技术支持。那么&#xff0c;这个先进的监测系统究竟由哪几部分组成呢&#xff1f; 河道无人机雷达监测系统工作原理 雷达传感器通过发射电磁波或激光束…

使用pycrawlers下载huggingface报错list index out of range问题解决

使用pycrawlers下载huggingface失败 错误&#xff1a;list index out of range 问题描述&#xff1a;当批量下载整个huggingface仓库的时候会报错&#xff0c;并且没有报错提示 分析&#xff1a;应该是哪个链接请求结果不存在数据 结果&#xff1a;当下载仓库存在文件夹下面只…

清华大学提出基于ESKF的松耦合里程计RINO:一种具有非迭代估计的精确、稳健的雷达惯性里程计

Abstract 精确的定位和建图对于实现自动驾驶车辆的自主导航至关重要。然而&#xff0c;当GNSS信号失效或在极端天气条件&#xff08;例如雾、雨和雪&#xff09;下&#xff0c;车体运动估计仍然面临重大挑战。近年来&#xff0c;扫描雷达因其较强的穿透能力成为一种有效的解决…

微信小程序之路由跳转传数据及接收

跳转并传id或者对象 1.home/index.wxml <!--点击goto方法 将spu_id传过去--> <view class"item" bind:tap"goto" data-id"{{item.spu_id}}"> 结果: 2.home/index.js goto(event){// 路由跳转页面,并把id传传过去//获取商品idlet i…

tensorflow案例6--基于VGG16的猫狗识别(准确率99.8%+),以及tqdm、train_on_batch的简介

&#x1f368; 本文为&#x1f517;365天深度学习训练营 中的学习记录博客&#x1f356; 原作者&#xff1a;K同学啊 前言 本次还是学习API和如何搭建神经网络为主&#xff0c;这一次用VGG16去对猫狗分类&#xff0c;效果还是很好的&#xff0c;达到了99.8% 文章目录 1、tqdm…

海康大华宇视视频平台EasyCVR私有化视频平台服务器选购主要参数有哪些?

在构建现代服务器和视频监控系统时&#xff0c;选择合适的硬件配置和关键技术是确保系统性能和稳定性的基础。服务器选购涉及到多个关键参数&#xff0c;这些参数直接影响到服务器的处理能力、数据存储、网络通信等多个方面。 同时&#xff0c;随着视频监控技术的发展&#xf…

Redisson的可重入锁

初始状态&#xff1a; 表示系统或资源在没有线程持有锁的情况下的状态&#xff0c;任何线程都可以尝试获取锁。 线程 1 获得锁&#xff1a; 线程 1 首次获取了锁并进入受保护的代码区域。 线程 1 再次请求锁&#xff1a; 在持有锁的情况下&#xff0c;线程 1 再次请求锁&a…

Java-01 深入浅出 MyBatis - MyBatis 概念 ORM映射关系 常见ORM 详细发展历史

点一下关注吧&#xff01;&#xff01;&#xff01;非常感谢&#xff01;&#xff01;持续更新&#xff01;&#xff01;&#xff01; 大数据篇正在更新&#xff01;https://blog.csdn.net/w776341482/category_12713819.html 目前已经更新到了&#xff1a; MyBatis&#xff…

C语言第13节:指针(3)

1. 回调函数 回调函数的基本思想是&#xff0c;将函数指针作为参数传递给另一个函数&#xff0c;并在需要时通过这个函数指针调用对应的函数。这种方式允许一个函数对执行的内容进行控制&#xff0c;而不需要知道具体的实现细节。 回调函数在以下场景中尤为有用&#xff1a; …

Tensorflow基本概念

简介&#xff1a;本文从Graph讲到Session&#xff0c;同时讲解了tf.constant创建tensor的用法和variable需要初始化的知识点&#xff0c;可以给你打好一个学习Tensorflow的基础。本文都是基于TensorFlow1.14.0的版本下运行。 本专栏将会系统的讲解TensorFlow在1.14.0版本下的各…

【包教包会】CocosCreator3.x框架——带翻页特效的场景切换

一、效果演示 二、如何获取 1、https://gitee.com/szrpf/TurnPage 2 2、解压&#xff0c;导入cocos creator&#xff08;版本3.8.2&#xff09;&#xff0c;可以直接运行Demo演示 三、算法思路 1、单场景 页面预制体 通过loadScene来切换页面&#xff0c;无法实现页面特效…

【MySQL 保姆级教学】事务的自动提交和手动提交(重点)--上(13)

目录 1. 什么是事务&#xff1f;2. 事务的版本支持3. 事务提交的方式3.1 事务提交方式的分类3.2 演示的准备的工作3.2.1 创建表3.2.2 MySQL的服务端和客户端3.2.3 调低事务的隔离级别 4. 手动提交4.1 手动提交的命令说明4.2 示例一4.3 示例二4.4 示例三4.5 示例四 5. 自动提交5…

几何合理的分片段感知的3D分子生成 FragGen - 评测

FragGen 来源于 2024 年 3 月 25 日 预印本的文章&#xff0c;文章题目是 Deep Geometry Handling and Fragment-wise Molecular 3D Graph Generation&#xff0c; 作者是 Odin Zhang&#xff0c;侯廷军&#xff0c;浙江大学药学院。FragGen 是一个基于分子片段的 3D 分子生成模…

数据结构笔记(其八)--一般树的存储及其遍历

1.知识总览 一般的树会有多个孩子&#xff0c;所以存储结构也会与二叉树略有不同。 一般树的遍历。 2.双亲表示法 双亲表示法&#xff0c;也是父亲表示法&#xff0c;即每个节点中都存储了其父节点的地址信息。 特性&#xff1a;可以轻易地找到父节点&#xff0c;但寻找孩子节…

Linux系统Centos设置开机默认root用户

目录 一. 教程 二. 部分第三方工具配置也无效 一. 教程 使用 Linux 安装Centos系统的小伙伴大概都知道&#xff0c;我们进入系统后&#xff0c;通常都是自己设置的普通用户身份&#xff0c;而不是 root 超级管理员用户&#xff0c;导致我们在操作文件夹时往往爆出没有权限&am…

医院信息化与智能化系统(21)

医院信息化与智能化系统(21) 这里只描述对应过程&#xff0c;和可能遇到的问题及解决办法以及对应的参考链接&#xff0c;并不会直接每一步详细配置 如果你想通过文字描述或代码画流程图&#xff0c;可以试试PlantUML&#xff0c;告诉GPT你的文件结构&#xff0c;让他给你对应…