yolov8x-p2 实现 tensorrt 推理

news2024/10/5 13:35:10

简述

在最开始的yolov8提供的不同size的版本,包括n、s、m、l、x(模型规模依次增大,通过depth, width, max_channels控制大小),这些都是通过P3、P4和P5提取图片特征;
正常的yolov8对象检测模型输出层是P3、P4、P5三个输出层,为了提升对小目标的检测能力,新版本的yolov8 已经包含了P2层(P2层做的卷积次数少,特征图的尺寸(分辨率)较大,更加利于小目标识别),有四个输出层。Backbone部分的结果没有改变,但是Neck跟Head部分模型结构做了调整。

在这里插入图片描述


yolov8-p2 yaml

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P2-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 80  # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024]
  s: [0.33, 0.50, 1024]
  m: [0.67, 0.75, 768]
  l: [1.00, 1.00, 512]
  x: [1.00, 1.25, 512]

# YOLOv8.0 backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]]  # 9

# YOLOv8.0-p2 head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4
  - [-1, 3, C2f, [512]]  # 12

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]  # cat backbone P3
  - [-1, 3, C2f, [256]]  # 15 (P3/8-small)

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 2], 1, Concat, [1]]  # cat backbone P2
  - [-1, 3, C2f, [128]]  # 18 (P2/4-xsmall)

  - [-1, 1, Conv, [128, 3, 2]]
  - [[-1, 15], 1, Concat, [1]]  # cat head P3
  - [-1, 3, C2f, [256]]  # 21 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]]  # cat head P4
  - [-1, 3, C2f, [512]]  # 24 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]]  # cat head P5
  - [-1, 3, C2f, [1024]]  # 27 (P5/32-large)

  - [[18, 21, 24, 27], 1, Detect, [nc]]  # Detect(P2, P3, P4, P5)

yolov8-p2 tensort 实现

参考: https://github.com/wang-xinyu/tensorrtx/tree/master/yolov8

  1. model.cpp 中增加 buildEngineYolov8x_p2 方法.

    Backbone

    • backbone 和 yolov8 一样 , 无需改动,照搬下来就行.

          /*******************************************************************************************************
          *****************************************  YOLOV8 BACKBONE  ********************************************
          *******************************************************************************************************/
          nvinfer1::IElementWiseLayer *conv0 = convBnSiLU(network, weightMap, *data, 80, 3, 2, 1, "model.0");
          nvinfer1::IElementWiseLayer *conv1 = convBnSiLU(network, weightMap, *conv0->getOutput(0), 160, 3, 2, 1, "model.1");
          nvinfer1::IElementWiseLayer *conv2 = C2F(network, weightMap, *conv1->getOutput(0), 160, 160, 3, true, 0.5, "model.2");
          nvinfer1::IElementWiseLayer *conv3 = convBnSiLU(network, weightMap, *conv2->getOutput(0), 320, 3, 2, 1, "model.3");
          nvinfer1::IElementWiseLayer *conv4 = C2F(network, weightMap, *conv3->getOutput(0), 320, 320, 6, true, 0.5, "model.4");
          nvinfer1::IElementWiseLayer *conv5 = convBnSiLU(network, weightMap, *conv4->getOutput(0), 640, 3, 2, 1, "model.5");
          nvinfer1::IElementWiseLayer *conv6 = C2F(network, weightMap, *conv5->getOutput(0), 640, 640, 6, true, 0.5, "model.6");
          nvinfer1::IElementWiseLayer *conv7 = convBnSiLU(network, weightMap, *conv6->getOutput(0), 640, 3, 2, 1, "model.7");
          nvinfer1::IElementWiseLayer *conv8 = C2F(network, weightMap, *conv7->getOutput(0), 640, 640, 3, true, 0.5, "model.8");
          nvinfer1::IElementWiseLayer *conv9 = SPPF(network, weightMap, *conv8->getOutput(0), 640, 640, 5, "model.9");
      

    Head

    • 由3个输出层 (P3、P4、P5) 变成4个输出层 (P2、P3、P4、P5)

      HEAD
        /*******************************************************************************************************
          ******************************************  YOLOV8 HEAD  ***********************************************
          *******************************************************************************************************/
          float scale[] = {1.0, 2.0, 2.0};
          nvinfer1::IResizeLayer *upsample10 = network->addResize(*conv9->getOutput(0));
          upsample10->setResizeMode(nvinfer1::ResizeMode::kNEAREST);
          upsample10->setScales(scale, 3);
      
          nvinfer1::ITensor *inputTensor11[] = {upsample10->getOutput(0), conv6->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat11 = network->addConcatenation(inputTensor11, 2);
          nvinfer1::IElementWiseLayer *conv12 = C2F(network, weightMap, *cat11->getOutput(0), 640, 640, 3, false, 0.5, "model.12");
      
          nvinfer1::IResizeLayer *upsample13 = network->addResize(*conv12->getOutput(0));
          upsample13->setResizeMode(nvinfer1::ResizeMode::kNEAREST);
          upsample13->setScales(scale, 3);
          nvinfer1::ITensor *inputTensor14[] = {upsample13->getOutput(0), conv4->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat14 = network->addConcatenation(inputTensor14, 2);
          nvinfer1::IElementWiseLayer *conv15 = C2F(network, weightMap, *cat14->getOutput(0), 320, 320, 3, false, 0.5, "model.15");
      
          nvinfer1::IResizeLayer *upsample16 = network->addResize(*conv15->getOutput(0));
          upsample16->setResizeMode(nvinfer1::ResizeMode::kNEAREST);
          upsample16->setScales(scale, 3);
          nvinfer1::ITensor *inputTensor17[] = {upsample16->getOutput(0), conv2->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat17 = network->addConcatenation(inputTensor17, 2);
          nvinfer1::IElementWiseLayer *conv18 = C2F(network, weightMap, *cat17->getOutput(0), 160, 160, 3, false, 0.5, "model.18");
      
          nvinfer1::IElementWiseLayer *conv19 = convBnSiLU(network, weightMap, *conv18->getOutput(0), 160, 3, 2, 1, "model.19");
          nvinfer1::ITensor *inputTensor20[] = {conv19->getOutput(0), conv15->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat20 = network->addConcatenation(inputTensor20, 2);
          nvinfer1::IElementWiseLayer *conv21 = C2F(network, weightMap, *cat20->getOutput(0), 320, 320, 3, false, 0.5, "model.21");
      
          nvinfer1::IElementWiseLayer *conv22 = convBnSiLU(network, weightMap, *conv21->getOutput(0), 320, 3, 2, 1, "model.22");
          nvinfer1::ITensor *inputTensor23[] = {conv22->getOutput(0), conv12->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat23 = network->addConcatenation(inputTensor23, 2);
          nvinfer1::IElementWiseLayer *conv24 = C2F(network, weightMap, *cat23->getOutput(0), 640, 640, 3, false, 0.5, "model.24");
      
          nvinfer1::IElementWiseLayer *conv25 = convBnSiLU(network, weightMap, *conv24->getOutput(0), 640, 3, 2, 1, "model.25");
          nvinfer1::ITensor *inputTensor26[] = {conv25->getOutput(0), conv9->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat26 = network->addConcatenation(inputTensor26, 2);
          nvinfer1::IElementWiseLayer *conv27 = C2F(network, weightMap, *cat26->getOutput(0), 640, 640, 3, false, 0.5, "model.27");
      
      OUTPUT
      /*******************************************************************************************************
          *********************************************  YOLOV8 OUTPUT  ******************************************
          *******************************************************************************************************/
          // output0
          nvinfer1::IElementWiseLayer *conv28_cv2_0_0 = convBnSiLU(network, weightMap, *conv18->getOutput(0), 64, 3, 1, 1, "model.28.cv2.0.0");
          nvinfer1::IElementWiseLayer *conv28_cv2_0_1 = convBnSiLU(network, weightMap, *conv28_cv2_0_0->getOutput(0), 64, 3, 1, 1, "model.28.cv2.0.1");
          nvinfer1::IConvolutionLayer *conv28_cv2_0_2 = network->addConvolutionNd(*conv28_cv2_0_1->getOutput(0), 64, nvinfer1::DimsHW{1, 1}, weightMap["model.28.cv2.0.2.weight"], weightMap["model.28.cv2.0.2.bias"]);
          conv28_cv2_0_2->setStrideNd(nvinfer1::DimsHW{1, 1});
          conv28_cv2_0_2->setPaddingNd(nvinfer1::DimsHW{0, 0});
      
          nvinfer1::IElementWiseLayer *conv28_cv3_0_0 = convBnSiLU(network, weightMap, *conv18->getOutput(0), 160, 3, 1, 1, "model.28.cv3.0.0");
          nvinfer1::IElementWiseLayer *conv28_cv3_0_1 = convBnSiLU(network, weightMap, *conv28_cv3_0_0->getOutput(0), 160, 3, 1, 1, "model.28.cv3.0.1");
          nvinfer1::IConvolutionLayer *conv28_cv3_0_2 = network->addConvolutionNd(*conv28_cv3_0_1->getOutput(0), kNumClass, nvinfer1::DimsHW{1, 1}, weightMap["model.28.cv3.0.2.weight"], weightMap["model.28.cv3.0.2.bias"]);
          conv28_cv3_0_2->setStride(nvinfer1::DimsHW{1, 1});
          conv28_cv3_0_2->setPadding(nvinfer1::DimsHW{0, 0});
          nvinfer1::ITensor *inputTensor28_0[] = {conv28_cv2_0_2->getOutput(0), conv28_cv3_0_2->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat28_0 = network->addConcatenation(inputTensor28_0, 2); // P2
      
          // output1
          nvinfer1::IElementWiseLayer *conv28_cv2_1_0 = convBnSiLU(network, weightMap, *conv21->getOutput(0), 64, 3, 1, 1, "model.28.cv2.1.0");
          nvinfer1::IElementWiseLayer *conv28_cv2_1_1 = convBnSiLU(network, weightMap, *conv28_cv2_1_0->getOutput(0), 64, 3, 1, 1, "model.28.cv2.1.1");
          nvinfer1::IConvolutionLayer *conv28_cv2_1_2 = network->addConvolutionNd(*conv28_cv2_1_1->getOutput(0), 64, nvinfer1::DimsHW{1, 1}, weightMap["model.28.cv2.1.2.weight"], weightMap["model.28.cv2.1.2.bias"]);
          conv28_cv2_1_2->setStrideNd(nvinfer1::DimsHW{1, 1});
          conv28_cv2_1_2->setPaddingNd(nvinfer1::DimsHW{0, 0});
      
          nvinfer1::IElementWiseLayer *conv28_cv3_1_0 = convBnSiLU(network, weightMap, *conv21->getOutput(0), 160, 3, 1, 1, "model.28.cv3.1.0");
          nvinfer1::IElementWiseLayer *conv28_cv3_1_1 = convBnSiLU(network, weightMap, *conv28_cv3_1_0->getOutput(0), 160, 3, 1, 1, "model.28.cv3.1.1");
          nvinfer1::IConvolutionLayer *conv28_cv3_1_2 = network->addConvolutionNd(*conv28_cv3_1_1->getOutput(0), kNumClass, nvinfer1::DimsHW{1, 1}, weightMap["model.28.cv3.1.2.weight"], weightMap["model.28.cv3.1.2.bias"]);
          conv28_cv3_1_2->setStrideNd(nvinfer1::DimsHW{1, 1});
          conv28_cv3_1_2->setPaddingNd(nvinfer1::DimsHW{0, 0});
      
          nvinfer1::ITensor *inputTensor28_1[] = {conv28_cv2_1_2->getOutput(0), conv28_cv3_1_2->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat28_1 = network->addConcatenation(inputTensor28_1, 2);
      
          // output2
          nvinfer1::IElementWiseLayer *conv28_cv2_2_0 = convBnSiLU(network, weightMap, *conv24->getOutput(0), 64, 3, 1, 1, "model.28.cv2.2.0");
          nvinfer1::IElementWiseLayer *conv28_cv2_2_1 = convBnSiLU(network, weightMap, *conv28_cv2_2_0->getOutput(0), 64, 3, 1, 1, "model.28.cv2.2.1");
          nvinfer1::IConvolutionLayer *conv28_cv2_2_2 = network->addConvolution(*conv28_cv2_2_1->getOutput(0), 64, nvinfer1::DimsHW{1, 1}, weightMap["model.28.cv2.2.2.weight"], weightMap["model.28.cv2.2.2.bias"]);
      
          nvinfer1::IElementWiseLayer *conv28_cv3_2_0 = convBnSiLU(network, weightMap, *conv24->getOutput(0), 160, 3, 1, 1, "model.28.cv3.2.0");
          nvinfer1::IElementWiseLayer *conv28_cv3_2_1 = convBnSiLU(network, weightMap, *conv28_cv3_2_0->getOutput(0), 160, 3, 1, 1, "model.28.cv3.2.1");
          nvinfer1::IConvolutionLayer *conv28_cv3_2_2 = network->addConvolution(*conv28_cv3_2_1->getOutput(0), kNumClass, nvinfer1::DimsHW{1, 1}, weightMap["model.28.cv3.2.2.weight"], weightMap["model.28.cv3.2.2.bias"]);
      
          nvinfer1::ITensor *inputTensor28_2[] = {conv28_cv2_2_2->getOutput(0), conv28_cv3_2_2->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat28_2 = network->addConcatenation(inputTensor28_2, 2);
      
          // output3
          nvinfer1::IElementWiseLayer *conv28_cv2_3_0 = convBnSiLU(network, weightMap, *conv27->getOutput(0), 64, 3, 1, 1, "model.28.cv2.3.0");
          nvinfer1::IElementWiseLayer *conv28_cv2_3_1 = convBnSiLU(network, weightMap, *conv28_cv2_3_0->getOutput(0), 64, 3, 1, 1, "model.28.cv2.3.1");
          nvinfer1::IConvolutionLayer *conv28_cv2_3_2 = network->addConvolution(*conv28_cv2_3_1->getOutput(0), 64, nvinfer1::DimsHW{1, 1}, weightMap["model.28.cv2.3.2.weight"], weightMap["model.28.cv2.3.2.bias"]);
      
          nvinfer1::IElementWiseLayer *conv28_cv3_3_0 = convBnSiLU(network, weightMap, *conv27->getOutput(0), 160, 3, 1, 1, "model.28.cv3.3.0");
          nvinfer1::IElementWiseLayer *conv28_cv3_3_1 = convBnSiLU(network, weightMap, *conv28_cv3_3_0->getOutput(0), 160, 3, 1, 1, "model.28.cv3.3.1");
          nvinfer1::IConvolutionLayer *conv28_cv3_3_2 = network->addConvolution(*conv28_cv3_3_1->getOutput(0), kNumClass, nvinfer1::DimsHW{1, 1}, weightMap["model.28.cv3.3.2.weight"], weightMap["model.28.cv3.3.2.bias"]);
      
          nvinfer1::ITensor *inputTensor28_3[] = {conv28_cv2_3_2->getOutput(0), conv28_cv3_3_2->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat28_3 = network->addConcatenation(inputTensor28_3, 2);
      
      DETECT
      /*******************************************************************************************************
          *********************************************  YOLOV8 DETECT  ******************************************
          *******************************************************************************************************/
          // P2
          nvinfer1::IShuffleLayer *shuffle28_0 = network->addShuffle(*cat28_0->getOutput(0));
          shuffle28_0->setReshapeDimensions(nvinfer1::Dims2{64 + kNumClass, (kInputH / 4) * (kInputW / 4)});
          nvinfer1::ISliceLayer *split28_0_0 = network->addSlice(*shuffle28_0->getOutput(0), nvinfer1::Dims2{0, 0}, nvinfer1::Dims2{64, (kInputH / 4) * (kInputW / 4)}, nvinfer1::Dims2{1, 1});
          nvinfer1::ISliceLayer *split28_0_1 = network->addSlice(*shuffle28_0->getOutput(0), nvinfer1::Dims2{64, 0}, nvinfer1::Dims2{kNumClass, (kInputH / 4) * (kInputW / 4)}, nvinfer1::Dims2{1, 1});
          nvinfer1::IShuffleLayer *dfl28_0 = DFL(network, weightMap, *split28_0_0->getOutput(0), 4, (kInputH / 4) * (kInputW / 4), 1, 1, 0, "model.28.dfl.conv.weight");
          nvinfer1::ITensor *inputTensor28_dfl_0[] = {dfl28_0->getOutput(0), split28_0_1->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat28_dfl_0 = network->addConcatenation(inputTensor28_dfl_0, 2);
      
          // P3
          nvinfer1::IShuffleLayer *shuffle28_1 = network->addShuffle(*cat28_1->getOutput(0));
          shuffle28_1->setReshapeDimensions(nvinfer1::Dims2{64 + kNumClass, (kInputH / 8) * (kInputW / 8)});
          nvinfer1::ISliceLayer *split28_1_0 = network->addSlice(*shuffle28_1->getOutput(0), nvinfer1::Dims2{0, 0}, nvinfer1::Dims2{64, (kInputH / 8) * (kInputW / 8)}, nvinfer1::Dims2{1, 1});
          nvinfer1::ISliceLayer *split28_1_1 = network->addSlice(*shuffle28_1->getOutput(0), nvinfer1::Dims2{64, 0}, nvinfer1::Dims2{kNumClass, (kInputH / 8) * (kInputW / 8)}, nvinfer1::Dims2{1, 1});
          nvinfer1::IShuffleLayer *dfl28_1 = DFL(network, weightMap, *split28_1_0->getOutput(0), 4, (kInputH / 8) * (kInputW / 8), 1, 1, 0, "model.28.dfl.conv.weight");
          nvinfer1::ITensor *inputTensor28_dfl_1[] = {dfl28_1->getOutput(0), split28_1_1->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat28_dfl_1 = network->addConcatenation(inputTensor28_dfl_1, 2);
      
          // P4
          nvinfer1::IShuffleLayer *shuffle28_2 = network->addShuffle(*cat28_2->getOutput(0));
          shuffle28_2->setReshapeDimensions(nvinfer1::Dims2{64 + kNumClass, (kInputH / 16) * (kInputW / 16)});
          nvinfer1::ISliceLayer *split28_2_0 = network->addSlice(*shuffle28_2->getOutput(0), nvinfer1::Dims2{0, 0}, nvinfer1::Dims2{64, (kInputH / 16) * (kInputW / 16)}, nvinfer1::Dims2{1, 1});
          nvinfer1::ISliceLayer *split28_2_1 = network->addSlice(*shuffle28_2->getOutput(0), nvinfer1::Dims2{64, 0}, nvinfer1::Dims2{kNumClass, (kInputH / 16) * (kInputW / 16)}, nvinfer1::Dims2{1, 1});
          nvinfer1::IShuffleLayer *dfl28_2 = DFL(network, weightMap, *split28_2_0->getOutput(0), 4, (kInputH / 16) * (kInputW / 16), 1, 1, 0, "model.28.dfl.conv.weight");
          nvinfer1::ITensor *inputTensor28_dfl_2[] = {dfl28_2->getOutput(0), split28_2_1->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat28_dfl_2 = network->addConcatenation(inputTensor28_dfl_2, 2);
      
          // P5
          nvinfer1::IShuffleLayer *shuffle28_3 = network->addShuffle(*cat28_3->getOutput(0));
          shuffle28_3->setReshapeDimensions(nvinfer1::Dims2{64 + kNumClass, (kInputH / 32) * (kInputW / 32)});
          nvinfer1::ISliceLayer *split28_3_0 = network->addSlice(*shuffle28_3->getOutput(0), nvinfer1::Dims2{0, 0}, nvinfer1::Dims2{64, (kInputH / 32) * (kInputW / 32)}, nvinfer1::Dims2{1, 1});
          nvinfer1::ISliceLayer *split28_3_1 = network->addSlice(*shuffle28_3->getOutput(0), nvinfer1::Dims2{64, 0}, nvinfer1::Dims2{kNumClass, (kInputH / 32) * (kInputW / 32)}, nvinfer1::Dims2{1, 1});
          nvinfer1::IShuffleLayer *dfl28_3 = DFL(network, weightMap, *split28_3_0->getOutput(0), 4, (kInputH / 32) * (kInputW / 32), 1, 1, 0, "model.28.dfl.conv.weight");
          nvinfer1::ITensor *inputTensor28_dfl_3[] = {dfl28_3->getOutput(0), split28_3_1->getOutput(0)};
          nvinfer1::IConcatenationLayer *cat28_dfl_3 = network->addConcatenation(inputTensor28_dfl_3, 2);
      
          nvinfer1::IPluginV2Layer *yolo = addYoLoLayer(network, std::vector<nvinfer1::IConcatenationLayer *>{cat28_dfl_0, cat28_dfl_1, cat28_dfl_2, cat28_dfl_3});
          yolo->getOutput(0)->setName(kOutputTensorName);
          network->markOutput(*yolo->getOutput(0));
      
  2. 修改 yololayer.cuforwardGpu 方法

     void YoloLayerPlugin::forwardGpu(const float *const *inputs, float *output, cudaStream_t stream, int mYoloV8netHeight, int mYoloV8NetWidth, int batchSize)
    {
       	int outputElem = 1 + mMaxOutObject * sizeof(Detection) / sizeof(float);
        cudaMemsetAsync(output, 0, sizeof(float), stream);
        for (int idx = 0; idx < batchSize; ++idx)
        {
           CUDA_CHECK(cudaMemsetAsync(output + idx * outputElem, 0, sizeof(float), stream));
        }
        int numElem = 0;   
        // int grids[3][2] = {{mYoloV8netHeight / 8, mYoloV8NetWidth / 8}, {mYoloV8netHeight / 16, mYoloV8NetWidth / 16}, {mYoloV8netHeight / 32, mYoloV8NetWidth / 32}};
    	// todo 
    	int grids[4][2] = {{mYoloV8netHeight / 4, mYoloV8NetWidth / 4}, {mYoloV8netHeight / 8, mYoloV8NetWidth / 8}, {mYoloV8netHeight / 16, mYoloV8NetWidth / 16}, {mYoloV8netHeight / 32, mYoloV8NetWidth / 32}};
    	// int strides[] = { 8, 16, 32 };
    	// todo 
    	int strides[] = {4, 8, 16, 32};
    	// for (unsigned int i = 0; i < 3; i++)
    	// todo 
        for (unsigned int i = 0; i < 4; i++)
        {
            int grid_h = grids[i][0];
            int grid_w = grids[i][1];
            int stride = strides[i];
            numElem = grid_h * grid_w * batchSize;
            if (numElem < mThreadCount)
                mThreadCount = numElem;
    
            CalDetection<<<(numElem + mThreadCount - 1) / mThreadCount, mThreadCount, 0, stream>>>(inputs[i], output, numElem, mMaxOutObject, grid_h, grid_w, stride, mClassCount, outputElem);
        }
    }
    
  3. 修改 main.cpp -> serialize_engine ,增加一个 sub_type

      ...
      else if (sub_type == "x-p2")
      {
         	serialized_engine = buildEngineYolov8x_p2(builder, config, DataType::kFLOAT, wts_name);
      }
      ...
    
  4. 参考作者 (https://github.com/wang-xinyu/tensorrtx/tree/master/yolov8) , 获取wts , 然后生成模型.
    ./yolov8 -s ./weights/xxx.wts ./weights/xxx.engine x-p2

  5. 推理模型测试
    ./yolov8 -d xxx.engine ../images g

    在这里插入图片描述


END

  • 官网中没有找到p2的预训练模型,所以需要根据自己数据集训练模型
  • 自己训练模型需要更改 config.h 中对应的参数.
  • 以上纯手工输出,若有不对,欢迎大佬指正.

参考:

  • https://github.com/ultralytics/ultralytics/tree/main
  • https://github.com/wang-xinyu/tensorrtx/tree/master/yolov8

在这里插入图片描述

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1125524.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

软考系列(系统架构师)- 2020年系统架构师软考案例分析考点

试题一 软件架构&#xff08;架构风格、质量属性&#xff09; 【问题1】&#xff08;13分&#xff09; 针对该系统的功能&#xff0c;李工建议采用管道-过滤器&#xff08;pipe and filter)的架构风格&#xff0c;而王工则建议采用仓库&#xff08;reposilory)架构风格。请指出…

数字信号处理期末复习(2)——z变换与DTFT

前言 本章主要学习的内容为z变换、离散时间傅里叶变换&#xff08;DTFT&#xff09;、离散时间系统的z变换域和频域&#xff08;傅里叶变换域&#xff09;的分析。 在z变换中&#xff0c;主要考查z变换和z反变换的计算、z变换的性质 在DTFT中&#xff0c;主要考查序列傅里叶变…

vue父子组件传值不能实时更新的解决方法

最近做项目,遇到个大坑,这会爬出来了,写个总结,避免下次掉坑。 vue父子组件传值不能实时更新问题,父组件将值传给了子组件,但子组件显示的值还是原来的初始值,并没有实时更新,为什么会出现这种问题呢? 出现这个问题,可能有以下两个原因: 一、 父组件没有把值传过…

提升药店效率:山海鲸医药零售大屏的成功案例

在医药行业中&#xff0c;特别是医药零售领域&#xff0c;高效的药品管理和客户服务至关重要。随着科技的飞速发展&#xff0c;数字化解决方案已经成为提高医药零售管控效率的有效工具之一。其中&#xff0c;医药零售管控大屏作为一种强大的工具&#xff0c;正在以独特的方式改…

SpringBoot+Mybatis 配置多数据源及事务管理

目录 1.多数据源 2.事务配置 项目搭建参考: 从零开始搭建SpringBoot项目_从0搭建springboot项目-CSDN博客 SpringBoot学习笔记(二) 整合redismybatisDubbo-CSDN博客 1.多数据源 添加依赖 <dependencies><dependency><groupId>org.springframework.boot&…

Docker Service 创建

Docker Swarm Mode Docker Swarm 集群搭建 Docker Swarm 节点维护 Docker Service 创建 service 只能依附于 docker swarm 集群&#xff0c;所以 service 的创建前提是&#xff0c;swarm 集群搭建完毕。 1. 创建 service docker service create 命令用于创建 service&#xff…

测试工程师应具备的软实力

测试工程师不仅要有过硬的技术实力&#xff0c;也需要培养软实力。硬实力决定着起点是基础&#xff0c;软实力决定能够走的多快多远。在平常的工作中需要不断升级打怪&#xff0c;修炼并提高自身的软实力。 特别是作为一名测试工程师&#xff0c;未来的转型方向很多&#xff0…

天锐绿盾——应用服务系统接入-集成OA审批

天锐绿盾是一种加密软件&#xff0c;而集成OA审批是指将天锐绿盾与OA系统进行集成&#xff0c;实现审批流程的自动化和信息化。 PC访问地址&#xff1a; https://isite.baidu.com/site/wjz012xr/2eae091d-1b97-4276-90bc-6757c5dfedee 在集成过程中&#xff0c;可以在天锐绿盾…

Docker 的数据管理与网络通信以及Docker镜像的创建

目录 Docker的数据管理 1、数据卷 2、数据卷容器 3、端口映射 4、容器互联 二、Docker网络 1、Docker网络实现原理 2、Docker的网桥模式 1&#xff09;Host 2&#xff09;Container 3&#xff09;none 4&#xff09;bridge 5&#xff09;自定义网络 3、创建自定义…

博客后台模块续更(六)

十三、后台模块-用户列表 1. 查询用户 需要用户分页列表接口。 可以根据用户名模糊搜索。 可以进行手机号的搜索。 可以进行状态的查询。 1.1 接口分析 请求方式请求路径是否需求token头GETsystem/user/list是 请求参数query格式&#xff1a; pageNum: 页码pageSize…

ROI的投入产出比是什么?

ROI的投入产出比是什么&#xff1f; 投入产出比&#xff08;Return on Investment, ROI&#xff09;是一种评估投资效益的财务指标&#xff0c;用于衡量投资带来的回报与投入成本之间的关系。它的计算公式如下&#xff1a; 投资收益&#xff1a;指的是投资带来的净收入&#x…

mysql下载和安装,使用

先下载安装 官方下载 已下载备份软件 安装&#xff0c;一路下一步设置环境变量 4. 打开一个cmd&#xff0c;输入mysql -u root -p

某马机房预约系统 C++项目(二) 完结

8.4、查看机房 8.4.1、添加机房信息 根据案例&#xff0c;我们还是先在computerRoom.txt中直接添加点数据 //几机房 机器数量 1 20 2 50 3 1008.4.2、机房类创建 ​ 同样我们在头文件下新建一个computerRoom.h文件 添加如下代码&#xff1a; #pragma once #include<i…

“比特币技术与链上分析:解析市场机会,掌握暴利投资策略!“

比特币再次达到 30,000 美元的感觉从来没有这么好过&#xff0c;尤其是在 25,000 美元和 27,000 美元之间波动了很长一段时间之后。 令人惊讶的是&#xff0c;这种情况发生在加密货币恐惧与贪婪指数被认为是“中性”的情况下&#xff0c;尽管它现在在转变为“贪婪”状态。 同样…

【USRP】通信基带物理层历史

无线通信的基带物理层开发历史涵盖了从早期无线技术到当前复杂的移动通信标准的各种进步。以下是关于无线通信基带物理层开发的简要历史概述&#xff1a; 无线电初期&#xff1a;20世纪初&#xff0c;Guglielmo Marconi等人通过无线电进行了早期的无线通信尝试。这些早期的尝试…

如何学会从产品经理角度去思考问题?

如何学会从产品经理角度去思考问题&#xff1f; 从产品经理的角度思考问题意味着你需要关注产品从构思到上市全过程中的各个方面&#xff0c;包括用户需求、市场趋势、设计、开发、测试、上市后的用户反馈等。以下是一些策略和方法&#xff0c;帮助你培养从产品经理角度思考问…

LeetCode刷题---简单组(一)

文章目录 &#x1f352;题目一 507. 完美数&#x1f352;解法一 &#x1f352;题目二 2678. 老人的数目&#x1f352;解法一 &#x1f352;题目三 520. 检测大写字母&#x1f352;解法一&#x1f352;解法二 &#x1f352;题目一 507. 完美数 对于一个 正整数&#xff0c;如果它…

Docker Swarm Mode

Docker Swarm Mode Docker Swarm 集群搭建 Docker Swarm 节点维护 Docker Service 创建 先看docker官网上的一句话&#xff1a;Docker Swarm mode is built into the Docker Engine. Do not confuse Docker Swarm mode with Docker Classic Swarm which is no longer actively …

香港优才计划转永居身份掏心经验以及好处分享!

香港优才计划转永居身份掏心经验以及好处分享&#xff01; 随着今年申请香港优才计划的人越来越多&#xff0c;拿到签证需要转永居身份的人也越来越多&#xff01;随之面临的问题就是&#xff1a;害怕转永居身份失败&#xff01; 当然香港优才计划申请必须满足的基本条件&#…

rust学习——泛型 (Generics)

文章目录 泛型 Generics泛型详解结构体中使用泛型枚举中使用泛型方法中使用泛型为具体的泛型类型实现方法 const 泛型&#xff08;Rust 1.51 版本引入的重要特性&#xff09;const 泛型表达式 泛型的性能 泛型 Generics Go 语言在 2022 年&#xff0c;就要正式引入泛型&#xf…