旷世yolox自定义数据训练和验证和onnx导出推理

news2024/11/23 3:44:08

目录

1.前言

2.代码

3.环境

4.自定义数据形态

5.配置文件

6.训练

7.验证

8.评估混淆矩阵

9.导出onnx

10.onnx推理

-- 补充:docker环境


1.前言

旷世科技的yolox比较清爽,效果也不错,简单总结主要有三点创新比较高:decoupled headanchor-free以及advanced label assigning strategy(SimOTA)

2.代码

截止到我写文章,我是下载的main分支,路径YOLOX\yolox\__init__.py下看版本是0.3.0
GitHub - Megvii-BaseDetection/YOLOX: YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/ - Megvii-BaseDetection/YOLOXicon-default.png?t=O83Ahttps://github.com/Megvii-BaseDetection/YOLOX

3.环境

(1)跟着requirements.txt来就好,基本上没啥问题

(2)搞一个docker容器,后边或综合文末补充。。。

装好了后,试一下效果看有没得,默认推理结果应该在YOLOX_outputs下边的以yolox_s命名文件夹下:

python tools/demo.py image -n yolox-s -c checkpoints/yolox_s.pth  --path   assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result --device gpu

  • image是推理模式,如果是输入视频则为video
  • -n 是模型的名字
  • -c 为权重文件地址
  • –path是测试的图片路径
  • –conf 置信度阈值
  • –nms nms的iou阈值
  • –tsize 测试图片大小
  • –save_result 是否保存推理结果

4.自定义数据形态

参考我曾经写的一篇mmdetection中,数据集那一块内容:

记录一次 mmdetection 自定义数据训练和推理_记录一次mmdetection自定义数据训练和推理-CSDN博客文章浏览阅读1.8k次,点赞2次,收藏14次。总体参考如下(还有其他CSDN和知乎贴子):1. 环境安装除了安装基础的python,pytorch等,重点是mmcv和mmcls!由于要用到开发场景,不要用pip安装封装好的包,用官方建议(官方install那一步也有讲):pip install openmimmim install -e .2. 代码直接clone的mmlab官方源码:GitHub - open-mmlab/mmdetection: OpenMMLab Detection Toolbox and Be.._记录一次mmdetection自定义数据训练和推理https://blog.csdn.net/hzy459176895/article/details/123690217?spm=1001.2014.3001.5502核心在于: (1)labelme打标;(2)labelme2coco的标签样式 (3)调整为yolox训练用样式

也可以其他处理方式,总之,为了迎合本次yolox训练用!最后自定义数据,my_data文件夹下长成这样样子【去下个coco128以它为例也行,人家README好像也说了】:

关心红框就行,annotations中长这样【左】, train2017和val2017中长这样【右】,就是图:

标签json中就是coco格式的目标检测标签,以instances_val2017.json为例,长下边这样:

{"info": {"year": 2021, "version": "1.0", "description": "For object detection", "date_created": "2021"}, "images": [{"date_captured": "2021", "file_name": "000000000001.jpg", "id": 1, "height": 480, "width": 640}, {"date_captured": "2021", "file_name": "000000000002.jpg", "id": 2, "height": 426, "width": 640}, {"date_captured": "2021", "file_name": "000000000003.jpg", "id": 3, "height": 428, "width": 640}, {"date_captured": "2021", "file_name": "000000000004.jpg", "id": 4, "height": 425, "width": 640}, {"date_captured": "2021", "file_name": "000000000005.jpg", "id": 5, "height": 640, "width": 481}], "licenses": [{"id": 1, "name": "GNU General Public License v3.0", "url": "https://github.com/zhiqwang/yolov5-rt-stack/blob/master/LICENSE"}], "type": "instances", "annotations": [{"segmentation": [[1.0799999999999272, 187.69008000000002, 612.66976, 187.69008000000002, 612.66976, 473.53008000000005, 1.0799999999999272, 473.53008000000005]], "area": 174816.81699840003, "iscrowd": 0, "image_id": 1, "bbox": [1.0799999999999272, 187.69008000000002, 611.5897600000001, 285.84000000000003], "category_id": 19, "id": 1}, {"segmentation": [[311.73024, 4.310159999999996, 631.0102400000001, 4.310159999999996, 631.0102400000001, 232.99032, 311.73024, 232.99032]], "area": 73013.00148480001, "iscrowd": 0, "image_id": 1, "bbox": [311.73024, 4.310159999999996, 319.28000000000003, 228.68016], "category_id": 50, "id": 2}, {"segmentation": [[249.60032, 229.27031999999997, 565.84032, 229.27031999999997, 565.84032, 474.35015999999996, 249.60032, 474.35015999999996]], "area": 77504.04860159999, "iscrowd": 0, "image_id": 1, "bbox": [249.60032, 229.27031999999997, 316.24, 245.07984], "category_id": 70, "id": 3}, {"segmentation": [[0.00031999999998788553, 13.510079999999988, 434.48032, 13.510079999999988, 434.48032, 388.63008, 0.00031999999998788553, 388.63008]], "area": 162982.13760000002, "iscrowd": 0, "image_id": 1, "bbox": [0.00031999999998788553, 13.510079999999988, 434.48, 375.12], "category_id": 38, "id": 4}, {"segmentation": [[376.2, 40.36008, 451.75007999999997, 40.36008, 451.75007999999997, 86.88983999999999, 376.2, 86.88983999999999]], "area": 3515.3270903807993, "iscrowd": 0, "image_id": 1, "bbox": [376.2, 40.36008, 75.55008, 46.529759999999996], "category_id": 33, "id": 5}, {"segmentation": [[465.77984, 38.97, 523.8496, 38.97, 523.8496, 85.63991999999999, 465.77984, 85.63991999999999]], "area": 2710.1110536191995, "iscrowd": 0, "image_id": 1, "bbox": [465.77984, 38.97, 58.069759999999995, 46.66992], "category_id": 8, "id": 6}, {"segmentation": [[385.70016, 73.65984, 469.71999999999997, 73.65984, 469.71999999999997, 144.16992, 385.70016, 144.16992]], "area": 5924.245639987201, "iscrowd": 0, "image_id": 1, "bbox": [385.70016, 73.65984, 84.01984, 70.51008], "category_id": 62, "id": 7}, {"segmentation": [[364.0496, 2.49024, 458.80992000000003, 2.49024, 458.80992000000003, 73.56, 364.0496, 73.56]], "area": 6734.593199923201, "iscrowd": 0, "image_id": 1, "bbox": [364.0496, 2.49024, 94.76032000000001, 71.06976], "category_id": 45, "id": 8}, {"segmentation": [[385.52992, 60.030002999999994, 600.50016, 60.030002999999994, 600.50016, 357.19013700000005, 385.52992, 357.19013700000005]], "area": 63880.58532441216, "iscrowd": 0, "image_id": 2, "bbox": [385.52992, 60.030002999999994, 214.97024, 297.160134], "category_id": 71, "id": 9}, {"segmentation": [[53.01024000000001, 356.49000599999994, 185.04032, 356.49000599999994, 185.04032, 411.6800099999999, 53.01024000000001, 411.6800099999999]], "area": 7286.7406433203205, "iscrowd": 0, "image_id": 2, "bbox": [53.01024000000001, 356.49000599999994, 132.03008, 55.190004], "category_id": 27, "id": 10}, {"segmentation": [[204.86016, 31.019728000000015, 459.74016, 31.019728000000015, 459.74016, 355.13984800000003, 204.86016, 355.13984800000003]], "area": 82611.73618559999, "iscrowd": 0, "image_id": 3, "bbox": [204.86016, 31.019728000000015, 254.88, 324.12012], "category_id": 27, "id": 11}, {"segmentation": [[237.56032, 155.809976, 403.96032, 155.809976, 403.96032, 351.060152, 237.56032, 351.060152]], "area": 32489.6292864, "iscrowd": 0, "image_id": 3, "bbox": [237.56032, 155.809976, 166.4, 195.25017599999998], "category_id": 58, "id": 12}, {"segmentation": [[0.960000000000008, 20.060000000000002, 442.19007999999997, 20.060000000000002, 442.19007999999997, 399.21015, 0.960000000000008, 399.21015]], "area": 167292.451016512, "iscrowd": 0, "image_id": 4, "bbox": [0.960000000000008, 20.060000000000002, 441.23008, 379.15015], "category_id": 19, "id": 13}, {"segmentation": [[0, 50.11967999999999, 457.680158, 50.11967999999999, 457.680158, 480.46975999999995, 0, 480.46975999999995]], "area": 196962.69260971263, "iscrowd": 0, "image_id": 5, "bbox": [0, 50.11967999999999, 457.680158, 430.35008], "category_id": 35, "id": 14}, {"segmentation": [[167.5801595, 162.88991999999993, 478.19023849999996, 162.88991999999993, 478.19023849999996, 628.0796799999999, 167.5801595, 628.0796799999999]], "area": 144492.62810359104, "iscrowd": 0, "image_id": 5, "bbox": [167.5801595, 162.88991999999993, 310.610079, 465.18976000000004], "category_id": 57, "id": 15}], "categories": [{"id": 1, "name": "0", "supercategory": "0"}, {"id": 2, "name": "1", "supercategory": "1"}, {"id": 3, "name": "2", "supercategory": "2"}, {"id": 4, "name": "3", "supercategory": "3"}, {"id": 5, "name": "4", "supercategory": "4"}, {"id": 6, "name": "5", "supercategory": "5"}, {"id": 7, "name": "6", "supercategory": "6"}, {"id": 8, "name": "7", "supercategory": "7"}, {"id": 9, "name": "8", "supercategory": "8"}, {"id": 10, "name": "9", "supercategory": "9"}, {"id": 11, "name": "10", "supercategory": "10"}, {"id": 12, "name": "11", "supercategory": "11"}, {"id": 13, "name": "12", "supercategory": "12"}, {"id": 14, "name": "13", "supercategory": "13"}, {"id": 15, "name": "14", "supercategory": "14"}, {"id": 16, "name": "15", "supercategory": "15"}, {"id": 17, "name": "16", "supercategory": "16"}, {"id": 18, "name": "17", "supercategory": "17"}, {"id": 19, "name": "18", "supercategory": "18"}, {"id": 20, "name": "19", "supercategory": "19"}, {"id": 21, "name": "20", "supercategory": "20"}, {"id": 22, "name": "21", "supercategory": "21"}, {"id": 23, "name": "22", "supercategory": "22"}, {"id": 24, "name": "23", "supercategory": "23"}, {"id": 25, "name": "24", "supercategory": "24"}, {"id": 26, "name": "25", "supercategory": "25"}, {"id": 27, "name": "26", "supercategory": "26"}, {"id": 28, "name": "27", "supercategory": "27"}, {"id": 29, "name": "28", "supercategory": "28"}, {"id": 30, "name": "29", "supercategory": "29"}, {"id": 31, "name": "30", "supercategory": "30"}, {"id": 32, "name": "31", "supercategory": "31"}, {"id": 33, "name": "32", "supercategory": "32"}, {"id": 34, "name": "33", "supercategory": "33"}, {"id": 35, "name": "34", "supercategory": "34"}, {"id": 36, "name": "35", "supercategory": "35"}, {"id": 37, "name": "36", "supercategory": "36"}, {"id": 38, "name": "37", "supercategory": "37"}, {"id": 39, "name": "38", "supercategory": "38"}, {"id": 40, "name": "39", "supercategory": "39"}, {"id": 41, "name": "40", "supercategory": "40"}, {"id": 42, "name": "41", "supercategory": "41"}, {"id": 43, "name": "42", "supercategory": "42"}, {"id": 44, "name": "43", "supercategory": "43"}, {"id": 45, "name": "44", "supercategory": "44"}, {"id": 46, "name": "45", "supercategory": "45"}, {"id": 47, "name": "46", "supercategory": "46"}, {"id": 48, "name": "47", "supercategory": "47"}, {"id": 49, "name": "48", "supercategory": "48"}, {"id": 50, "name": "49", "supercategory": "49"}, {"id": 51, "name": "50", "supercategory": "50"}, {"id": 52, "name": "51", "supercategory": "51"}, {"id": 53, "name": "52", "supercategory": "52"}, {"id": 54, "name": "53", "supercategory": "53"}, {"id": 55, "name": "54", "supercategory": "54"}, {"id": 56, "name": "55", "supercategory": "55"}, {"id": 57, "name": "56", "supercategory": "56"}, {"id": 58, "name": "57", "supercategory": "57"}, {"id": 59, "name": "58", "supercategory": "58"}, {"id": 60, "name": "59", "supercategory": "59"}, {"id": 61, "name": "60", "supercategory": "60"}, {"id": 62, "name": "61", "supercategory": "61"}, {"id": 63, "name": "62", "supercategory": "62"}, {"id": 64, "name": "63", "supercategory": "63"}, {"id": 65, "name": "64", "supercategory": "64"}, {"id": 66, "name": "65", "supercategory": "65"}, {"id": 67, "name": "66", "supercategory": "66"}, {"id": 68, "name": "67", "supercategory": "67"}, {"id": 69, "name": "68", "supercategory": "68"}, {"id": 70, "name": "69", "supercategory": "69"}, {"id": 71, "name": "70", "supercategory": "70"}]}

5.配置文件

把exps\default\yolox_s.py复制一份到exps\yolox_s_me.py,改动如下:

#!/usr/bin/env python3
# -*- coding:utf-8 -*-
# Copyright (c) Megvii, Inc. and its affiliates.

import os

from yolox.exp import Exp as MyExp

"""
自定义数据直接快速训练注意事项:
1.dataset下的annotations下,写成同coco128一致的标签名,方便直接用。 instances_train2017.json 与 instances_val2017.json
2.训练集和验证集文件夹名称也同样写成train2017和val2017[CocoDataset里直接用的这些]
3.yolox\data\datasets\coco_classes.py 下边的 COCO_CLASSES 写成本次业务数据的标签类
"""

from yolox.data.datasets.coco_classes import COCO_CLASSES  # 运行前快速抵达修改类别
DATA_DIR = "datasets/my_data"  # 本次训练业务数据

class Exp(MyExp):
    def __init__(self):
        super(Exp, self).__init__()
        self.depth = 0.33
        self.width = 0.50
        self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]

        # Define yourself dataset path
        self.data_dir = DATA_DIR
        self.train_ann = "instances_train2017.json"
        self.val_ann = "instances_val2017.json"

        self.num_classes = len(COCO_CLASSES)

        self.max_epoch = 50
        self.data_num_workers = 4
        self.eval_interval = 10  # 多少次迭代评估
        self.print_interval = 10  # 多少批次打印
        self.save_history_ckpt = False  # 只保存最后一次, 如果设置True,则保存每一次

上边代码有两个点,

第一个点:是数据集一些文件名就懒得改了,因为EXP的类中直接用了coco的标签文件名;

第二个点是yolox.data.datasets.coco_classes.py中去加一个COCO_CLASSES作为你的自定义类,方便配置文件中直接加载用!!!如:COCO_CLASSES = ('c1', 'c2')

6.训练

把tools\train.py复制一份出来到根目录,比如my_train.py,里边代码改动如下【上边不动其他的,主函数加上你的配置文件和超参数啥的即可】:

if __name__ == "__main__":

    # export CUDA_VISIBLE_DEVICES=1  # 终端执行回车【1卡训练】,注意要大写

    configure_module()
    args = make_parser().parse_args()


    # 训练自己数据  主要是改动如下这四行啥的,训练结果默认在output的py文件夹名下
    args.exp_file = "exps/yolox_s_me.py"  # 叠加训练100次
    args.device = 1  # 用1张卡
    args.batch_size = 16 
    args.ckpt = "checkpoints/yolox_s.pth"  # 预训练模型,跟着readme前去下载,还有yolo_l等大模型对应yolox_l的其他大的配置文件

    exp = get_exp(args.exp_file, args.name)
    exp.merge(args.opts)
    check_exp_value(exp)

    if not args.experiment_name:
        args.experiment_name = exp.exp_name

    num_gpu = get_num_devices() if args.devices is None else args.devices
    assert num_gpu <= get_num_devices()

    if args.cache is not None:
        exp.dataset = exp.get_dataset(cache=True, cache_type=args.cache)

    dist_url = "auto" if args.dist_url is None else args.dist_url
    launch(
        main,
        num_gpu,
        args.num_machines,
        args.machine_rank,
        backend=args.dist_backend,
        dist_url=dist_url,
        args=(exp, args),
    )

训练完了后,在YOLOX_outputs\yolox_s_me下边有结果,日志啥的,best.pth模型啥的。。。

7.验证

 把tools/eval.py复制一份到根目录,叫my_eval.py,简单验证就不动其他内容,就主函数里加配置文件和你的best_ckpt.pth啥的,验的是验证集效果,

if __name__ == "__main__":

    """
    用gpu方法:
    1.终端运行 export CUDA_VISIBLE_DEVICES=1  表示用gpu为1的卡
    2.代码中  args.device = 1 表示用1张卡
    """

    # 自定义数据识别
    from yolox.data.datasets.coco_classes import COCO_CLASSES  # 运行前快速抵达修改类别
    args.exp_file = "exps/yolox_s_me.py"
    args.ckpt = "YOLOX_outputs/yolox_s_me/best_ckpt.pth"

    args.batch_size = 2
    args.device = 1
    args.conf = 0.5
    args.nms = 0.5
    args.test = False  # 不含测试集的 只验证val集  balabala。。。。。。

    exp = get_exp(args.exp_file, args.name)
    exp.merge(args.opts)

    if not args.experiment_name:
        args.experiment_name = exp.exp_name

    num_gpu = torch.cuda.device_count() if args.devices is None else args.devices
    assert num_gpu <= torch.cuda.device_count()

    dist_url = "auto" if args.dist_url is None else args.dist_url
    launch(
        main,
        num_gpu,
        args.num_machines,
        args.machine_rank,
        backend=args.dist_backend,
        dist_url=dist_url,
        args=(exp, args, num_gpu),
    )

8.评估混淆矩阵

目标检测都是map,不好给领导汇报,领导只想看目标对与错了多少,可以转为类似分类的混淆矩阵和精准率,召回率啥的。【注:不做改动的demo.py就是推理图片用的,args.path 为图片路径则推理一张图,若为一个文件夹则推理文件夹中所有图!!!结果依然在YOLOX_output】

下面是我把tools/demo.py复制了一份叫demo_metric.py,加一些操作评估精准率和召回率:

#!/usr/bin/env python3
# -*- coding:utf-8 -*-
# Copyright (c) Megvii, Inc. and its affiliates.

import argparse
import json
import os
import time

import numpy as np
from loguru import logger

import cv2

import torch

from yolox.data.data_augment import ValTransform
from yolox.exp import get_exp
from yolox.utils import fuse_model, get_model_info, postprocess, vis

IMAGE_EXT = [".jpg", ".jpeg", ".webp", ".bmp", ".png"]



# Calculate precision and recall for object detection
def calculate_precision_recall(detected_boxes, true_boxes, iou_threshold):
    """
    Calculate precision and recall for object detection
    :param detected_boxes: list of detected bounding boxes in format [xmin, ymin, xmax, ymax]
    :param true_boxes: list of true bounding boxes in format [xmin, ymin, xmax, ymax]
    :param iou_threshold: intersection over union threshold for matching detected and true boxes
    :return: precision and recall
    """
    num_true_boxes = len(true_boxes)
    true_positives = 0
    for true_box in true_boxes:
        max_iou = 0
        for detected_box in detected_boxes:
            iou = calculate_iou(detected_box, true_box)
            if iou > max_iou:
                max_iou = iou
            if max_iou >= iou_threshold:
                true_positives += 1
                break
    false_positives = len(detected_boxes) - true_positives
    false_negatives = num_true_boxes - true_positives
    precision = true_positives / (true_positives + false_positives)
    recall = true_positives / (true_positives + false_negatives)
    # print('TP: ', true_positives)
    # print('FP: ', false_positives)
    # print('FN: ', false_negatives)
    return precision, recall


def calculate_iou(box1, box2):
    """
    Calculate intersection over union (IoU) between two bounding boxes
    :param box1: bounding box in format [xmin, ymin, xmax, ymax]
    :param box2: bounding box in format [xmin, ymin, xmax, ymax]
    :return: IoU between box1 and box2
    """
    x1 = max(box1[0], box2[0])
    y1 = max(box1[1], box2[1])
    x2 = min(box1[2], box2[2])
    y2 = min(box1[3], box2[3])
    intersection = max(0, x2 - x1) * max(0, y2 - y1)
    area_box1 = (box1[2] - box1[0]) * (box1[3] - box1[1])
    area_box2 = (box2[2] - box2[0]) * (box2[3] - box2[1])
    union = area_box1 + area_box2 - intersection
    iou = intersection / union
    return iou





def make_parser():
    parser = argparse.ArgumentParser("YOLOX Demo!")
    # parser.add_argument(
    #     "demo", default="image", help="demo type, eg. image, video and webcam"
    # )
    parser.add_argument(
        "--demo", default="image", help="demo type, eg. image, video and webcam"
    )
    parser.add_argument("-expn", "--experiment-name", type=str, default=None)
    parser.add_argument("-n", "--name", type=str, default=None, help="model name")

    parser.add_argument(
        "--path", default="./assets/dog.jpg", help="path to images or video"
    )
    parser.add_argument("--camid", type=int, default=0, help="webcam demo camera id")
    parser.add_argument(
        "--save_result",
        action="store_true",
        help="whether to save the inference result of image/video",
    )

    # exp file
    parser.add_argument(
        "-f",
        "--exp_file",
        default=None,
        type=str,
        help="please input your experiment description file",
    )
    parser.add_argument("-c", "--ckpt", default=None, type=str, help="ckpt for eval")
    parser.add_argument(
        "--device",
        default="cpu",
        type=str,
        help="device to run our model, can either be cpu or gpu",
    )
    parser.add_argument("--conf", default=0.3, type=float, help="test conf")
    parser.add_argument("--nms", default=0.3, type=float, help="test nms threshold")
    parser.add_argument("--tsize", default=None, type=int, help="test img size")
    parser.add_argument(
        "--fp16",
        dest="fp16",
        default=False,
        action="store_true",
        help="Adopting mix precision evaluating.",
    )
    parser.add_argument(
        "--legacy",
        dest="legacy",
        default=False,
        action="store_true",
        help="To be compatible with older versions",
    )
    parser.add_argument(
        "--fuse",
        dest="fuse",
        default=False,
        action="store_true",
        help="Fuse conv and bn for testing.",
    )
    parser.add_argument(
        "--trt",
        dest="trt",
        default=False,
        action="store_true",
        help="Using TensorRT model for testing.",
    )
    return parser


def get_image_list(path):
    image_names = []
    for maindir, subdir, file_name_list in os.walk(path):
        for filename in file_name_list:
            apath = os.path.join(maindir, filename)
            ext = os.path.splitext(apath)[1]
            if ext in IMAGE_EXT:
                image_names.append(apath)
    return image_names


class Predictor(object):
    def __init__(
        self,
        model,
        exp,
        cls_names,
        trt_file=None,
        decoder=None,
        device="cpu",
        fp16=False,
        legacy=False,
    ):
        self.model = model
        self.cls_names = cls_names
        self.decoder = decoder
        self.num_classes = exp.num_classes
        self.confthre = exp.test_conf
        self.nmsthre = exp.nmsthre
        self.test_size = exp.test_size
        self.device = device
        self.fp16 = fp16
        self.preproc = ValTransform(legacy=legacy)
        if trt_file is not None:
            from torch2trt import TRTModule

            model_trt = TRTModule()
            model_trt.load_state_dict(torch.load(trt_file))

            x = torch.ones(1, 3, exp.test_size[0], exp.test_size[1]).cuda()
            self.model(x)
            self.model = model_trt

    def inference(self, img):
        img_info = {"id": 0}
        if isinstance(img, str):
            img_info["file_name"] = os.path.basename(img)
            img = cv2.imread(img)
        else:
            img_info["file_name"] = None

        height, width = img.shape[:2]
        img_info["height"] = height
        img_info["width"] = width
        img_info["raw_img"] = img

        ratio = min(self.test_size[0] / img.shape[0], self.test_size[1] / img.shape[1])
        img_info["ratio"] = ratio

        img, _ = self.preproc(img, None, self.test_size)
        img = torch.from_numpy(img).unsqueeze(0)
        img = img.float()
        if self.device == "gpu":
            img = img.cuda()
            if self.fp16:
                img = img.half()  # to FP16

        with torch.no_grad():
            t0 = time.time()
            outputs = self.model(img)
            if self.decoder is not None:
                outputs = self.decoder(outputs, dtype=outputs.type())
            outputs = postprocess(
                outputs, self.num_classes, self.confthre,
                self.nmsthre, class_agnostic=True
            )  # (x1, y1, x2, y2, obj_conf, class_conf, class_pred)  这个框有对象的概率,对象是车的概率,车的标签代号
            logger.info("Infer time: {:.4f}s".format(time.time() - t0))
        return outputs, img_info

    def visual(self, output, img_info, cls_conf=0.35):
        ratio = img_info["ratio"]
        img = img_info["raw_img"]
        if output is None:
            return img
        output = output.cpu()

        bboxes = output[:, 0:4]

        # preprocessing: resize
        bboxes /= ratio

        cls = output[:, 6]
        scores = output[:, 4] * output[:, 5]

        vis_res = vis(img, bboxes, scores, cls, cls_conf, self.cls_names)
        return vis_res


def image_demo(predictor, path, args):
    if os.path.isdir(path):
        files = get_image_list(path)
    else:
        files = [path]
    files.sort()

    current_time = time.localtime()
    # 一个图
    outputs, img_info = predictor.inference(files[0])

    return outputs, img_info


def imageflow_demo(predictor, vis_folder, current_time, args):
    cap = cv2.VideoCapture(args.path if args.demo == "video" else args.camid)
    width = cap.get(cv2.CAP_PROP_FRAME_WIDTH)  # float
    height = cap.get(cv2.CAP_PROP_FRAME_HEIGHT)  # float
    fps = cap.get(cv2.CAP_PROP_FPS)
    if args.save_result:
        save_folder = os.path.join(
            vis_folder, time.strftime("%Y_%m_%d_%H_%M_%S", current_time)
        )
        os.makedirs(save_folder, exist_ok=True)
        if args.demo == "video":
            save_path = os.path.join(save_folder, os.path.basename(args.path))
        else:
            save_path = os.path.join(save_folder, "camera.mp4")
        logger.info(f"video save_path is {save_path}")
        vid_writer = cv2.VideoWriter(
            save_path, cv2.VideoWriter_fourcc(*"mp4v"), fps, (int(width), int(height))
        )
    while True:
        ret_val, frame = cap.read()
        if ret_val:
            outputs, img_info = predictor.inference(frame)
            result_frame = predictor.visual(outputs[0], img_info, predictor.confthre)
            if args.save_result:
                vid_writer.write(result_frame)
            else:
                cv2.namedWindow("yolox", cv2.WINDOW_NORMAL)
                cv2.imshow("yolox", result_frame)
            ch = cv2.waitKey(1)
            if ch == 27 or ch == ord("q") or ch == ord("Q"):
                break
        else:
            break


def main(exp, args, COCO_CLASSES_):
    if not args.experiment_name:
        args.experiment_name = exp.exp_name

    file_name = os.path.join(exp.output_dir, args.experiment_name)
    os.makedirs(file_name, exist_ok=True)

    vis_folder = None
    if args.save_result:
        vis_folder = os.path.join(file_name, "vis_res")
        os.makedirs(vis_folder, exist_ok=True)

    if args.trt:
        args.device = "gpu"

    logger.info("Args: {}".format(args))

    if args.conf is not None:
        exp.test_conf = args.conf
    if args.nms is not None:
        exp.nmsthre = args.nms
    if args.tsize is not None:
        exp.test_size = (args.tsize, args.tsize)

    model = exp.get_model()
    logger.info("Model Summary: {}".format(get_model_info(model, exp.test_size)))

    if args.device == "gpu":
        model.cuda()
        if args.fp16:
            model.half()  # to FP16
    model.eval()

    if not args.trt:
        if args.ckpt is None:
            ckpt_file = os.path.join(file_name, "best_ckpt.pth")
        else:
            ckpt_file = args.ckpt
        logger.info("loading checkpoint")
        ckpt = torch.load(ckpt_file, map_location="cpu")
        # load the model state dict
        model.load_state_dict(ckpt["model"])
        logger.info("loaded checkpoint done.")

    if args.fuse:
        logger.info("\tFusing model...")
        model = fuse_model(model)

    if args.trt:
        assert not args.fuse, "TensorRT model is not support model fusing!"
        trt_file = os.path.join(file_name, "model_trt.pth")
        assert os.path.exists(
            trt_file
        ), "TensorRT model is not found!\n Run python3 tools/trt.py first!"
        model.head.decode_in_inference = False
        decoder = model.head.decode_outputs
        logger.info("Using TensorRT to inference")
    else:
        trt_file = None
        decoder = None


    predictor = Predictor(
        model=model, exp=exp, cls_names=COCO_CLASSES_, trt_file=trt_file, decoder=decoder,
        device=args.device, fp16=args.fp16, legacy=args.legacy,
    )

    # images 一个图
    outputs, img_info = image_demo(predictor, args.path, args)


    return outputs, img_info


if __name__ == "__main__":

    args = make_parser().parse_args()

    # 自定义数据识别
    from yolox.data.datasets.coco_classes import COCO_CLASSES  # 运行前快速抵达修改类别
    args.exp_file = "exps/yolox_s_me.py"
    args.ckpt = "YOLOX_outputs/yolox_s_me/ckpt_200.pth"
    data_path = "datasets/xxx/val2017"
    val_jsons = "datasets/xxx/annotations/instances_val2017.json"
    COCO_CLASSES_ = COCO_CLASSES
    args.conf = 0.5
    args.nms = 0.5
    args.device = "cpu"  # 转numpy要用
    args.save_result = False


    # 获取标签的所有框框
    with open(val_jsons, "r") as f:
        val_labels = json.load(f)

    res_id_img = {}
    imgs = []
    for image_info in val_labels['images']:
        res_id_img[image_info['id']] = image_info['file_name']
        imgs.append(image_info['file_name'])

    bbox_res = {}
    for ann in val_labels['annotations']:
        bbox_new = [ann['bbox'][0], ann['bbox'][1], ann['bbox'][0]+ann['bbox'][2], ann['bbox'][1]+ann['bbox'][3]]
        if res_id_img[ann['image_id']] not in bbox_res.keys():
            bbox_res[res_id_img[ann['image_id']]] = [bbox_new]
        else:
            tmp_list = bbox_res[res_id_img[ann['image_id']]]
            tmp_list.append(bbox_new)
            bbox_res[res_id_img[ann['image_id']]] = tmp_list


    res_all = {}
    imgs = list(set(imgs))

    # imgs = ["1.jpg"]  # 调试单张

    for img in imgs:
        img_file = os.path.join(data_path, img)
        args.path = img_file
        # 推理
        exp = get_exp(args.exp_file, args.name)
        res, img_info = main(exp, args, COCO_CLASSES_)
        res_list = np.array(res[0]).tolist()
        infer_bbox = [[ii[0]/img_info["ratio"], ii[1]/img_info["ratio"], ii[2]/img_info["ratio"], ii[3]/img_info["ratio"]] for ii in res_list]  # img_info中有一个转换率

        if img not in res_all.keys():
            res_all[img] = [infer_bbox]
        else:
            tmp = res_all[img]
            tmp.append(infer_bbox)
            res_all[img] = tmp

    precision_all = []
    recall_all = []
    print()
    iou_threshold = 0.3  # 计算精准召回时候iou设定
    for img in imgs:
        precision, recall = calculate_precision_recall(detected_boxes=res_all[img][0], true_boxes=bbox_res[img], iou_threshold=iou_threshold)
        print(img, precision, recall)
        precision_all.append(precision)
        recall_all.append(recall)
    print()
    print('验证集平均精度 ', float(sum(precision_all)/len(precision_all)))
    print('验证集平均召回 ', float(sum(recall_all)/len(recall_all)))

9.导出onnx

同理,把tools/export_onnx.py复制一份出来,加如下内容即可实现导出onnx,所有结果都在前面说的那个路径下:

    args = make_parser().parse_args()
    # 主函数下args变量下加下面一些内容:
    os.mkdir("YOLOX_onnx")
    args.output_name = "YOLOX_onnx/yolox_s_me.onnx"  
    args.exp_file = "exps/yolox_s_me.py"
    args.ckpt = "YOLOX_outputs/yolox_s_me/best_ckpt.pth"

10.onnx推理

主要是数据输入要搞正确,yolox默认640,以及其他一些onnxruntime的格式要正确,onnx推理图片的目标检测如下:

import os
import cv2
import numpy as np
import onnxruntime
from yolox.data.data_augment import preproc as preprocess
from yolox.utils.demo_utils import multiclass_nms, demo_postprocess
from yolox.utils.visualize import vis

"""
yolox_onnx 推理 demo
"""

if __name__ == '__main__':


    COCO_CLASSES = ('c1', 'c2', ...) 
    data_path = "test_datas"  # 测试图片的文件夹,里边是一堆图片
    model = "/xxx/yolox_s_me.onnx"  # 前面导出的onnx模型
    output_dir = "/xxx/xxx/result"  # 推理结果保存文件夹
    score_thr = 0.5
    NMS = 0.5

    for img in os.listdir(data_path):
        image_path = os.path.join(data_path, img)
        input_shape = "640, 640"
        input_shape = tuple(map(int, input_shape.split(',')))
        origin_img = cv2.imread(image_path)
        img, ratio = preprocess(origin_img, input_shape)

        session = onnxruntime.InferenceSession(model)

        ort_inputs = {session.get_inputs()[0].name: img[None, :, :, :]}  #  ort_input 1 3 640 640
        output = session.run(None, ort_inputs)  # [1 8400 12]
        predictions = demo_postprocess(output[0], input_shape)[0]

        boxes = predictions[:, :4]
        scores = predictions[:, 4:5] * predictions[:, 5:]

        boxes_xyxy = np.ones_like(boxes)
        boxes_xyxy[:, 0] = boxes[:, 0] - boxes[:, 2]/2.
        boxes_xyxy[:, 1] = boxes[:, 1] - boxes[:, 3]/2.
        boxes_xyxy[:, 2] = boxes[:, 0] + boxes[:, 2]/2.
        boxes_xyxy[:, 3] = boxes[:, 1] + boxes[:, 3]/2.
        boxes_xyxy /= ratio
        dets = multiclass_nms(boxes_xyxy, scores, nms_thr=NMS, score_thr=score_thr)

        if dets is not None:
            final_boxes, final_scores, final_cls_inds = dets[:, :4], dets[:, 4], dets[:, 5]
            origin_img = vis(origin_img, final_boxes, final_scores, final_cls_inds,
                             conf=score_thr, class_names=COCO_CLASSES)

        if not os.path.exists(output_dir):
            os.makedirs(output_dir)

        output_path = os.path.join(output_dir, os.path.basename(image_path))
        cv2.imwrite(output_path, origin_img)
        print('one img infer ok.')
    print('all img infer ok !!!')

-- 补充:docker环境

- dockerFile如下: 从阿里源拉一个torch的基础镜像.........

# https://www.modelscope.cn/docs/环境安装 # GPU环境镜像(python3.10)

FROM registry.cn-beijing.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda12.1.0-py310-torch2.3.0-tf2.16.1-1.15.0

# FROM registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda12.1.0-py310-torch2.3.0-tf2.16.1-1.15.0

# FROM registry.us-west-1.aliyuncs.com/modelscope-repo/modelscope:ubuntu22.04-cuda12.1.0-py310-torch2.3.0-tf2.16.1-1.15.0

RUN mkdir /opt/code

WORKDIR /opt/code

- 构建镜像 docker build -t hxy_base_image .

- 创建容器 docker run --name  hxy_mmcls  -d  -p 9528:22  --shm-size=1g  hxy_base_image   tail -f /dev/null

【-d 表示后台, -p表示端口映射 --shm-size 表示共享内存分配  tail -f /dev/null表示啥也不干】

- docker run还有些参数,可以酌情添加。

- docker exec -it 容器id /bin/bash: 命令可以进到容器

- docker images, docker ps | grep hezy: 查看镜像和容器等等

针对yolox只需要继续安装一个:pip install loguru  然后应该就能用了。

- 补充:映射ssh等以及如下:

vim /etc/ssh/sshd_config  下边这些设置放开:

Port 22

AddressFamily any

ListenAddress 0.0.0.0

PermitRootLogin yes

PermitEmptyPasswords yes

PasswordAuthentication  yes

#重启ssh

service ssh restart

# 设置root密码:passwd root

外边就root/root和IP:端口登录了。【其他shell或者pycharm等idea登录用】

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2245706.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

Electron开发构建工具electron-vite(alex8088)添加VueDevTools(VitePlugin)

零、介绍 本文章的electron-vite指的是这个项目&#x1f449;electron-vite仓库&#xff0c;electron-vite网站 本文章的VueDevTools指的是VueDevTools的Vite插件版&#x1f449;https://devtools.vuejs.org/guide/vite-plugin 一、有一个用electron-vite创建的项目 略 二、…

软件测试—— Selenium 常用函数(一)

前一篇文章&#xff1a;软件测试 —— 自动化基础-CSDN博客 目录 前言 一、窗口 1.屏幕截图 2.切换窗口 3.窗口设置大小 4.关闭窗口 二、等待 1.等待意义 2.强制等待 3.隐式等待 4.显式等待 总结 前言 在前一篇文章中&#xff0c;我们介绍了自动化的一些基础知识&a…

UE5 腿部IK 解决方案 footplacement

UE5系列文章目录 文章目录 UE5系列文章目录前言一、FootPlacement 是什么&#xff1f;二、具体实现 前言 在Unreal Engine 5 (UE5) 中&#xff0c;腿部IK&#xff08;Inverse Kinematics&#xff0c;逆向运动学&#xff09;是一个重要的动画技术&#xff0c;用于实现角色脚部准…

私有化部署视频平台EasyCVR宇视设备视频平台如何构建视频联网平台及升级视频转码业务?

在当今数字化、网络化的时代背景下&#xff0c;视频监控技术已广泛应用于各行各业&#xff0c;成为保障安全、提升效率的重要工具。然而&#xff0c;面对复杂多变的监控需求和跨区域、网络化的管理挑战&#xff0c;传统的视频监控解决方案往往显得力不从心。 EasyCVR视频融合云…

山东春季高考-C语言-综合应用题

&#xff08;2018年&#xff09;3.按要求编写以下C语言程序&#xff1a; &#xff08;1&#xff09;从键盘上输入三个整数a、b、c&#xff0c;判断能否以这三个数为边构成三角形&#xff0c;若可以则计算机三角形的面积且保留两位小数&#xff1b;若不可以则输出“不能构成三角…

Linux移植IMX6ULL记录 一:编译源码并支持能顺利进入linux

目录 前言 一、不修改文件进行编译 二、修改设备树文件进行编译 前言 我用的开发板是100_ask_imx6ull_pro&#xff0c;其自带的linux内核版本linux-4.9.88&#xff0c;然后从linux官网下载过来的linux-4.9.88版本的arch/arm/configs/defconfig和dts设备树文件并没有对imx6ull…

从Stream的 toList() 和 collect(Collectors.toList()) 方法看Java的不可变流

环境 JDK 21Windows 11 专业版IntelliJ IDEA 2024.1.6 背景 在使用Java的Stream的时候&#xff0c;常常会把流收集为List。 假设有List list1 如下&#xff1a; var list1 List.of("aaa", "bbbbbb", "cccc", "d", "eeeee&qu…

大语言模型---LoRA简介;LoRA的优势;LoRA训练步骤;总结

文章目录 1. 介绍2. LoRA的优势3. LoRA训练步骤&#xff1a;4.总结 1. 介绍 LoRA&#xff08;Low-Rank Adaptation&#xff09;是一种用于高效微调大模型的技术&#xff0c;它通过在已有模型的基础上引入低秩矩阵来减少训练模型时所需的参数量和计算量。具体来说&#xff0c;L…

Debug-031-近期功能实现小结

由于时间原因&#xff0c;没办法对每个小的功能点进行比较细致的总结&#xff0c;这里统一去记录一下最近的实现了的功能&#xff0c;算是存档备份&#xff0c;为今后开发带来便利和参考。 一、ACEeditor ACEeditor使用手册&#xff08;一&#xff09;_ace editor-CSDN博客 AC…

深度学习中的mAP

在深度学习中&#xff0c;mAP是指平均精度均值(mean Average Precision)&#xff0c;它是深度学习中评价模型好坏的一种指标(metric)&#xff0c;特别是在目标检测中。 精确率和召回率的概念&#xff1a; (1).精确率(Precision)&#xff1a;预测阳性结果中实际正确的比例(TP / …

基于SpringBoot+Vue的影院管理系统(含演示视频+运行截图+说明文档)

web启动链接地址&#xff1a; http://localhost:8082&#xff08;管理端&#xff09; http://localhost:8081&#xff08;用户端&#xff09; http://localhost:8082&#xff08;员工端&#xff09; 一、项目介绍 基于框架的系统&#xff0c;系统分为用户、员工和管理员三个…

科研实验室的数字化转型:Spring Boot系统

1系统概述 1.1 研究背景 随着计算机技术的发展以及计算机网络的逐渐普及&#xff0c;互联网成为人们查找信息的重要场所&#xff0c;二十一世纪是信息的时代&#xff0c;所以信息的管理显得特别重要。因此&#xff0c;使用计算机来管理实验室管理系统的相关信息成为必然。开发合…

网络无人值守批量装机-cobbler

网络无人值守批量装机-cobbler 一、cobbler简介 ​ 上一节中的pxe+kickstart已经可以解决网络批量装机的问题了,但是环境配置过于复杂,而且仅针对某一个版本的操作系统进批量安装则无法满足目前复杂环境的部署需求。 ​ 本小节所讲的cobbler则是基于pxe+kickstart技术的二…

基于Java Springboot二手商品网站

一、作品包含 源码数据库全套环境和工具资源部署教程 二、项目技术 前端技术&#xff1a;Html、Css、Js、Vue、Element-ui 数据库&#xff1a;MySQL 后端技术&#xff1a;Java、Spring Boot、MyBatis 三、运行环境 开发工具&#xff1a;IDEA/eclipse 数据库&#xff1a;…

使用chrome 访问虚拟机Apache2 的默认页面,出现了ERR_ADDRESS_UNREACHABLE这个鸟问题

本地环境 主机MacOs Sequoia 15.1虚拟机Parallels Desktop 20 for Mac Pro Edition 版本 20.0.1 (55659)虚拟机-操作系统Ubuntu 22.04 服务器版本 最小安装 开发环境 编辑器编译器调试工具数据库http服务web开发防火墙Vim9Gcc13Gdb14Mysql8Apache2Php8.3Iptables 第一坑 数…

java: spire.pdf.free 9.12.3 create pdf

可以用windows 系统中文字体&#xff0c;也可以从文件夹的字体文件 /*** encoding: utf-8* 版权所有 2024 ©涂聚文有限公司* 许可信息查看&#xff1a;言語成了邀功盡責的功臣&#xff0c;還需要行爲每日來值班嗎* 描述&#xff1a;* # Author : geovindu,Geovin Du 涂…

PSO融合DWA路径规划(附MATLAB源代码)

PSO&#xff08;粒子群优化算法&#xff09;和DWA&#xff08;动态窗口法&#xff09;是路径规划领域常用的两种算法&#xff0c;它们结合使用可以充分发挥各自的优势&#xff0c;实现高效且安全的机器人路径规划。 1. PSO算法的全局路径规划 - 工作原理&#xff1a;PSO模拟群…

双因子认证:统一运维平台安全管理策略

01双因子认证概述 双因子认证&#xff08;Two-Factor Authentication&#xff0c;简称2FA&#xff09;是一种身份验证机制&#xff0c;它要求用户提供两种不同类型的证据来证明自己的身份。这通常包括用户所知道的&#xff08;如密码&#xff09;、用户所拥有的&#xff08;如…

蓝桥杯每日真题 - 第19天

题目&#xff1a;&#xff08;费用报销&#xff09; 题目描述&#xff08;13届 C&C B组F题&#xff09; 解题思路&#xff1a; 1. 问题抽象 本问题可以看作一个限制条件较多的优化问题&#xff0c;核心是如何在金额和时间约束下选择最优方案&#xff1a; 动态规划是理想…

MyBatis实践:提高持久化层数据处理效率

一、MyBatis简介: 1.简介:https://mybatis.org/mybatis-3/zh/index.html?spmwolai.workspace.0.0.66162306mX2SuC MyBatis最初是Apache的一个开源项目iBatis, 2010年6月这个项目由Apache Software Foundation迁移到了Google Code。随着开发团队转投Google Code旗下&#xff…