DeepStream--测试resnet50分类模型

news2025/6/30 7:26:30

ResNet50是一种深度残差网络，50表示包含50层深度。该模型可以用于图像分类，物体检测等。

现在用DeepStream测试ResNet50分类模型。

1 资源

模型地址：https://github.com/onnx/models/blob/main/vision/classification/resnet/model/resnet50-v2-7.onnx,模型信息详见https://github.com/onnx/models/tree/main/vision/classification/resnet。

label文件：https://gist.github.com/yrevar/942d3a0ac09ec9e5eb3a

TensorRT的python测试代码：https://github.com/NVIDIA/TensorRT/blob/release/8.6/samples/python/introductory_parser_samples/onnx_resnet50.py

nvinfer配置文件dstest_appsrc_config.txt：

[property]
gpu-id=0

labelfile-path=labels.txt
model-engine-file=resnet50.onnx_b1_gpu0_fp16.engine
onnx-file=resnet50-v2-7.onnx

infer-dims=3;224;224
net-scale-factor=0.01742919
offsets=114.75;114.75;114.75
network-type=1
input-object-min-width=64
input-object-min-height=64
model-color-format=1
#gie-unique-id=2
#operate-on-gie-id=1
#operate-on-class-ids=0
#classifier-async-mode=1
#classifier-threshold=0.51

#force-implicit-batch-dim=1
batch-size=1
network-mode=1
num-detected-classes=1000
interval=0
gie-unique-id=1
output-blob-names=495
#scaling-filter=0
#scaling-compute-hw=0
cluster-mode=2
is-classifier=1

[class-attrs-all]
pre-cluster-threshold=0.2
topk=20
nms-iou-threshold=0.5

测试图片broom.JPG:

如果要用DeepStream跑这个模型，只需要修改nvifner的配置文件。现在问题是怎么python处理转成nvinfer的配置文件。有几个注意的地方：

# 归一化

Python版的归一化，是按这个公式https://github.com/NVIDIA/TensorRT/blob/release/8.6/samples/python/introductory_parser_samples/onnx_resnet50.py#L73

# This particular ResNet50 model requires some preprocessing, specifically, mean normalization.
return (image_arr / 255.0 - 0.45) / 0.225

而nvinfer支持的是这个公式

y = net scale factor*(x-mean)

这就需要把分子分母同乘个数，最终变为y=0.01742919 * (x - 114.75).

#第一个模型为分类模型

#nvinfer的大部分例子，第一个模型都是检测模型，这个例子第一个模型为分类模型。需要做如下设置：

network-type=1

#后处理

python例子中要对推理后的数据，要做个argmax操作，也就是从1000个结果里，取可能性最大的。如果nvifner没有设parse-bbox-func-name, 那插件用的resnet的bbox解析函数，刚好就是从可能性里找最大的。

2 运行

将模型resnet50-v2-7.onnx，dstest_appsrc_config.txt，测试图片放在一起后，执行命令：

gst-launch-1.0 filesrc location=broom.JPG ! jpegdec ! videoconvert ! video/x-raw,format=I420 ! nvvideoconvert ! video/x-raw\(memory:NVMM\),format=NV12 ! mux.sink_0 nvstreammux name=mux batch-size=1 width=1280 height=720 ! nvinfer config-file-path=./dstest_appsrc_config.txt ! nvvideoconvert ! video/x-raw\(memory:NVMM\),format=RGBA ! nvdsosd ! nvvideoconvert ! video/x-raw,format=I420 ! jpegenc ! filesink location=out.jpg

3 问题

执行命令后，发现生成的图片也没有分类的字符串。

nvinfer插件和nvinfer底层库是开源，改了源代码之后，需要编译，替换对应库。在/opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer/nvdsinfer_context_impl_output_parsing.cpp加点打印之后，发现的到的分类是对的，索引462在label文件里就是broom，只是没有取到分类的标签字符串。打印如下：

......

probability:2.433594, m_ClassifierThreshold:0.000000
probability:1.642578, m_ClassifierThreshold:0.000000
fd, attr.attributeValue:462
......

nvinfer的label解析函数InferPostprocessor::parseLabelsFile，要求文件是以分号相隔的，而这个label不是的。所以解析不成，nvinfer的代码是开源的，用户可以修改这个函数。相关代码如下：

NvDsInferStatus

InferPostprocessor::parseLabelsFile(const std::string& labelsFilePath)

{

std::ifstream labels_file(labelsFilePath);

std::string delim{';'};

if (!labels_file.is_open())

......