- 推理步骤
[picodet_s_320_coco.yml]
- Infer.py main()->run()
- Trainer.py __init__() self.model = create(cfg.architecture)
- Picodet.py from_config()->__init__()
- head = create(cfg['head'], **kwargs)时候调用:
- Layers.py MultiClassNMS __init__()
- Pico_head.py PicoHead __init__()
- head = create(cfg['head'], **kwargs)时候调用:
- Picodet.py from_config()->__init__()
- Trainer.py predict()
- 逐个data,outs = self.model(dat
- Picodet.py get_pred()->_forward()
- Inputs [1,3,320,320]
Backbone output
Neck output
Head output Pico_head.py forward()
post_process output
调用gfl_head.py post_process()
->decode():patch中逐图像处理,图像中逐个feature map处理
->get_bboxes_single()
->get_single_level_center_point:
bbox_pred的[1,x,y,32]:reg_max==8
x = F.softmax(x.reshape([-1, self.reg_max + 1]), axis=1)
x = F.linear(x, self.project).reshape([-1, 4])
再乘以stride
根据当前feature map的stride调整点坐标
找到cls_score的前nms_pre的值,
->bbox_utils.py distance2bbox 根据points的[x,y]和distance的(left, top, right, bottom),结合图像尺寸[320,320],获取bounding box的左上和右下坐标[x1,y1,x2,y2]
Mlvl_bboxes坐标根据原图resize的scale映射回原图
->nums():
->layers.py MultiClassNMS __call__()
->ops.py multiclass_nums()
- 筛选score大于阈值score_threshold的bbox
- 选择前nms_top_k的bbox
- 基于nms_threshold和nms_eta的自适应阈值NMS过滤IOU高的框
- 保留前keep_top_k的bbox
- Coco_utils.py get_infer_results
- json_results.py get_det_res 转变数据类型,保存bbox和score
- visualizer.py visualize_results 在原图上根据bbox绘制框
- visualizer.py draw_bbox 小于阈值不绘制
- visualizer.py save_result 结果保存txt
二、模型分析
- backbone
- 基于ShuffleNetV2,根据PP-LCNet进行优化,称为ESNet(Enhanced ShuffleNet)
- 改变1:给每个块增加SE,SE的两层激活分别是ReLU和H-Sigmoid
- 改变2:stride为2时,添加depthwise卷积和pointwise卷积整合通道信息
- 改变3:stride为1时,添加Ghost模块
- 改变4:channel-wise search for detection backbone,full model[128,256,512], ratio[[0.5, 0.675, 0.75, 0.875, 1]