【已解决】matrix contains invalid numeric entries，记录bug修改

news2025/7/8 1:10:40

文章目录

摘要
原因
解决办法
图像分类网络
- AlexNet
- VGGNet
- GooLeNet系列
- ResNet
- DenseNet
- Swin Transformer
- MAE
- CoAtNet
- ConvNeXtV1、V2
- MobileNet系列
- MPViT
- VIT
- SWA
- EfficientNet系列
- MOBILEVIT
- EdgeViTs
- MixConv
- RepLKNet
- TransFG
- ConvMAE
- MicroNet
- RepVGG
- MaxViT
- MAFormer
- GhostNet系列
- DEiT系列
- MetaFormer
- RegNet
- InternImage
- FasterNet
注意力机制
物体检测
行人属性识别
行人跟踪
OCR
超分辨采样
弱光增强
- RetinexNet
NLP
多模态
知识蒸馏
剪枝
智慧城市

摘要

在使用DeepSort做跟踪的时候，遇到了下面这个问题

matrix contains invalid numeric entries

出问题的代码如下：

def min_cost_matching(
        distance_metric, max_distance, tracks, detections, track_indices=None,
        detection_indices=None):
    if track_indices is None:
        track_indices = np.arange(len(tracks))
    if detection_indices is None:
        detection_indices = np.arange(len(detections))

    if len(detection_indices) == 0 or len(track_indices) == 0:
        return [], track_indices, detection_indices  # Nothing to match.
    for det in detections:
        print(det.tlwh,det.confidence)
    cost_matrix = distance_metric(
        tracks, detections, track_indices, detection_indices)
    print("distance_metric",cost_matrix)
    cost_matrix[cost_matrix > max_distance] = max_distance + 1e-5
    cost_matrix=np.nan_to_num(cost_matrix)
    row_indices, col_indices = linear_assignment(cost_matrix)

    matches, unmatched_tracks, unmatched_detections = [], [], []
    for col, detection_idx in enumerate(detection_indices):
        if col not in col_indices:
            unmatched_detections.append(detection_idx)
    for row, track_idx in enumerate(track_indices):
        if row not in row_indices:
            unmatched_tracks.append(track_idx)
    for row, col in zip(row_indices, col_indices):
        track_idx = track_indices[row]
        detection_idx = detection_indices[col]
        if cost_matrix[row, col] > max_distance:
            unmatched_tracks.append(track_idx)
            unmatched_detections.append(detection_idx)
        else:
            matches.append((track_idx, detection_idx))
    return matches, unmatched_tracks, unmatched_detections

由于代码来源开源代码，没有太多的注释，所以只能一步步的分析。
通过分析发现cost_matrix里包含nan。
在这里插入图片描述

原因

经过初步分心是cost_matrix包含nan造成的，于是，将nan转为0，所以在代码上面加上，如下代码：

cost_matrix=np.nan_to_num(cost_matrix)

然而，为啥为nan呢？我们进一步查找，继续打印log分析，打印出detections

    for det in detections:
        print(det.tlwh,det.confidence)

发现，凡是出现nan的时候，两个检测框几乎重叠。所以，我们就锁定了bug的位置。在distance_metric中

cost_matrix = distance_metric(
        tracks, detections, track_indices, detection_indices)

这个函数是上层传入的，在tracker.py可以找到，如下图：
在这里插入图片描述
然后，我们就可以锁定iou_matching.iou_cost，继续打印log，发现iou(bbox, candidates)为nan。
所以就锁定iou函数有问题。
然后，进入iou继续找bug。发现area_intersection = wh.prod(axis=1)，当出现nan的时候，area_intersection 为inf。
同时，打印wh，发现是整数数组，猜想应该溢出了。

解决办法

将数组转为float类型。代码如下：

    wh =np.maximum(0., br - tl).astype(float)
    area_intersection = wh.prod(axis=1)
    area_bbox = bbox[2:].astype(float).prod()
    area_candidates = candidates[:, 2:].astype(float).prod(axis=1)
    return area_intersection / (area_bbox + area_candidates - area_intersection)

图像分类网络

在这里插入图片描述

AlexNet

【第61篇】AlexNet：CNN开山之作

VGGNet

【第1篇】VGG

GooLeNet系列

【第2篇】GooLeNet

【第3篇】Inception V2

【第4篇】Inception V3

【第62篇】Inception-v4

ResNet

【第5篇】ResNet

DenseNet

【第10篇】DenseNet

Swin Transformer

【第16篇】Swin Transformer

【第49篇】Swin Transformer V2：扩展容量和分辨率

MAE

【第21篇】MAE（屏蔽自编码器是可扩展的视觉学习器）

CoAtNet

【第22篇】CoAtNet：将卷积和注意力结合到所有数据大小上

ConvNeXtV1、V2

【第25篇】力压Tramsformer，ConvNeXt成了CNN的希望

【第64篇】ConvNeXt V2论文翻译：ConvNeXt V2与MAE激情碰撞

MobileNet系列

【第26篇】MobileNets：用于移动视觉应用的高效卷积神经网络

【第27篇】MobileNetV2：倒置残差和线性瓶颈

【第28篇】搜索 MobileNetV3

MPViT

【第29篇】MPViT：用于密集预测的多路径视觉转换器

VIT

【第30篇】Vision Transformer

SWA

【第32篇】SWA：平均权重导致更广泛的最优和更好的泛化

EfficientNet系列

【第34篇】 EfficientNetV2：更快、更小、更强——论文翻译

MOBILEVIT

【第35篇】MOBILEVIT：轻量、通用和适用移动设备的Vision Transformer

EdgeViTs

【第37篇】EdgeViTs：在移动设备上使用Vision Transformers 的轻量级 CNN

MixConv

【第38篇】MixConv：混合深度卷积核

RepLKNet

【第39篇】RepLKNet将内核扩展到 31x31：重新审视 CNN 中的大型内核设计

TransFG

【第40篇】TransFG：用于细粒度识别的 Transformer 架构

ConvMAE

【第41篇】ConvMAE：Masked Convolution 遇到 Masked Autoencoders

MicroNet

【第42篇】MicroNet：以极低的 FLOP 实现图像识别

RepVGG

【第46篇】RepVGG ：让卷积再次伟大

MaxViT

【第48篇】MaxViT：多轴视觉转换器

MAFormer

【第53篇】MAFormer: 基于多尺度注意融合的变压器网络视觉识别

GhostNet系列

【第56篇】GhostNet:廉价操作得到更多的特征

【第57篇】RepGhost:一个通过重新参数化实现硬件高效的Ghost模块

DEiT系列

【第58篇】DEiT：通过注意力训练数据高效的图像transformer &蒸馏

MetaFormer

【第59篇】MetaFormer实际上是你所需要的视觉

RegNet

【第60篇】RegNet：设计网络设计空间

InternImage

【第73篇】InternImage：探索具有可变形卷积的大规模视觉基础模型

FasterNet

【第74篇】 FasterNet：CVPR2023年最新的网络，基于部分卷积PConv，性能远超MobileNet，MobileVit

注意力机制

【第23篇】NAM：基于标准化的注意力模块

物体检测

【第6篇】SSD论文翻译和代码汇总

【第7篇】CenterNet

【第8篇】M2Det

【第9篇】YOLOX

【第11篇】微软发布的Dynamic Head，创造COCO新记录：60.6AP

【第12篇】Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

【第13篇】CenterNet2论文解析，COCO成绩最高56.4mAP

【第14篇】UMOP

【第15篇】CBNetV2

【第19篇】SE-SSD论文翻译

【第24篇】YOLOR：多任务的统一网络

【第31篇】探索普通视觉Transformer Backbones用于物体检测

【第36篇】CenterNet++ 用于对象检测

【第45篇】YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

行人属性识别

【第66篇】行人属性识别研究综述（一）

【第66篇】行人属性识别研究综述（二）

行人跟踪

【第47篇】BoT-SORT：强大的关联多行人跟踪

【第65篇】SMILEtrack:基于相似度学习的多目标跟踪

【第70篇】DeepSort：论文翻译

【第72篇】深度学习在视频多目标跟踪中的应用综述

OCR

【第20篇】像人类一样阅读：自主、双向和迭代语言场景文本识别建模

【第44篇】DBNet：具有可微分二值化的实时场景文本检测

超分辨采样

【第33篇】SwinIR

弱光增强

RetinexNet

【第52篇】RetinexNet: Deep Retinex Decomposition for Low-Light Enhancement

【第50篇】迈向快速、灵活、稳健的微光图像增强

NLP

【第17篇】TextCNN

【第18篇】Bert论文翻译

多模态

【第43篇】CLIP：从自然语言监督中学习可迁移的视觉模型

知识蒸馏

【第54篇】知识蒸馏：Distilling the Knowledge in a Neural Network

剪枝

【第55篇】剪枝算法：通过网络瘦身学习高效卷积网络

【第71篇】DepGraph：适用任何结构的剪枝

智慧城市

【第51篇】用于交通预测的时空交互动态图卷积网络

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/738592.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！