目前主流的目标跟踪算法都是基于Tracking-by-Detecton策略，即基于目标检测的结果来进行目标跟踪。

跟踪结果中，每个bbox左上角的数字是用来标识某个人的唯一ID号。那么问题就来了，视频中不同时刻的同一个人，位置发生了变化，是如何关联上的呢？答案就是匈牙利算法和卡尔曼滤波

匈牙利算法可以判断当前帧的某个目标，是否与前一帧的某个目标相同。
卡尔曼滤波可以基于目标前一时刻的位置，来预测当前时刻的位置，并且可以比传感器（在目标跟踪中即目标检测器，比如Yolo等）更准确的估计目标的位置。

匈牙利算法（Hungarian Algorithm）

卡尔曼滤波（Kalman Filter）

DeepSort工作流程

三、个人总结

匈牙利算法（Hungarian Algorithm）

匈牙利算法，就是可以将分配问题的代价最小化。

例如现在有三个人A、B、C出现在视频中，目标检测器也检测出来了，但是目标检测器只能判定A、B、C三个目标都是人，不能判定A、B、C谁是谁。在DeepSORT中，匈牙利算法将前一帧的跟踪tracks与当前帧中的检测框detections进行关联，外观信息（appearance information）和马氏距离（Mahalanobis distance），或者IOU来计算代价矩阵，并得出代价最小的分配结果。

源码解读：

#  linear_assignment.py
def min_cost_matching(distance_metric, max_distance, tracks, detections, 
                      track_indices=None, detection_indices=None):
    ...
    
    # 计算代价矩阵
    cost_matrix = distance_metric(tracks, detections, track_indices, detection_indices)
    cost_matrix[cost_matrix > max_distance] = max_distance + 1e-5
    
    # 执行匈牙利算法，得到匹配成功的索引对，行索引为tracks的索引，列索引为detections的索引
    row_indices, col_indices = linear_assignment(cost_matrix)
 
    matches, unmatched_tracks, unmatched_detections = [], [], []
 
    # 找出未匹配的detections
    for col, detection_idx in enumerate(detection_indices):
        if col not in col_indices:
            unmatched_detections.append(detection_idx)
     
    # 找出未匹配的tracks
    for row, track_idx in enumerate(track_indices):
        if row not in row_indices:
            unmatched_tracks.append(track_idx)
    
    # 遍历匹配的(track, detection)索引对
    for row, col in zip(row_indices, col_indices):
        track_idx = track_indices[row]
        detection_idx = detection_indices[col]
        # 如果相应的cost大于阈值max_distance，也视为未匹配成功
        if cost_matrix[row, col] > max_distance:
            unmatched_tracks.append(track_idx)
            unmatched_detections.append(detection_idx)
        else:
            matches.append((track_idx, detection_idx))
 
    return matches, unmatched_tracks, unmatched_detections

如DeepSORT源码所示：

用上一帧的tracks、当前帧的detections、计算出代价矩阵 cost_matrix

执行匈牙利算法，得到匹配成功的索引对，行索引为tracks的索引，列索引为detections的索引

找出未匹配的detections，未匹配的tracks

遍历匹配的（track, detection）索引对，如相应的cost大于阈值max_distance，也视为未匹配成功，筛过后便是匹配成功的

卡尔曼滤波（Kalman Filter）

在目标跟踪中，需要估计track的两个状态：

均值(Mean)：表示目标的位置信息，由bbox的中心坐标 (cx, cy)，宽高比r，高h，以及各自的速度变化值组成，由8维向量表示为 x = [cx, cy, r, h, vx, vy, vr, vh]，各个速度值初始化为0。
协方差(Covariance )：表示目标位置信息的不确定性，由8x8的对角矩阵表示，矩阵中数字越大则表明不确定性越大，可以以任意值初始化。

卡尔曼滤波分为两个阶段：（1）预测track在下一时刻的位置，（2）基于detection来更新预测的位置

预测：

基于track在t-1时刻的状态来预测其在t时刻的状态。

式（1）中，F为状态转移矩阵，x为t-1时刻的均值，x为t时刻的均值。可以把这个理解为新位置=旧位置*状态=旧位置 * （速度*单位时间）。

公式（2）中，P为track在t-1时刻的协方差，Q为系统的噪声矩阵，代表整个系统的可靠程度，一般初始化为很小的值，该公式预测t时刻的P'。

源码：

#  kalman_filter.py
def predict(self, mean, covariance):
    """Run Kalman filter prediction step.
    
    Parameters
    ----------
    mean: ndarray, the 8 dimensional mean vector of the object state at the previous time step.
    covariance: ndarray, the 8x8 dimensional covariance matrix of the object state at the previous time step.
 
    Returns
    -------
    (ndarray, ndarray), the mean vector and covariance matrix of the predicted state. 
     Unobserved velocities are initialized to 0 mean.
    """
    std_pos = [
        self._std_weight_position * mean[3],
        self._std_weight_position * mean[3],
        1e-2,
        self._std_weight_position * mean[3]]
    std_vel = [
        self._std_weight_velocity * mean[3],
        self._std_weight_velocity * mean[3],
        1e-5,
        self._std_weight_velocity * mean[3]]
    
    motion_cov = np.diag(np.square(np.r_[std_pos, std_vel]))  # 初始化噪声矩阵Q
    mean = np.dot(self._motion_mat, mean)  # x' = Fx
    covariance = np.linalg.multi_dot((self._motion_mat, covariance, self._motion_mat.T)) + motion_cov  # P' = FPF(T) + Q
 
    return mean, covariance

更新：

基于t时刻检测到的detection，校正与其关联的track的状态，得到一个更精确的结果。

式（3）计算detection和track的均值误差。

式（4）

式（5）计算卡尔曼增益K，卡尔曼增益用于估计误差的重要程度

式（6）和式（7）得到更新后的均值向量x和协方差矩阵P

源码解读：

#  kalman_filter.py
def project(self, mean, covariance):
    """Project state distribution to measurement space.
        
    Parameters
    ----------
    mean: ndarray, the state's mean vector (8 dimensional array).
    covariance: ndarray, the state's covariance matrix (8x8 dimensional).

    Returns
    -------
    (ndarray, ndarray), the projected mean and covariance matrix of the given state estimate.
    """
    std = [self._std_weight_position * mean[3],
           self._std_weight_position * mean[3],
           1e-1,
           self._std_weight_position * mean[3]]
        
    innovation_cov = np.diag(np.square(std))  # 初始化噪声矩阵R
    mean = np.dot(self._update_mat, mean)  # 将均值向量映射到检测空间，即Hx'
    covariance = np.linalg.multi_dot((
        self._update_mat, covariance, self._update_mat.T))  # 将协方差矩阵映射到检测空间，即HP'H^T
    return mean, covariance + innovation_cov


def update(self, mean, covariance, measurement):
    """Run Kalman filter correction step.

    Parameters
    ----------
    mean: ndarra, the predicted state's mean vector (8 dimensional).
    covariance: ndarray, the state's covariance matrix (8x8 dimensional).
    measurement: ndarray, the 4 dimensional measurement vector (x, y, a, h), where (x, y) is the 
                 center position, a the aspect ratio, and h the height of the bounding box.
    Returns
    -------
    (ndarray, ndarray), the measurement-corrected state distribution.
    """
    # 将mean和covariance映射到检测空间，得到Hx'和S
    projected_mean, projected_cov = self.project(mean, covariance)
    # 矩阵分解（这一步没看懂）
    chol_factor, lower = scipy.linalg.cho_factor(projected_cov, lower=True, check_finite=False)
    # 计算卡尔曼增益K
    kalman_gain = scipy.linalg.cho_solve(
            (chol_factor, lower), np.dot(covariance, self._update_mat.T).T,
            check_finite=False).T
    # z - Hx'
    innovation = measurement - projected_mean
    # x = x' + Ky
    new_mean = mean + np.dot(innovation, kalman_gain.T)
    # P = (I - KH)P'
    new_covariance = covariance - np.linalg.multi_dot((kalman_gain, projected_cov, kalman_gain.T))
        
    return new_mean, new_covariance