1. 预备知识

在这里插入图片描述

向量 $\mathbf{v}$ 的欧几里得范数： $\| \mathbf{v}\|$ ；
单位向量： $\hat{\mathbf{v}} =\mathbf{v}/ \| \mathbf{v}\|$ ；
两条线 $\mathbf{L_0}$ 和 $\mathbf{L_1}$ 间角度： $\angle(\mathbf{L}_0,\mathbf{L}_1)\in[0,\pi/2]$ ；
定义 $\mathbf{x_0}=[x_0, y_0, z_0]^T$ 和 $\mathbf{x_1}=[x_1, y_1, z_1]^T$ 分别为相机参考系 $C_0$ 和 $C_1$ 的三维坐标；
$\mathbf{R}$ 和 $\mathbf{t}$ 为两相机间的旋转和位移，则 $\mathbf{x}_1=\mathbf{R}\mathbf{x}_0+\mathbf{t}$ ；
相机标定矩阵： $\mathbf{K}$ ；
每帧点观测到的其次像素坐标： $\mathbf{u}_{0}=(u_{0},v_{0},1)^{\intercal}$ 和 $\mathbf{u}_{1}=(u_{1},v_{1},1)^{\intercal}$ ；
归一化图像坐标： $\mathbf{f_{0}}=\left[x_{0}/z_{0},y_{0}/z_{0},1\right]^{T}$ 和 $\mathbf{f_{1}}=\left[x_{1}/z_{1},y_{1}/z_{1},1\right]^{T}$ ，分别由 $\mathbf{f}_0=\mathbf{K}^{-1}\mathbf{u}_0$ 和 $\mathbf{f}_1=\mathbf{K}^{-1}\mathbf{u}_1$ 得到；
在 Fig.1a 的理想情况下，两个反向投影的光线相交，满足对极约束： $\mathbf{f}_1\cdot(\mathbf{t}\times\mathbf{Rf}_0)=0$ (SLAM十四讲 P167， $\mathbf{x_2}^Tt^{\wedge}\mathbf{R}\mathbf{x_1}=0$ )；
给定上图的深度标量 $\lambda_0$ 和 $\lambda_1$ ，交点 $\mathbf{x_1}$ 的公式为： $\begin{aligned}\mathbf{x}_{1}& =\lambda_0\mathbf{R}\mathbf{\hat{f_0}}+\mathbf{t} \end{aligned}$ 和 $\begin{aligned}\mathbf{x}_{1}& =\lambda_1\mathbf{\hat{f_1}}\end{aligned}$ 。( $\mathbf{\hat{f_0}}$ 和 $\mathbf{\hat{f_1}}$ 为理想值，由于图像测量和相机模型的不准确，这种情况很少发生。要从两条倾斜的光线推断三维点，就引出了下面的方法)；

1.1 评估 3D 点准确性

一旦使用某种三角测量方法获得 3D 点 $\mathbf{x}_{1}'$ 的估计，就可以以多种方式评估其准确性：

计算 3D 误差，即： $e_{3D}=\|\mathbf{x}_1^\prime-\mathbf{x}_{\mathrm{true}}\|$ (见 Fig.1b)；
计算 2D 误差，也就是重投影误差：
$d_i=\|\mathbf{K}\left(\mathbf{f}_i-\mathbf{f}_i'\right)\|=\left\|\mathbf{K}\left(\mathbf{f}_i-\left(\left[\mathbf{0}\text{ }\mathbf{0}\text{ }\mathbf{1}\right]\mathbf{x}_i'\right)^{-1}\mathbf{x}_i'\right)\right\|\quad\text{for}\quad i=0,1 \quad(1)$ 其中 $\mathbf{x}'_0=\mathbf{R}^\intercal(\mathbf{x}'_1-\mathbf{t})$ 。这里 $\left(\left[\mathbf{0}\text{ }\mathbf{0}\text{ }\mathbf{1}\right]\mathbf{x}_i'\right)^{-1}$ 求的是 $z$ 轴坐标的逆，再乘上 $\mathbf{x}'_1$ 可得归一化平面坐标(见 Fig.1b)；
2D 误差表示与测量的偏差，而 3D 误差表示与真实值的偏差。与 3D 误差不同，3D 点的 2D 误差可以在不同范数下评估，如：
- $L_1$ 范数： $d_0+d_1$ ；
- $L_2$ 范数： $\sqrt{d_0^2+d_1^2}$ ；
- $L_\infty$ 范数： $max(d_0,d_1)$ 。
除了 2D 和 3D 精度外，还可以评估得到的视差角精度(见 Fig.1c)，视差误差定义如下(这里的表示感觉有点迷， $\mathbf{x}_{\mathrm{true}}$ 说的应该是 $C_1$ 和 $\mathbf{x}_{\mathrm{true}}$ 的直线，而 $\mathbf{x}_{\mathrm{true}}-\mathbf{t}$ 为 $C_0$ 和 $\mathbf{x}_{\mathrm{true}}$ 的直线)：
$e_{\beta}=|\beta_{\mathrm{true}}-\beta'|=|\angle\left(\mathbf{x}_{\mathrm{true}},\mathbf{x}_{\mathrm{true}}-\mathbf{t}\right)-\angle\left(\mathbf{x}'_1,\mathbf{x}'_1-\mathbf{t}\right)| \quad(2)$ 将“原始视差”定义为原始反投影光线之间的角度：
$\beta_{\mathrm{raw}}=\angle\left(\mathbf{Rf}_0,\mathbf{f}_1\right). \quad(3)$ 这给出了独立于平移和三角化方法的视差角粗略估计。

2. 提出的方法

2.1 广义加权中点法

广义加权中点法 Generalized Weighted Midpoint (GWM) 包含以下三个步骤：

给定对应于同一点的两个反向投影光线，使用一些方法估计两条光线 $(\lambda_0, \lambda_1)$ 的深度；
计算这两条深度为 $\lambda_0$ 和 $\lambda_1$ 的光线的 3D 点，也就是在 $C_1$ 帧的 $\mathbf{t} + \lambda_0\mathbf{R}\mathbf{\hat{f_0}}$ 及 $\lambda_1\mathbf{\hat{f_1}}$ ；
通过计算它们的加权平均来获得 3D 点的最终估计。

以前经典的中点方法是，每条光线的两个点是有着相同权重的最接近点对。而下图展示了广义加权中点的另一个可能的示例：
在这里插入图片描述

2.2 可选的中点法

这里提出一种属于 GWM 的可选中点法。首先考虑两个反向投影碰巧相交的情况，如 Fig.1a。在这种情况下，最合理的解决方案是交点，可以使用正弦规则获得沿射线的对应深度：
$\lambda_0=\frac{\sin\left(\angle\left(\mathbf{f}_1,\mathbf{t}\right)\right)}{\sin\left(\angle\left(\mathbf{R}\mathbf{f}_0,\mathbf{f}_1,\right)\right)}\|\mathbf{t}\|=\frac{\|\hat{\mathbf{f}}_1\times\mathbf{t}\|}{\|\mathbf{R}\hat{\mathbf{f}}_0\times\hat{\mathbf{f}}_1\|},\quad\lambda_1=\frac{\sin\left(\angle\left(\mathbf{R}\mathbf{f}_0,\mathbf{t}\right)\right)}{\sin\left(\angle\left(\mathbf{R}\mathbf{f}_0,\mathbf{f}_1,\right)\right)}\|\mathbf{t}\|=\frac{\|\mathbf{R}\hat{\mathbf{f}}_0\times\mathbf{t}\|}{\|\mathbf{R}\hat{\mathbf{f}}_0\times\hat{\mathbf{f}}_1\|}. \quad(4)$ 在两条光线是倾斜的时候，也使用该公式估计深度。分别计算 3D 点在每条光线上的深度 $\lambda_0$ 和 $\lambda_1$ ，得：
$\mathbf{t}+\lambda_0\mathbf{R}\hat{\mathbf{f}}_0=\mathbf{t}+\frac{\|\mathbf{f}_1\times\mathbf{t}\|}{\|\mathbf{Rf}_0\times\mathbf{f}_1\|}\mathbf{Rf}_0\quad\text{and}\quad\lambda_1\hat{\mathbf{f_1}}=\frac{\|\mathbf{Rf}_0\times\mathbf{t}\|}{\|\mathbf{Rf}_0\times\mathbf{f}_1\|}\mathbf{f}_1 \quad(5)$ 取这两个点的中点有：
$\mathbf{x}_1'=\frac{1}{2}\left(\mathbf{t}+\frac{\|\mathbf{f}_1\times\mathbf{t}\|}{\|\mathbf{R}\mathbf{f}_0\times\mathbf{f}_1\|}\mathbf{R}\mathbf{f}_0+\frac{\|\mathbf{R}\mathbf{f}_0\times\mathbf{t}\|}{\|\mathbf{R}\mathbf{f}_0\times\mathbf{f}_1\|}\mathbf{f}_1\right). \quad(6)$ 令 $\mathbf p=\mathbf R\hat{\mathbf f}_0\times\hat{\mathbf f}_1$ ， $\mathbf q=\mathbf R\hat{\mathbf f}_0\times\mathbf t$ 以及 $\mathbf r=\hat{\mathbf f}_1\times\mathbf t$ ，深度公式可替换为：
$\lambda_{0}={\frac{||\mathbf{r}||}{||\mathbf{p}||}},\quad\lambda_{1}={\frac{||\mathbf{q}||}{||\mathbf{p}||}} \quad(7)$ 这些形式类似于以前经典的中点法给出的深度：
$\lambda_{\text{mid}0}=\frac{\hat{\mathbf{p}}\cdot\mathbf{r}}{\|\mathbf{p}\|},\quad\lambda_{\text{mid}1}=\frac{\hat{\mathbf{p}}\cdot\mathbf{q}}{\|\mathbf{p}\|}. \quad(8)$ 这两种公式的不同之处在于分子，equ.7 有大小 $\mathbf{r}$ 和 $\mathbf{q}$ ，而 equ.8 将它们投影到 $\mathbf{p}$ 上。因此，我们总是得到 $\lambda_0\geq \lambda_{\text{mid}0}$ 和 $\lambda_1\geq \lambda_{\text{mid}1}$ 。在大多数情况下，这意味着该方法得到的中点会比经典中点更远。在 Fig.2 描述的例子里，当我们估计一个远离相机的点时，通常(传统三角化方法)会导致估计的视差角较小。

2.3 Cheirality(多视图几何中代表着3D点的正景深约束)

在这里插入图片描述当三角化的点有负深度时，违反了 3D 点的正景深约束。这可能有很多原因产生，如点对的关联错了或极点附近的图像点有噪声。通常它不会造成严重的问题，因为可以很轻易检查每个点的正景深并抛弃坏点。对于经典的中点方法，可以通过检查 equ.8 给出的深度符号确认。但对于该文中给出的方法是不能这样确认的，因为给出的深度总为正。上图说明了两种方法间的差异。因此在该方法中，仅靠深度不能说明三角测量的结果是否可靠。

这时需要使用其他方法来测试该点是否为坏点(满足以下不等式即为坏点)：如果将至少一个深度的符号改为负会导致光线上两点间距变小，则抛弃这个点的对应关系：
$\|\mathbf{t}+\lambda_0\mathbf{R}\hat{\mathbf{f_0}}-\lambda_1\hat{\mathbf{f}}_1\|^2\geq\min\left(\|\mathbf{t}+\lambda_0\mathbf{R}\hat{\mathbf{f_0}}+\lambda_1\hat{\mathbf{f_1}}\|^2,\|\mathbf{t}-\lambda_0\mathbf{R}\hat{\mathbf{f_0}}-\lambda_1\hat{\mathbf{f}}_1\|^2,\|\mathbf{t}-\lambda_0\mathbf{R}\hat{\mathbf{f_0}}+\lambda_1\hat{\mathbf{f_1}}\|^2\right)\quad(9)$ 对于经典的中点法，令 $\lambda_0=|\lambda_{mid0}|$ 和 $\lambda_1=|\lambda_{mid1}|$ 可以有效得到与检验正景深约束相同的结果(显而易见，都是验证这个数是不是大于零的)。而现方法在 equ.9 在 Fig.3a 成立，是因为当 $\lambda_0=-|\lambda_{mid0}|$ 及 $\lambda_1=-|\lambda_{mid1}|$ 时两点最接近。

2.4 逆深度加权中点-Inverse Depth Weighted(IDW) Midpoint

由 equ.6 给出的为加权中点通常会导致两幅图像不成比例的重投影误差，Fig.3c 给出了一个例子。注意具有较小深度的光线往往会产生较大的重投影误差。为了补偿这种不平衡，文中建议使用逆深度作为权重：
$\mathbf{x}_1'=\frac{\lambda_0^{-1}\left(\mathbf{t}+\lambda_0\mathbf{R}\hat{\mathbf{f_0}}\right)+\lambda_1^{-1}\left(\lambda_1\hat{\mathbf{k_1}}\right)}{\lambda_0^{-1}+\lambda_1^{-1}}=\frac{\|\mathbf{q}\|}{\|\mathbf{q}\|+\|\mathbf{r}\|}\left(\mathbf{t}+\frac{\|\mathbf{r}\|}{\|\mathbf{p}\|}\left(\mathbf{k}\hat{\mathbf{f_0}}+\hat{\mathbf{f_1}}\right)\right).$

3. 实现代码

// Helper function
// Compute Relative motion between two absolute poses parameterized by Rt
// Rotate one bearing vector according to the relative motion
inline void AbsoluteToRelative(const Mat3 &R0, const Vec3 &t0, const Mat3 &R1, const Vec3 &t1, const Vec3 &x0, Mat3 &R, Vec3 &t, Vec3 &Rx0)
{
  R = R1 * R0.transpose();
  t = t1 - R * t0;
  Rx0 = R * x0;
}

bool TriangulateIDWMidpoint(const Mat3 & R0, const Vec3 & t0, const Vec3 & x0, const Mat3 & R1, const Vec3 & t1, const Vec3 & x1, Vec3* X_euclidean)
{
  // absolute to relative
  Mat3 R;
  Vec3 t, Rx0;
  AbsoluteToRelative(R0, t0, R1, t1, x0, R, t, Rx0);

  const double p_norm = Rx0.cross(x1).norm();
  const double q_norm = Rx0.cross(t).norm();
  const double r_norm = x1.cross(t).norm();

  // Eq. (10)
  const auto xprime1 = ( q_norm / (q_norm + r_norm) )
    * ( t + (r_norm / p_norm) * (Rx0 + x1) );

  // relative to absolute
  *X_euclidean = R1.transpose() * (xprime1 - t1);

  // Eq. (7)
  const Vec3 lambda0_Rx0 = (r_norm / p_norm) * Rx0;
  const Vec3 lambda1_x1 = (q_norm / p_norm) * x1;

  // Eq. (9) - test adequation
  return (t + lambda0_Rx0 - lambda1_x1).squaredNorm()
    <
    std::min(std::min(
      (t + lambda0_Rx0 + lambda1_x1).squaredNorm(),
      (t - lambda0_Rx0 - lambda1_x1).squaredNorm()),
      (t - lambda0_Rx0 + lambda1_x1).squaredNorm());
}