CH6_No1:
1-1. 光流法可分为哪几类?
答:正向和逆向,其中两种方法各又包含了直加法和组合法;
1-2 在 compositional 中,为什么有时候需要做原始图像的 wrap?该 wrap 有何物理意义?
答:我个人的理解此处的wrap就是高博视频课ppt中对图像像素点在dt时间内在x和y方向上做了dx,dy的操作,以此与模板T图像构建像素误差方程:
即:
1-3 forward 和 inverse 有何差别?
答:高博课程作业pdf后面的试题中其实已经给出了部分的答案,inverse的方法相比forward在于对图像梯度计算上取巧;特征点提取的过程中,我们一般只能保证原始图像所提取的特征点一定是角点,而对于后续的图像特征提取中,由于尺度不确定性导致的当相机发生运动后,前一帧图像提取的角点信息,在后续的提取工作中发现该点不是角点特征点,故无法保证后面的图像特征点有梯度保证。基于此,逆向方法用 I 1 I_1 I1的梯度去代替 I 2 I_2 I2的梯度,且根据梯度不变原则,此处计算的海森矩阵只需要计算一次即可,不需要如前向法需要在每次得到更新时重新计算一次海森矩阵,极大降低了计算量;这样就导致两种方法的目标函数也不一样;根据文献和查阅的资料可知,一般前向法适用于模板图像 T T T的噪声较大的情况;逆向法则适用于当输入图像 I I I的噪声更多的情况。
CH6_No2
2-1 从最小二乘角度来看,每个像素的误差怎么定义?
答:
2-2 误差相对于自变量的导数如何定义?
答:参考:链接1
2-3 forward && inverse coding:
void OpticalFlowSingleLevel( // 没有图像金字塔的光流法
const Mat &img1,
const Mat &img2,
const vector<KeyPoint> &kp1,
vector<KeyPoint> &kp2,
vector<bool> &success,
bool inverse
) {
// parameters
int half_patch_size = 4;
int iterations = 10;
bool have_initial = !kp2.empty();
for (size_t i = 0; i < kp1.size(); i++) {
auto kp = kp1[i];
double dx = 0, dy = 0; // dx,dy need to be estimated
if (have_initial) {
dx = kp2[i].pt.x - kp.pt.x;
dy = kp2[i].pt.y - kp.pt.y;
}
double cost = 0, lastCost = 0;
bool succ = true; // indicate if this point succeeded
// Gauss-Newton iterations
for (int iter = 0; iter < iterations; iter++) {
Eigen::Matrix2d H = Eigen::Matrix2d::Zero();
Eigen::Vector2d b = Eigen::Vector2d::Zero();
cost = 0;
if (kp.pt.x + dx <= half_patch_size || kp.pt.x + dx >= img1.cols - half_patch_size ||
kp.pt.y + dy <= half_patch_size || kp.pt.y + dy >= img1.rows - half_patch_size) { // go outside 排除外点
succ = false;
break;
}
// compute cost and jacobian
for (int x = -half_patch_size; x < half_patch_size; x++)
for (int y = -half_patch_size; y < half_patch_size; y++) {
// TODO START YOUR CODE HERE (~8 lines)
double error = GetPixelValue(img1,kp.pt.x + x,kp.pt.y + y) - GetPixelValue(img2,kp.pt.x + x + dx,kp.pt.y + y + dy);
Eigen::Vector2d J; // Jacobian
if (inverse == false) {
// Forward Jacobian 参考链接中的偏导方程
J = -1.0 * Eigen::Vector2d(
0.5 * (GetPixelValue(img2, kp.pt.x + dx + x + 1, kp.pt.y + dy + y) -
GetPixelValue(img2, kp.pt.x + dx + x - 1, kp.pt.y + dy + y)),
0.5 * (GetPixelValue(img2, kp.pt.x + dx + x, kp.pt.y + dy + y + 1) -
GetPixelValue(img2, kp.pt.x + dx + x, kp.pt.y + dy + y - 1))
);
} else {
// Inverse Jacobian
J = -1.0 * Eigen::Vector2d(
0.5 * (GetPixelValue(img1, kp.pt.x + dx + x + 1, kp.pt.y + dy + y) -
GetPixelValue(img1, kp.pt.x + dx + x - 1, kp.pt.y + dy + y)),
0.5 * (GetPixelValue(img1, kp.pt.x + dx + x, kp.pt.y + dy + y + 1) -
GetPixelValue(img1, kp.pt.x + dx + x, kp.pt.y + dy + y - 1))
);
// NOTE this J does not change when dx, dy is updated, so we can store it and only compute error
}
// compute H, b and set cost;
if(iter == 0 || inverse == false){
// H += J.transpose() * J;
H += J * J.transpose();
}
// b += -J.transpose() * error;
b += -J * error;
cost += error * error;
// TODO END YOUR CODE HERE
}
// compute update
// TODO START YOUR CODE HERE (~1 lines)
Eigen::Vector2d update;
update = H.ldlt().solve(b);
// TODO END YOUR CODE HERE
if (isnan(update[0])) {
// sometimes occurred when we have a black or white patch and H is irreversible
cout << "update is nan" << endl;
succ = false;
break;
}
// cout << "iteration " << iter << " cost=" << cout.precision(12) << cost << endl;
if (iter > 0 && cost > lastCost) {
cout << "cost increased: " << cost << ", " << lastCost << endl;
break;
}
// update dx, dy
dx += update[0];
dy += update[1];
lastCost = cost;
succ = true;
}
success.push_back(succ);
// set kp2
if (have_initial) {
kp2[i].pt = kp.pt + Point2f(dx, dy);
} else {
KeyPoint tracked = kp;
tracked.pt += cv::Point2f(dx, dy);
kp2.push_back(tracked);
}
}
}
前向和后向两种方式计算得到的时间:
2-2 coarse-to-fine_LK
void OpticalFlowMultiLevel(
const Mat &img1,
const Mat &img2,
const vector<KeyPoint> &kp1,
vector<KeyPoint> &kp2,
vector<bool> &success,
bool inverse) {
// parameters
int pyramids = 4; //金字塔层数
double pyramid_scale = 0.3; //每一层按照0.3倍进行缩放
// 想象金字塔的样子,img1是金字塔的底层,从低向上以此0.3倍缩放四次到塔顶
double scales[] = {1.0, 0.3, 0.09, 0.027};
// create pyramids
vector<Mat> pyr1, pyr2; // image pyramids 注意这里是个vector
// TODO START YOUR CODE HERE (~8 lines)
for (int i = 0; i < pyramids; i++)
{
if (i == 0)
{
pyr1.push_back(img1);
pyr2.push_back(img2);
}
else
{
Mat img1_pyr, img2_pyr;
//resize函数改变图像的大小
cv::resize(pyr1[i -1], img1_pyr, cv::Size(pyr1[i - 1].cols * pyramid_scale, pyr1[i - 1].rows * pyramid_scale));
cv::resize(pyr2[i -1], img2_pyr, cv::Size(pyr2[i - 1].cols * pyramid_scale, pyr2[i - 1].rows * pyramid_scale));
pyr1.push_back(img1_pyr);
pyr2.push_back(img2_pyr);
}
}
// TODO END YOUR CODE HERE
// coarse-to-fine LK tracking in pyramids
// 搭建好金字塔后,从顶向下依次扩大0.3
// TODO START YOUR CODE HERE
vector<KeyPoint> kp1_pyr, kp2_pyr;
for (auto &kp:kp1)
{
auto kp_top = kp;
kp_top.pt *= scales[pyramids -1];// 顶层相对于第一张底层(img1)的缩放倍率为0.027
kp1_pyr.push_back(kp_top);
kp2_pyr.push_back(kp_top);
}
for (int level = pyramids -1; level >= 0; level--)
{
success.clear();
OpticalFlowSingleLevel(pyr1[level], pyr2[level], kp1_pyr, kp2_pyr, success, inverse);
if (level > 0)
{// 每次从上倒下,直到底层的之前,都是按照0.3倍率进行扩大
for (auto &kp:kp1_pyr){
kp.pt /= pyramid_scale;
}
for (auto &kp:kp2_pyr){
kp.pt /= pyramid_scale;
}
}
}
for (auto &kp:kp2_pyr) // 底层 即最后一层 kp2
{
kp2.push_back(kp);
}
// TODO END YOUR CODE HERE
// don't forget to set the results into kp2
}
2-3 讨论
1.图像块大小是否有明显差异?取 16x16 和 8x8 的图像块会让结果发生变化吗?
答:有差异,尝试了8、16、32三种图像块大小,其中,8效果最差,16和32效果与opencv效果相当(对于单层LK而言)
2.金字塔层数对结果有怎样的影响?缩放倍率呢?
答:
对比实验:
2.1 扩大、缩小倍率效果:
原始为0.5倍,缩小到0.3,扩大到0.8效果图:
原始0.5倍:
缩小在0.3倍:
扩大在0.8倍:
可以看到效果最佳在0.5倍,即当层数不变,适当调整缩放倍率可以达到最佳效果,缩放缩放倍率不宜过大或过小,当然可能也与相机本身的图像效果有关吧。
2.2 改变层数,缩放倍率不变:
2/4/8层三种情况: