arXiv-2020
文章目录
- 1 Background and Motivation
- 2 Advantages / Contributions
- 3 Method
- 4 Experiments
- 5 Conclusion(own)
1 Background and Motivation
人体关键点存在的难点:a wide variety of poses, numerous degrees of freedom, and occlusions.
本位没有聚焦解决上述难点,而是从快的方面入手,提速
2 Advantages / Contributions
- a novel body pose tracking solution
- a lightweight body pose estimation neural network
3 Method
整体预测流程如下,涉及到了跟踪和关键点检测
The tracker predicts
- key-point coordinates
- the presence of the person on the current frame
- the refined region of interest for the current frame
When the tracker indicates that there is no human present, we re-run the detector network on the next frame.
注意,没有用人体检测器去检测人,而是采用了 face detector,先找 RoI,人脸,臀部中点,肩膀中点,臀部中点与肩膀中点的夹角,然后可以使其平行于竖直方向,来对齐
图片来源 简单几行代码玩转实时人体姿态追踪算法BlazePose
像达芬奇的《维特鲁威人》这样,这样对齐后也会有利于跟踪
会预测出 33 个关键点
每个关键点对应的类别如下
-
Nose
-
Left eye inner(眼睛内侧)
-
Left eye
-
Left eye outer(眼睛外侧)
-
Right eye inner
-
Right eye
-
Right eye outer
-
Left ear
-
Right ear
-
Mouth left
-
Mouth right
-
Left shoulder
-
Right shoulder
-
Left elbow
-
Right elbow
-
Left wrist
-
Right wrist
-
Left pinky #1 knuckle(小拇指)
-
Right pinky #1 knuckle
-
Left index #1 knuckle(食指)
-
Right index #1 knuckle
-
Left thumb #2 knuckle(拇指)
-
Right thumb #2 knuckle
-
Left hip
-
Right hip
-
Left knee
-
Right knee
-
Left ankle
-
Right ankle
-
Left heel(脚跟)
-
Right heel
-
Left foot index
-
Right foot index
关键点预测模型结构如下
既有热力图预测关键点(准),又有回归预测关键点(快)
训练时两者都采用,共享了部分特征图,梯度没有共享(the gradients from the regression encoder are not propagated back to the heatmaptrained features),梯度不共享的好处:not only improve the heatmap predictions, but also substantially increase the coordinate regression accuracy
推理时,仅保留回归分支
4 Experiments
数据集
- AR Dataset
- Yoga Dataset
训练时
10% scale and shift augmentations,有利于跟踪
simulate occlusions (random rectangles filled with various colors),每个关键点都有是否可见或者准确的概率
测试,在 COCO 17 个关键点上进行,结果如下
评价指标 the Percent of Correct Points with 20% tolerance (PCK@0.2) (where we assume the point to be detected correctly if the 2D Euclidean error is smaller than 20% of the corresponding person’s torso size
效果展示
5 Conclusion(own)
https://github.com/google/mediapipe
Pose 是 3D 的