之前做的人体姿态检测大都是基于openpose或者是yolo-pose之类的技术框架,这里主要是想基于一个开源的实现来完成人体姿态检测。首先看下效果图:
Mediapipe是google的一个开源项目,支持跨平台的常用ML方案。项目在这里,如下所示:
github仓库在这里,如下所示:
MediaPipe工具包包括框架和Solutions。框架是用c++、Java和Obj-C编写的,包含:Calculator API (C++)、Graph construction API (Protobuf)和Graph Execution API (C++, Java, Obj-C)。Solutions是基于特定的预训练TensorFlow或TFLite模型的开源预构建示例。MediaPipe Solutions构建在框架之上。目前,它提供了16个Solutions,包括:人脸检测、Face Mesh、虹膜、手、姿态、人体、人物分割、头发分割、目标检测、Box Tracking、Instant Motion Tracking、3D目标检测、特征匹配、AutoFlip、MediaSequence、YouTube-8M。如下所示:
这里主要是想基于mediapipe来实现人体姿态检测。
核心代码实现很简单,如下所示:
mp_pose = mp.solutions.pose
mp_drawing = mp.solutions.drawing_utils
pose = mp_pose.Pose(static_image_mode=True,
smooth_landmarks=True,
min_detection_confidence=0.5,
min_tracking_confidence=0.5
)
# 姿态估计
results = pose.process(img)
print("results: ", results.pose_landmarks)
# 可视化
mp_drawing.draw_landmarks(img, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)
look_img(img)
结果如下所示:
我们再多测试几张图片,如下所示:
效果还是可以的,毕竟是开箱即用的工具了。
这里打印出来了计算得到的landmarks,如下所示:
landmark {
x: 0.8632122278213501
y: 0.39934223890304565
z: -0.0021378363016992807
visibility: 1.0
}
landmark {
x: 0.8844274282455444
y: 0.387251079082489
z: 0.005716356914490461
visibility: 1.0
}
landmark {
x: 0.8930062055587769
y: 0.3897208869457245
z: -0.002450103173032403
visibility: 1.0
}
landmark {
x: 0.9013738632202148
y: 0.39209526777267456
z: 0.0007952205487526953
visibility: 1.0
}
landmark {
x: 0.8574270009994507
y: 0.378071129322052
z: 0.0035106923896819353
visibility: 1.0
}
landmark {
x: 0.8468424081802368
y: 0.37453708052635193
z: 0.003549371613189578
visibility: 1.0
}
landmark {
x: 0.8365076184272766
y: 0.3711674213409424
z: -0.0029587973840534687
visibility: 1.0
}
landmark {
x: 0.8933941125869751
y: 0.3996911644935608
z: -0.0009722764370962977
visibility: 0.9999998807907104
}
landmark {
x: 0.8056524991989136
y: 0.37151792645454407
z: 0.0004327383066993207
visibility: 0.9999991655349731
}
landmark {
x: 0.8579585552215576
y: 0.4187290668487549
z: 6.539761670865119e-05
visibility: 0.9999994039535522
}
landmark {
x: 0.8241762518882751
y: 0.40628987550735474
z: 0.004473073408007622
visibility: 0.9999991655349731
}
landmark {
x: 0.8441612720489502
y: 0.5035648345947266
z: -0.25567159056663513
visibility: 0.9999856948852539
}
landmark {
x: 0.6964893341064453
y: 0.39029890298843384
z: -0.24796368181705475
visibility: 0.9999732971191406
}
landmark {
x: 0.8154499530792236
y: 0.6286311745643616
z: -0.5366763472557068
visibility: 0.9982965588569641
}
landmark {
x: 0.567449688911438
y: 0.37861955165863037
z: -0.5087283253669739
visibility: 0.9612370133399963
}
landmark {
x: 0.8140097856521606
y: 0.5123477578163147
z: -0.5489024519920349
visibility: 0.9991863369941711
}
landmark {
x: 0.6818426847457886
y: 0.3975215554237366
z: -0.6002770066261292
visibility: 0.9928218722343445
}
landmark {
x: 0.8133549690246582
y: 0.4815672039985657
z: -0.5735042691230774
visibility: 0.9980783462524414
}
landmark {
x: 0.7153089642524719
y: 0.4053640067577362
z: -0.6815528273582458
visibility: 0.9833958148956299
}
landmark {
x: 0.822299599647522
y: 0.4749387800693512
z: -0.5469208359718323
visibility: 0.9978579878807068
}
landmark {
x: 0.7317229509353638
y: 0.4011012315750122
z: -0.6464983820915222
visibility: 0.9760293364524841
}
landmark {
x: 0.8098946809768677
y: 0.48704978823661804
z: -0.5442765355110168
visibility: 0.9976638555526733
}
landmark {
x: 0.7142665982246399
y: 0.4025900065898895
z: -0.6355682611465454
visibility: 0.9730454087257385
}
landmark {
x: 0.546131432056427
y: 0.5851266384124756
z: -0.09281446784734726
visibility: 0.9991334080696106
}
landmark {
x: 0.508682131767273
y: 0.48553502559661865
z: 0.04270992800593376
visibility: 0.9998688697814941
}
landmark {
x: 0.5611737966537476
y: 0.7204070091247559
z: -0.24625882506370544
visibility: 0.9888652563095093
}
landmark {
x: 0.42196184396743774
y: 0.3439556956291199
z: -0.3940356969833374
visibility: 0.9972810745239258
}
landmark {
x: 0.5202962756156921
y: 0.8794336915016174
z: -0.15859434008598328
visibility: 0.9323670268058777
}
landmark {
x: 0.31913697719573975
y: 0.17321842908859253
z: -0.18911319971084595
visibility: 0.9997671246528625
}
landmark {
x: 0.49125418066978455
y: 0.9076821208000183
z: -0.21756611764431
visibility: 0.90669184923172
}
landmark {
x: 0.26992592215538025
y: 0.1453598439693451
z: -0.29918187856674194
visibility: 0.99979168176651
}
landmark {
x: 0.570073127746582
y: 0.9116483330726624
z: -0.3478630483150482
visibility: 0.9282693862915039
}
landmark {
x: 0.381407231092453
y: 0.09821030497550964
z: -0.4874095618724823
visibility: 0.9996092915534973
}
为了方便界面展示,这里对其进行解析处理,结果输出如下所示:
x: 0.8632, y: 0.3993, z: -0.0021, visibility: 1.0
x: 0.8844, y: 0.3873, z: 0.0057, visibility: 1.0
x: 0.893, y: 0.3897, z: -0.0025, visibility: 1.0
x: 0.9014, y: 0.3921, z: 0.0008, visibility: 1.0
x: 0.8574, y: 0.3781, z: 0.0035, visibility: 1.0
x: 0.8468, y: 0.3745, z: 0.0035, visibility: 1.0
x: 0.8365, y: 0.3712, z: -0.003, visibility: 1.0
x: 0.8934, y: 0.3997, z: -0.001, visibility: 1.0
x: 0.8057, y: 0.3715, z: 0.0004, visibility: 1.0
x: 0.858, y: 0.4187, z: 0.0001, visibility: 1.0
x: 0.8242, y: 0.4063, z: 0.0045, visibility: 1.0
x: 0.8442, y: 0.5036, z: -0.2557, visibility: 1.0
x: 0.6965, y: 0.3903, z: -0.248, visibility: 1.0
x: 0.8154, y: 0.6286, z: -0.5367, visibility: 0.9983
x: 0.5674, y: 0.3786, z: -0.5087, visibility: 0.9612
x: 0.814, y: 0.5123, z: -0.5489, visibility: 0.9992
x: 0.6818, y: 0.3975, z: -0.6003, visibility: 0.9928
x: 0.8134, y: 0.4816, z: -0.5735, visibility: 0.9981
x: 0.7153, y: 0.4054, z: -0.6816, visibility: 0.9834
x: 0.8223, y: 0.4749, z: -0.5469, visibility: 0.9979
x: 0.7317, y: 0.4011, z: -0.6465, visibility: 0.976
x: 0.8099, y: 0.487, z: -0.5443, visibility: 0.9977
x: 0.7143, y: 0.4026, z: -0.6356, visibility: 0.973
x: 0.5461, y: 0.5851, z: -0.0928, visibility: 0.9991
x: 0.5087, y: 0.4855, z: 0.0427, visibility: 0.9999
x: 0.5612, y: 0.7204, z: -0.2463, visibility: 0.9889
x: 0.422, y: 0.344, z: -0.394, visibility: 0.9973
x: 0.5203, y: 0.8794, z: -0.1586, visibility: 0.9324
x: 0.3191, y: 0.1732, z: -0.1891, visibility: 0.9998
x: 0.4913, y: 0.9077, z: -0.2176, visibility: 0.9067
x: 0.2699, y: 0.1454, z: -0.2992, visibility: 0.9998
x: 0.5701, y: 0.9116, z: -0.3479, visibility: 0.9283
x: 0.3814, y: 0.0982, z: -0.4874, visibility: 0.9996
之后开发专用的可视化界面实现可视化推理如下所示:
界面大小的限制,这里仅展示了Top 10的关键点信息数据,感兴趣的话都可以自行尝试。