基于OpenCV 和 Dlib 进行头部姿态估计

news2024/12/29 13:33:34

写在前面


  • 工作中遇到,简单整理
  • 博文内容涉及基于 OpenCV 和 Dlib头部姿态评估的简单Demo
  • 理解不足小伙伴帮忙指正

庐山烟雨浙江潮,未到千般恨不消。到得还来别无事,庐山烟雨浙江潮。 ----《庐山烟雨浙江潮》苏轼


https://github.com/LIRUILONGS/Head-posture-detection-dlib-opencv-.git

实验项目以上传,只需 git 克隆,安装需要的 pytohn 包,就可以开始使用了,但是需要说明的是 Dlib 的基于 HOG特征和SVM分类器的人脸检测器很一般,很多脸都检测不到,实际情况中可以考虑使用深度学习模型来做关键点检测,然后评估姿态。可以查看文章末尾大佬的开源项目

实现效果

Demo
原图
原图
原图
特征点标记后
特征点标记后
姿态标记
姿态标记
姿态对应的Yaw,Pitch,Roll 度数
姿态对应的Yaw,Pitch,Roll
姿态对应的Yaw,Pitch,Roll

步骤

三个主要步骤

人脸检测

人脸检测:引入人脸检测器 dlib.get_frontal_face_detector() 以检测包含人脸的图片,多个人脸会选择面积最大的人脸。

dlib.get_frontal_face_detector()dlib 库中的一个函数,用于获取一个基于HOG特征和SVM分类器的人脸检测器。该函数返回一个可以用于检测图像中人脸的对象。

具体来说,HOG(Histogram of Oriented Gradients,梯度方向直方图)是一种常用于图像识别中的特征描述子,SVM(Support Vector Machine,支持向量机)是一种常用的分类器。将HOG特征与SVM分类器结合起来,可以得到一个有效的人脸检测器。

在使用 dlib.get_frontal_face_detector()函数时,只需将待检测的图像作为参数传入,即可得到一个用于检测人脸的对象。一个Demo

import dlib
import cv2

# 读取图像
img = cv2.imread('image.jpg')

# 获取人脸检测器
detector = dlib.get_frontal_face_detector()

# 在图像中检测人脸
faces = detector(img)

# 输出检测到的人脸数
print("检测到的人脸数为:", len(faces))

面部特征点检测

面部特征点检测,利用预训练模型 shape_predictor_68_face_landmarks.dat 以人脸图像为输入,输出68个人脸特征点

shape_predictor_68_face_landmarks.dat 是基于 dlib 库中的人脸特征点检测模型,该模型使用了基于 HOG 特征和 SVM 分类器的人脸检测器来检测图像中的人脸,并使用回归算法来预测人脸的 68 个关键点位置。这些关键点包括眼睛、鼻子、嘴巴等部位,可以用于进行人脸识别、表情识别、姿态估计等应用。

这个模型文件可以在dlib的官方网站上下载。在使用它之前,需要安装dlib库并将模型文件加载到程序中。

predictor = dlib.shape_predictor(r".\shape_predictor_68_face_landmarks.dat")

姿势估计

姿势估计。在获得 68 个面部特征点后,选择部分特征点,通过 PnP算法计算姿势 Yaw、Pitch、Roll 度数

    (success, rotation_vector, translation_vector) = cv2.solvePnP(model_points, image_points, camera_matrix,
                                                                  dist_coeffs, flags=cv2.SOLVEPNP_ITERATIVE)

Yaw、Pitch、Roll 是用于描述物体或相机在三维空间中的旋转角度的术语,常用于姿态估计和姿态控制中。

  • Yaw(左右):绕垂直于物体或相机的轴旋转的角度,也称为偏航角。通常以 z 轴为轴进行旋转,正值表示逆时针旋转,负值表示顺时针旋转。
  • Pitch(上下):绕物体或相机的横轴旋转的角度,也称为俯仰角。通常以 x 轴为轴进行旋转,正值表示向上旋转,负值表示向下旋转。
  • Roll(弯曲):绕物体或相机的纵轴旋转的角度,也称为翻滚角。通常以 y 轴为轴进行旋转,正值表示向右旋转,负值表示向左旋转。

这三个角度通常以欧拉角的形式表示,可以用于描述物体或相机的姿态信息。在计算机视觉中,常用于人脸识别、动作捕捉、机器人控制等应用场景。

完整 Demo 代码

#!/usr/bin/env python
# -*- encoding: utf-8 -*-
"""
@File    :   face_ypr_demo.py
@Time    :   2023/06/05 21:32:45
@Author  :   Li Ruilong
@Version :   1.0
@Contact :   liruilonger@gmail.com
@Desc    :   根据68个人脸关键点,获取人头部姿态评估
"""

# here put the import lib

import cv2
import numpy as np
import dlib
import math
import uuid

# 头部姿态检测(dlib+opencv)

detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor(r".\shape_predictor_68_face_landmarks.dat")
POINTS_NUM_LANDMARK = 68


# shape_predictor_68_face_landmarks.dat 是一个预训练的人脸关键点检测模型,可以用于识别人脸的68个关键点,如眼睛、鼻子、嘴巴等。这个模型可以被用于人脸识别、人脸表情分析、面部姿势估计等领域。
# 它是由dlib库提供的,可以在Python中使用。如果你想使用它,可以在dlib的官方网站上下载。

# 获取最大的人脸
def _largest_face(dets):
    """
    @Time    :   2023/06/05 21:30:37
    @Author  :   liruilonger@gmail.com
    @Version :   1.0
    @Desc    :   从一个由 dlib 库检测到的人脸框列表中,找到最大的人脸框,并返回该框在列表中的索
                如果只有一个人脸,直接返回
                 Args:
                   dets: 一个由 `dlib.rectangle` 类型的对象组成的列表,每个对象表示一个人脸框
                 Returns:
                   人脸索引
    """
    # 如果列表长度为1,则直接返回
    if len(dets) == 1:
        return 0
    # 计算每个人脸框的面积
    face_areas = [(det.right() - det.left()) * (det.bottom() - det.top()) for det in dets]
    import heapq
    # 找到面积最大的人脸框的索引
    largest_area = face_areas[0]
    largest_index = 0
    for index in range(1, len(dets)):
        if face_areas[index] > largest_area:
            largest_index = index
            largest_area = face_areas[index]
    # 打印最大人脸框的索引和总人脸数
    print("largest_face index is {} in {} faces".format(largest_index, len(dets)))

    return largest_index


def get_image_points_from_landmark_shape(landmark_shape):
    """
    @Time    :   2023/06/05 22:30:02
    @Author  :   liruilonger@gmail.com
    @Version :   1.0
    @Desc    :   从dlib的检测结果抽取姿态估计需要的点坐标
                 Args:
                   landmark_shape:  所有的位置点
                 Returns:
                   void
    """

    if landmark_shape.num_parts != POINTS_NUM_LANDMARK:
        print("ERROR:landmark_shape.num_parts-{}".format(landmark_shape.num_parts))
        return -1, None

    # 2D image points. If you change the image, you need to change vector

    image_points = np.array([
        (landmark_shape.part(17).x, landmark_shape.part(17).y),  # 17 left brow left corner
        (landmark_shape.part(21).x, landmark_shape.part(21).y),  # 21 left brow right corner
        (landmark_shape.part(22).x, landmark_shape.part(22).y),  # 22 right brow left corner
        (landmark_shape.part(26).x, landmark_shape.part(26).y),  # 26 right brow right corner
        (landmark_shape.part(36).x, landmark_shape.part(36).y),  # 36 left eye left corner
        (landmark_shape.part(39).x, landmark_shape.part(39).y),  # 39 left eye right corner
        (landmark_shape.part(42).x, landmark_shape.part(42).y),  # 42 right eye left corner
        (landmark_shape.part(45).x, landmark_shape.part(45).y),  # 45 right eye right corner
        (landmark_shape.part(31).x, landmark_shape.part(31).y),  # 31 nose left corner
        (landmark_shape.part(35).x, landmark_shape.part(35).y),  # 35 nose right corner
        (landmark_shape.part(48).x, landmark_shape.part(48).y),  # 48 mouth left corner
        (landmark_shape.part(54).x, landmark_shape.part(54).y),  # 54 mouth right corner
        (landmark_shape.part(57).x, landmark_shape.part(57).y),  # 57 mouth central bottom corner
        (landmark_shape.part(8).x, landmark_shape.part(8).y),  # 8 chin corner
    ], dtype="double")
    return 0, image_points


def get_image_points(img):
    """
    @Time    :   2023/06/05 22:30:43
    @Author  :   liruilonger@gmail.com
    @Version :   1.0
    @Desc    :   用dlib检测关键点,返回姿态估计需要的几个点坐标
                 Args:
                   
                 Returns:
                   void
    """

    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)  # 图片调整为灰色

    dets = detector(img, 0)

    if 0 == len(dets):
        print("ERROR: found no face")
        return -1, None
    largest_index = _largest_face(dets)
    face_rectangle = dets[largest_index]

    landmark_shape = predictor(img, face_rectangle)
    draw = im.copy()
    cv2.circle(draw, (landmark_shape.part(0).x, landmark_shape.part(0).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(1).x, landmark_shape.part(1).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(2).x, landmark_shape.part(2).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(3).x, landmark_shape.part(3).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(4).x, landmark_shape.part(4).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(5).x, landmark_shape.part(5).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(6).x, landmark_shape.part(6).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(7).x, landmark_shape.part(7).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(8).x, landmark_shape.part(8).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(9).x, landmark_shape.part(9).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(10).x, landmark_shape.part(10).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(11).x, landmark_shape.part(11).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(12).x, landmark_shape.part(12).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(13).x, landmark_shape.part(13).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(14).x, landmark_shape.part(14).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(15).x, landmark_shape.part(15).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(16).x, landmark_shape.part(16).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(17).x, landmark_shape.part(17).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(18).x, landmark_shape.part(18).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(19).x, landmark_shape.part(19).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(20).x, landmark_shape.part(20).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(21).x, landmark_shape.part(21).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(22).x, landmark_shape.part(22).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(23).x, landmark_shape.part(23).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(24).x, landmark_shape.part(24).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(25).x, landmark_shape.part(25).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(26).x, landmark_shape.part(26).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(27).x, landmark_shape.part(27).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(28).x, landmark_shape.part(28).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(29).x, landmark_shape.part(29).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(30).x, landmark_shape.part(30).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(31).x, landmark_shape.part(31).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(32).x, landmark_shape.part(32).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(33).x, landmark_shape.part(33).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(34).x, landmark_shape.part(34).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(35).x, landmark_shape.part(35).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(36).x, landmark_shape.part(36).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(37).x, landmark_shape.part(37).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(38).x, landmark_shape.part(38).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(39).x, landmark_shape.part(39).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(40).x, landmark_shape.part(40).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(41).x, landmark_shape.part(41).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(42).x, landmark_shape.part(42).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(43).x, landmark_shape.part(43).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(44).x, landmark_shape.part(44).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(45).x, landmark_shape.part(45).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(46).x, landmark_shape.part(46).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(47).x, landmark_shape.part(47).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(48).x, landmark_shape.part(48).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(49).x, landmark_shape.part(49).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(50).x, landmark_shape.part(50).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(51).x, landmark_shape.part(51).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(52).x, landmark_shape.part(52).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(53).x, landmark_shape.part(53).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(54).x, landmark_shape.part(54).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(55).x, landmark_shape.part(55).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(56).x, landmark_shape.part(56).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(57).x, landmark_shape.part(57).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(58).x, landmark_shape.part(58).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(59).x, landmark_shape.part(59).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(60).x, landmark_shape.part(60).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(61).x, landmark_shape.part(61).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(62).x, landmark_shape.part(62).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(63).x, landmark_shape.part(63).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(64).x, landmark_shape.part(64).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(65).x, landmark_shape.part(65).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(66).x, landmark_shape.part(66).y), 2, (0, 255, 0), -1)
    cv2.circle(draw, (landmark_shape.part(67).x, landmark_shape.part(67).y), 2, (0, 255, 0), -1)

    # 部分关键点特殊标记
    cv2.circle(draw, (landmark_shape.part(17).x, landmark_shape.part(17).y), 2, (0, 165, 255),
               -1)  # 17 left brow left corner
    cv2.circle(draw, (landmark_shape.part(21).x, landmark_shape.part(21).y), 2, (0, 165, 255),
               -1)  # 21 left brow right corner
    cv2.circle(draw, (landmark_shape.part(22).x, landmark_shape.part(22).y), 2, (0, 165, 255),
               -1)  # 22 right brow left corner
    cv2.circle(draw, (landmark_shape.part(26).x, landmark_shape.part(26).y), 2, (0, 165, 255),
               -1)  # 26 right brow right corner
    cv2.circle(draw, (landmark_shape.part(36).x, landmark_shape.part(36).y), 2, (0, 165, 255),
               -1)  # 36 left eye left corner
    cv2.circle(draw, (landmark_shape.part(39).x, landmark_shape.part(39).y), 2, (0, 165, 255),
               -1)  # 39 left eye right corner
    cv2.circle(draw, (landmark_shape.part(42).x, landmark_shape.part(42).y), 2, (0, 165, 255),
               -1)  # 42 right eye left corner
    cv2.circle(draw, (landmark_shape.part(45).x, landmark_shape.part(45).y), 2, (0, 165, 255),
               -1)  # 45 right eye right corner
    cv2.circle(draw, (landmark_shape.part(31).x, landmark_shape.part(31).y), 2, (0, 165, 255),
               -1)  # 31 nose left corner
    cv2.circle(draw, (landmark_shape.part(35).x, landmark_shape.part(35).y), 2, (0, 165, 255),
               -1)  # 35 nose right corner
    cv2.circle(draw, (landmark_shape.part(48).x, landmark_shape.part(48).y), 2, (0, 165, 255),
               -1)  # 48 mouth left corner
    cv2.circle(draw, (landmark_shape.part(54).x, landmark_shape.part(54).y), 2, (0, 165, 255),
               -1)  # 54 mouth right corner
    cv2.circle(draw, (landmark_shape.part(57).x, landmark_shape.part(57).y), 2, (0, 165, 255),
               -1)  # 57 mouth central bottom corner
    cv2.circle(draw, (landmark_shape.part(8).x, landmark_shape.part(8).y), 2, (0, 165, 255), -1)

    # 保存关键点标记后的图片
    cv2.imwrite('new_' + "KeyPointDetection.jpg", draw)

    return get_image_points_from_landmark_shape(landmark_shape)


def get_pose_estimation(img_size, image_points):
    """
    @Time    :   2023/06/05 22:31:31
    @Author  :   liruilonger@gmail.com
    @Version :   1.0
    @Desc    :   获取旋转向量和平移向量
                 Args:
                   
                 Returns:
                   void
    """

    # 3D model points.
    model_points = np.array([
        (6.825897, 6.760612, 4.402142),  # 33 left brow left corner
        (1.330353, 7.122144, 6.903745),  # 29 left brow right corner
        (-1.330353, 7.122144, 6.903745),  # 34 right brow left corner
        (-6.825897, 6.760612, 4.402142),  # 38 right brow right corner
        (5.311432, 5.485328, 3.987654),  # 13 left eye left corner
        (1.789930, 5.393625, 4.413414),  # 17 left eye right corner
        (-1.789930, 5.393625, 4.413414),  # 25 right eye left corner
        (-5.311432, 5.485328, 3.987654),  # 21 right eye right corner
        (2.005628, 1.409845, 6.165652),  # 55 nose left corner
        (-2.005628, 1.409845, 6.165652),  # 49 nose right corner
        (2.774015, -2.080775, 5.048531),  # 43 mouth left corner
        (-2.774015, -2.080775, 5.048531),  # 39 mouth right corner
        (0.000000, -3.116408, 6.097667),  # 45 mouth central bottom corner
        (0.000000, -7.415691, 4.070434)  # 6 chin corner
    ])
    # Camera internals

    focal_length = img_size[1]
    center = (img_size[1] / 2, img_size[0] / 2)
    camera_matrix = np.array(
        [[focal_length, 0, center[0]],
         [0, focal_length, center[1]],
         [0, 0, 1]], dtype="double"
    )

    dist_coeffs = np.array([7.0834633684407095e-002, 6.9140193737175351e-002, 0.0, 0.0, -1.3073460323689292e+000],
                           dtype="double")  # Assuming no lens distortion

    (success, rotation_vector, translation_vector) = cv2.solvePnP(model_points, image_points, camera_matrix,
                                                                  dist_coeffs, flags=cv2.SOLVEPNP_ITERATIVE)

    # print("Rotation Vector:\n {}".format(rotation_vector))
    # print("Translation Vector:\n {}".format(translation_vector))
    return success, rotation_vector, translation_vector, camera_matrix, dist_coeffs


def draw_annotation_box(image, rotation_vector, translation_vector, camera_matrix, dist_coeefs, color=(0, 255, 0),
                        line_width=2):
    """
    @Time    :   2023/06/05 22:09:14
    @Author  :   liruilonger@gmail.com
    @Version :   1.0
    @Desc    :   标记一个人脸朝向的3D框
                 Args:
                   
                 Returns:
                   void
    """

    """Draw a 3D box as annotation of pose"""
    point_3d = []
    rear_size = 10
    rear_depth = 0
    point_3d.append((-rear_size, -rear_size, rear_depth))
    point_3d.append((-rear_size, rear_size, rear_depth))
    point_3d.append((rear_size, rear_size, rear_depth))
    point_3d.append((rear_size, -rear_size, rear_depth))
    point_3d.append((-rear_size, -rear_size, rear_depth))

    front_size = 10
    # 高度
    front_depth = 10
    point_3d.append((-front_size, -front_size, front_depth))
    point_3d.append((-front_size, front_size, front_depth))
    point_3d.append((front_size, front_size, front_depth))
    point_3d.append((front_size, -front_size, front_depth))
    point_3d.append((-front_size, -front_size, front_depth))
    point_3d = np.array(point_3d, dtype=np.float32).reshape(-1, 3)

    # Map to 2d image points
    (point_2d, _) = cv2.projectPoints(point_3d,
                                      rotation_vector,
                                      translation_vector,
                                      camera_matrix,
                                      dist_coeefs)
    point_2d = np.int32(point_2d.reshape(-1, 2))

    # Draw all the lines
    cv2.polylines(image, [point_2d], True, color, line_width, cv2.LINE_AA)
    cv2.line(image, tuple(point_2d[1]), tuple(
        point_2d[6]), color, line_width, cv2.LINE_AA)
    cv2.line(image, tuple(point_2d[2]), tuple(
        point_2d[7]), color, line_width, cv2.LINE_AA)
    cv2.line(image, tuple(point_2d[3]), tuple(
        point_2d[8]), color, line_width, cv2.LINE_AA)


# 从旋转向量转换为欧拉角
def get_euler_angle(rotation_vector):
    """
    @Time    :   2023/06/05 22:31:52
    @Author  :   liruilonger@gmail.com
    @Version :   1.0
    @Desc    :   从旋转向量转换为欧拉角
                 Args:
                   
                 Returns:
                   void
    """

    # calculate rotation angles
    theta = cv2.norm(rotation_vector, cv2.NORM_L2)

    # transformed to quaterniond
    w = math.cos(theta / 2)
    x = math.sin(theta / 2) * rotation_vector[0][0] / theta
    y = math.sin(theta / 2) * rotation_vector[1][0] / theta
    z = math.sin(theta / 2) * rotation_vector[2][0] / theta

    ysqr = y * y
    # pitch (x-axis rotation)
    t0 = 2.0 * (w * x + y * z)
    t1 = 1.0 - 2.0 * (x * x + ysqr)

    # print('t0:{}, t1:{}'.format(t0, t1))
    pitch = math.atan2(t0, t1)

    # yaw (y-axis rotation)
    t2 = 2.0 * (w * y - z * x)
    if t2 > 1.0:
        t2 = 1.0
    if t2 < -1.0:
        t2 = -1.0
    yaw = math.asin(t2)

    # roll (z-axis rotation)
    t3 = 2.0 * (w * z + x * y)
    t4 = 1.0 - 2.0 * (ysqr + z * z)
    roll = math.atan2(t3, t4)

    print('pitch:{}, yaw:{}, roll:{}'.format(pitch, yaw, roll))

    # 单位转换:将弧度转换为度
    pitch_degree = int((pitch / math.pi) * 180)
    yaw_degree = int((yaw / math.pi) * 180)
    roll_degree = int((roll / math.pi) * 180)

    return 0, pitch, yaw, roll, pitch_degree, yaw_degree, roll_degree


def get_pose_estimation_in_euler_angle(landmark_shape, im_szie):
    try:
        ret, image_points = get_image_points_from_landmark_shape(landmark_shape)
        if ret != 0:
            print('get_image_points failed')
            return -1, None, None, None

        ret, rotation_vector, translation_vector, camera_matrix, dist_coeffs = get_pose_estimation(im_szie,
                                                                                                   image_points)
        if ret != True:
            print('get_pose_estimation failed')
            return -1, None, None, None

        ret, pitch, yaw, roll = get_euler_angle(rotation_vector)
        if ret != 0:
            print('get_euler_angle failed')
            return -1, None, None, None

        euler_angle_str = 'Pitch:{}, Yaw:{}, Roll:{}'.format(pitch, yaw, roll)
        print(euler_angle_str)
        return 0, pitch, yaw, roll

    except Exception as e:
        print('get_pose_estimation_in_euler_angle exception:{}'.format(e))
        return -1, None, None, None


def build_img_text_marge(img_, text, height):
    """
    @Time    :   2023/06/01 05:29:09
    @Author  :   liruilonger@gmail.com
    @Version :   1.0
    @Desc    :   生成文字图片拼接到 img 对象
                 Args:

                 Returns:
                   void
    """
    import cv2
    from PIL import Image, ImageDraw, ImageFont

    # 定义图片大小和背景颜色
    width = img_.shape[1]
    background_color = (255, 255, 255)

    # 定义字体、字号和颜色
    font_path = 'arial.ttf'
    font_size = 26
    font_color = (0, 0, 0)

    # 创建空白图片
    image = Image.new('RGB', (width, height), background_color)

    # 创建画笔
    draw = ImageDraw.Draw(image)

    # 加载字体
    font = ImageFont.truetype(font_path, font_size)

    # 写入文字
    text_width, text_height = draw.textsize(text, font)
    text_x = (width - text_width) // 2
    text_y = (height - text_height) // 2
    draw.text((text_x, text_y), text, font=font, fill=font_color)

    # 将Pillow图片转换为OpenCV图片
    image_cv = cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR)

    montage_size = (width, img_.shape[0])
    import imutils
    montages = imutils.build_montages([img_, image_cv], montage_size, (1, 2))

    # 保存图片
    return montages[0]


if __name__ == '__main__':
    from imutils import paths

    # for imagePath in paths.list_images("W:\\python_code\\deepface\\huge_1.jpg"):
    for imagePath in range(1):
        print(f"处理的图片路径为: {imagePath}")
        # Read Image
        im = cv2.imread("image.jpg")
        size = im.shape
        # 对图像进行缩放的操作
        if size[0] > 700:
            h = size[0] / 3
            w = size[1] / 3
            # 如果图像的高度大于700,就将其高度和宽度分别缩小为原来的1/3,然后使用双三次插值的方法进行缩放。最后返回缩放后的图像的大小。
            im = cv2.resize(im, (int(w), int(h)), interpolation=cv2.INTER_CUBIC)
            size = im.shape
        # 获取坐标点    
        ret, image_points = get_image_points(im)
        if ret != 0:
            print('get_image_points failed')
            continue

        ret, rotation_vector, translation_vector, camera_matrix, dist_coeffs = get_pose_estimation(size, image_points)

        if ret != True:
            print('get_pose_estimation failed')
            continue
        draw_annotation_box(im, rotation_vector, translation_vector, camera_matrix, dist_coeffs)
        cv2.imwrite('new_' + "draw_annotation_box.jpg", im)

        ret, pitch, yaw, roll, pitch_degree, yaw_degree, roll_degree = get_euler_angle(rotation_vector)

        draw = im.copy()
        # Yaw:

        if yaw_degree < 0:
            output_yaw = "left : " + str(abs(yaw_degree)) + " degrees"
        elif yaw_degree > 0:
            output_yaw = "right :" + str(abs(yaw_degree)) + " degrees"
        else:
            output_yaw = "No left or right"
        print(output_yaw)

        # Pitch:
        if pitch_degree > 0:
            output_pitch = "dow :" + str(abs(pitch_degree)) + " degrees"
        elif pitch_degree < 0:
            output_pitch = "up :" + str(abs(pitch_degree)) + " degrees"
        else:
            output_pitch = "No downwards or upwards"
        print(output_pitch)

        # Roll:
        if roll_degree < 0:
            output_roll = "bends to the right: " + str(abs(roll_degree)) + " degrees"
        elif roll_degree > 0:
            output_roll = "bends to the left: " + str(abs(roll_degree)) + " degrees"
        else:
            output_roll = "No bend  right or left."
        print(output_roll)

        # Initial status:
        if abs(yaw) < 0.00001 and abs(pitch) < 0.00001 and abs(roll) < 0.00001:
            cv2.putText(draw, "Initial ststus", (20, 40), cv2.FONT_HERSHEY_SIMPLEX, .5, (0, 255, 0))
            print("Initial ststus")

        # 姿态检测完的数据写在对应的照片
        imgss = build_img_text_marge(im, output_yaw + "\n" + output_pitch + "\n" + output_roll, 200)
        cv2.imwrite('new_' + str(uuid.uuid4()).replace('-', '') + ".jpg", imgss)

博文部分内容参考

© 文中涉及参考链接内容版权归原作者所有,如有侵权请告知,这是一个开源项目,如果你认可它,不要吝啬星星哦 😃


https://blog.csdn.net/zhang2gongzi/article/details/124520896

https://github.com/JuneoXIE/

https://github.com/yinguobing/head-pose-estimation


© 2018-2023 liruilonger@gmail.com, All rights reserved. 保持署名-非商用-相同方式共享(CC BY-NC-SA 4.0)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/622369.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

2023智源大会议程公开 | 大模型新基建与智力运营论坛

6月9日&#xff0c;2023北京智源大会&#xff0c;将邀请这一领域的探索者、实践者、以及关心智能科学的每个人&#xff0c;共同拉开未来舞台的帷幕&#xff0c;你准备好了吗&#xff1f;与会知名嘉宾包括&#xff0c;图灵奖得主Yann LeCun、OpenAI创始人Sam Altman、图灵奖得主…

【模型评估】混淆矩阵(confusion_matrix)之 TP、FP、TN、FN;敏感度、特异度、准确率、精确率

你这蠢货&#xff0c;是不是又把酸葡萄和葡萄酸弄“混淆”啦&#xff01;&#xff01;&#xff01;这里的混淆&#xff0c;我们细品&#xff0c;帮助我们理解名词“混淆矩阵” 上面日常情况中的混淆就是&#xff1a;是否把某两件东西或者多件东西给弄混了&#xff0c;迷糊了。把…

数据隐私保护的最佳实践:全面了解数据脱敏方案

1、数据脱敏 数据脱敏是一种保护敏感信息的安全措施&#xff0c;通常会将真实数据替换成模拟数据或者经过处理后的数据。下面是常见的数据脱敏实现方案&#xff1a; 字符串替换&#xff1a;将需要脱敏的字符串中指定位置的字符替换为“****”或其他符号。例如&#xff0c;将银…

MySQL数据库误删恢复

前言 经常听说删库跑路这真的不只是一句玩笑话,若不小心删除了数据库,事情很严重。你一个不小心可能会给公司删没。建议研发不要直连生成环境,一般的话都会分配账号权限,生产环境的账号尽量是只读,以防你一个不经意给库或表删除。一定要备份,这很重要,这是一个血的教训。…

iTOP3568开发板-Buildroot 系统设置待机和锁屏

Weston 的超时待机时长可以在启动参数中配置,也可以在 weston.ini 的 core 段配置。 方法一&#xff1a; 修改文件系统中/etc/init.d/S50launcher 文件&#xff0c;如下图所示的红框&#xff0c;0 代表禁止待机&#xff0c;可自行设置待机时间&#xff0c;单位是秒。 方法二&a…

深浅拷贝各种实现方式性能

拷贝方式 拷贝方式类型原理备注Object.clone()默认 浅拷贝&#xff0c;可以自定义实现深拷贝对象内存复制constructor可以实现深拷贝自定义实现BeanUtil.copyProperties()浅拷贝利用 getter/setter 实现属性拷贝反射&#xff0c;spring utilCollectionUtils.clone()深拷贝本质…

强化学习驱动的低延迟视频传输

随着视频会议、视频直播的流行以及未来AR/VR业务的发展&#xff0c;低延迟视频传输服务被广泛使用&#xff0c;但视频质量&#xff08;QoE&#xff09;还不能满足用户要求。那么近年来新兴的AI神经网络是否能为视频传输带来智能化的优化&#xff1f;今天LiveVideoStack大会北京…

macos m1 pip install lightgbm error

MacOS M1电脑&#xff0c;执行 pip install lightgbm 错误如下&#xff1a; 尝试如下操作&#xff1a; 参考链接如下&#xff1a; https://github.com/Microsoft/LightGBM/issues/1324 brew install cmake brew install gcc git clone --recursive https://github.com/Micro…

Unity之OpenXR+XR Interaction Toolkit接入HTC Vive解决手柄无法使用的问题

前言 随着Unity版本的不断进化,VR的接口逐渐统一,现在大部分的VR项目都开始使用OpenXR开发了。基于OpenXR,我们可以快速适配HTC,Pico,Oculus,等等设备。 今天我们要说的问题就是,当我们按照官方的标准流程配置完OpenXR后(参考:Unity之OpenXR+XR Interaction Toolkit…

西门子S7-200 CPU输入/输出接线说明

总结来看&#xff0c;S7-200系列PLC提供4个不同的基本型号的8种CPU&#xff0c;其接线方式也可大致分为6种&#xff1a; 1.CPU SR20接线 2.CPU SR40接线 3.CPU CR40接线 4.CPU ST40接线 5. CPU SR60接线 6. CPU ST60接线 除了CPU外&#xff0c;我们还需要了解200smart PLC的数…

排序算法大总结(插入、希尔、选择、堆、冒泡、快速、归并、计数)

1. 排序概要2. 插入排序直接插入排序希尔排序&#xff08;缩小增量排序&#xff09; 3.选择排序直接选择排序堆排序 4. 交换排序冒泡排序快速排序霍尔版本&#xff08;hoare&#xff09;挖坑法双指针版本快排优化快速排序非递归 5. 归并排序归并递归版本归并非递归版本 6.计数排…

物联网开发项目中具备哪些特点?

物联网开发是一项由设备、传感器、系统和应用程序组成的复杂技术。许多公司已经制定了计划&#xff0c;准备将物联网技术整合到他们的系统中&#xff0c;以帮助提高效率和生产力。在许多方面&#xff0c;物联网都是一项技术投资&#xff0c;但与其他投资相比&#xff0c;它可以…

RFID如何提升工业物流管理效率?

RFID技术目前已经在工业物流行业中广泛应用&#xff0c;企业只要将附带产品信息的RFID标签贴在货物上&#xff0c;利用工业读写器&#xff0c;可以快速准确地读取货物信息&#xff0c;提高仓库管理效率。 RFID如何提升工业物流管理效率? 1、跟踪货物 RFID技术可帮助企业跟踪货…

声表面波滤波器圆片级互连封装技术研究

陈作桓&#xff0c;于大全&#xff0c;张名川 厦门大学&#xff0c;厦门云天半导体科技有限公司 摘要 射频前端模块是无线通信的核心&#xff0c;滤波器作为射频前端的关键器件&#xff0c;可将带外干扰和噪声滤除以保留特定频段内的信号&#xff0c;满足射频系统的通讯要求…

B站刚崩,唯品会又崩:亿级用户网站的架构硬伤与解决方案

说在前面 在40岁老架构师尼恩的数千读者的社区中&#xff0c;一直在指导大家简历和职业升级。前几天&#xff0c;指导了一个华为老伙伴的简历&#xff0c;小伙伴的优势在异地多活&#xff0c;但是在简历指导的过程中&#xff0c;尼恩发现&#xff1a; 异地多活的概念、异地多活…

Segment Anything模型用于地理空间数据

原文地址&#xff1a;https://samgeo.gishub.org/examples/satellite/ 此笔记本通过几行代码展示了如何使用 Segment Anything Model (SAM) 来使用分段卫星图像。确保为此jupyter notebook使用 GPU 运行时。 安装依赖 取消注释并运行以下单元格以安装所需的依赖项。 # %pip ins…

如何批量删除文件名称中的内容

网上有很多&#xff0c;这里自己要用就整理一下。 最好的还是知乎大佬张三丰的方式&#xff0c;放在参考内第一个链接。 首先打开命令行提示符 另一种方式按住WinR 输入powershell 弹出窗口&#xff1a; 我这里为了展示&#xff0c;在C盘新建了一个文件夹命名为1 里面新建了…

比起各式各样的AI应用,我们可能更需要AI for OS

刚刚过去的五月&#xff0c;是一个炙热的AI之夏。前有2023谷歌 I/O开发者大会“炸场”&#xff0c;开建AI全宇宙&#xff0c;后有2023微软Build大会&#xff0c;一切都与AI相关。 AI被谷歌和微软应用到各个产品&#xff0c;落地速度一路狂飙。不过&#xff0c;应用层面&#xf…

开启未来科技之门:选择IT相关专业,迈向成功的高考生小贴士!

毕业在即&#xff0c;许多即将高考的学生正面临选择未来专业的重要决策。在当今数字化时代&#xff0c;IT行业成为一个备受瞩目的领域&#xff0c;为学生们提供了广阔的就业前景和丰富的发展机会。本文将为高考生提供一些关于选择IT相关专业的小贴士&#xff0c;以帮助他们做出…

写对二分查找不是套模板并往里面填空,需要仔细分析题意

视频讲解&#xff1a;《算法不好玩》二分查找专题 &#xff08;已经录制的部分&#xff0c;能够帮助大家完全搞懂二分查找算法&#xff0c;非常适合初学的朋友们&#xff09;。 2022 年 12 月 27 日补充&#xff1a;有时间的话请大家先看上面的视频&#xff0c;2 倍速看完就好…