【python】OpenCV—Single Human Pose Estimation

文章目录

1、Human Pose Estimation
2、模型介绍
3、基于图片的单人人体关键点检测
4、基于视频的单人人体关键点检测
5、左右校正
6、关键点平滑
7、涉及到的库函数
- scipy.signal.savgol_filter
8、参考

1、Human Pose Estimation

Human Pose Estimation，即人体姿态估计，是一种基于计算机视觉和深度学习的技术，用于自动检测和识别人体的姿态和动作。它可以在图像或视频中准确地确定人体各个关节的位置和运动。

一、定义与分类

定义：人体姿态估计是指图像或视频中人体关节的定位问题，也可表述为在所有关节姿势的空间中搜索特定姿势。
分类：
2D姿态估计：从RGB图像估计每个关节的2D坐标（x,y）。
3D姿态估计：从RGBD图像中估计每个关节的3D坐标（x,y,z）。
根据应用场景，还可分为单人姿态估计、多人姿态估计、人体姿态跟踪等。

二、技术原理

实现方式：人体姿态识别的实现通常基于深度学习模型，如卷积神经网络（CNN）和循环神经网络（RNN）。首先，通过训练模型使用大量标记的姿势数据，模型能够学习到人体各个关节的特征表示。然后，当输入一张图像或视频时，模型会对每个关节进行定位和跟踪，进而恢复出人体的姿态。
改进技术：为了提高模型的鲁棒性和准确性，研究人员还提出了一些改进技术，如引入上下文信息、多尺度特征融合、姿势关系建模等。

三、应用领域

健身和运动：帮助跟踪和纠正运动姿势，提供更高效、精确的训练指导。
医疗：用于康复治疗、姿势评估等方面。
安防：用于行为分析、异常检测等。
虚拟现实（VR）与增强现实（AR）：创建更真实的互动体验。
健康管理：监控老年人的身体活动，预防跌倒风险等。

四、未来展望

随着深度学习和计算机视觉技术的不断发展，人体姿态估计技术将在更多领域发挥重要作用。未来，该技术有望进一步提高准确性和实时性，为人们的训练、康复、安全和生活提供更加有效的支持。同时，随着数据量的增加和计算能力的提升，人体姿态估计技术也将更加智能化和个性化，满足不同场景下的需求。

2、模型介绍

在这里插入图片描述

COCO 输出格式

鼻子- 0,脖子- 1,右肩- 2,右手肘- 3,右手腕- 4,左肩- 5,左肘- 6,左腕- 7,右髋部- 8,右膝- 9,右脚踝- 10,左臀部- 11,左膝- 12,左脚踝 - 13,右眼- 14,左眼- 15,右耳- 16,左耳- 17,背景-18

MPII 输出格式

头-0,颈-1,右肩-2,右手肘-3,右手腕-4,左肩-5,左肘-6,左腕-7,右髋部-8,右膝-9,右脚踝-10,左臀部-11,左膝-12,左脚踝-13,胸部-14,背景-15

COCO 输出格式的模型
在这里插入图片描述

MPII 格式模型

在这里插入图片描述

3、基于图片的单人人体关键点检测

import cv2
import time
import numpy as np
import argparse

parser = argparse.ArgumentParser(description='Run keypoint detection')
parser.add_argument("--device", default="gpu", help="Device to inference on")
parser.add_argument("--image_file", default="1.jpg", help="Input image")

args = parser.parse_args()


MODE = "COCO"
# MODE = "MPI"


if MODE is "COCO":
    protoFile = "pose/coco/pose_deploy_linevec.prototxt"
    weightsFile = "pose/coco/pose_iter_440000.caffemodel"
    nPoints = 18
    POSE_PAIRS = [ [1,0],[1,2],[1,5],[2,3],[3,4],[5,6],[6,7],[1,8],[8,9],[9,10],[1,11],[11,12],[12,13],[0,14],[0,15],[14,16],[15,17]]

elif MODE is "MPI":
    protoFile = "pose/mpi/pose_deploy_linevec_faster_4_stages.prototxt"
    weightsFile = "pose/mpi/pose_iter_160000.caffemodel"
    nPoints = 15
    POSE_PAIRS = [[0,1], [1,2], [2,3], [3,4], [1,5], [5,6], [6,7], [1,14], [14,8], [8,9], [9,10], [14,11], [11,12], [12,13] ]


frame = cv2.imread(args.image_file)
frameCopy = np.copy(frame)
frameWidth = frame.shape[1]
frameHeight = frame.shape[0]
threshold = 0.1

net = cv2.dnn.readNetFromCaffe(protoFile, weightsFile)

if args.device == "cpu":
    net.setPreferableBackend(cv2.dnn.DNN_TARGET_CPU)
    print("Using CPU device")
elif args.device == "gpu":
    net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
    net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)
    print("Using GPU device")

t = time.time()
# input image dimensions for the network
inWidth = 368
inHeight = 368
inpBlob = cv2.dnn.blobFromImage(frame, 1.0 / 255, (inWidth, inHeight),
                          (0, 0, 0), swapRB=False, crop=False)

net.setInput(inpBlob)

output = net.forward()
print("time taken by network : {:.3f}".format(time.time() - t))

H = output.shape[2]
W = output.shape[3]

# Empty list to store the detected keypoints
points = []

for i in range(nPoints):
    # confidence map of corresponding body's part.
    probMap = output[0, i, :, :]

    # Find global maxima of the probMap.
    minVal, prob, minLoc, point = cv2.minMaxLoc(probMap)
    
    # Scale the point to fit on the original image
    x = (frameWidth * point[0]) / W
    y = (frameHeight * point[1]) / H

    if prob > threshold : 
        cv2.circle(frameCopy, (int(x), int(y)), 8, (0, 255, 255), thickness=-1, lineType=cv2.FILLED)
        cv2.putText(frameCopy, "{}".format(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2, lineType=cv2.LINE_AA)

        # Add the point to the list if the probability is greater than the threshold
        points.append((int(x), int(y)))
    else :
        points.append(None)

# Draw Skeleton
for pair in POSE_PAIRS:
    partA = pair[0]
    partB = pair[1]

    if points[partA] and points[partB]:
        cv2.line(frame, points[partA], points[partB], (0, 255, 255), 2)
        cv2.circle(frame, points[partA], 8, (0, 0, 255), thickness=-1, lineType=cv2.FILLED)


cv2.imshow(MODE+'-Output-Keypoints', frameCopy)
cv2.imshow(MODE+'-Output-Skeleton', frame)


cv2.imwrite(MODE+'-Output-Keypoints.jpg', frameCopy)
cv2.imwrite(MODE+'-Output-Skeleton.jpg', frame)

print("Total time taken : {:.3f}".format(time.time() - t))

cv2.waitKey(0)

输入图片

在这里插入图片描述

COCO 输出

在这里插入图片描述
关键点连起来

在这里插入图片描述

可以看出脸部关键点预测的不太准，特别是鼻子

MPII 输出格式

在这里插入图片描述
关键点连起来

在这里插入图片描述

脚踝还是有一点点瑕疵

再看一组例子

输入图片

在这里插入图片描述

COCO 输出格式

在这里插入图片描述

在这里插入图片描述
同样的，脸部关键点不太准，耳朵鼻子

MPII 格式输出

在这里插入图片描述

在这里插入图片描述
MPII 结果不错

4、基于视频的单人人体关键点检测

import cv2
import time
import numpy as np
import argparse

parser = argparse.ArgumentParser(description='Run keypoint detection')
parser.add_argument("--device", default="gpu", help="Device to inference on")
parser.add_argument("--video_file", default="sample_video.mp4", help="Input Video")

args = parser.parse_args()

# MODE = "MPI"
MODE = "COCO"

if MODE is "COCO":
    protoFile = "pose/coco/pose_deploy_linevec.prototxt"
    weightsFile = "pose/coco/pose_iter_440000.caffemodel"
    nPoints = 18
    POSE_PAIRS = [ [1,0],[1,2],[1,5],[2,3],[3,4],[5,6],[6,7],[1,8],[8,9],[9,10],[1,11],[11,12],[12,13],[0,14],[0,15],[14,16],[15,17]]

elif MODE is "MPI" :
    protoFile = "pose/mpi/pose_deploy_linevec_faster_4_stages.prototxt"
    weightsFile = "pose/mpi/pose_iter_160000.caffemodel"
    nPoints = 15
    POSE_PAIRS = [[0,1], [1,2], [2,3], [3,4], [1,5], [5,6], [6,7], [1,14], [14,8], [8,9], [9,10], [14,11], [11,12], [12,13] ]


inWidth = 368
inHeight = 368
threshold = 0.1


input_source = args.video_file
cap = cv2.VideoCapture(input_source)
hasFrame, frame = cap.read()

vid_writer = cv2.VideoWriter(MODE+'-output.avi',cv2.VideoWriter_fourcc('M','J','P','G'), 10, (frame.shape[1],frame.shape[0]))

net = cv2.dnn.readNetFromCaffe(protoFile, weightsFile)
if args.device == "cpu":
    net.setPreferableBackend(cv2.dnn.DNN_TARGET_CPU)
    print("Using CPU device")
elif args.device == "gpu":
    net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
    net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)
    print("Using GPU device")

while cv2.waitKey(1) < 0:
    t = time.time()
    hasFrame, frame = cap.read()
    frameCopy = np.copy(frame)
    if not hasFrame:
        cv2.waitKey()
        break

    frameWidth = frame.shape[1]
    frameHeight = frame.shape[0]

    inpBlob = cv2.dnn.blobFromImage(frame, 1.0 / 255, (inWidth, inHeight),
                              (0, 0, 0), swapRB=False, crop=False)
    net.setInput(inpBlob)
    output = net.forward()

    H = output.shape[2]
    W = output.shape[3]
    # Empty list to store the detected keypoints
    points = []

    for i in range(nPoints):
        # confidence map of corresponding body's part.
        probMap = output[0, i, :, :]

        # Find global maxima of the probMap.
        minVal, prob, minLoc, point = cv2.minMaxLoc(probMap)
        
        # Scale the point to fit on the original image
        x = (frameWidth * point[0]) / W
        y = (frameHeight * point[1]) / H

        if prob > threshold : 
            cv2.circle(frameCopy, (int(x), int(y)), 8, (0, 255, 255), thickness=-1, lineType=cv2.FILLED)
            cv2.putText(frameCopy, "{}".format(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2, lineType=cv2.LINE_AA)

            # Add the point to the list if the probability is greater than the threshold
            points.append((int(x), int(y)))
        else :
            points.append(None)

    # Draw Skeleton
    for pair in POSE_PAIRS:
        partA = pair[0]
        partB = pair[1]

        if points[partA] and points[partB]:
            cv2.line(frame, points[partA], points[partB], (0, 255, 255), 3, lineType=cv2.LINE_AA)
            cv2.circle(frame, points[partA], 8, (0, 0, 255), thickness=-1, lineType=cv2.FILLED)
            cv2.circle(frame, points[partB], 8, (0, 0, 255), thickness=-1, lineType=cv2.FILLED)

    cv2.putText(frame, "time taken = {:.2f} sec".format(time.time() - t), (50, 50), cv2.FONT_HERSHEY_COMPLEX, .8, (255, 50, 0), 2, lineType=cv2.LINE_AA)
    # cv2.putText(frame, "OpenPose using OpenCV", (50, 50), cv2.FONT_HERSHEY_COMPLEX, 1, (255, 50, 0), 2, lineType=cv2.LINE_AA)
    # cv2.imshow('Output-Keypoints', frameCopy)
    cv2.imshow(MODE + '-Output-Skeleton', frame)

    vid_writer.write(frame)

vid_writer.release()

COCO 格式输出

COCO-output

MPII 格式输出

MPI-output

手腕脚腕是难点，左右存在混淆，抖动比较明显

下面章节我们来优化缓解下上述问题，基于 MPII 格式的输出

5、左右校正

身体部分预测的关键点不合理时被交换，这样左边的部分就总是在左边（要确保人一直是对着屏幕前的我们的吧，背对着和正面切换这样的校正就没有意义了）

先把正常预测的结果保存下来，save_pose_data,py 检测结果保存在 workout.csv 中

#!/usr/bin/python3
# -*- encoding: utf-8 -*-
"""
@File    : save_pose_data.py.py
@Time    : 2021/9/27 16:21
@Author  : David
@Software: PyCharm
"""
import cv2, numpy as np, csv

# https://github.com/opencv/opencv/blob/master/samples/dnn/openpose.py
outfile_path = 'workout.csv'

protoFile = "./pose/mpi/pose_deploy_linevec_faster_4_stages.prototxt"
weightsFile = "./pose/mpi/pose_iter_160000.caffemodel"
net = cv2.dnn.readNetFromCaffe(protoFile, weightsFile)

data, input_width, input_height, threshold, frame_number = [], 368, 386, 0.1, 0

input_source = "sample_video.mp4"
cap = cv2.VideoCapture(input_source)

# use the previous location of the body part if the model is wrong
previous_x, previous_y = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                                                                            0]

while True:

    ret, img = cap.read()
    if not ret: break

    # get the image shape
    img_width, img_height = img.shape[1], img.shape[0]

    # get a blob from the image
    inputBlob = cv2.dnn.blobFromImage(img, 1.0 / 255, (input_width, input_height), (0, 0, 0), swapRB=False, crop=False)

    # set the input and perform a forward pass
    net.setInput(inputBlob)
    output = net.forward()

    # get the output shape
    output_width, output_height = output.shape[2], output.shape[3]

    # Empty list to store the detected keypoints
    x_data, y_data = [], []

    # Iterate through the body parts
    for i in range(15):

        # find probability that point is correct
        _, prob, _, point = cv2.minMaxLoc(output[0, i, :, :])

        # Scale the point to fit on the original image
        x, y = (img_width * point[0]) / output_width, (img_height * point[1]) / output_height

        # Is the point likely to be correct?
        if prob > threshold:
            x_data.append(x)
            y_data.append(y)
            xy = tuple(np.array([x, y], int))
            cv2.circle(img, xy, 5, (25, 0, 255), 5)
        # No? us the location in the previous frame
        else:
            x_data.append(previous_x[i])
            y_data.append(previous_y[i])

    # add these points to the list of data
    data.append(x_data + y_data)
    previous_x, previous_y = x_data, y_data
    frame_number += 1
    # use this break statement to check your data before processing the whole video
    # if frame_number == 300: break
    print(frame_number)

    cv2.imshow('img', img)
    k = cv2.waitKey(1)
    if k == 27: break

# write the data to a .csv file
import pandas as pd

df = pd.DataFrame(data)
df.to_csv(outfile_path, index=False)
print('save complete')

读取模型预测的结果，进行左右校正，swap_body_parts.py 输出 swapped_body_parts.csv

#!/usr/bin/python3
# -*- encoding: utf-8 -*-
"""
@File    : swap_body_parts.py.py
@Time    : 2021/9/27 16:52
@Author  : David
@Software: PyCharm
"""
# swap_body_parts.py
import pandas as pd
import numpy as np
import cv2, os
import csv

input_source = "sample_video.mp4"
cap = cv2.VideoCapture(input_source)
frame_number = 0
font, scale, colorText, thick = cv2.FONT_HERSHEY_SIMPLEX, .5, (234, 234, 234), 1
size, color, thickness = 5, (255, 255, 255), 5
# get pose data - data is generated by open pose video
df = pd.read_csv('workout.csv')

# there are 15 points in the skeleton
# 0 head
# 1 neck
# 2, 5 shoulders
# 3, 6 elbows
# 4, 7 hands
# 8, 11 hips
# 9, 12 knees
# 10, 13 ankles
# 14 torso

data = []
while cv2.waitKey(10) < 0 and frame_number < len(df.values) - 2:
    ret, img = cap.read()
    if not ret: break
    try:
        values = df.values[frame_number]
    except:
        break
    values = np.array(values, int)
    points = []
    points.append((values[0], values[1]))
    points.append((values[2], values[3]))
    points.append((values[4], values[5]))
    points.append((values[6], values[7]))
    points.append((values[8], values[9]))
    points.append((values[10], values[11]))
    points.append((values[12], values[13]))
    points.append((values[14], values[15]))
    points.append((values[16], values[17]))
    points.append((values[18], values[19]))
    points.append((values[20], values[21]))
    points.append((values[22], values[23]))
    points.append((values[24], values[25]))
    points.append((values[26], values[27]))
    points.append((values[28], values[29]))

    # create a blank list to store the non-swapped poitns
    non_swap_points = []
    for i in range(15): non_swap_points.append((0, 0))
    # add the head, that point never changes
    non_swap_points[0] = points[0]
    # add the neck, that point never changes
    non_swap_points[1] = points[1]
    # add the torso, that never changes
    non_swap_points[14] = points[14]

    # swap the left and right shoulders (2 and 5)
    if points[2][0] < points[5][0]:
        non_swap_points[2] = points[2]
        non_swap_points[5] = points[5]
    else:
        non_swap_points[2] = points[5]
        non_swap_points[5] = points[2]
    # swap the elbows
    if points[3][0] < points[6][0]:
        non_swap_points[3] = points[3]
        non_swap_points[6] = points[6]
    else:
        non_swap_points[6] = points[3]
        non_swap_points[3] = points[6]

    # swap the hands
    if points[4][0] < points[7][0]:
        non_swap_points[4] = points[4]
        non_swap_points[7] = points[7]
    else:
        non_swap_points[7] = points[4]
        non_swap_points[4] = points[7]
    # swap the hips
    if points[8][0] < points[11][0]:
        non_swap_points[11] = points[11]
        non_swap_points[8] = points[8]
    else:
        non_swap_points[8] = points[11]
        non_swap_points[11] = points[8]

    # swap the knees
    if points[9][0] < points[12][0]:
        non_swap_points[9] = points[9]
        non_swap_points[12] = points[12]
    else:
        non_swap_points[12] = points[9]
        non_swap_points[9] = points[12]

    # swap the feet
    if points[10][0] < points[13][0]:
        non_swap_points[10] = points[10]
        non_swap_points[13] = points[13]
    else:
        non_swap_points[13] = points[10]
        non_swap_points[10] = points[13]

    for point in non_swap_points:
        cv2.circle(img, point, 3, (0, 0, 255), 3)
        cv2.putText(img, str(non_swap_points.index(point)), point, font, scale, colorText, thick, cv2.LINE_AA)

    cv2.imshow('Output-Skeleton', img)
    frame_number += 1
    data.append(non_swap_points)
cv2.destroyAllWindows()

with open('swapped_body_parts.csv', 'w') as csvfile:
    fieldnames = []
    for i in range(30): fieldnames.append(str(i))
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()

    for trick in data:
        writer.writerow({'0': trick[0][0],
                         '1': trick[0][1],
                         '2': trick[1][0],
                         '3': trick[1][1],
                         '4': trick[2][0],
                         '5': trick[2][1],
                         '6': trick[3][0],
                         '7': trick[3][1],
                         '8': trick[4][0],
                         '9': trick[4][1],
                         '10': trick[5][0],
                         '11': trick[5][1],
                         '12': trick[6][0],
                         '13': trick[6][1],
                         '14': trick[7][0],
                         '15': trick[7][1],
                         '16': trick[8][0],
                         '17': trick[8][1],
                         '18': trick[9][0],
                         '19': trick[9][1],
                         '20': trick[10][0],
                         '21': trick[10][1],
                         '22': trick[11][0],
                         '23': trick[11][1],
                         '24': trick[12][0],
                         '25': trick[12][1],
                         '26': trick[13][0],
                         '27': trick[13][1],
                         '28': trick[14][0],
                         '29': trick[14][1]})

结果会在下一小节展示

6、关键点平滑

在应用平滑算法之前，所有帧的姿态数据必须是已知的（我们保存在了workout.csv 之中）

#!/usr/bin/python3
# -*- encoding: utf-8 -*-
"""
@File    : smooth_pose_data.py.py
@Time    : 2021/9/27 16:15
@Author  : David
@Software: PyCharm
"""
import pandas as pd
import numpy as np
import cv2
from scipy import signal

circle_color, line_color = (255, 255, 0), (0, 0, 255)
window_length, polyorder = 13, 2

input_source = "sample_video.mp4"

# Get pose data - data is generated by OpenPose
df = pd.read_csv('workout.csv')
# df = pd.read_csv('swapped_body_parts.csv')

cap = cv2.VideoCapture(input_source)

# get width of video
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))

# get height of video
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
out = cv2.VideoWriter('smooth_pose.avi',
                      cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), 30, (frame_width, frame_height))
# There are 15 points in the skeleton
pairs = [[0, 1],  # head
         [1, 2], [1, 5],  # sholders
         [2, 3], [3, 4], [5, 6], [6, 7],  # arms
         [1, 14], [14, 11], [14, 8],  # hips
         [8, 9], [9, 10], [11, 12], [12, 13]]  # legs

# Smooth it out
for i in range(30):
    df[str(i)] = signal.savgol_filter(df[str(i)], window_length, polyorder)

frame_number = 0
while True:
    print(frame_number)
    ret, img = cap.read()  # 720, 576
    if not ret:
        break
    # img = np.zeros_like(img)
    values = np.array(df.values[frame_number], int)

    points, lateral_offset = [], 18
    points = list(zip(values[:15] + lateral_offset, values[15:]))

    cc = 0
    for point in points:
        cc += 90
        xy = tuple(np.array([point[0], point[1]], int))
        cv2.circle(img, xy, 5, (cc, cc, cc), 5)

    # Draw Skeleton
    for pair in pairs:
        partA = pair[0]
        partB = pair[1]
        cv2.line(img, points[partA], points[partB], line_color, 3, lineType=cv2.LINE_AA)

    cv2.imshow('Output-Skeleton', img)
    out.write(img)
    k = cv2.waitKey(100)
    if k == 27: # Esc
        break
    frame_number += 1
out.release()
cv2.destroyAllWindows()

我们仅平滑，不左右校正，看看结果

smooth pose direct

smooth_pose_data.py 平滑结果，可以看到结果丝滑了很多，抖动明显减少，不过大幅度动作关键点预测的还是不太准确，特别是手腕脚腕这些比较难的地方，而且也容易出现左右混淆

下面我们先左右校正，再平滑，输入 df = pd.read_csv('swapped_body_parts.csv')

smooth pose

感觉哪里有点问题，哈哈，比直接平滑效果变差了

7、涉及到的库函数

scipy.signal.savgol_filter

scipy.signal.savgol_filter 是 SciPy 库中用于数据平滑的一个函数，它实现了 Savitzky-Golay 滤波器。Savitzky-Golay 滤波器是一种在一维数据上应用的滤波技术，通过局域多项式最小二乘法拟合来平滑数据并去除噪声，同时保留信号的形状和变化信息。

函数签名

scipy.signal.savgol_filter(x, window_length, polyorder, deriv=0, delta=1.0, axis=-1, mode='interp', cval=0.0)

参数说明

x (array_like): 要过滤的数据。如果 x 不是单精度或双精度浮点型数组，则会在过滤前被转换为 numpy.float64 类型。
window_length (int): 滤波器窗口的长度（即系数的数量）。该值必须为正奇数，决定了平滑的范围。如果 mode 是 ‘interp’，则 window_length 必须小于或等于 x 的大小。
polyorder (int): 用于拟合样本的多项式的阶数。该值必须小于 window_length。
deriv (int, 可选): 要计算的导数的阶数。默认为 0，表示仅平滑数据而不进行微分。如果大于 0，则计算指定阶数的导数。
delta (float, 可选): 样本间隔。当 deriv 大于 0 时使用。默认为 1.0。
axis (int, 可选): 应用滤波器的轴。对于多维数组，该参数指定在哪个轴上应用滤波器。默认为 -1，即最后一个轴。
mode (str, 可选): 用于处理边界的模式。可以是 ‘mirror’、‘constant’、‘nearest’、‘wrap’ 或 ‘interp’。默认为 ‘interp’。这些模式决定了在窗口移出数组边界时如何处理边界值。
‘mirror’: 通过镜像反射填充边界。
‘constant’: 使用 cval 指定的常量值填充边界。
‘nearest’: 使用最近的值填充边界。
‘wrap’: 通过环绕的方式填充边界。
‘interp’: 使用多项式插值填充边界。
cval (scalar, 可选): 当 mode 为 ‘constant’ 时，用于填充边界的常量值。默认为 0.0。

返回值

Y (ndarray): 过滤后的数据。

注意事项

Savitzky-Golay 滤波器是一种局部多项式回归方法，它通过最小化窗口内数据的局部多项式拟合误差来平滑数据。
选择合适的 window_length 和 polyorder 是非常重要的，它们将直接影响平滑的效果和信号的保留程度。
window_length 必须是奇数，以便窗口能够对称地围绕中心点。
在处理多维数据时，axis 参数允许用户指定在哪个维度上应用滤波器。

示例

import numpy as np  
import matplotlib.pyplot as plt  
from scipy.signal import savgol_filter  
  
# 创建示例数据  
x = np.linspace(-4, 4, 500)  
y = np.exp(-x**2) + np.random.normal(0, 0.05, x.shape)  
  
# 应用 Savitzky-Golay 滤波器  
y_filtered = savgol_filter(y, window_length=51, polyorder=3)  
  
# 绘制原始数据和平滑后的数据  
plt.plot(x, y, label='Noisy data')  
plt.plot(x, y_filtered, label='Smoothed data', color='red')  
plt.legend()  
plt.show()

在这里插入图片描述

以上示例展示了如何使用 scipy.signal.savgol_filter 函数对带有噪声的数据进行平滑处理。

8、参考

参考学习来自以下链接

Code and model
链接：https://pan.baidu.com/s/1OoDWEc7bdwKbKBEqQ5oOdA
提取码：123a
OpenCV进阶（5）基于OpenCV的深度学习人体姿态估计之单人篇
https://learnopencv.com/deep-learning-based-human-pose-estimation-using-opencv-cpp-python/
Cao Z, Simon T, Wei S E, et al. Realtime multi-person 2d pose estimation using part affinity fields[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 7291-7299.
https://arxiv.org/pdf/1611.08050
MPII Human Pose Dataset
http://human-pose.mpi-inf.mpg.de/
Human Pose Evaluator Dataset
https://www.robots.ox.ac.uk/~vgg/data/pose_evaluation/