MidiaPipe +stgcn(时空图卷积网络)实现人体姿态判断(单目标)

news2025/1/4 17:37:26

文章目录

  • 前言
  • Midiapipe关键点检测
  • stgcn 姿态评估
  • 效果

前言

冒个泡,年少无知吹完的牛皮是要还的呀。
那么这里的话要做的一个东西就是一个人体的姿态判断,比如一个人是坐着还是站着还是摔倒了,如果摔倒了我们要做什么操作,之类的。

不过这里比较可惜的就是这个midiapipe 它里面的Pose的话是只有一个pose的也就是单目标的一个检测,所以距离我想要的一个效果是很难受的,不过这个dome还是挺好玩的。

实现效果如下:

在这里插入图片描述

Midiapipe关键点检测

这个dome的核心之一,就是这个检测到人体的一个关键点,
在这里插入图片描述

import time
from collections import deque

import cv2
import numpy as np
import mediapipe as mp

from stgcn.stgcn import STGCN
from PIL import Image, ImageDraw, ImageFont


# 人体关键点检测模块
mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_pose = mp.solutions.pose

# 人脸模块
mpFace = mp.solutions.face_detection
faceDetection = mpFace.FaceDetection(min_detection_confidence=0.5)

KEY_JOINTS = [
    mp_pose.PoseLandmark.NOSE,
    mp_pose.PoseLandmark.LEFT_SHOULDER,
    mp_pose.PoseLandmark.RIGHT_SHOULDER,
    mp_pose.PoseLandmark.LEFT_ELBOW,
    mp_pose.PoseLandmark.RIGHT_ELBOW,
    mp_pose.PoseLandmark.LEFT_WRIST,
    mp_pose.PoseLandmark.RIGHT_WRIST,
    mp_pose.PoseLandmark.LEFT_HIP,
    mp_pose.PoseLandmark.RIGHT_HIP,
    mp_pose.PoseLandmark.LEFT_KNEE,
    mp_pose.PoseLandmark.RIGHT_KNEE,
    mp_pose.PoseLandmark.LEFT_ANKLE,
    mp_pose.PoseLandmark.RIGHT_ANKLE
]

POSE_CONNECTIONS = [(6, 4), (4, 2), (2, 13), (13, 1), (5, 3), (3, 1), (12, 10),
                    (10, 8), (8, 2), (11, 9), (9, 7), (7, 1), (13, 0)]

POINT_COLORS = [(0, 255, 255), (0, 191, 255), (0, 255, 102), (0, 77, 255), (0, 255, 0),  # Nose, LEye, REye, LEar, REar
                (77, 255, 255), (77, 255, 204), (77, 204, 255), (191, 255, 77), (77, 191, 255), (191, 255, 77),  # LShoulder, RShoulder, LElbow, RElbow, LWrist, RWrist
                (204, 77, 255), (77, 255, 204), (191, 77, 255), (77, 255, 191), (127, 77, 255), (77, 255, 127), (0, 255, 255)]  # LHip, RHip, LKnee, Rknee, LAnkle, RAnkle, Neck

LINE_COLORS = [(0, 215, 255), (0, 255, 204), (0, 134, 255), (0, 255, 50), (77, 255, 222),
               (77, 196, 255), (77, 135, 255), (191, 255, 77), (77, 255, 77), (77, 222, 255),
               (255, 156, 127), (0, 127, 255), (255, 127, 77), (0, 77, 255), (255, 77, 36)]


POSE_MAPPING = ["站着","走着","坐着","躺下","站起来","坐下","摔倒"]

POSE_MAPPING_COLOR = [
    (255,255,240),(	245,222,179),(244,164,96),(	210,180,140),
    (255,127,80),(255,165,79),(	255,48,48)
]

# 为了检测动作的准确度,每30帧进行一次检测
ACTION_MODEL_MAX_FRAMES = 30

class FallDetection:
    def __init__(self):
        self.action_model = STGCN(weight_file='./weights/tsstg-model.pth', device='cpu')
        self.joints_list = deque(maxlen=ACTION_MODEL_MAX_FRAMES)

    def draw_skeleton(self, frame, pts):
        l_pair = POSE_CONNECTIONS
        p_color = POINT_COLORS
        line_color = LINE_COLORS

        part_line = {}
        pts = np.concatenate((pts, np.expand_dims((pts[1, :] + pts[2, :]) / 2, 0)), axis=0)
        for n in range(pts.shape[0]):
            if pts[n, 2] <= 0.05:
                continue
            cor_x, cor_y = int(pts[n, 0]), int(pts[n, 1])
            part_line[n] = (cor_x, cor_y)
            cv2.circle(frame, (cor_x, cor_y), 3, p_color[n], -1)
            # cv2.putText(frame, str(n), (cor_x+10, cor_y+10), cv2.FONT_HERSHEY_PLAIN, 1, (0, 0, 255), 1)

        for i, (start_p, end_p) in enumerate(l_pair):
            if start_p in part_line and end_p in part_line:
                start_xy = part_line[start_p]
                end_xy = part_line[end_p]
                cv2.line(frame, start_xy, end_xy, line_color[i], int(1*(pts[start_p, 2] + pts[end_p, 2]) + 3))
        return frame

    def cv2_add_chinese_text(self, img, text, position, textColor=(0, 255, 0), textSize=30):
        if (isinstance(img, np.ndarray)):  # 判断是否OpenCV图片类型
            img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
        # 创建一个可以在给定图像上绘图的对象
        draw = ImageDraw.Draw(img)
        # 字体的格式,opencv不支持中文,需要指定字体
        fontStyle = ImageFont.truetype(
            "./fonts/MSYH.ttc", textSize, encoding="utf-8")

        draw.text(position, text, textColor, font=fontStyle)

        return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)

    def detect(self):
        cap = cv2.VideoCapture(0)
        # cap.set(3, 540)
        # cap.set(4, 960)
        # cap.set(5,30)
        image_h = cap.get(cv2.CAP_PROP_FRAME_HEIGHT)
        image_w = cap.get(cv2.CAP_PROP_FRAME_WIDTH)
        frame_num = 0
        print(image_h, image_w)

        with mp_pose.Pose(
                min_detection_confidence=0.7,
                min_tracking_confidence=0.5) as pose:
            while cap.isOpened():
                fps_time = time.time()
                frame_num += 1
                success, image = cap.read()
                if not success:
                    print("Ignoring empty camera frame.")
                    continue

                # 提高性能,这里是做那个姿态的一个推理
                image.flags.writeable = False
                image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
                results = pose.process(image)

                if results.pose_landmarks:
                    # 识别骨骼点
                    image.flags.writeable = True
                    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)


                    landmarks = results.pose_landmarks.landmark
                    joints = np.array([[landmarks[joint].x * image_w,
                                        landmarks[joint].y * image_h,
                                        landmarks[joint].visibility]
                                       for joint in KEY_JOINTS])
                    # 人体框
                    box_l, box_r = int(joints[:, 0].min())-50, int(joints[:, 0].max())+50
                    box_t, box_b = int(joints[:, 1].min())-100, int(joints[:, 1].max())+100

                    self.joints_list.append(joints)

                    # 识别动作
                    action = ''
                    clr = (0, 255, 0)
                    # 30帧数据预测动作类型
                    if len(self.joints_list) == ACTION_MODEL_MAX_FRAMES:
                        pts = np.array(self.joints_list, dtype=np.float32)
                        out = self.action_model.predict(pts, (image_w, image_h))
                        #
                        index = out[0].argmax()
                        action_name = POSE_MAPPING[index]
                        cls = POSE_MAPPING_COLOR[index]
                        action = '{}: {:.2f}%'.format(action_name, out[0].max() * 100)
                        print(action)
                    # 绘制骨骼点和动作类别
                    image = self.draw_skeleton(image, self.joints_list[-1])
                    image = cv2.rectangle(image, (box_l, box_t), (box_r, box_b), (255, 0, 0), 1)
                    image = self.cv2_add_chinese_text(image, f'当前状态:{action}', (box_l + 10, box_t + 10), clr, 40)

                else:
                    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

                image = cv2.putText(image, f'FPS: {int(1.0 / (time.time() - fps_time))}',
                                    (50, 50), cv2.FONT_HERSHEY_PLAIN, 3, (0, 255, 0), 2)

                cv2.imshow('Pose', image)

                if cv2.waitKey(1) & 0xFF == ord("q"):
                    break


        cap.release()
        cv2.destroyAllWindows()

if __name__ == '__main__':
    FallDetection().detect()

stgcn 姿态评估

首先的话,他这个时空图神经网络,我是没有研究过的,这玩意就是啥呢,就是把pose传入然后一通运算,然后就可以得到一个动作以及所属类别,也就是说这玩意是一个分类的图网络。这部分的话我不是很熟悉,这是我的盲区,所以我这里就把这个当作黑盒处理了。那么同样的这部分代码也是直接在Github上面cv过来,然后集成到这个项目里面。

是的,算法的运用开发和我们正常的开发其实区别不大,重新训练任务只是调参,适当调整网络模型,以及训练数据即可,颠覆性的改动=重新设计算法。

在这里插入图片描述

这部分代码并不多,我就直接贴出来了:

按顺序从上到下


import torch
import torch.nn as nn
import torch.nn.functional as F
from stgcn.Utils import Graph


class GraphConvolution(nn.Module):
    """The basic module for applying a graph convolution.
    Args:
        - in_channel: (int) Number of channels in the input sequence data.
        - out_channels: (int) Number of channels produced by the convolution.
        - kernel_size: (int) Size of the graph convolving kernel.
        - t_kernel_size: (int) Size of the temporal convolving kernel.
        - t_stride: (int, optional) Stride of the temporal convolution. Default: 1
        - t_padding: (int, optional) Temporal zero-padding added to both sides of
            the input. Default: 0
        - t_dilation: (int, optional) Spacing between temporal kernel elements. Default: 1
        - bias: (bool, optional) If `True`, adds a learnable bias to the output.
            Default: `True`
    Shape:
        - Inputs x: Graph sequence in :math:`(N, in_channels, T_{in}, V)`,
                 A: Graph adjacency matrix in :math:`(K, V, V)`,
        - Output: Graph sequence out in :math:`(N, out_channels, T_{out}, V)`

            where
                :math:`N` is a batch size,
                :math:`K` is the spatial kernel size, as :math:`K == kernel_size[1]`,
                :math:`T_{in}/T_{out}` is a length of input/output sequence,
                :math:`V` is the number of graph nodes.

    """
    def __init__(self, in_channels, out_channels, kernel_size,
                 t_kernel_size=1,
                 t_stride=1,
                 t_padding=0,
                 t_dilation=1,
                 bias=True):
        super().__init__()

        self.kernel_size = kernel_size
        self.conv = nn.Conv2d(in_channels,
                              out_channels * kernel_size,
                              kernel_size=(t_kernel_size, 1),
                              padding=(t_padding, 0),
                              stride=(t_stride, 1),
                              dilation=(t_dilation, 1),
                              bias=bias)

    def forward(self, x, A):
        x = self.conv(x)
        n, kc, t, v = x.size()
        x = x.view(n, self.kernel_size, kc//self.kernel_size, t, v)
        x = torch.einsum('nkctv,kvw->nctw', (x, A))

        return x.contiguous()


class st_gcn(nn.Module):
    """Applies a spatial temporal graph convolution over an input graph sequence.
    Args:
        - in_channels: (int) Number of channels in the input sequence data.
        - out_channels: (int) Number of channels produced by the convolution.
        - kernel_size: (tuple) Size of the temporal convolving kernel and
            graph convolving kernel.
        - stride: (int, optional) Stride of the temporal convolution. Default: 1
        - dropout: (int, optional) Dropout rate of the final output. Default: 0
        - residual: (bool, optional) If `True`, applies a residual mechanism.
            Default: `True`
    Shape:
        - Inputs x: Graph sequence in :math: `(N, in_channels, T_{in}, V)`,
                 A: Graph Adjecency matrix in :math: `(K, V, V)`,
        - Output: Graph sequence out in :math: `(N, out_channels, T_{out}, V)`
            where
                :math:`N` is a batch size,
                :math:`K` is the spatial kernel size, as :math:`K == kernel_size[1]`,
                :math:`T_{in}/T_{out}` is a length of input/output sequence,
                :math:`V` is the number of graph nodes.
    """
    def __init__(self, in_channels, out_channels, kernel_size,
                 stride=1,
                 dropout=0,
                 residual=True):
        super().__init__()
        assert len(kernel_size) == 2
        assert kernel_size[0] % 2 == 1

        padding = ((kernel_size[0] - 1) // 2, 0)

        self.gcn = GraphConvolution(in_channels, out_channels, kernel_size[1])
        self.tcn = nn.Sequential(nn.BatchNorm2d(out_channels),
                                 nn.ReLU(inplace=True),
                                 nn.Conv2d(out_channels,
                                           out_channels,
                                           (kernel_size[0], 1),
                                           (stride, 1),
                                           padding),
                                 nn.BatchNorm2d(out_channels),
                                 nn.Dropout(dropout, inplace=True)
                                 )

        if not residual:
            self.residual = lambda x: 0
        elif (in_channels == out_channels) and (stride == 1):
            self.residual = lambda x: x
        else:
            self.residual = nn.Sequential(nn.Conv2d(in_channels,
                                                    out_channels,
                                                    kernel_size=1,
                                                    stride=(stride, 1)),
                                          nn.BatchNorm2d(out_channels)
                                          )
        self.relu = nn.ReLU(inplace=True)

    def forward(self, x, A):
        res = self.residual(x)
        x = self.gcn(x, A)
        x = self.tcn(x) + res

        return self.relu(x)


class StreamSpatialTemporalGraph(nn.Module):
    """Spatial temporal graph convolutional networks.
    Args:
        - in_channels: (int) Number of input channels.
        - graph_args: (dict) Args map of `Actionsrecognition.Utils.Graph` Class.
        - num_class: (int) Number of class outputs. If `None` return pooling features of
            the last st-gcn layer instead.
        - edge_importance_weighting: (bool) If `True`, adds a learnable importance
            weighting to the edges of the graph.
        - **kwargs: (optional) Other parameters for graph convolution units.
    Shape:
        - Input: :math:`(N, in_channels, T_{in}, V_{in})`
        - Output: :math:`(N, num_class)` where
            :math:`N` is a batch size,
            :math:`T_{in}` is a length of input sequence,
            :math:`V_{in}` is the number of graph nodes,
        or If num_class is `None`: `(N, out_channels)`
            :math:`out_channels` is number of out_channels of the last layer.
    """
    def __init__(self, in_channels, graph_args, num_class=None,
                 edge_importance_weighting=True, **kwargs):
        super().__init__()
        # Load graph.
        graph = Graph(**graph_args)
        A = torch.tensor(graph.A, dtype=torch.float32, requires_grad=False)
        self.register_buffer('A', A)

        # Networks.
        spatial_kernel_size = A.size(0)
        temporal_kernel_size = 9
        kernel_size = (temporal_kernel_size, spatial_kernel_size)
        kwargs0 = {k: v for k, v in kwargs.items() if k != 'dropout'}

        self.data_bn = nn.BatchNorm1d(in_channels * A.size(1))
        self.st_gcn_networks = nn.ModuleList((
            st_gcn(in_channels, 64, kernel_size, 1, residual=False, **kwargs0),
            st_gcn(64, 64, kernel_size, 1, **kwargs),
            st_gcn(64, 64, kernel_size, 1, **kwargs),
            st_gcn(64, 64, kernel_size, 1, **kwargs),
            st_gcn(64, 128, kernel_size, 2, **kwargs),
            st_gcn(128, 128, kernel_size, 1, **kwargs),
            st_gcn(128, 128, kernel_size, 1, **kwargs),
            st_gcn(128, 256, kernel_size, 2, **kwargs),
            st_gcn(256, 256, kernel_size, 1, **kwargs),
            st_gcn(256, 256, kernel_size, 1, **kwargs)
        ))

        # initialize parameters for edge importance weighting.
        if edge_importance_weighting:
            self.edge_importance = nn.ParameterList([
                nn.Parameter(torch.ones(A.size()))
                for i in self.st_gcn_networks
            ])
        else:
            self.edge_importance = [1] * len(self.st_gcn_networks)

        if num_class is not None:
            self.cls = nn.Conv2d(256, num_class, kernel_size=1)
        else:
            self.cls = lambda x: x

    def forward(self, x):
        # data normalization.
        N, C, T, V = x.size()
        x = x.permute(0, 3, 1, 2).contiguous()  # (N, V, C, T)
        x = x.view(N, V * C, T)
        x = self.data_bn(x)
        x = x.view(N, V, C, T)
        x = x.permute(0, 2, 3, 1).contiguous()
        x = x.view(N, C, T, V)

        # forward.
        for gcn, importance in zip(self.st_gcn_networks, self.edge_importance):
            x = gcn(x, self.A * importance)

        x = F.avg_pool2d(x, x.size()[2:])
        x = self.cls(x)
        x = x.view(x.size(0), -1)

        return x


class TwoStreamSpatialTemporalGraph(nn.Module):
    """Two inputs spatial temporal graph convolutional networks.
    Args:
        - graph_args: (dict) Args map of `Actionsrecognition.Utils.Graph` Class.
        - num_class: (int) Number of class outputs.
        - edge_importance_weighting: (bool) If `True`, adds a learnable importance
            weighting to the edges of the graph.
        - **kwargs: (optional) Other parameters for graph convolution units.
    Shape:
        - Input: :tuple of math:`((N, 3, T, V), (N, 2, T, V))`
        for points and motions stream where.
            :math:`N` is a batch size,
            :math:`in_channels` is data channels (3 is (x, y, score)), (2 is (mot_x, mot_y))
            :math:`T` is a length of input sequence,
            :math:`V` is the number of graph nodes,
        - Output: :math:`(N, num_class)`
    """
    def __init__(self, graph_args, num_class, edge_importance_weighting=True,
                 **kwargs):
        super().__init__()
        self.pts_stream = StreamSpatialTemporalGraph(3, graph_args, None,
                                                     edge_importance_weighting,
                                                     **kwargs)
        self.mot_stream = StreamSpatialTemporalGraph(2, graph_args, None,
                                                     edge_importance_weighting,
                                                     **kwargs)

        self.fcn = nn.Linear(256 * 2, num_class)

    def forward(self, inputs):
        out1 = self.pts_stream(inputs[0])
        out2 = self.mot_stream(inputs[1])

        concat = torch.cat([out1, out2], dim=-1)
        out = self.fcn(concat)

        return torch.sigmoid(out)


import torch
import numpy as np

from .Models import TwoStreamSpatialTemporalGraph
from .Utils import normalize_points_with_size, scale_pose


class STGCN(object):
    """Two-Stream Spatial Temporal Graph Model Loader.
    Args:
        weight_file: (str) Path to trained weights file.
        device: (str) Device to load the model on 'cpu' or 'cuda'.
    """
    def __init__(self,
                 weight_file='./Models/TSSTG/tsstg-model.pth',
                 device='cuda'):
        self.graph_args = {'strategy': 'spatial'}
        self.class_names = ['Standing', 'Walking', 'Sitting', 'Lying Down',
                            'Stand up', 'Sit down', 'Fall Down']
        self.num_class = len(self.class_names)
        self.device = device

        self.model = TwoStreamSpatialTemporalGraph(self.graph_args, self.num_class).to(self.device)
        self.model.load_state_dict(torch.load(weight_file,  map_location=torch.device(device)))
        self.model.eval()

    def predict(self, pts, image_size):
        """Predict actions from single person skeleton points and score in time sequence.
        Args:
            pts: (numpy array) points and score in shape `(t, v, c)` where
                t : inputs sequence (time steps).,
                v : number of graph node (body parts).,
                c : channel (x, y, score).,
            image_size: (tuple of int) width, height of image frame.
        Returns:
            (numpy array) Probability of each class actions.
        """
        pts[:, :, :2] = normalize_points_with_size(pts[:, :, :2], image_size[0], image_size[1])
        pts[:, :, :2] = scale_pose(pts[:, :, :2])
        pts = np.concatenate((pts, np.expand_dims((pts[:, 1, :] + pts[:, 2, :]) / 2, 1)), axis=1)

        pts = torch.tensor(pts, dtype=torch.float32)
        pts = pts.permute(2, 0, 1)[None, :]

        mot = pts[:, :2, 1:, :] - pts[:, :2, :-1, :]
        mot = mot.to(self.device)
        pts = pts.to(self.device)

        out = self.model((pts, mot))

        return out.detach().cpu().numpy()

### Reference from: https://github.com/yysijie/st-gcn/blob/master/net/utils/graph.py

import os
import torch
import numpy as np


class Graph:
    """The Graph to model the skeletons extracted by the Alpha-Pose.
    Args:
        - strategy: (string) must be one of the follow candidates
            - uniform: Uniform Labeling,
            - distance: Distance Partitioning,
            - spatial: Spatial Configuration,
        For more information, please refer to the section 'Partition Strategies'
            in our paper (https://arxiv.org/abs/1801.07455).
        - layout: (string) must be one of the follow candidates
            - coco_cut: Is COCO format but cut 4 joints (L-R ears, L-R eyes) out.
        - max_hop: (int) the maximal distance between two connected nodes.
        - dilation: (int) controls the spacing between the kernel points.
    """
    def __init__(self,
                 layout='coco_cut',
                 strategy='uniform',
                 max_hop=1,
                 dilation=1):
        self.max_hop = max_hop
        self.dilation = dilation

        self.get_edge(layout)
        self.hop_dis = get_hop_distance(self.num_node, self.edge, max_hop)
        self.get_adjacency(strategy)

    def get_edge(self, layout):
        if layout == 'coco_cut':
            self.num_node = 14
            self_link = [(i, i) for i in range(self.num_node)]
            neighbor_link = [(6, 4), (4, 2), (2, 13), (13, 1), (5, 3), (3, 1), (12, 10),
                             (10, 8), (8, 2), (11, 9), (9, 7), (7, 1), (13, 0)]
            self.edge = self_link + neighbor_link
            self.center = 13
        else:
            raise ValueError('This layout is not supported!')

    def get_adjacency(self, strategy):
        valid_hop = range(0, self.max_hop + 1, self.dilation)
        adjacency = np.zeros((self.num_node, self.num_node))
        for hop in valid_hop:
            adjacency[self.hop_dis == hop] = 1
        normalize_adjacency = normalize_digraph(adjacency)

        if strategy == 'uniform':
            A = np.zeros((1, self.num_node, self.num_node))
            A[0] = normalize_adjacency
            self.A = A
        elif strategy == 'distance':
            A = np.zeros((len(valid_hop), self.num_node, self.num_node))
            for i, hop in enumerate(valid_hop):
                A[i][self.hop_dis == hop] = normalize_adjacency[self.hop_dis ==
                                                                hop]
            self.A = A
        elif strategy == 'spatial':
            A = []
            for hop in valid_hop:
                a_root = np.zeros((self.num_node, self.num_node))
                a_close = np.zeros((self.num_node, self.num_node))
                a_further = np.zeros((self.num_node, self.num_node))
                for i in range(self.num_node):
                    for j in range(self.num_node):
                        if self.hop_dis[j, i] == hop:
                            if self.hop_dis[j, self.center] == self.hop_dis[i, self.center]:
                                a_root[j, i] = normalize_adjacency[j, i]
                            elif self.hop_dis[j, self.center] > self.hop_dis[i, self.center]:
                                a_close[j, i] = normalize_adjacency[j, i]
                            else:
                                a_further[j, i] = normalize_adjacency[j, i]
                if hop == 0:
                    A.append(a_root)
                else:
                    A.append(a_root + a_close)
                    A.append(a_further)
            A = np.stack(A)
            self.A = A
            #self.A = np.swapaxes(np.swapaxes(A, 0, 1), 1, 2)
        else:
            raise ValueError("This strategy is not supported!")


def get_hop_distance(num_node, edge, max_hop=1):
    A = np.zeros((num_node, num_node))
    for i, j in edge:
        A[j, i] = 1
        A[i, j] = 1

    # compute hop steps
    hop_dis = np.zeros((num_node, num_node)) + np.inf
    transfer_mat = [np.linalg.matrix_power(A, d) for d in range(max_hop + 1)]
    arrive_mat = (np.stack(transfer_mat) > 0)
    for d in range(max_hop, -1, -1):
        hop_dis[arrive_mat[d]] = d
    return hop_dis


def normalize_digraph(A):
    Dl = np.sum(A, 0)
    num_node = A.shape[0]
    Dn = np.zeros((num_node, num_node))
    for i in range(num_node):
        if Dl[i] > 0:
            Dn[i, i] = Dl[i]**(-1)
    AD = np.dot(A, Dn)
    return AD


def normalize_undigraph(A):
    Dl = np.sum(A, 0)
    num_node = A.shape[0]
    Dn = np.zeros((num_node, num_node))
    for i in range(num_node):
        if Dl[i] > 0:
            Dn[i, i] = Dl[i]**(-0.5)
    DAD = np.dot(np.dot(Dn, A), Dn)
    return DAD


def normalize_points_with_size(xy, width, height, flip=False):
    """Normalize scale points in image with size of image to (0-1).
    xy : (frames, parts, xy) or (parts, xy)
    """
    if xy.ndim == 2:
        xy = np.expand_dims(xy, 0)
    print(xy[:, :, 1].min(), xy[:, :, 1].max())
    xy[:, :, 0] /= width
    xy[:, :, 1] /= height
    print('preprocess')
    print(xy[:, :, 0].min(), xy[:, :, 0].max())
    print(xy[:, :, 1].min(), xy[:, :, 1].max())
    if flip:
        xy[:, :, 0] = 1 - xy[:, :, 0]
    return xy


def scale_pose(xy):
    """Normalize pose points by scale with max/min value of each pose.
    xy : (frames, parts, xy) or (parts, xy)
    """
    if xy.ndim == 2:
        xy = np.expand_dims(xy, 0)
    xy_min = np.nanmin(xy, axis=1)
    xy_max = np.nanmax(xy, axis=1)
    for i in range(xy.shape[0]):
        xy[i] = ((xy[i] - xy_min[i]) / (xy_max[i] - xy_min[i])) * 2 - 1
    return xy.squeeze()

在这里插入图片描述
这里还有一个权重,这个是人家官方那里下载的哈。
此外这里还有个字体:
在这里插入图片描述
然后就没有了。

效果

最后的话,就可以看到这样的一个效果了:
在这里插入图片描述
这里的话帧数还是比较低的,因为是纯cpu跑的,不过综合效果来看,当每满30帧时,它去判断姿态其实不会由明显卡顿,可能帧数会下降一些,但是就几帧。

当然局限很明显,就是不适合多目标的检测,还是单目标的。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/392534.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

【模型复现】-alexnet,nn.Sequential顺序结构构建网络

深度卷积神经网络&#xff08;AlexNet&#xff09; 在LeNet提出后的将近20年里&#xff0c;神经网络一度被其他机器学习方法超越&#xff0c;如支持向量机。虽然LeNet可以在早期的小数据集上取得好的成绩&#xff0c;但是在更大的真实数据集上的表现并不尽如人意。一方面&#…

第五章 事务管理

1.事务概念 *什么是事务&#xff1a;事务是数据库操作最基本单元&#xff0c;逻辑上是一组操作&#xff0c;要么都成功&#xff0c;要么都失败 *事务的特性&#xff08;ACID&#xff09;&#xff1a;原子性、隔离性、一致性、持久性 2.搭建事务操作环境 *模拟场景&#xff…

uart串口接收模块

uart串口接收模块 1、UART&#xff08;异步串行接口&#xff09; 串行通信&#xff1a;指利用一条数据线将资料一位位的顺序传输。   异步通信&#xff1a;以一个字符为传输单位&#xff0c;通信中两个字符间的时间间隔是不固定的&#xff0c;然而在同一个字符的两个相邻位代…

【微信小程序】-- 页面事件 - 下拉刷新(二十五)

&#x1f48c; 所属专栏&#xff1a;【微信小程序开发教程】 &#x1f600; 作  者&#xff1a;我是夜阑的狗&#x1f436; &#x1f680; 个人简介&#xff1a;一个正在努力学技术的CV工程师&#xff0c;专注基础和实战分享 &#xff0c;欢迎咨询&#xff01; &…

高盐废水除钙镁的技术解析

高盐废水指含有机物和至少总溶解固体(totaldissolvedsolids&#xff0c;tds)的质量分数大于3.5&#xff05;的废水&#xff0c;具有水量大&#xff0c;无机盐离子k、na、ca2、mg2、cl-、so42-等含量高&#xff0c;水质水量变化大&#xff0c;成分复杂&#xff0c;难生化降解等特…

2023年中职网络安全竞赛——CMS网站渗透解析

需求环境可私信博主 解析如下: CMS网站渗透 任务环境说明: 服务器场景:Server2206(关闭链接) 服务器场景操作系统:未知 1.使用渗透机对服务器信息收集,并将服务器中网站服务端口号作为flag提交; Flag:8089

华为套件生态

华为套件生态前言蓝牙设备华为耳机华为鼠标智慧互联超级终端多屏协同远程访问文件共享华为电脑管家我的设备控制中心前言 华为的手机、平板、电脑、耳机、手环、手表等设备可以组成华为生态。以下分享一些生态体验。 蓝牙设备 华为耳机 快速连接 在手机/电脑附近打开华为耳…

里奇RIDGID管线定位仪/探测仪维修SR-20 SR-24 SR-60

美国里奇SeekTech SR-20管线定位仪对于初次使用定位仪的用户或经验丰富的用户&#xff0c;都同样可以轻易上手使用SR-20。SR-20提供许多设置和参数&#xff0c;使得大多数复杂的定位工作变得很容易。此外&#xff0c;当你在不复杂的环境下完成些基本的定位工作时&#xff0c;这…

软件测试7

一 CS和BS软件架构 CS&#xff1a;客户端-服务器端&#xff0c;BS&#xff1a;浏览器端-服务器端 区别总结&#xff1a; 1.效率&#xff1a;c/s效率高&#xff0c;某些内容已经安装在系统中了&#xff0c;b/s每次都要加载最新的数据 2.升级&#xff1a;b/s无缝升级&#xff0c…

【Maven】(五)Maven模块的继承与聚合 多模块项目组织构建

文章目录1.前言2.模块的继承2.1.可继承的标签2.2.超级POM2.3.手动引入自定义父POM3.模块的聚合3.1.聚合的注意事项3.2.反应堆(reactor)4.依赖管理及属性配置4.1.依赖管理4.2.属性配置5.总结1.前言 本系列文章记录了 Maven 从0开始到实战的过程&#xff0c;Maven 系列历史文章清…

三天吃透SpringMVC面试八股文

本文已经收录到Github仓库&#xff0c;该仓库包含计算机基础、Java基础、多线程、JVM、数据库、Redis、Spring、Mybatis、SpringMVC、SpringBoot、分布式、微服务、设计模式、架构、校招社招分享等核心知识点&#xff0c;欢迎star~ Github地址&#xff1a;https://github.com/…

【操作系统原理实验】命令解释器模拟实现

选择一种高级语言如C/C等&#xff0c;编写一类似于DOS、UNIX中的命令行解释程序。 1)设计系统命名行提示符&#xff1b; 2)自定义命令集&#xff08;8-10个&#xff09;&#xff1b; 3)用户输入help命令以查找命令的帮助&#xff1b; 4)列出命令的功能&#xff0c;区分内部命令…

fuse文件系统调试环境

libfuse源码&#xff1a;GitHub - libfuse/libfuse: The reference implementation of the Linux FUSE (Filesystem in Userspace) interface 一、ubuntu20.04挂载fuse文件系统 1&#xff0c;安装编译工具 apt install ninja-build apt install meson apt install build-ess…

4. C#语法基础

一、cs文件结构 上面程序的各个部分说明如下&#xff1a; 程序的第一行using System; 其中【using】关键字用于在程序中包含 System 命名空间。一个程序一般有多个 using 语句。程序的第七行是 namespace 声明。一个 namespace 是一系列的类&#xff0c;MyFirstWinFormApp 命名…

SQL语句大全(MySQL入门到精通——基础篇)(基础篇——进阶篇——运维篇)

文章目录前言MySQL——基础篇一、SQL分类二、图形化界面工具三、DDL&#xff08;Data Definition Language|数据定义语言&#xff09;1.SQL-DDL-数据库操作2.SQL-DDL-表操作&查询3.SQL-DDL-数据类型3.SQL-DDL-表操作-修改&删除四、DML&#xff08;Data Manipulation La…

经典模型LeNet跑Fashion-MNIST 代码解析

测试6.6. 卷积神经网络&#xff08;LeNet&#xff09; — 动手学深度学习 2.0.0 documentation import torch from torch import nn from d2l import torch as d2lnet nn.Sequential(#输入通道1表示黑白 输出通道6表示6组取不同特征的卷积核 因为卷积核是5*5,原始图片单通道黑…

面向对象设计模式:行为型模式之模板方法模式

一、模板方法引入&#xff1a;泡茶与冲咖啡 泡茶 烧水泡茶倒入杯子加入柠檬 冲咖啡 烧水冲咖啡倒入杯子加入牛奶和糖 二、模板方法&#xff0c;TemplateMethod 2.1 Intent 意图 Define the skeleton of an algorithm in an operation, deferring some steps to lets subclas…

【深度学习】BERT变体—BERT-wwm

1.BERT-wwm 1-1 Whole Word Masking Whole Word Masking (wwm)是谷歌在2019年5月31日发布的一项BERT的升级版本&#xff0c;主要更改了原预训练阶段的训练样本生成策略。 原有基于WordPiece的分词方式会把一个完整的词切分成若干个子词&#xff0c;在生成训练样本时&#xff…

路由传参含对象数据刷新页面数据丢失

目录 一、问题描述 二、 解决办法 一、问题描述 【1】众所周知&#xff0c;在veu项目开发过程中&#xff0c;我们常常会用到通过路由的方式在页面中传递数据。但是用到this.$route.query.ObjectData的页面&#xff0c;刷新后会导致this.$route.query.ObjectData数据丢失。 …

(小甲鱼python)函数笔记合集七 函数(IX)总结 python实现汉诺塔详解

一、基础复习 函数的基本用法 创建和调用函数 函数的形参与实参等等函数的几种参数 位置参数、关键字参数、默认参数等函数的收集参数*args **args 解包参数详解函数中参数的作用域 局部作用域 全局作用域 global语句 嵌套函数 nonlocal语句等详解函数的闭包&#xff08;工厂函…