端到端的全人体关键点检测：手把手实现从YOLOPose到YOLOWhole

news2025/4/20 4:43:23

一、搭建yolopose平台

目标：利用yolov8框架来实现17个点的人体姿态估计；在上一篇博客，我们已经把yolov8搭建出来了；
在原yolov8的基础上下载好相对应的模型，本文下载的是YOLOv8n-pose，在将下载好的模型放入文件夹中，最后在根目录下创建一个执行文件，文件代码如下：

from ultralytics import YOLO
from PIL import Image
import cv2
model = YOLO("yolov8n-pose.pt")  # 选择自己模型位置
model.info()
imgPath = "resource/human1.jpg"#选择自己图片位置
im1 = Image.open(imgPath)
results = model.predict(source=im1, save=True)  # save plotted images

效果展示：
在这里插入图片描述

二、迁移训练任务

从上述内容，可知YOLO是可以做人体姿态估计的，也就是17个点的躯体关键点检测。如何拓展到全人体超过130多个点的关键点检测呢？

全人体关键点范畴：
在这里插入图片描述
COCO wholebody数据集包含了对全体关键点的标注，即 4种检测框 (person box, face box, left-hand box, and right-hand box) 和 133 keypoints (17 for body, 6 for feet, 68 for face and 42 for hands).

2.1 任务拓展

任务：人体框是不变的，只需要把17个点拓展到133个点；

数据准备

以coco wholebody为例子；

COCO whole非常大，每训练一次，可能都需要2-3天，那么是不是每次都需要在整个数据集上训练？
其实，可以从整个大规模的数据集里面构建一个小数据集，称之为miniCOCO;
1、收集miniCOCO
决定将训练集为10000张图片,测试集为1000张图片代码如下:

import json
import time
import shutil
import os
from collections import defaultdict
import json
from pathlib import Path


class COCO:
    def __init__(self, annotation_file=None, origin_img_dir=""):
        """
        Constructor of Microsoft COCO helper class for reading and visualizing annotations.
        :param annotation_file (str): location of annotation file
        :param image_folder (str): location to the folder that hosts images.
        :return:
        """
        # load dataset
        self.origin_dir = origin_img_dir
        self.dataset, self.anns, self.cats, self.imgs = dict(), dict(), dict(), dict()  # imgToAnns　一个图片对应多个注解(mask) 一个类别对应多个图片
        self.imgToAnns, self.catToImgs = defaultdict(list), defaultdict(list)
        if not annotation_file == None:
            print('loading annotations into memory...')
            tic = time.time()
            dataset = json.load(open(annotation_file, 'r'))
            assert type(dataset) == dict, 'annotation file format {} not supported'.format(type(dataset))
            print('Done (t={:0.2f}s)'.format(time.time() - tic))
            self.dataset = dataset
            self.createIndex()

    def createIndex(self):
        # create index　　  给图片->注解,类别->图片建立索引
        print('creating index...')
        anns, cats, imgs = {
   }, {
   }, {
   }
        imgToAnns, catToImgs = defaultdict(list), defaultdict(list)
        if 'annotations' in self.dataset:
            for ann in self.dataset['annotations']:
                imgToAnns[ann['image_id']].append(ann)
                anns[ann['id']] = ann

        if 'images' in self.dataset:
            for img in self.dataset['images']:
                imgs[img['id']] = img

        if 'categories' in self.dataset:
            for cat in self.dataset['categories']:
                cats[cat['id']] = cat

        if 'annotations' in self.dataset and 'categories' in self.dataset:
            for ann in self.dataset['annotations']:
                catToImgs[ann['category_id']].append(ann['image_id'])

        print('index created!')

        # create class members
        self.</