文章目录

资料
模型介绍(或者叫weights)
安装
- 安装ultralytics（yolo）
- Torch
- 测试命令
CLI命令行
通过COCO128数据集体验yolov8
- 标签
- predict
- segment
- 下载COCO 2017数据集
- Val
- Train
自定义数据集
- 标注
- 标注软件labelimg
- 分析训练结果
获得最佳训练结果提示

资料

Docs: https://docs.ultralytics.com
Community: https://community.ultralytics.com
GitHub: https://github.com/ultralytics/ultralytics
COCO数据集 https://cocodataset.org/#download
kaggle 云平台数据集

模型介绍(或者叫weights)

模型仓库

在这里插入图片描述

安装

安装ultralytics（yolo）

https://docs.ultralytics.com/quickstart/#install

方式一：https://docs.ultralytics.com/quickstart/
git clone https://github.com/ultralytics/ultralytics
cd ultralytics
pip install -e .
方式二：
pip install ultralytics


yolo -v
8.0.105

Torch

torch测试, False说明驱动还没好, 进入python命令行

# cuda支持检查
import torch
print(torch.cuda.is_available())

https://pytorch.org/get-started/previous-versions/

执行类似命令安装某个版本cuXXX

pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118

测试命令

yolo predict model=yolov8n.pt source='https://ultralytics.com/images/bus.jpg

Downloading https:\github.com\ultralytics\assets\releases\download\v0.0.0\yolov8n.pt to yolov8n.pt...
100%|█████████████████████████████████████████████████████████████████████████████| 6.23M/6.23M [00:01<00:00, 4.59MB/s]
Ultralytics YOLOv8.0.105  Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
Speed: 18.1ms preprocess, 191.7ms inference, 8.0ms postprocess per image at shape (1, 3, 640, 640)

Downloading https:\ultralytics.com\images\bus.jpg to bus.jpg...
100%|███████████████████████████████████████████████████████████████████████████████| 476k/476k [00:00<00:00, 2.36MB/s]
image 1/1 G:\ultralytics\bus.jpg: 640x480 4 persons, 1 bus, 1 stop sign, 191.7ms
Speed: 18.1ms preprocess, 191.7ms inference, 8.0ms postprocess per image at shape (1, 3, 640, 640)
Results saved to runs\detect\predict

分析一下

版本信息Ultralytics YOLOv8.0.105 Python-3.10.11 torch-2.0.0+cu118
模型yolov8n.pt, 不存在会下载，预测图片bus.jpg
默认任务为detect,目标检测
显卡计算，NVIDIA GeForce RTX 3090
cli预测信息包含 4 persons, 1 bus
预测结果自动保存到目录runs\detect\predict，多次生产会创建predict1,predict2…

在这里插入图片描述
GPU和CPU性能对比，差距还是很明显

CLI命令行

CLI入门文档：https://docs.ultralytics.com/usage/cli/

yolo TASK MODE ARGS

Where   TASK (optional) is one of [detect, segment, classify]
        MODE (required) is one of [train, val, predict, export, track]
        ARGS (optional) are any number of custom 'arg=value' pairs like 'imgsz=320' that override defaults.

通过COCO128数据集体验yolov8

predict

直接使用命令进行预测（predict）

yolo detect predict model=yolov8n.pt source='https://ultralytics.com/images/bus.jpg'

Ultralytics YOLOv8.0.105  Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
YOLOv8n summary (fused): 168 layers, 3151904 parameters, 0 gradients

Found https:\ultralytics.com\images\bus.jpg locally at bus.jpg
image 1/1 G:\ultralytics\bus.jpg: 640x480 4 persons, 1 bus, 1 stop sign, 185.4ms
Speed: 5.7ms preprocess, 185.4ms inference, 7.0ms postprocess per image at shape (1, 3, 640, 640)
Results saved to runs\detect\predict2

图片输出如下
在这里插入图片描述

segment

注意使用模型是yolov8n-seg

yolo segment predict model=yolov8n-seg.pt source='https://ultralytics.com/images/bus.jpg'

执行过程

(yolo) G:\ultralytics>yolo segment predict model=yolov8n-seg.pt source='https://ultralytics.com/images/bus.jpg'
Downloading https:\github.com\ultralytics\assets\releases\download\v0.0.0\yolov8n-seg.pt to yolov8n-seg.pt...
100%|█████████████████████████████████████████████████████████████████████████████| 6.73M/6.73M [00:01<00:00, 6.63MB/s]
Ultralytics YOLOv8.0.105  Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
YOLOv8n-seg summary (fused): 195 layers, 3404320 parameters, 0 gradients

Found https:\ultralytics.com\images\bus.jpg locally at bus.jpg
image 1/1 G:\ultralytics\bus.jpg: 640x480 4 persons, 1 bus, 1 skateboard, 245.6ms
Speed: 5.2ms preprocess, 245.6ms inference, 8.1ms postprocess per image at shape (1, 3, 640, 640)
Results saved to runs\segment\predict

图片输出如下
在这里插入图片描述

下载COCO 2017数据集

# Download COCO val
import torch
torch.hub.download_url_to_file('https://ultralytics.com/assets/coco2017val.zip', 'tmp.zip')  # download (780M - 5000 images)
!unzip -q tmp.zip -d datasets && rm tmp.zip  # unzip

Val

使用COCO128验证yolov8n.pt模型，数据集coco2017val在这个不步骤还用不上

# Validate YOLOv8n on COCO128 val
yolo val model=yolov8n.pt data=coco128.yaml

执行过程

(yolo) G:\ultralytics>yolo val model=yolov8n.pt data=coco128.yaml
Ultralytics YOLOv8.0.105  Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
YOLOv8n summary (fused): 168 layers, 3151904 parameters, 0 gradients

Dataset 'coco128.yaml' images not found , missing paths ['G:\\ultralytics\\datasets\\coco128\\images\\train2017']
Downloading https:\ultralytics.com\assets\coco128.zip to G:\ultralytics\datasets\coco128.zip...
 Download failure, retrying 1/3 https://ultralytics.com/assets/coco128.zip...
################################################################################################################ 100.0%-################################################################################################################ 100.0%
Unzipping G:\ultralytics\datasets\coco128.zip to G:\ultralytics\datasets...
Dataset download success  (21.7s), saved to G:\ultralytics\datasets

val: Scanning G:\ultralytics\datasets\coco128\labels\train2017... 126 images, 2 backgrounds, 0 corrupt: 100%|██████████
val: New cache created: G:\ultralytics\datasets\coco128\labels\train2017.cache
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 8/8 [00:06<0
                   all        128        929       0.64      0.537      0.605      0.446
                person        128        254      0.796      0.677      0.764      0.537
               bicycle        128          6      0.513      0.333      0.315      0.263
                   car        128         46      0.814      0.217      0.273      0.168
            ....截断
            toothbrush        128          5      0.745        0.6      0.638      0.374
Speed: 0.8ms preprocess, 9.0ms inference, 0.0ms loss, 2.4ms postprocess per image

coco128.yaml会自动下载，128说的是datasets\coco128\images\train2017有128张图片

查看验证结果

在这里插入图片描述

Train

基于数据集coco128（前面已经自动下载）训练YOLO

yolo train model=yolov8n.pt data=coco128.yaml epochs=3 imgsz=640

过程记录

Ultralytics YOLOv8.0.105  Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
yolo\engine\trainer: task=detect, mode=train, model=yolov8n.pt, data=coco128.yaml, epochs=3, patience=50, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=None, exist_ok=False, pretrained=False, optimizer=SGD, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=0, resume=False, amp=True, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, show=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, vid_stride=1, line_width=None, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, boxes=True, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0, cfg=None, v5loader=False, tracker=botsort.yaml, save_dir=runs\detect\train

Downloading https:\ultralytics.com\assets\Arial.ttf to C:\Users\Administrator\AppData\Roaming\Ultralytics\Arial.ttf...
100%|███████████████████████████████████████████████████████████████████████████████| 755k/755k [00:00<00:00, 3.64MB/s]

                   from  n    params  module                                       arguments
  0                  -1  1       464  ultralytics.nn.modules.conv.Conv             [3, 16, 3, 2]
  1                  -1  1      4672  ultralytics.nn.modules.conv.Conv             [16, 32, 3, 2]
  2                  -1  1      7360  ultralytics.nn.modules.block.C2f             [32, 32, 1, True]
  3                  -1  1     18560  ultralytics.nn.modules.conv.Conv             [32, 64, 3, 2]
  4                  -1  2     49664  ultralytics.nn.modules.block.C2f             [64, 64, 2, True]
  5                  -1  1     73984  ultralytics.nn.modules.conv.Conv             [64, 128, 3, 2]
  6                  -1  2    197632  ultralytics.nn.modules.block.C2f             [128, 128, 2, True]
  7                  -1  1    295424  ultralytics.nn.modules.conv.Conv             [128, 256, 3, 2]
  8                  -1  1    460288  ultralytics.nn.modules.block.C2f             [256, 256, 1, True]
  9                  -1  1    164608  ultralytics.nn.modules.block.SPPF            [256, 256, 5]
 10                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']
 11             [-1, 6]  1         0  ultralytics.nn.modules.conv.Concat           [1]
 12                  -1  1    148224  ultralytics.nn.modules.block.C2f             [384, 128, 1]
 13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']
 14             [-1, 4]  1         0  ultralytics.nn.modules.conv.Concat           [1]
 15                  -1  1     37248  ultralytics.nn.modules.block.C2f             [192, 64, 1]
 16                  -1  1     36992  ultralytics.nn.modules.conv.Conv             [64, 64, 3, 2]
 17            [-1, 12]  1         0  ultralytics.nn.modules.conv.Concat           [1]
 18                  -1  1    123648  ultralytics.nn.modules.block.C2f             [192, 128, 1]
 19                  -1  1    147712  ultralytics.nn.modules.conv.Conv             [128, 128, 3, 2]
 20             [-1, 9]  1         0  ultralytics.nn.modules.conv.Concat           [1]
 21                  -1  1    493056  ultralytics.nn.modules.block.C2f             [384, 256, 1]
 22        [15, 18, 21]  1    897664  ultralytics.nn.modules.head.Detect           [80, [64, 128, 256]]
Model summary: 225 layers, 3157200 parameters, 3157184 gradients

Transferred 355/355 items from pretrained weights
AMP: running Automatic Mixed Precision (AMP) checks with YOLOv8n...
AMP: checks passed
optimizer: SGD(lr=0.01) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias
train: Scanning G:\ultralytics\datasets\coco128\labels\train2017.cache... 126 images, 2 backgrounds, 0 corrupt: 100%|██
val: Scanning G:\ultralytics\datasets\coco128\labels\train2017.cache... 126 images, 2 backgrounds, 0 corrupt: 100%|████
Plotting labels to runs\detect\train\labels.jpg...
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs\detect\train
Starting training for 3 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
        1/3      2.51G      1.182      1.409      1.214        218        640: 100%|██████████| 8/8 [00:04<00:00,  1.89
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 4/4 [00:02<0
                   all        128        929      0.635      0.564      0.626      0.463

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
        2/3       2.5G      1.137      1.319      1.239        205        640: 100%|██████████| 8/8 [00:01<00:00,  7.10
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 4/4 [00:01<0
                   all        128        929       0.67      0.548      0.635      0.472

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
        3/3      2.49G      1.172      1.293      1.219        161        640: 100%|██████████| 8/8 [00:01<00:00,  7.51
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 4/4 [00:02<0
                   all        128        929      0.663       0.59      0.663      0.496

3 epochs completed in 0.006 hours.
Optimizer stripped from runs\detect\train\weights\last.pt, 6.5MB
Optimizer stripped from runs\detect\train\weights\best.pt, 6.5MB

Validating runs\detect\train\weights\best.pt...
Ultralytics YOLOv8.0.105  Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)

Model summary (fused): 168 layers, 3151904 parameters, 0 gradients
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 4/4 [00:02<0
                   all        128        929      0.664      0.589      0.663      0.496
                person        128        254      0.814      0.685      0.767      0.552
               bicycle        128          6      0.658      0.325      0.327       0.25
                   car        128         46      0.782      0.217      0.329      0.194
            motorcycle        128          5      0.685      0.879      0.938      0.747
              airplane        128          6      0.774      0.833      0.903      0.673
              ...
            toothbrush        128          5          1      0.574      0.786      0.485
Speed: 1.0ms preprocess, 2.9ms inference, 0.0ms loss, 1.8ms postprocess per image
Results saved to runs\detect\train

训练结果包含权重等数据

在这里插入图片描述
单张图片训练结果

自定义数据集

参考 docs\yolov5\tutorials\train_custom_data.md,在线文档,同样适用v8

mkdir -p data/{images,labels}/{train,val}
mkdir -p data/test_images

images 要训练的图片数据都放在这里，train是要训练的数据，val是要验证的数据
labels 标注的数据放在这里
test_images 测试数据

在这里插入图片描述

新建dog.yaml, nc是对象的类别总数，会和names做校验的，不能写错。names就是种类的名称，这里我用了哈奇士和拉布拉多

train: G:\ultralytics\data\images\train
val: G:\ultralytics\data\images\val
# number of classes
nc: 2
# class names
names: ['hashiqi','labuladuo']

复制ultralytics\yolo\cfg\default.yaml 到 data目录,或者用yolo copy-cfg生成

detect 目标检测，train 训练
model yolov8n yolo第8版本的naco模型（小，性能搞）
data data/dog.yaml 上面定义的对象，有训练和标注的检测目标图片路径，验证图片路径，分类对象等信息
epochs 训练图片起码1k起步，一般500个迭代差不多

task: detect  # YOLO task, i.e. detect, segment, classify, pose
mode: train  # YOLO mode, i.e. train, val, predict, export, track, benchmark

# Train settings -------------------------------------------------------------------------------------------------------
model: yolov8n.pt # path to model file, i.e. yolov8n.pt, yolov8n.yaml
data: data/dog.yaml # path to data file, i.e. coco128.yaml
epochs: 500  # number of epochs to train for
patience: 400  # epochs to wait for no observable improvement for early stopping of training
batch: 128  # number of images per batch (-1 for AutoBatch)
imgsz:  1024 # size of input images as integer or w,h
save_period: 100 # Save checkpoint every x epochs (disabled if < 1) 一般按迭代的20%保存一次完全够了
resume: True  # resume training from last checkpoint
...

标注

一行一个对象
每行格式为 class x_center y_center width height
文件是*.txt(labelimg有yolo格式)
标注文件路径约定，图片的/images/替换/labels/，txt文件名等于图片名称，如下图

在这里插入图片描述
理解标注的参数，如下图

标注软件labelimg

https://github.com/heartexlabs/labelImg/releases ，直接下载windows版本

在这里插入图片描述

使用labelimg标注yolo图片数据
在这里插入图片描述

训练，这里我图片少，才24张，所以加大了epoch为50000 （正常来说都是1000起步，但是标注真的很麻烦），训练后你会得到一个最佳的模型文件如runs\detect\train7\weights\best.pt,

yolo cfg=data/default.yaml

训练过程日志, 大概用了4小时，实际42000左右就出发机制停止训练了（patiece 40000）

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
42199/50000      8.08G    0.08059    0.08471     0.7974        163       1024: 100%|██████████| 1/1 [00:00<00:00,  3.21
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<0
                   all          8         16      0.917      0.718      0.849      0.463

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
42200/50000      8.14G    0.09538     0.1015     0.7966        215       1024: 100%|██████████| 1/1 [00:00<00:00,  3.15
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<0
                   all          8         16      0.917      0.718      0.849      0.463
Stopping training early as no improvement observed in last 40000 epochs. Best results observed at epoch 2200, best model saved as best.pt.
To update EarlyStopping(patience=40000) pass a new patience value, i.e. `patience=300` or use `patience=0` to disable EarlyStopping.
42200 epochs completed in 10.187 hours.

Optimizer stripped from runs\detect\train7\weights\last.pt, 6.3MB
Optimizer stripped from runs\detect\train7\weights\best.pt, 6.3MB

Validating runs\detect\train7\weights\best.pt...
Ultralytics YOLOv8.0.105  Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
Model summary (fused): 168 layers, 3006038 parameters, 0 gradients
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:00<0
                   all          8         16          1      0.875      0.956      0.791
               hashiqi          8         13          1      0.751      0.916      0.719
             labuladuo          8          3          1      0.998      0.995      0.864
Speed: 1.1ms preprocess, 15.7ms inference, 0.0ms loss, 2.0ms postprocess per image
Results saved to runs\detect\train7

推理, 使用最佳的训练模型准备推理

yolo detect predict model=runs\detect\train5\weights\best.pt source=data\test_images save=True

推理效果

(yolo) G:\ultralytics>yolo detect predict model=runs\detect\train7\weights\best.pt source=data\test_images save=True
Ultralytics YOLOv8.0.105  Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
Model summary (fused): 168 layers, 3006038 parameters, 0 gradients

image 1/5 G:\ultralytics\data\test_images\11111.jpg: 1024x1024 2 hashiqis, 2 labuladuos, 10.4ms
image 2/5 G:\ultralytics\data\test_images\111121.jpg: 672x1024 2 hashiqis, 1 labuladuo, 167.6ms
image 3/5 G:\ultralytics\data\test_images\1111211.jpg: 640x1024 (no detections), 145.9ms
image 4/5 G:\ultralytics\data\test_images\11112211.jpg: 800x1024 1 hashiqi, 6 labuladuos, 153.9ms
image 5/5 G:\ultralytics\data\test_images\1d6e8adbb5d18f57.jpg: 704x1024 (no detections), 162.3ms
Speed: 8.1ms preprocess, 128.0ms inference, 2.7ms postprocess per image at shape (1, 3, 1024, 1024)
Results saved to runs\detect\predict6

分析训练结果

在runs\detect\train7找到下图
在这里插入图片描述

recall： Recall越高，漏检越少，此时将正例预测为负例的个数越少，可以理解为把全部的正例挑出来的越多，15000步数左右趋向平稳
mAP@[0.5:0.95]，越大，表示预测框越精准

获得最佳训练结果提示

https://docs.ultralytics.com/yolov5/tutorials/tips_for_best_training_results/

数据：

每个类的图像。建议每类≥ 1500 张图像
每个类的实例数。建议每个类≥ 10000 个实例（标记对象）
图像多样性。必须代表已部署的环境。对于现实世界的用例，我们建议使用一天中不同时间、不同季节、不同天气、不同照明、不同角度、不同来源（在线抓取、本地收集、不同相机）等的图像。
标签一致性。必须标记所有映像中所有类的所有实例。部分标记将不起作用。
标签准确性。标签必须紧紧包围每个对象。对象与其边界框之间不应存在空格。任何对象都不应缺少标签。
背景图像。背景图像是没有对象的图像，这些对象被添加到数据集以减少误报（FP）。我们建议大约 0-10% 的背景图像来帮助降低 FP（COCO 有 1000 张背景图像供参考，占总数的 1%）。背景图像不需要标签。