文章目录
- 资料
- 模型介绍(或者叫weights)
- 安装
- 安装ultralytics(yolo)
- Torch
- 测试命令
- CLI命令行
- 通过COCO128数据集体验yolov8
- 标签
- predict
- segment
- 下载COCO 2017数据集
- Val
- Train
- 自定义数据集
- 标注
- 标注软件labelimg
- 分析训练结果
- 获得最佳训练结果提示
资料
- Docs: https://docs.ultralytics.com
- Community: https://community.ultralytics.com
- GitHub: https://github.com/ultralytics/ultralytics
- COCO数据集 https://cocodataset.org/#download
- kaggle 云平台数据集
模型介绍(或者叫weights)
模型仓库
安装
安装ultralytics(yolo)
https://docs.ultralytics.com/quickstart/#install
方式一:https://docs.ultralytics.com/quickstart/
git clone https://github.com/ultralytics/ultralytics
cd ultralytics
pip install -e .
方式二:
pip install ultralytics
yolo -v
8.0.105
Torch
torch测试, False说明驱动还没好, 进入python命令行
# cuda支持检查
import torch
print(torch.cuda.is_available())
https://pytorch.org/get-started/previous-versions/
执行类似命令安装某个版本cuXXX
pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118
测试命令
yolo predict model=yolov8n.pt source='https://ultralytics.com/images/bus.jpg
Downloading https:\github.com\ultralytics\assets\releases\download\v0.0.0\yolov8n.pt to yolov8n.pt...
100%|█████████████████████████████████████████████████████████████████████████████| 6.23M/6.23M [00:01<00:00, 4.59MB/s]
Ultralytics YOLOv8.0.105 Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
Speed: 18.1ms preprocess, 191.7ms inference, 8.0ms postprocess per image at shape (1, 3, 640, 640)
Downloading https:\ultralytics.com\images\bus.jpg to bus.jpg...
100%|███████████████████████████████████████████████████████████████████████████████| 476k/476k [00:00<00:00, 2.36MB/s]
image 1/1 G:\ultralytics\bus.jpg: 640x480 4 persons, 1 bus, 1 stop sign, 191.7ms
Speed: 18.1ms preprocess, 191.7ms inference, 8.0ms postprocess per image at shape (1, 3, 640, 640)
Results saved to runs\detect\predict
分析一下
- 版本信息
Ultralytics YOLOv8.0.105
Python-3.10.11
torch-2.0.0+cu118
- 模型
yolov8n.pt
, 不存在会下载,预测图片bus.jpg - 默认任务为
detect
,目标检测 - 显卡计算,NVIDIA GeForce RTX 3090
cli
预测信息包含4 persons, 1 bus
- 预测结果自动保存到目录
runs\detect\predict
,多次生产会创建predict1
,predict2
…
GPU和CPU性能对比,差距还是很明显
CLI命令行
CLI入门文档:https://docs.ultralytics.com/usage/cli/
yolo TASK MODE ARGS
Where TASK (optional) is one of [detect, segment, classify]
MODE (required) is one of [train, val, predict, export, track]
ARGS (optional) are any number of custom 'arg=value' pairs like 'imgsz=320' that override defaults.
通过COCO128数据集体验yolov8
标签
训练、推理过程会打出识别的内容,0-79(类似的还有person),都是什么??怎么来的,可以看这个文件,https://github.com/ultralytics/ultralytics/blob/main/ultralytics/datasets/coco128.yaml
都是基于微软coco数据集预训练的分类结果
predict
直接使用命令进行预测(predict)
yolo detect predict model=yolov8n.pt source='https://ultralytics.com/images/bus.jpg'
Ultralytics YOLOv8.0.105 Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
YOLOv8n summary (fused): 168 layers, 3151904 parameters, 0 gradients
Found https:\ultralytics.com\images\bus.jpg locally at bus.jpg
image 1/1 G:\ultralytics\bus.jpg: 640x480 4 persons, 1 bus, 1 stop sign, 185.4ms
Speed: 5.7ms preprocess, 185.4ms inference, 7.0ms postprocess per image at shape (1, 3, 640, 640)
Results saved to runs\detect\predict2
图片输出如下
segment
注意使用模型是yolov8n-seg
yolo segment predict model=yolov8n-seg.pt source='https://ultralytics.com/images/bus.jpg'
执行过程
(yolo) G:\ultralytics>yolo segment predict model=yolov8n-seg.pt source='https://ultralytics.com/images/bus.jpg'
Downloading https:\github.com\ultralytics\assets\releases\download\v0.0.0\yolov8n-seg.pt to yolov8n-seg.pt...
100%|█████████████████████████████████████████████████████████████████████████████| 6.73M/6.73M [00:01<00:00, 6.63MB/s]
Ultralytics YOLOv8.0.105 Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
YOLOv8n-seg summary (fused): 195 layers, 3404320 parameters, 0 gradients
Found https:\ultralytics.com\images\bus.jpg locally at bus.jpg
image 1/1 G:\ultralytics\bus.jpg: 640x480 4 persons, 1 bus, 1 skateboard, 245.6ms
Speed: 5.2ms preprocess, 245.6ms inference, 8.1ms postprocess per image at shape (1, 3, 640, 640)
Results saved to runs\segment\predict
图片输出如下
下载COCO 2017数据集
# Download COCO val
import torch
torch.hub.download_url_to_file('https://ultralytics.com/assets/coco2017val.zip', 'tmp.zip') # download (780M - 5000 images)
!unzip -q tmp.zip -d datasets && rm tmp.zip # unzip
Val
使用COCO128
验证yolov8n.pt
模型,数据集coco2017val
在这个不步骤还用不上
# Validate YOLOv8n on COCO128 val
yolo val model=yolov8n.pt data=coco128.yaml
执行过程
(yolo) G:\ultralytics>yolo val model=yolov8n.pt data=coco128.yaml
Ultralytics YOLOv8.0.105 Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
YOLOv8n summary (fused): 168 layers, 3151904 parameters, 0 gradients
Dataset 'coco128.yaml' images not found , missing paths ['G:\\ultralytics\\datasets\\coco128\\images\\train2017']
Downloading https:\ultralytics.com\assets\coco128.zip to G:\ultralytics\datasets\coco128.zip...
Download failure, retrying 1/3 https://ultralytics.com/assets/coco128.zip...
################################################################################################################ 100.0%-################################################################################################################ 100.0%
Unzipping G:\ultralytics\datasets\coco128.zip to G:\ultralytics\datasets...
Dataset download success (21.7s), saved to G:\ultralytics\datasets
val: Scanning G:\ultralytics\datasets\coco128\labels\train2017... 126 images, 2 backgrounds, 0 corrupt: 100%|██████████
val: New cache created: G:\ultralytics\datasets\coco128\labels\train2017.cache
Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 8/8 [00:06<0
all 128 929 0.64 0.537 0.605 0.446
person 128 254 0.796 0.677 0.764 0.537
bicycle 128 6 0.513 0.333 0.315 0.263
car 128 46 0.814 0.217 0.273 0.168
....截断
toothbrush 128 5 0.745 0.6 0.638 0.374
Speed: 0.8ms preprocess, 9.0ms inference, 0.0ms loss, 2.4ms postprocess per image
coco128.yaml
会自动下载,128说的是datasets\coco128\images\train2017
有128张图片
查看验证结果
Train
基于数据集coco128
(前面已经自动下载)训练YOLO
yolo train model=yolov8n.pt data=coco128.yaml epochs=3 imgsz=640
过程记录
Ultralytics YOLOv8.0.105 Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
yolo\engine\trainer: task=detect, mode=train, model=yolov8n.pt, data=coco128.yaml, epochs=3, patience=50, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=None, exist_ok=False, pretrained=False, optimizer=SGD, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=0, resume=False, amp=True, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, show=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, vid_stride=1, line_width=None, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, boxes=True, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0, cfg=None, v5loader=False, tracker=botsort.yaml, save_dir=runs\detect\train
Downloading https:\ultralytics.com\assets\Arial.ttf to C:\Users\Administrator\AppData\Roaming\Ultralytics\Arial.ttf...
100%|███████████████████████████████████████████████████████████████████████████████| 755k/755k [00:00<00:00, 3.64MB/s]
from n params module arguments
0 -1 1 464 ultralytics.nn.modules.conv.Conv [3, 16, 3, 2]
1 -1 1 4672 ultralytics.nn.modules.conv.Conv [16, 32, 3, 2]
2 -1 1 7360 ultralytics.nn.modules.block.C2f [32, 32, 1, True]
3 -1 1 18560 ultralytics.nn.modules.conv.Conv [32, 64, 3, 2]
4 -1 2 49664 ultralytics.nn.modules.block.C2f [64, 64, 2, True]
5 -1 1 73984 ultralytics.nn.modules.conv.Conv [64, 128, 3, 2]
6 -1 2 197632 ultralytics.nn.modules.block.C2f [128, 128, 2, True]
7 -1 1 295424 ultralytics.nn.modules.conv.Conv [128, 256, 3, 2]
8 -1 1 460288 ultralytics.nn.modules.block.C2f [256, 256, 1, True]
9 -1 1 164608 ultralytics.nn.modules.block.SPPF [256, 256, 5]
10 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
11 [-1, 6] 1 0 ultralytics.nn.modules.conv.Concat [1]
12 -1 1 148224 ultralytics.nn.modules.block.C2f [384, 128, 1]
13 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
14 [-1, 4] 1 0 ultralytics.nn.modules.conv.Concat [1]
15 -1 1 37248 ultralytics.nn.modules.block.C2f [192, 64, 1]
16 -1 1 36992 ultralytics.nn.modules.conv.Conv [64, 64, 3, 2]
17 [-1, 12] 1 0 ultralytics.nn.modules.conv.Concat [1]
18 -1 1 123648 ultralytics.nn.modules.block.C2f [192, 128, 1]
19 -1 1 147712 ultralytics.nn.modules.conv.Conv [128, 128, 3, 2]
20 [-1, 9] 1 0 ultralytics.nn.modules.conv.Concat [1]
21 -1 1 493056 ultralytics.nn.modules.block.C2f [384, 256, 1]
22 [15, 18, 21] 1 897664 ultralytics.nn.modules.head.Detect [80, [64, 128, 256]]
Model summary: 225 layers, 3157200 parameters, 3157184 gradients
Transferred 355/355 items from pretrained weights
AMP: running Automatic Mixed Precision (AMP) checks with YOLOv8n...
AMP: checks passed
optimizer: SGD(lr=0.01) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias
train: Scanning G:\ultralytics\datasets\coco128\labels\train2017.cache... 126 images, 2 backgrounds, 0 corrupt: 100%|██
val: Scanning G:\ultralytics\datasets\coco128\labels\train2017.cache... 126 images, 2 backgrounds, 0 corrupt: 100%|████
Plotting labels to runs\detect\train\labels.jpg...
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs\detect\train
Starting training for 3 epochs...
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/3 2.51G 1.182 1.409 1.214 218 640: 100%|██████████| 8/8 [00:04<00:00, 1.89
Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 4/4 [00:02<0
all 128 929 0.635 0.564 0.626 0.463
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
2/3 2.5G 1.137 1.319 1.239 205 640: 100%|██████████| 8/8 [00:01<00:00, 7.10
Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 4/4 [00:01<0
all 128 929 0.67 0.548 0.635 0.472
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
3/3 2.49G 1.172 1.293 1.219 161 640: 100%|██████████| 8/8 [00:01<00:00, 7.51
Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 4/4 [00:02<0
all 128 929 0.663 0.59 0.663 0.496
3 epochs completed in 0.006 hours.
Optimizer stripped from runs\detect\train\weights\last.pt, 6.5MB
Optimizer stripped from runs\detect\train\weights\best.pt, 6.5MB
Validating runs\detect\train\weights\best.pt...
Ultralytics YOLOv8.0.105 Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
Model summary (fused): 168 layers, 3151904 parameters, 0 gradients
Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 4/4 [00:02<0
all 128 929 0.664 0.589 0.663 0.496
person 128 254 0.814 0.685 0.767 0.552
bicycle 128 6 0.658 0.325 0.327 0.25
car 128 46 0.782 0.217 0.329 0.194
motorcycle 128 5 0.685 0.879 0.938 0.747
airplane 128 6 0.774 0.833 0.903 0.673
...
toothbrush 128 5 1 0.574 0.786 0.485
Speed: 1.0ms preprocess, 2.9ms inference, 0.0ms loss, 1.8ms postprocess per image
Results saved to runs\detect\train
训练结果包含权重等数据
单张图片训练结果
自定义数据集
参考 docs\yolov5\tutorials\train_custom_data.md
,在线文档,同样适用v8
mkdir -p data/{images,labels}/{train,val}
mkdir -p data/test_images
images
要训练的图片数据都放在这里,train
是要训练的数据,val
是要验证的数据labels
标注的数据放在这里test_images
测试数据
新建dog.yaml
, nc是对象的类别总数,会和names做校验的,不能写错。names就是种类的名称,这里我用了哈奇士和拉布拉多
train: G:\ultralytics\data\images\train
val: G:\ultralytics\data\images\val
# number of classes
nc: 2
# class names
names: ['hashiqi','labuladuo']
复制ultralytics\yolo\cfg\default.yaml
到 data
目录,或者用yolo copy-cfg
生成
- detect 目标检测,train 训练
- model yolov8n yolo第8版本的naco模型(小,性能搞)
- data data/dog.yaml 上面定义的对象,有训练和标注的检测目标图片路径,验证图片路径,分类对象等信息
- epochs 训练图片起码1k起步,一般500个迭代差不多
task: detect # YOLO task, i.e. detect, segment, classify, pose
mode: train # YOLO mode, i.e. train, val, predict, export, track, benchmark
# Train settings -------------------------------------------------------------------------------------------------------
model: yolov8n.pt # path to model file, i.e. yolov8n.pt, yolov8n.yaml
data: data/dog.yaml # path to data file, i.e. coco128.yaml
epochs: 500 # number of epochs to train for
patience: 400 # epochs to wait for no observable improvement for early stopping of training
batch: 128 # number of images per batch (-1 for AutoBatch)
imgsz: 1024 # size of input images as integer or w,h
save_period: 100 # Save checkpoint every x epochs (disabled if < 1) 一般按迭代的20%保存一次完全够了
resume: True # resume training from last checkpoint
...
标注
- 一行一个对象
- 每行格式为
class x_center y_center width height
- 文件是
*.txt
(labelimg有yolo格式) - 标注文件路径约定,图片的
/images/
替换/labels/
,txt文件名等于图片名称,如下图
理解标注的参数,如下图
标注软件labelimg
https://github.com/heartexlabs/labelImg/releases , 直接下载windows版本
使用labelimg标注yolo图片数据
训练,这里我图片少,才24张,所以加大了epoch为50000 (正常来说都是1000起步,但是标注真的很麻烦),训练后你会得到一个最佳的模型文件如runs\detect\train7\weights\best.pt,
yolo cfg=data/default.yaml
训练过程日志, 大概用了4小时,实际42000左右就出发机制停止训练了(patiece 40000)
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
42199/50000 8.08G 0.08059 0.08471 0.7974 163 1024: 100%|██████████| 1/1 [00:00<00:00, 3.21
Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 1/1 [00:00<0
all 8 16 0.917 0.718 0.849 0.463
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
42200/50000 8.14G 0.09538 0.1015 0.7966 215 1024: 100%|██████████| 1/1 [00:00<00:00, 3.15
Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 1/1 [00:00<0
all 8 16 0.917 0.718 0.849 0.463
Stopping training early as no improvement observed in last 40000 epochs. Best results observed at epoch 2200, best model saved as best.pt.
To update EarlyStopping(patience=40000) pass a new patience value, i.e. `patience=300` or use `patience=0` to disable EarlyStopping.
42200 epochs completed in 10.187 hours.
Optimizer stripped from runs\detect\train7\weights\last.pt, 6.3MB
Optimizer stripped from runs\detect\train7\weights\best.pt, 6.3MB
Validating runs\detect\train7\weights\best.pt...
Ultralytics YOLOv8.0.105 Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
Model summary (fused): 168 layers, 3006038 parameters, 0 gradients
Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 1/1 [00:00<0
all 8 16 1 0.875 0.956 0.791
hashiqi 8 13 1 0.751 0.916 0.719
labuladuo 8 3 1 0.998 0.995 0.864
Speed: 1.1ms preprocess, 15.7ms inference, 0.0ms loss, 2.0ms postprocess per image
Results saved to runs\detect\train7
推理, 使用最佳的训练模型准备推理
yolo detect predict model=runs\detect\train5\weights\best.pt source=data\test_images save=True
推理效果
(yolo) G:\ultralytics>yolo detect predict model=runs\detect\train7\weights\best.pt source=data\test_images save=True
Ultralytics YOLOv8.0.105 Python-3.10.11 torch-2.0.0+cu118 CUDA:0 (NVIDIA GeForce RTX 3090, 24575MiB)
Model summary (fused): 168 layers, 3006038 parameters, 0 gradients
image 1/5 G:\ultralytics\data\test_images\11111.jpg: 1024x1024 2 hashiqis, 2 labuladuos, 10.4ms
image 2/5 G:\ultralytics\data\test_images\111121.jpg: 672x1024 2 hashiqis, 1 labuladuo, 167.6ms
image 3/5 G:\ultralytics\data\test_images\1111211.jpg: 640x1024 (no detections), 145.9ms
image 4/5 G:\ultralytics\data\test_images\11112211.jpg: 800x1024 1 hashiqi, 6 labuladuos, 153.9ms
image 5/5 G:\ultralytics\data\test_images\1d6e8adbb5d18f57.jpg: 704x1024 (no detections), 162.3ms
Speed: 8.1ms preprocess, 128.0ms inference, 2.7ms postprocess per image at shape (1, 3, 1024, 1024)
Results saved to runs\detect\predict6
分析训练结果
在runs\detect\train7
找到下图
- recall: Recall越高,漏检越少,此时将正例预测为负例的个数越少,可以理解为把全部的正例挑出来的越多,15000步数左右趋向平稳
- mAP@[0.5:0.95],越大,表示预测框越精准
获得最佳训练结果提示
https://docs.ultralytics.com/yolov5/tutorials/tips_for_best_training_results/
数据:
- 每个类的图像。建议每类≥ 1500 张图像
- 每个类的实例数。建议每个类≥ 10000 个实例(标记对象)
- 图像多样性。必须代表已部署的环境。对于现实世界的用例,我们建议使用一天中不同时间、不同季节、不同天气、不同照明、不同角度、不同来源(在线抓取、本地收集、不同相机)等的图像。
- 标签一致性。必须标记所有映像中所有类的所有实例。部分标记将不起作用。
- 标签准确性。标签必须紧紧包围每个对象。对象与其边界框之间不应存在空格。任何对象都不应缺少标签。
- 背景图像。背景图像是没有对象的图像,这些对象被添加到数据集以减少误报 (FP)。我们建议大约 0-10% 的背景图像来帮助降低 FP(COCO 有 1000 张背景图像供参考,占总数的 1%)。背景图像不需要标签。