道路驾驶视角人车检测数据集 16000张带标注 voc yolo

news2026/3/26 9:49:52

随着智能驾驶技术和车辆辅助系统的快速发展，道路驾驶视角下的多目标检测成为了保障行车安全的关键技术之一。为了提高自动驾驶车辆以及辅助驾驶系统的性能，需要大量的高质量标注数据来训练这些系统。本数据集旨在为道路驾驶视角下的人车检测提供高质量的标注数据，支持自动化检测系统的开发与应用。

数据集概述：

名称：道路驾驶视角人车检测数据集
规模：共计16,000张图像
类别：七种目标类别，“0”表示摩托车（Motorcycle）， “1”表示行人（Pedestrian）， “2”表示汽车（Car）， “3”表示自行车（Bicycle）， “4”表示公交车（Bus）， “5”表示交通信号灯（Traffic Light）， “6”表示卡车（Truck）
标注格式：支持VOC和YOLO格式的标注文件，可以直接用于模型训练

数据集特点：

全面性：涵盖道路驾驶视角中常见的多种目标类型，确保数据集的多样性和实用性。
高质量标注：每张图像都已详细标注，确保数据的准确性和可靠性。
适用范围广：支持多种标注格式（VOC、YOLO），方便科研人员和开发者直接使用。
标准格式：采用广泛使用的标注文件格式，方便导入不同的检测框架。

数据集内容：

摩托车（Motorcycle）：标注了道路上行驶的摩托车。
行人（Pedestrian）：标注了道路上行走的行人。
汽车（Car）：标注了道路上行驶的汽车。
自行车（Bicycle）：标注了道路上骑行的自行车。
公交车（Bus）：标注了道路上行驶的公交车。
交通信号灯（Traffic Light）：标注了道路上的交通信号灯。
卡车（Truck）：标注了道路上行驶的卡车。

数据集用途：

目标检测：可用于训练和评估深度学习模型，特别是在道路驾驶视角下的人车检测方面。
智能驾驶：帮助实现自动驾驶车辆的智能感知，减少交通事故的发生。
科研与教育：为道路驾驶视角下的人车检测领域的研究和教学提供丰富的数据支持。

使用场景：

实时监控：在智能交通系统中，利用该数据集训练的模型可以实时检测道路上的目标。
事故预防：在事故预防和安全预警中，利用该数据集可以提高检测的准确性和速度。
生产管理：在智能驾驶系统和车辆辅助系统的开发工作中，利用该数据集可以提高系统的可靠性和稳定性。

技术指标：

数据量：共计16,000张图像，覆盖多种目标类型。
数据划分：数据集是否进行了训练集、验证集和测试集的划分，需根据数据集实际内容确定。
标注格式：支持VOC和YOLO格式的标注文件，方便导入不同的检测框架。
标注精度：所有图像均已详细标注，确保数据的准确性和可靠性。

注意事项：

数据隐私：在使用过程中，请确保遵守相关法律法规，保护个人隐私。
数据预处理：在使用前，建议进行一定的数据预处理，如图像归一化等。

获取方式：

下载链接：请访问项目主页获取数据集下载链接。
许可证：请仔细阅读数据集的使用许可协议。

关键代码示例：

以下是关键代码的示例，包括数据加载、模型训练、检测和结果展示。

数据加载（以VOC格式为例）：

1import os
2import cv2
3import xml.etree.ElementTree as ET
4import numpy as np
5
6# 数据集路径
7DATASET_PATH = 'path/to/dataset'
8IMAGES_DIR = os.path.join(DATASET_PATH, 'JPEGImages')
9ANNOTATIONS_DIR = os.path.join(DATASET_PATH, 'Annotations')
10
11# 加载数据集
12def load_dataset(directory):
13    images = []
14    annotations = []
15
16    for img_file in os.listdir(IMAGES_DIR):
17        if img_file.endswith('.jpg') or img_file.endswith('.png'):
18            img_path = os.path.join(IMAGES_DIR, img_file)
19            annotation_path = os.path.join(ANNOTATIONS_DIR, img_file.replace('.jpg', '.xml').replace('.png', '.xml'))
20            
21            image = cv2.imread(img_path)
22            tree = ET.parse(annotation_path)
23            root = tree.getroot()
24            
25            objects = []
26            for obj in root.findall('object'):
27                name = obj.find('name').text
28                bbox = obj.find('bndbox')
29                xmin = int(bbox.find('xmin').text)
30                ymin = int(bbox.find('ymin').text)
31                xmax = int(bbox.find('xmax').text)
32                ymax = int(bbox.find('ymax').text)
33                objects.append((name, [xmin, ymin, xmax, ymax]))
34            
35            images.append(image)
36            annotations.append(objects)
37
38    return images, annotations
39
40train_images, train_annotations = load_dataset(os.path.join(DATASET_PATH, 'train'))
41val_images, val_annotations = load_dataset(os.path.join(DATASET_PATH, 'val'))
42test_images, test_annotations = load_dataset(os.path.join(DATASET_PATH, 'test'))

模型训练：

1# 初始化YOLOv8模型
2model = YOLO('yolov8n.pt')
3
4# 转换VOC格式到YOLO格式
5def convert_voc_to_yolo(annotations, image_shape=(640, 640), class_names=['Motorcycle', 'Pedestrian', 'Car', 'Bicycle', 'Bus', 'Traffic Light', 'Truck']):
6    yolo_annotations = []
7    class_map = {name: i for i, name in enumerate(class_names)}
8    
9    for ann in annotations:
10        converted = []
11        for name, obj in ann:
12            class_id = class_map[name]
13            x_center = (obj[0] + obj[2]) / 2 / image_shape[1]
14            y_center = (obj[1] + obj[3]) / 2 / image_shape[0]
15            width = (obj[2] - obj[0]) / image_shape[1]
16            height = (obj[3] - obj[1]) / image_shape[0]
17            converted.append([class_id, x_center, y_center, width, height])
18        yolo_annotations.append(converted)
19    return yolo_annotations
20
21# 定义训练参数
22EPOCHS = 100
23BATCH_SIZE = 16
24
25# 转换并训练模型
26train_yolo_annots = convert_voc_to_yolo(train_annotations)
27val_yolo_annots = convert_voc_to_yolo(val_annotations)
28
29results = model.train(data='road_driving_perspective_detection.yaml', epochs=EPOCHS, batch=BATCH_SIZE)

模型检测：

1# 加载训练好的模型
2model = YOLO('best.pt')
3
4# 检测图像
5def detect_road_objects(image):
6    results = model.predict(image)
7    for result in results:
8        boxes = result.boxes
9        for box in boxes:
10            x1, y1, x2, y2 = box.xyxy[0]
11            conf = box.conf
12            class_id = box.cls
13            
14            # 显示结果
15            cv2.rectangle(image, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)
16            class_name = ['Motorcycle', 'Pedestrian', 'Car', 'Bicycle', 'Bus', 'Traffic Light', 'Truck'][class_id]
17            cv2.putText(image, f'{class_name}, Conf: {conf:.2f}', (int(x1), int(y1)-10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
18    
19    return image
20
21# 测试图像
22test_image = cv2.imread('path/to/test_image.jpg')
23result_image = detect_road_objects(test_image)
24cv2.imshow('Detected Road Objects', result_image)
25cv2.waitKey(0)
26cv2.destroyAllWindows()

配置文件 `road_driving_perspective_detection.yaml`：

1train: path/to/train/images
2val: path/to/val/images
3test: path/to/test/images
4
5nc: 7  # Number of classes
6names: ['Motorcycle', 'Pedestrian', 'Car', 'Bicycle', 'Bus', 'Traffic Light', 'Truck']  # Class names
7
8# Training parameters
9batch_size: 16
10epochs: 100
11img_size: [640, 640]  # Image size