蘑菇分类检测数据集 21类蘑菇 8800张带标注 voc yolo

news2026/2/12 3:50:13

蘑菇分类检测数据集 21类蘑菇 8800张带标注 v

蘑菇分类检测数据集 21类蘑菇 8800张带标注 voc yolo

蘑菇分类检测数据集介绍

数据集名称

蘑菇分类检测数据集 (Mushroom Classification and Detection Dataset)

数据集概述

该数据集专为训练和评估基于YOLO系列目标检测模型（包括YOLOv5、YOLOv6、YOLOv7等）而设计，旨在帮助研究人员和开发者创建能够高效识别图像中的多种蘑菇类别的系统。通过使用这个数据集，可以开发出适用于生态研究、食品安全监测、野外探险等多种应用场景的技术解决方案。

数据集规格

总图像数量：8,800张
- 训练集：具体划分比例未提供，通常建议按照70%（训练）、20%（验证）、10%（测试）的比例来分配。
标注格式：
- VOC格式：每个图像对应一个XML文件，包含边界框坐标及类别信息。
- YOLO格式：每个图像对应一个TXT文件，包含边界框坐标及类别ID。
分辨率：图像分辨率可能有所不同，但为了保证一致性，推荐将所有图像调整至统一尺寸，如640x640或1280x1280像素。
类别：涵盖21种常见的蘑菇类型，包括但不限于Clitocybe maxima、Lentinus edodes、Agaricus bisporus等。

数据集结构

mushroom_classification_dataset/
├── images/
│   ├── train/
│   ├── val/
│   └── test/
├── labels/
│   ├── train/
│   ├── val/
│   └── test/
└── data.yaml

images/ 目录下存放的是原始图像文件。
labels/ 目录存放与图像对应的标注文件，每个图像文件都有一个同名的.txt文件存储其YOLO格式的标注信息，以及一个同名的.xml文件存储其VOC格式的标注信息。
data.yaml 文件包含了关于数据集的基本信息，如路径指向、类别数目及其名称等关键参数。

数据集配置文件 (`data.yaml`)

# 训练集图像路径
train: path_to_your_train_images
# 验证集图像路径
val: path_to_your_val_images
# 测试集图像路径（如果有的话）
test: path_to_your_test_images

# 类别数量
nc: 21
# 类别名称
names: [
    'Clitocybe maxima',
    'Lentinus edodes',
    'Agaricus bisporus',
    'Pleurotus eryngii',
    'Copr inus comatus',
    'Cantharellus cibarius',
    'Boletus',
    'Dictyophora indusiata',
    'Pleurotus citrinopileatus',
    'Hypsizygus marmoreus',
    'Pleurotus cystidiosus',
    'Flammulina velutiper',
    'Agrocybe aegerita',
    'Auricularia auricula',
    'Armillaria mellea',
    'Agaricus blazei Murill',
    'Pleurotus ostreatus',
    'Morchella esculenta',
    'Hericium erinaceus',
    'Cordyceps militaris',
    'Collybia albuminosa'
]

标注统计

Clitocybe maxima：606张图像，共1,049个实例
Lentinus edodes：479张图像，共2,690个实例
Agaricus bisporus：161张图像，共521个实例
Pleurotus eryngii：423张图像，共704个实例
Coprinus comatus：519张图像，共1,599个实例
Cantharellus cibarius：648张图像，共1,317个实例
Boletus：639张图像，共1,353个实例
Dictyophora indusiata：535张图像，共1,275个实例
Pleurotus citrinopileatus：441张图像，共531个实例
Hypsizygus marmoreus：393张图像，共583个实例
Pleurotus cystidiosus：429张图像，共711个实例
Flammulina velutiper：423张图像，共550个实例
Agrocybe aegerita：179张图像，共197个实例
Auricularia auricula：242张图像，共408个实例
Armillaria mellea：200张图像，共290个实例
Agaricus blazei Murill：137张图像，共307个实例
Pleurotus ostreatus：433张图像，共549个实例
Morchella esculenta：433张图像，共1,107个实例
Hericium erinaceus：454张图像，共1,299个实例
Cordyceps militaris：600张图像，共1,137个实例
Collybia albuminosa：493张图像，共2,074个实例
总计 (total)：8,858张图像，共20,251个实例

标注示例

YOLO格式

对于一张图片中包含一个“Lentinus edodes”情况，相应的.txt文件内容可能是：

1 0.5678 0.3456 0.1234 0.2345

这里1代表“Lentinus edodes”这一类别的ID，后续四个数字依次表示物体在图像中的相对位置（中心点x, 中心点y, 宽度w, 高度h），所有值均归一化到[0, 1]范围内。

VOC格式

对于同一张图片，相应的.xml文件内容可能是：

<annotation>
    <folder>images</folder>
    <filename>000001.jpg</filename>
    <size>
        <width>640</width>
        <height>640</height>
        <depth>3</depth>
    </size>
    <object>
        <name>Lentinus edodes</name>
        <bndbox>
            <xmin>180</xmin>
            <ymin>200</ymin>
            <xmax>300</xmax>
            <ymax>400</ymax>
        </bndbox>
    </object>
</annotation>

这里<name>标签指定了类别名称（Lentinus edodes），<bndbox>标签定义了边界框的坐标。

使用说明

准备环境：
- 确保安装了必要的软件库以支持所选版本的YOLO模型。例如，对于YOLOv5，可以使用以下命令安装依赖库：
```
pip install -r requirements.txt
```
数据预处理：
- 将图像和标注文件分别放在images/和labels/目录下。
- 修改data.yaml文件中的路径以匹配你的数据集位置。
- 如果需要，可以使用脚本将VOC格式的标注文件转换为YOLO格式，或者反之。
修改配置文件：
- 更新data.yaml以反映正确的数据路径。
- 如果使用YOLOv5或其他特定版本的YOLO，还需要更新相应的模型配置文件（如models/yolov5s.yaml）。
开始训练：
- 使用提供的训练脚本启动模型训练过程。例如，对于YOLOv5，可以使用以下命令进行训练：
```
python train.py --img 640 --batch 16 --epochs 100 --data data.yaml --weights yolov5s.pt
```
性能评估：
- 训练完成后，使用验证集或测试集对模型进行评估，检查mAP等指标是否达到预期水平。例如，对于YOLOv5，可以使用以下命令进行评估：
```
python val.py --data data.yaml --weights runs/train/exp/weights/best.pt --img 640
```
部署应用：
- 将训练好的模型应用于实际场景中，实现蘑菇自动检测功能。例如，可以使用以下命令进行推理：
```
python detect.py --source path_to_your_test_images --weights runs/train/exp/weights/best.pt --conf 0.4
```