目标检测常见数据集总结

这里先总结一下,我自己看完这三个常见目标检测数据集:

V0C数据集(Annotation的格式是xmI)

A. 数据集包含种类:

一共包含了20类。一共包含了20类。Person,bird, cat, cow, dog, horse, sheep,aeroplane, bicycle, boat, bus, car, motorbike, train, bottle, chair, dining table, potted plant, sofa, tv/monitor.

B. V0C2007和V0C2012的区别:

（图片来源于某博客，忘记是哪个博客了，如果博友知道，方便告诉，我补上链接）

VOC2007中包含9963张标注过的图片，由train/val/test三部分组成，共标注出24,640个物体。

对于检测任务，VOC2012的trainval/test包含08-11年的所有对应图片。 trainval有11540张图片共27450个物体。

C. 数据集格式:

. ├── Annotations 【Annotations下存放的是xml文件,每个xml对应JPEGImage中的一张图片描述

| 了图片信息】

├── ImageSets【包含三个子文件夹 Layout、Main、Segmentation】

│ ├── Action【Action下存放的是人的动作（例如running、jumping等等）】

│ ├── Layout 【Layout下存放的是具有人体部位的数据】

│ ├── Main 【Main下存放的是图像物体识别的数据，总共分为20类。】

│ └── Segmentation 【Segmentation下存放的是可用于分割的数据】

├── JPEGImages 【主要提供的是PASCAL VOC所提供的所有的图片信息，包括训练图片，测

试图片

\| \|	这些图像就是用来进行训练和测试验证的图像数据。注：是没有标记时的原图】

├── SegmentationClass 【存放按照 class 分割的图片；目标检测不需要】

└── SegmentationObject【存放按照 object 分割的图片；目标检测不需要】

D. 标注信息是用xmI文件组织的如下:

	<annotation>
		<folder>VOC2007</folder>
		<filename>000001.jpg</filename>  # 文件名 
		<source>
			<database>The VOC2007 Database</database>
			<annotation>PASCAL VOC2007</annotation>
			<image>flickr</image>
			<flickrid>341012865</flickrid>
		</source>
		<owner>
			<flickrid>Fried Camels</flickrid>
			<name>Jinky the Fruit Bat</name>
		</owner>
		<size>  # 图像尺寸, 用于对 bbox 左上和右下坐标点做归一化操作
			<width>353</width>
			<height>500</height>
			<depth>3</depth>
		</size>
		<segmented>0</segmented>  # 是否用于分割
		<object>
			<name>dog</name>  # 物体类别
			<pose>Left</pose>  # 拍摄角度：front, rear, left, right, unspecified 
			<truncated>1</truncated>  # 目标是否被截断（比如在图片之外），或者被遮挡（超过15%）
			<difficult>0</difficult>  # 检测难易程度，这个主要是根据目标的大小，光照变化，图片质量来判断
			<bndbox>
				<xmin>48</xmin>
				<ymin>240</ymin>
				<xmax>195</xmax>
				<ymax>371</ymax>
			</bndbox>
		</object>
		<object>
			<name>person</name>
			<pose>Left</pose>
			<truncated>1</truncated>
			<difficult>0</difficult>
			<bndbox>
				<xmin>8</xmin>
				<ymin>12</ymin>
				<xmax>352</xmax>
				<ymax>498</ymax>
			</bndbox>
		</object>
</annotation>

E. 各文件部分展示

(1)JPEGImages:

(2)Annotations

COCO数据集（Annotation的格式是json）

图像来源链接：点击此处

A. 总类别:

80类

B. 文件说明:

3种标注类型，使用json文件存储，每种类型包含了训练和验证
object instances（目标实例）：也就是目标检测object detection；object keypoints（目标上的关键点）； image captions（看图说话）

C. 数据格式:

	{
	    "info": info,
	    "licenses": [license],
	    "images": [image],
	    "annotations": [annotation],
	}
	    
	info{
	    "year": int,
	    "version": str,
	    "description": str,
	    "contributor": str,
	    "url": str,
	    "date_created": datetime,
	}
	license{
	    "id": int,
	    "name": str,
	    "url": str,
	} 
	image{
	    "id": int,
	    "width": int,
	    "height": int,
	    "file_name": str,
	    "license": int,
	    "flickr_url": str,
	    "coco_url": str,
	    "date_captured": datetime,
	}