前言

本文章实现xml数据格式转yolov5的txt格式，一方面共享读者直接使用，另一方面便于我能快速复制使用，以便yolov5模型快速训练。为此，我将写下本篇文章。主要内容包含yolov5训练数据格式介绍，xml代码解读及xml解读信息转为txt格式，并将所有内容附属整个源码。

提示：以下是本篇文章正文内容，下面案例可供参考

一、yolov5训练数据格式介绍

我以coco数据集转yolov5的txt格式说明。

1、txt的类别对应说明

coco数据集共有80个类别，yolov5将80个类别标签为0到79，其中0表示coco的第一个类别，即person，79表示coco对应的第80个类别，即toothbrush。

coco类别如下：

['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
        'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
        'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
        'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
        'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
        'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
        'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
        'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
        'hair drier', 'toothbrush']

2、txt的文件说明

一张图对应一个txt文件，其图片与txt名称相同，后缀不同。
假如一张图有7个目标需要检测，对应的txt文件有7行数据记录目标相应的信息。

3、txt文件格式

txt文件每行表示顺序为目标类别、目标中心点x、目标中心点y、目标宽、目标高，其中目标中心点与宽高需要归一化，也就是对应数据除以对应宽高。
假如一张图有2个目标，假设图像宽=1920，高=1080;
格式：类别,box中心点x坐标,中心点y坐标,box宽w,box高h;
目标信息如下：

bicycle,60，80,20,40;
motorcycle,60，100,80,240;

以coco类别对应转txt为:

1,60/1920，80/1080,20/1920,40/1080;
3,60/1920，100/1080,80/1920,240/1080;

3、yolov5训练文件形式

一般是images和labels，当然你也可以自己取名字，但必须保证images与labels里面的文件名相同。
在这里插入图片描述
yolov5调用形式：

path: ../datasets/coco128  # dataset root dir
train: images/train  # train images (relative to 'path') 128 images
val: images/val  # val images (relative to 'path') 128 images
test:  # test images (optional)

以一张图转txt文件格式展示，如下：
在这里插入图片描述

二、xml文件读取代码解读

竟然是xml文件转yolov5的txt文件，最重要是xml文件信息读取，以下直接给代码读取xml信息。

代码如下：


def get_strfile(file_str, pos=-1):
    # 得到file_str / or \\ 的最后一个名称
    endstr_f_filestr = file_str.split('\\')[pos] if '\\' in file_str else file_str.split('/')[pos]
    return endstr_f_filestr


def read_xml(xml_root):
    '''
    :param xml_root: .xml文件
    :return: dict('cat':['cat1',...],'bboxes':[[x1,y1,x2,y2],...],'whd':[w ,h,d],'xml_name':xml_name)
    '''
    xml_name = get_strfile(xml_root)
    dict_info = {'cat': [], 'bboxes': [], 'box_wh': []}
    if os.path.splitext(xml_root)[-1] == '.xml':
        tree = ET.parse(xml_root)  # ET是一个xml文件解析库，ET.parse（）打开xml文件。parse--"解析"
        root = tree.getroot()  # 获取根节点
        whd = root.find('size')
        whd = [float(whd.find('width').text), float(whd.find('height').text), float(whd.find('depth').text)]
        dict_info['xml_name'] = xml_name
        dict_info['whd'] = whd

        for obj in root.findall('object'):  # 找到根节点下所有“object”节点
            cat = str(obj.find('name').text)  # 找到object节点下name子节点的值（字符串）
            bbox = obj.find('bndbox')
            x1, y1, x2, y2 = [float(bbox.find('xmin').text),
                              float(bbox.find('ymin').text),
                              float(bbox.find('xmax').text),
                              float(bbox.find('ymax').text)]
            b_w = x2 - x1 + 1
            b_h = y2 - y1 + 1
            dict_info['cat'].append(cat)
            dict_info['bboxes'].append([x1, y1, x2, y2])
            dict_info['box_wh'].append([b_w, b_h])

    else:
        print('[inexistence]:{} suffix is not xml '.format(xml_root))
    return dict_info

get_strfile函数是将xml_root绝对路径转换，获得xml名称函数。
read_xml(xml_root)函数是直接输入单个xml的绝对路径，输出为dict_info。

dict_info = {'cat': [], 'bboxes': [], 'box_wh': []}

cat与bboxes为列表，一一对应目标，cat为类别名称(一般为英文命名)；bboxes为对应box宽，分别表示左上点坐标x1,y1与右下点坐标x2,y2；box_wh记录该xml的size，为图像宽、高、通道。

三、xml文件转txt文件

1、xml转txt代码解读

在二中获得xml中的信息，这里将xml信息转yolov5的txt信息，其代码如下：

def xml_info2yolotxt(xml_info, classes_lst):
    cat_lst, bboxes_lst, whd, xml_name \
        = xml_info['cat'], xml_info['bboxes'], xml_info['whd'], xml_info['xml_name']
    txt_name = xml_name[:-3] + 'txt'
    txt_info = {'info': [], 'txt_name': txt_name}
    for i, cat in enumerate(cat_lst):
        if cat in classes_lst:  # 判断在classes_lst中才会保存该条信息
            cat_id = list(classes_lst).index(str(cat))
            x1, y1, x2, y2 = bboxes_lst[i]
            x, y, w, h = (x1 + x2) / 2, (y1 + y2) / 2, x2 - x1, y2 - y1
            dw = 1.0 / whd[0]
            dh = 1.0 / whd[1]
            x = x * dw
            w = w * dw
            y = y * dh
            h = h * dh
            txt_info['info'].append([cat_id, x, y, w, h])
    return txt_info

该函数xml_info2yolotxt(xml_info, classes_lst)的xml_info信息来自二中得到的结果，classes_lst表示类别的列表，若为coco128表示一中coco类别。输出为txt_info信息，实际已转为txt文件格式信息。

2、保存txt文件代码解读

保存实际来源上面转txt的输出信息，保存到本地，实现txt文件保存，其中txt格式很重要out_file.write(str(cls_id) + " " + " ".join([str(b) for b in box]) + '\n')，详情代码如下：

def save_txt(txt_info, out_dir):
    txt_name = txt_info['txt_name']
    info = txt_info['info']
    out_file = open(os.path.join(out_dir, txt_name), 'w')  # 转换后的txt文件存放路径
    for a in info:
        cls_id = a[0]
        box = a[1:]
        out_file.write(str(cls_id) + " " + " ".join([str(b) for b in box]) + '\n')

四、完整代码

以下代码读取一个xml转换为txt格式，并保存yolov5可直接使用的txt文件，完整代码如下：

import os
import xml.etree.ElementTree as ET

def get_strfile(file_str, pos=-1):
    # 得到file_str / or \\ 的最后一个名称
    endstr_f_filestr = file_str.split('\\')[pos] if '\\' in file_str else file_str.split('/')[pos]
    return endstr_f_filestr

def read_xml(xml_root):
    '''
    :param xml_root: .xml文件
    :return: dict('cat':['cat1',...],'bboxes':[[x1,y1,x2,y2],...],'whd':[w ,h,d],'xml_name':xml_name)
    '''
    xml_name = get_strfile(xml_root)
    dict_info = {'cat': [], 'bboxes': [], 'box_wh': []}
    if os.path.splitext(xml_root)[-1] == '.xml':
        tree = ET.parse(xml_root)  # ET是一个xml文件解析库，ET.parse（）打开xml文件。parse--"解析"
        root = tree.getroot()  # 获取根节点
        whd = root.find('size')
        whd = [float(whd.find('width').text), float(whd.find('height').text), float(whd.find('depth').text)]
        dict_info['xml_name'] = xml_name
        dict_info['whd'] = whd
        for obj in root.findall('object'):  # 找到根节点下所有“object”节点
            cat = str(obj.find('name').text)  # 找到object节点下name子节点的值（字符串）
            bbox = obj.find('bndbox')
            x1, y1, x2, y2 = [float(bbox.find('xmin').text),
                              float(bbox.find('ymin').text),
                              float(bbox.find('xmax').text),
                              float(bbox.find('ymax').text)]
            b_w = x2 - x1 + 1
            b_h = y2 - y1 + 1
            dict_info['cat'].append(cat)
            dict_info['bboxes'].append([x1, y1, x2, y2])
            dict_info['box_wh'].append([b_w, b_h])

    else:
        print('[inexistence]:{} suffix is not xml '.format(xml_root))
    return dict_info

def save_txt(txt_info, out_dir):
    txt_name = txt_info['txt_name']
    info = txt_info['info']
    out_file = open(os.path.join(out_dir, txt_name), 'w')  # 转换后的txt文件存放路径
    for a in info:
        cls_id = a[0]
        box = a[1:]
        out_file.write(str(cls_id) + " " + " ".join([str(b) for b in box]) + '\n')

def xml_info2yolotxt(xml_info, classes_lst):
    cat_lst, bboxes_lst, whd, xml_name \
        = xml_info['cat'], xml_info['bboxes'], xml_info['whd'], xml_info['xml_name']
    txt_name = xml_name[:-3] + 'txt'
    txt_info = {'info': [], 'txt_name': txt_name}
    for i, cat in enumerate(cat_lst):
        if cat in classes_lst:  # 判断在classes_lst中才会保存该条信息
            cat_id = list(classes_lst).index(str(cat))
            x1, y1, x2, y2 = bboxes_lst[i]
            x, y, w, h = (x1 + x2) / 2, (y1 + y2) / 2, x2 - x1, y2 - y1
            dw = 1.0 / whd[0]
            dh = 1.0 / whd[1]
            x = x * dw
            w = w * dw
            y = y * dh
            h = h * dh
            txt_info['info'].append([cat_id, x, y, w, h])
    return txt_info

def read_txt(file_path):
    with open(file_path, 'r') as f:
        content = f.read().splitlines()
    return content

if __name__ == '__main__':

    xml_root = r'E:\project\data\hncy_data\img_xml_ori\hw_data\images\train\19841.xml'
    classes_lst=['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
        'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
        'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
        'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
        'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
        'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
        'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
        'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
        'hair drier', 'toothbrush']  # class names
    xml_info = read_xml(xml_root)
    txt_info = xml_info2yolotxt(xml_info, classes_lst)
    save_txt(txt_info, '')