目标检测DOTA数据集

news2024/11/16 19:48:39

前言

之前对于xml格式的YOLO数据集,之前记录过如何用imgaug对其进行数据增强。不过DOTA数据集采用的是txt格式的旋转框标注,因此不能直接套用,只能另辟蹊径。

DOTA数据集简介
DOTA数据集全称:Dataset for Object deTection in Aerial images

DOTA数据集v1.0共收录2806张4000 × 4000的图片,总共包含188282个目标。
在这里插入图片描述

DOTA数据集论文介绍:https://arxiv.org/pdf/1711.10398.pdf

数据集官网:https://captain-whu.github.io/DOTA/dataset.html

DOTA数据集总共有3个版本

DOTAV1.0
类别数目:15
类别名称:plane, ship, storage tank, baseball diamond, tennis court, basketball court, ground track field, harbor, bridge, large vehicle, small vehicle, helicopter, roundabout, soccer ball field , swimming pool
DOTAV1.5
类别数目:16
类别名称:plane, ship, storage tank, baseball diamond, tennis court, basketball court, ground track field, harbor, bridge, large vehicle, small vehicle, helicopter, roundabout, soccer ball field, swimming pool , container crane
DOTAV2.0
类别数目:18
类别名称:plane, ship, storage tank, baseball diamond, tennis court, basketball court, ground track field, harbor, bridge, large vehicle, small vehicle, helicopter, roundabout, soccer ball field, swimming pool, container crane, airport , helipad
DOTAV2.0版本,备份在我的GitHub上。

https://github.com/zstar1003/Dataset

本实验演示选择的是DOTA数据集中的一张图片,原图如下:
在这里插入图片描述

标签如下:

每个对象有10个数值,前8个代表一个矩形框四个角的坐标,第9个表示对象类别,第10个表示识别难易程度,0表示简单,1表示困难。
在这里插入图片描述

使用JDet中的visualization.py将标签可视化,效果如下图所示:

在这里插入图片描述

注:由于可视化代码需要坐标点输入为整数,因此后续给输出对象坐标点进行了取整操作,这会损失一些精度。

数据增强及可视化

调整亮度
这里通过skimage.exposure.adjust_gamma来调整亮度:

调整亮度

def changeLight(img, inputtxt, outputiamge, outputtxt):
    # random.seed(int(time.time()))
    flag = random.uniform(0.5, 1.5)  # flag>1为调暗,小于1为调亮
    label = round(flag, 2)
    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_" + str(label) + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_" + str(label) + extension)

    ima_gamma = exposure.adjust_gamma(img, 0.5)

    shutil.copyfile(inputtxt, outputtxt)
    cv.imwrite(outputiamge, ima_gamma)

在这里插入图片描述

添加高斯噪声
添加高斯噪声需要先将像素点进行归一化(/255)

添加高斯噪声

def gasuss_noise(image, inputtxt, outputiamge, outputtxt, mean=0, var=0.01):
    '''
    mean : 均值
    var : 方差
    '''
    image = np.array(image / 255, dtype=float)
    noise = np.random.normal(mean, var ** 0.5, image.shape)
    out = image + noise

    if out.min() < 0:
        low_clip = -1.
    else:
        low_clip = 0.
    out = np.clip(out, low_clip, 1.0)
    out = np.uint8(out * 255)

    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_gasunoise_" + str(mean) + "_" + str(var) + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_gasunoise_" + str(mean) + "_" + str(var) + extension)

    shutil.copyfile(inputtxt, outputtxt)
    cv.imwrite(outputiamge, out)

在这里插入图片描述

调整对比度

调整对比度

def ContrastAlgorithm(rgb_img, inputtxt, outputiamge, outputtxt):
    img_shape = rgb_img.shape
    temp_imag = np.zeros(img_shape, dtype=float)
    for num in range(0, 3):
        # 通过直方图正规化增强对比度
        in_image = rgb_img[:, :, num]
        # 求输入图片像素最大值和最小值
        Imax = np.max(in_image)
        Imin = np.min(in_image)
        # 要输出的最小灰度级和最大灰度级
        Omin, Omax = 0, 255
        # 计算a 和 b的值
        a = float(Omax - Omin) / (Imax - Imin)
        b = Omin - a * Imin
        # 矩阵的线性变化
        out_image = a * in_image + b
        # 数据类型的转化
        out_image = out_image.astype(np.uint8)
        temp_imag[:, :, num] = out_image
    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_contrastAlgorithm" + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_contrastAlgorithm" + extension)
    shutil.copyfile(inputtxt, outputtxt)
    cv.imwrite(outputiamge, temp_imag)

在这里插入图片描述

旋转
旋转本质上是利用仿射矩阵,这里预留了接口参数,可以自定义旋转角度

旋转

def rotate_img_bbox(img, inputtxt, temp_outputiamge, temp_outputtxt, angle, scale=1):
    nAgree = angle
    size = img.shape
    w = size[1]
    h = size[0]
    for numAngle in range(0, len(nAgree)):
        dRot = nAgree[numAngle] * np.pi / 180
        dSinRot = math.sin(dRot)
        dCosRot = math.cos(dRot)

        nw = (abs(np.sin(dRot) * h) + abs(np.cos(dRot) * w)) * scale
        nh = (abs(np.cos(dRot) * h) + abs(np.sin(dRot) * w)) * scale

        (filepath, tempfilename) = os.path.split(inputtxt)
        (filename, extension) = os.path.splitext(tempfilename)
        outputiamge = os.path.join(temp_outputiamge + "/" + filename + "_rotate_" + str(nAgree[numAngle]) + ".png")
        outputtxt = os.path.join(temp_outputtxt + "/" + filename + "_rotate_" + str(nAgree[numAngle]) + extension)

        rot_mat = cv.getRotationMatrix2D((nw * 0.5, nh * 0.5), nAgree[numAngle], scale)
        rot_move = np.dot(rot_mat, np.array([(nw - w) * 0.5, (nh - h) * 0.5, 0]))
        rot_mat[0, 2] += rot_move[0]
        rot_mat[1, 2] += rot_move[1]
        # 仿射变换
        rotat_img = cv.warpAffine(img, rot_mat, (int(math.ceil(nw)), int(math.ceil(nh))), flags=cv.INTER_LANCZOS4)
        cv.imwrite(outputiamge, rotat_img)

        save_txt = open(outputtxt, 'w')
        f = open(inputtxt)
        for line in f.readlines():
            line = line.split(" ")
            x1 = float(line[0])
            y1 = float(line[1])
            x2 = float(line[2])
            y2 = float(line[3])
            x3 = float(line[4])
            y3 = float(line[5])
            x4 = float(line[6])
            y4 = float(line[7])
            category = str(line[8])
            dif = str(line[9])

            point1 = np.dot(rot_mat, np.array([x1, y1, 1]))
            point2 = np.dot(rot_mat, np.array([x2, y2, 1]))
            point3 = np.dot(rot_mat, np.array([x3, y3, 1]))
            point4 = np.dot(rot_mat, np.array([x4, y4, 1]))
            x1 = round(point1[0], 3)
            y1 = round(point1[1], 3)
            x2 = round(point2[0], 3)
            y2 = round(point2[1], 3)
            x3 = round(point3[0], 3)
            y3 = round(point3[1], 3)
            x4 = round(point4[0], 3)
            y4 = round(point4[1], 3)
            # string = str(x1) + " " + str(y1) + " " + str(x2) + " " + str(y2) + " " + str(x3) + " " + str(y3) + " " + str(x4) + " " + str(y4) + " " + category + " " + dif
            string = str(int(x1)) + " " + str(int(y1)) + " " + str(int(x2)) + " " + str(int(y2)) + " " + str(
                int(x3)) + " " + str(int(y3)) + " " + str(int(x4)) + " " + str(int(y4)) + " " + category + " " + dif
            save_txt.write(string)

在这里插入图片描述

翻转图像
翻转图像使用cv.flip进行

翻转图像

def filp_pic_bboxes(img, inputtxt, outputiamge, outputtxt):
    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    output_vert_flip_img = os.path.join(outputiamge + "/" + filename + "_vert_flip" + ".png")
    output_vert_flip_txt = os.path.join(outputtxt + "/" + filename + "_vert_flip" + extension)
    output_horiz_flip_img = os.path.join(outputiamge + "/" + filename + "_horiz_flip" + ".png")
    output_horiz_flip_txt = os.path.join(outputtxt + "/" + filename + "_horiz_flip" + extension)

    h, w, _ = img.shape
    # 垂直翻转
    vert_flip_img = cv.flip(img, 1)
    cv.imwrite(output_vert_flip_img, vert_flip_img)
    # 水平翻转
    horiz_flip_img = cv.flip(img, 0)
    cv.imwrite(output_horiz_flip_img, horiz_flip_img)
    # ---------------------- 调整boundingbox ----------------------

    save_vert_txt = open(output_vert_flip_txt, 'w')
    save_horiz_txt = open(output_horiz_flip_txt, 'w')
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])

        vert_string = str(int(round(w - x1, 3))) + " " + str(int(y1)) + " " + str(int(round(w - x2, 3))) + " " + str(
            int(y2)) + " " + str(int(round(w - x3, 3))) + " " + str(int(y3)) + " " + str(
            int(round(w - x4, 3))) + " " + str(int(y4)) + " " + category + " " + dif
        horiz_string = str(int(x1)) + " " + str(int(round(h - y1, 3))) + " " + str(int(x2)) + " " + str(
            int(round(h - y2, 3))) + " " + str(int(x3)) + " " + str(int(round(h - y3, 3))) + " " + str(
            int(x4)) + " " + str(int(round(h - y4, 3))) + " " + category + " " + dif

        save_horiz_txt.write(horiz_string)
        save_vert_txt.write(vert_string)

在这里插入图片描述

平移图像
平移图像本质也是利用仿射矩阵

平移图像

def shift_pic_bboxes(img, inputtxt, outputiamge, outputtxt):
    w = img.shape[1]
    h = img.shape[0]
    x_min = w  # 裁剪后的包含所有目标框的最小的框
    x_max = 0
    y_min = h
    y_max = 0
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])

        x_min = min(x_min, x1, x2, x3, x4)
        y_min = min(y_min, y1, y2, y3, y4)
        x_max = max(x_max, x1, x2, x3, x4)
        y_max = max(y_max, y1, y2, y3, y4)

    d_to_left = x_min  # 包含所有目标框的最大左移动距离
    d_to_right = w - x_max  # 包含所有目标框的最大右移动距离
    d_to_top = y_min  # 包含所有目标框的最大上移动距离
    d_to_bottom = h - y_max  # 包含所有目标框的最大下移动距离

    x = random.uniform(-(d_to_left - 1) / 3, (d_to_right - 1) / 3)
    y = random.uniform(-(d_to_top - 1) / 3, (d_to_bottom - 1) / 3)

    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    if x >= 0 and y >= 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift_" + str(round(x, 3)) + "_" + str(round(y, 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift_" + str(round(x, 3)) + "_" + str(round(y, 3)) + extension)
    elif x >= 0 and y < 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift_" + str(round(x, 3)) + "__" + str(round(abs(y), 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift_" + str(round(x, 3)) + "__" + str(round(abs(y), 3)) + extension)
    elif x < 0 and y >= 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "_" + str(round(y, 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "_" + str(round(y, 3)) + extension)
    elif x < 0 and y < 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "__" + str(round(abs(y), 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "__" + str(round(abs(y), 3)) + extension)

    M = np.float32([[1, 0, x], [0, 1, y]])  # x为向左或右移动的像素值,正为向右负为向左; y为向上或者向下移动的像素值,正为向下负为向上
    shift_img = cv.warpAffine(img, M, (img.shape[1], img.shape[0]))
    cv.imwrite(outputiamge, shift_img)
    # ---------------------- 平移boundingbox ----------------------
    save_txt = open(outputtxt, "w")
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])
        shift_str = str(int(round(x1 + x, 3))) + " " + str(int(round(y1 + y, 3))) + " " + str(int(round(x2 + x, 3))) + " " + str(int(round(y2 + y, 3))) + " " + str(int(round(x3 + x, 3))) + " " + str(int(round(y3 + y, 3))) + " " + str(int(round(x4 + x, 3))) + " " + str(int(round(y4 + y, 3))) + " " + category + " " + dif
        save_txt.write(shift_str)

裁剪图像
这里的裁剪图像指的是裁剪目标周围的背景,从外面向内收缩,直到碰到第一个对象框停止

裁剪图像

def crop_img_bboxes(img, inputtxt, outputiamge, outputtxt):
    # ---------------------- 裁剪图像 ----------------------
    w = img.shape[1]
    h = img.shape[0]
    x_min = 0  # 裁剪后的包含所有目标框的最小的框
    x_max = w
    y_min = 0
    y_max = h
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])

        x_min = min(x_min, x1, x2, x3, x4)
        y_min = min(y_min, y1, y2, y3, y4)
        x_max = max(x_max, x1, x2, x3, x4)
        y_max = max(y_max, y1, y2, y3, y4)

    d_to_left = x_min  # 包含所有目标框的最小框到左边的距离
    d_to_right = w - x_max  # 包含所有目标框的最小框到右边的距离
    d_to_top = y_min  # 包含所有目标框的最小框到顶端的距离
    d_to_bottom = h - y_max  # 包含所有目标框的最小框到底部的距离
    # 随机扩展这个最小框
    crop_x_min = int(x_min - random.uniform(0, d_to_left))
    crop_y_min = int(y_min - random.uniform(0, d_to_top))
    crop_x_max = int(x_max + random.uniform(0, d_to_right))
    crop_y_max = int(y_max + random.uniform(0, d_to_bottom))
    # 随机扩展这个最小框 , 防止别裁的太小
    # crop_x_min = int(x_min - random.uniform(d_to_left//2, d_to_left))
    # crop_y_min = int(y_min - random.uniform(d_to_top//2, d_to_top))
    # crop_x_max = int(x_max + random.uniform(d_to_right//2, d_to_right))
    # crop_y_max = int(y_max + random.uniform(d_to_bottom//2, d_to_bottom))
    # 确保不要越界
    crop_x_min = max(0, crop_x_min)
    crop_y_min = max(0, crop_y_min)
    crop_x_max = min(w, crop_x_max)
    crop_y_max = min(h, crop_y_max)
    crop_img = img[crop_y_min:crop_y_max, crop_x_min:crop_x_max]

    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_crop_" + str(crop_x_min) + "_" +
                               str(crop_y_min) + "_" + str(crop_x_max) + "_" +
                               str(crop_y_max) + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_crop_" + str(crop_x_min) + "_" +
                             str(crop_y_min) + "_" + str(crop_x_max) + "_" +
                             str(crop_y_max) + extension)
    cv.imwrite(outputiamge, crop_img)
    # ---------------------- 裁剪boundingbox ----------------------
    # 裁剪后的boundingbox坐标计算
    save_txt = open(outputtxt, "w")
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])
        crop_str = str(int(round(x1 - crop_x_min, 3))) + " " + str(int(round(y1 - crop_y_min, 3))) + " " + str(int(round(x2 - crop_x_min, 3))) + " " + str(int(round(y2 - crop_y_min, 3))) + " " + str(int(round(x3 - crop_x_min, 3))) + " " + str(int(round(y3 - crop_y_min, 3))) + " " + str(int(round(x4 - crop_x_min, 3))) + " " + str(int(round(y4 - crop_y_min, 3))) + " " + category + " " + dif
        save_txt.write(crop_str)

在这里插入图片描述

完整代码
图片放置结构如下图所示:
在这里插入图片描述

import time
import random
import cv2 as cv
import os
import math
import numpy as np
from skimage import exposure
import shutil


# 调整亮度
def changeLight(img, inputtxt, outputiamge, outputtxt):
    # random.seed(int(time.time()))
    flag = random.uniform(0.5, 1.5)  # flag>1为调暗,小于1为调亮
    label = round(flag, 2)
    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_" + str(label) + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_" + str(label) + extension)

    ima_gamma = exposure.adjust_gamma(img, 0.5)

    shutil.copyfile(inputtxt, outputtxt)
    cv.imwrite(outputiamge, ima_gamma)


# 添加高斯噪声
def gasuss_noise(image, inputtxt, outputiamge, outputtxt, mean=0, var=0.01):
    '''
    mean : 均值
    var : 方差
    '''
    image = np.array(image / 255, dtype=float)
    noise = np.random.normal(mean, var ** 0.5, image.shape)
    out = image + noise

    if out.min() < 0:
        low_clip = -1.
    else:
        low_clip = 0.
    out = np.clip(out, low_clip, 1.0)
    out = np.uint8(out * 255)

    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_gasunoise_" + str(mean) + "_" + str(var) + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_gasunoise_" + str(mean) + "_" + str(var) + extension)

    shutil.copyfile(inputtxt, outputtxt)
    cv.imwrite(outputiamge, out)


# 调整对比度
def ContrastAlgorithm(rgb_img, inputtxt, outputiamge, outputtxt):
    img_shape = rgb_img.shape
    temp_imag = np.zeros(img_shape, dtype=float)
    for num in range(0, 3):
        # 通过直方图正规化增强对比度
        in_image = rgb_img[:, :, num]
        # 求输入图片像素最大值和最小值
        Imax = np.max(in_image)
        Imin = np.min(in_image)
        # 要输出的最小灰度级和最大灰度级
        Omin, Omax = 0, 255
        # 计算a 和 b的值
        a = float(Omax - Omin) / (Imax - Imin)
        b = Omin - a * Imin
        # 矩阵的线性变化
        out_image = a * in_image + b
        # 数据类型的转化
        out_image = out_image.astype(np.uint8)
        temp_imag[:, :, num] = out_image
    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_contrastAlgorithm" + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_contrastAlgorithm" + extension)
    shutil.copyfile(inputtxt, outputtxt)
    cv.imwrite(outputiamge, temp_imag)


# 旋转
def rotate_img_bbox(img, inputtxt, temp_outputiamge, temp_outputtxt, angle, scale=1):
    nAgree = angle
    size = img.shape
    w = size[1]
    h = size[0]
    for numAngle in range(0, len(nAgree)):
        dRot = nAgree[numAngle] * np.pi / 180
        dSinRot = math.sin(dRot)
        dCosRot = math.cos(dRot)

        nw = (abs(np.sin(dRot) * h) + abs(np.cos(dRot) * w)) * scale
        nh = (abs(np.cos(dRot) * h) + abs(np.sin(dRot) * w)) * scale

        (filepath, tempfilename) = os.path.split(inputtxt)
        (filename, extension) = os.path.splitext(tempfilename)
        outputiamge = os.path.join(temp_outputiamge + "/" + filename + "_rotate_" + str(nAgree[numAngle]) + ".png")
        outputtxt = os.path.join(temp_outputtxt + "/" + filename + "_rotate_" + str(nAgree[numAngle]) + extension)

        rot_mat = cv.getRotationMatrix2D((nw * 0.5, nh * 0.5), nAgree[numAngle], scale)
        rot_move = np.dot(rot_mat, np.array([(nw - w) * 0.5, (nh - h) * 0.5, 0]))
        rot_mat[0, 2] += rot_move[0]
        rot_mat[1, 2] += rot_move[1]
        # 仿射变换
        rotat_img = cv.warpAffine(img, rot_mat, (int(math.ceil(nw)), int(math.ceil(nh))), flags=cv.INTER_LANCZOS4)
        cv.imwrite(outputiamge, rotat_img)

        save_txt = open(outputtxt, 'w')
        f = open(inputtxt)
        for line in f.readlines():
            line = line.split(" ")
            x1 = float(line[0])
            y1 = float(line[1])
            x2 = float(line[2])
            y2 = float(line[3])
            x3 = float(line[4])
            y3 = float(line[5])
            x4 = float(line[6])
            y4 = float(line[7])
            category = str(line[8])
            dif = str(line[9])

            point1 = np.dot(rot_mat, np.array([x1, y1, 1]))
            point2 = np.dot(rot_mat, np.array([x2, y2, 1]))
            point3 = np.dot(rot_mat, np.array([x3, y3, 1]))
            point4 = np.dot(rot_mat, np.array([x4, y4, 1]))
            x1 = round(point1[0], 3)
            y1 = round(point1[1], 3)
            x2 = round(point2[0], 3)
            y2 = round(point2[1], 3)
            x3 = round(point3[0], 3)
            y3 = round(point3[1], 3)
            x4 = round(point4[0], 3)
            y4 = round(point4[1], 3)
            # string = str(x1) + " " + str(y1) + " " + str(x2) + " " + str(y2) + " " + str(x3) + " " + str(y3) + " " + str(x4) + " " + str(y4) + " " + category + " " + dif
            string = str(int(x1)) + " " + str(int(y1)) + " " + str(int(x2)) + " " + str(int(y2)) + " " + str(
                int(x3)) + " " + str(int(y3)) + " " + str(int(x4)) + " " + str(int(y4)) + " " + category + " " + dif
            save_txt.write(string)

# 翻转图像
def filp_pic_bboxes(img, inputtxt, outputiamge, outputtxt):
    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    output_vert_flip_img = os.path.join(outputiamge + "/" + filename + "_vert_flip" + ".png")
    output_vert_flip_txt = os.path.join(outputtxt + "/" + filename + "_vert_flip" + extension)
    output_horiz_flip_img = os.path.join(outputiamge + "/" + filename + "_horiz_flip" + ".png")
    output_horiz_flip_txt = os.path.join(outputtxt + "/" + filename + "_horiz_flip" + extension)

    h, w, _ = img.shape
    # 垂直翻转
    vert_flip_img = cv.flip(img, 1)
    cv.imwrite(output_vert_flip_img, vert_flip_img)
    # 水平翻转
    horiz_flip_img = cv.flip(img, 0)
    cv.imwrite(output_horiz_flip_img, horiz_flip_img)
    # ---------------------- 调整boundingbox ----------------------
    save_vert_txt = open(output_vert_flip_txt, 'w')
    save_horiz_txt = open(output_horiz_flip_txt, 'w')
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])

        vert_string = str(int(round(w - x1, 3))) + " " + str(int(y1)) + " " + str(int(round(w - x2, 3))) + " " + str(
            int(y2)) + " " + str(int(round(w - x3, 3))) + " " + str(int(y3)) + " " + str(
            int(round(w - x4, 3))) + " " + str(int(y4)) + " " + category + " " + dif
        horiz_string = str(int(x1)) + " " + str(int(round(h - y1, 3))) + " " + str(int(x2)) + " " + str(
            int(round(h - y2, 3))) + " " + str(int(x3)) + " " + str(int(round(h - y3, 3))) + " " + str(
            int(x4)) + " " + str(int(round(h - y4, 3))) + " " + category + " " + dif

        save_horiz_txt.write(horiz_string)
        save_vert_txt.write(vert_string)


# 平移图像
def shift_pic_bboxes(img, inputtxt, outputiamge, outputtxt):
    w = img.shape[1]
    h = img.shape[0]
    x_min = w  # 裁剪后的包含所有目标框的最小的框
    x_max = 0
    y_min = h
    y_max = 0
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])

        x_min = min(x_min, x1, x2, x3, x4)
        y_min = min(y_min, y1, y2, y3, y4)
        x_max = max(x_max, x1, x2, x3, x4)
        y_max = max(y_max, y1, y2, y3, y4)

    d_to_left = x_min  # 包含所有目标框的最大左移动距离
    d_to_right = w - x_max  # 包含所有目标框的最大右移动距离
    d_to_top = y_min  # 包含所有目标框的最大上移动距离
    d_to_bottom = h - y_max  # 包含所有目标框的最大下移动距离

    x = random.uniform(-(d_to_left - 1) / 3, (d_to_right - 1) / 3)
    y = random.uniform(-(d_to_top - 1) / 3, (d_to_bottom - 1) / 3)

    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    if x >= 0 and y >= 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift_" + str(round(x, 3)) + "_" + str(round(y, 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift_" + str(round(x, 3)) + "_" + str(round(y, 3)) + extension)
    elif x >= 0 and y < 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift_" + str(round(x, 3)) + "__" + str(round(abs(y), 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift_" + str(round(x, 3)) + "__" + str(round(abs(y), 3)) + extension)
    elif x < 0 and y >= 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "_" + str(round(y, 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "_" + str(round(y, 3)) + extension)
    elif x < 0 and y < 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "__" + str(round(abs(y), 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "__" + str(round(abs(y), 3)) + extension)

    M = np.float32([[1, 0, x], [0, 1, y]])  # x为向左或右移动的像素值,正为向右负为向左; y为向上或者向下移动的像素值,正为向下负为向上
    shift_img = cv.warpAffine(img, M, (img.shape[1], img.shape[0]))
    cv.imwrite(outputiamge, shift_img)
    # ---------------------- 平移boundingbox ----------------------
    save_txt = open(outputtxt, "w")
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])
        shift_str = str(int(round(x1 + x, 3))) + " " + str(int(round(y1 + y, 3))) + " " + str(int(round(x2 + x, 3))) + " " + str(int(round(y2 + y, 3))) + " " + str(int(round(x3 + x, 3))) + " " + str(int(round(y3 + y, 3))) + " " + str(int(round(x4 + x, 3))) + " " + str(int(round(y4 + y, 3))) + " " + category + " " + dif
        save_txt.write(shift_str)


# 裁剪图像
def crop_img_bboxes(img, inputtxt, outputiamge, outputtxt):
    # ---------------------- 裁剪图像 ----------------------
    w = img.shape[1]
    h = img.shape[0]
    x_min = 0  # 裁剪后的包含所有目标框的最小的框
    x_max = w
    y_min = 0
    y_max = h
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])

        x_min = min(x_min, x1, x2, x3, x4)
        y_min = min(y_min, y1, y2, y3, y4)
        x_max = max(x_max, x1, x2, x3, x4)
        y_max = max(y_max, y1, y2, y3, y4)

    d_to_left = x_min  # 包含所有目标框的最小框到左边的距离
    d_to_right = w - x_max  # 包含所有目标框的最小框到右边的距离
    d_to_top = y_min  # 包含所有目标框的最小框到顶端的距离
    d_to_bottom = h - y_max  # 包含所有目标框的最小框到底部的距离
    # 随机扩展这个最小框
    crop_x_min = int(x_min - random.uniform(0, d_to_left))
    crop_y_min = int(y_min - random.uniform(0, d_to_top))
    crop_x_max = int(x_max + random.uniform(0, d_to_right))
    crop_y_max = int(y_max + random.uniform(0, d_to_bottom))
    # 随机扩展这个最小框 , 防止别裁的太小
    # crop_x_min = int(x_min - random.uniform(d_to_left//2, d_to_left))
    # crop_y_min = int(y_min - random.uniform(d_to_top//2, d_to_top))
    # crop_x_max = int(x_max + random.uniform(d_to_right//2, d_to_right))
    # crop_y_max = int(y_max + random.uniform(d_to_bottom//2, d_to_bottom))
    # 确保不要越界
    crop_x_min = max(0, crop_x_min)
    crop_y_min = max(0, crop_y_min)
    crop_x_max = min(w, crop_x_max)
    crop_y_max = min(h, crop_y_max)
    crop_img = img[crop_y_min:crop_y_max, crop_x_min:crop_x_max]

    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_crop_" + str(crop_x_min) + "_" +
                               str(crop_y_min) + "_" + str(crop_x_max) + "_" +
                               str(crop_y_max) + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_crop_" + str(crop_x_min) + "_" +
                             str(crop_y_min) + "_" + str(crop_x_max) + "_" +
                             str(crop_y_max) + extension)
    cv.imwrite(outputiamge, crop_img)
    # ---------------------- 裁剪boundingbox ----------------------
    # 裁剪后的boundingbox坐标计算
    save_txt = open(outputtxt, "w")
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])
        crop_str = str(int(round(x1 - crop_x_min, 3))) + " " + str(int(round(y1 - crop_y_min, 3))) + " " + str(int(round(x2 - crop_x_min, 3))) + " " + str(int(round(y2 - crop_y_min, 3))) + " " + str(int(round(x3 - crop_x_min, 3))) + " " + str(int(round(y3 - crop_y_min, 3))) + " " + str(int(round(x4 - crop_x_min, 3))) + " " + str(int(round(y4 - crop_y_min, 3))) + " " + category + " " + dif
        save_txt.write(crop_str)


if __name__ == '__main__':
    inputiamge = "dataset/ceshi/images"
    inputtxt = "dataset/ceshi/labelTxt"
    outputiamge = "dataset/ceshi_aug/images"
    outputtxt = "dataset/ceshi_aug/labelTxt"
    angle = [30, 60, 90, 120, 150, 180]
    tempfilename = os.listdir(inputiamge)
    for file in tempfilename:
        (filename, extension) = os.path.splitext(file)
        input_image = os.path.join(inputiamge + "/" + file)
        input_txt = os.path.join(inputtxt + "/" + filename + ".txt")

        img = cv.imread(input_image)
        # 图像亮度变化
        changeLight(img,input_txt,outputiamge,outputtxt)
        # 加高斯噪声
        gasuss_noise(img, input_txt, outputiamge, outputtxt, mean=0, var=0.001)
        # 改变图像对比度
        ContrastAlgorithm(img, input_txt, outputiamge, outputtxt)
        # 图像旋转
        rotate_img_bbox(img, input_txt, outputiamge, outputtxt, angle)
        # 图像镜像
        filp_pic_bboxes(img, input_txt, outputiamge, outputtxt)
        # 平移
        shift_pic_bboxes(img, input_txt, outputiamge, outputtxt)
        # 剪切
        crop_img_bboxes(img, input_txt, outputiamge, outputtxt)
    print("finished!")

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2162943.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

Redis6 多线程模型

优质博文&#xff1a;IT-BLOG-CN 一、单线程的优缺点 对于一个请求操作Redis主要做3件事情&#xff1a;从客户端读取数据/解析、执行Redis命令、回写数据给客户端。所以主线程其实就是把所有操作的这3件事情串行一起执行&#xff0c;因为是基于内存&#xff0c;所以执行速度非…

区间合并算法

区间合并 区间合并就是有两个区间我们把两个区间合并成一个区间 我们来看一道题 Acwing 803 区间合并 1.题目 给定 n nn 个区间 [ l i , r i ] [li,ri][li,ri]&#xff0c;要求合并所有有交集的区间。 注意如果在端点处相交&#xff0c;也算有交集。 输出合并完成后的区间个…

C语言 | Leetcode C语言题解之第434题字符串中的单词数

题目&#xff1a; 题解&#xff1a; int countSegments(char * s){int count 0; //count用来记录单词个数for(int i0; i < strlen(s); i){ //遍历字符串 if((i 0 || s[i-1] ) && s[i] ! ) //一个…

C语言_指针(2)

1.指针与数组的关系 1.1 数组名 先看代码&#xff1a; #include <stdio.h> int main() {int arr[10] { 1,2,3,4,5,6,7,8,9,10 };printf("&arr[0] %p\n", &arr[0]);printf("arr %p\n", arr);return 0;}运行结果是这样的&#xff1a; 我…

数据结构 ——— 数组 nums 包含了从 0 到 n 的所有整数,但是其中缺失了一个,请编写代码找出缺失的整数,并且在O(N)时间内完成

目录 题目要求 代码实现 方法1&#xff08;异或法&#xff09;&#xff1a; 异或算法的时间复杂度&#xff1a; 方法2&#xff08;等差数列公式&#xff09;&#xff1a; 等差数列公式的时间复杂度&#xff1a; 题目要求 整型数组 nums 包含了从 0 到 n 的所有整数&…

【有啥问啥】 Self-Play技术:强化学习中的自我进化之道

Self-Play技术&#xff1a;强化学习中的自我进化之道 在人工智能的快速发展中&#xff0c;强化学习&#xff08;Reinforcement Learning, RL&#xff09;已成为推动智能体自主学习与优化的关键力量。Self-Play技术&#xff0c;作为强化学习领域的一项前沿创新&#xff0c;通过…

Java语法-类和对象(上)

1. 面向对象的初步认识 1.1 什么是面向对象 概念: Java是一门纯面向对象的语言(Object Oriented Program&#xff0c;简称OOP)&#xff0c;在面向对象的世界里&#xff0c;一切皆为对象。 1.2 面向对象VS面向过程 如:洗衣服 面向过程: 注重的是洗衣服的过程,少了一个环节也不…

SPSS26统计分析笔记——3 假设检验

1 假设检验原理 假设检验的基本原理源于“小概率事件”原理&#xff0c;是一种基于概率性质的反证法。其核心思想是小概率事件在一次试验中几乎不会发生。检验的过程首先假设原假设 H 0 {H_0} H0​成立&#xff0c;然后通过统计方法分析样本数据。如果样本数据引发了“小概率事…

《数据压缩入门》笔记-Part 2

一篇文章显得略长&#xff0c;本文对应原书6-10章。序言、前言、第1-5章&#xff0c;请参考Part 1&#xff0c;第11-15章&#xff0c;请参考Part 3。 自适应统计编码 位置对熵的重要性 统计编码有一个问题&#xff1a;在编码开始之前都需要遍历一次数据&#xff0c;以计算出…

Linux:八种重定向详解(万字长文警告)

相关阅读Linuxhttps://blog.csdn.net/weixin_45791458/category_12234591.html?spm1001.2014.3001.5482 本文将讨论Linux中的重定向相关问题&#xff0c;在阅读本文前&#xff0c;强烈建议先学习文件描述符的相关内容Linux&#xff1a;文件描述符详解。 重定向分为两类&#x…

智能感知,主动防御:移动云态势感知为政企安全护航

数字化时代&#xff0c;网络安全已成为企业持续运营和发展的重要基石。随着业务扩展&#xff0c;企业资产的数量急剧增加&#xff0c;且分布日益分散&#xff0c;如何全面、准确地掌握和管理资产成为众多政企单位的难题。同时&#xff0c;传统安全手段又难以有效应对新型、隐蔽…

【unity进阶知识1】最详细的单例模式的设计和应用,继承和不继承MonoBehaviour的单例模式,及泛型单例基类的编写

文章目录 前言一、不使用单例二、普通单例模式1、单例模式介绍实现步骤&#xff1a;单例模式分为饿汉式和懒汉式两种。 2、不继承MonoBehaviour的单例模式2.1、基本实现2.2、防止外部实例化对象2.3、最终代码 3、继承MonoBehaviour的单例模式3.1、基本实现3.2、自动创建和挂载单…

苏轼为何要写石钟山记?时间节点是关键

《石钟山记》不仅是苏轼的旅行笔记&#xff0c;亦是其人生哲学与思想的深邃自省。文中不仅详述了他对石钟山的实地勘察&#xff0c;亦体现了其对历史、自然及人生之独到见解。黄州生涯及其对政治与文化的洞悉&#xff0c;为这篇作品注入了深厚底蕴。 苏轼的黄州岁月 黄州期间…

后端回写前端日期格式化

问题 不进行格式化处理&#xff0c;就会导致传递的字符串很奇怪 解决方案 注解&#xff08;字段&#xff09; <dependency><groupId>com.fasterxml.jackson.core</groupId><artifactId>jackson-databind</artifactId><version>2.9.2</…

9.创新与未来:ChatGPT的新功能和趋势【9/10】

创新与未来&#xff1a;ChatGPT的新功能和趋势 引言 在探讨人工智能的发展历程时&#xff0c;我们可以看到它已经从早期的图灵机和人工神经网络模型&#xff0c;发展到了今天能够模拟人类智能的复杂系统。人工智能的起源可以追溯到20世纪40年代&#xff0c;而它的重要里程碑包…

构建预测睡眠质量模型_相关性分析,多变量分析和聚类分析

数据入口&#xff1a;睡眠质量记录数据集 - Heywhale.com 本数据集目的是探究不同因素是如何影响睡眠质量和整体健康的。 数据说明 字段说明Heart Rate Variability心率变异性&#xff1a;心跳时间间隔的模拟变化Body Temperature体温&#xff1a;以摄氏度为单位的人工生成体…

Multisim简体中文版百度云下载(含安装步骤)

如大家所熟悉的&#xff0c;Multisim是一款基于电路仿真的软件&#xff0c;可用于电子工程师、电子爱好者和学生进行电路设计、分析和调试。Multisim具有完整的电路设计和仿真功能&#xff0c;可支持模拟电路、数字电路&#xff0c;以及混合电路。 Multisim可以模拟不同电路的…

【数据结构】排序算法系列——归并排序(附源码+图解)

归并排序 归并排序从字面上来看&#xff0c;它的大致核心应与归并有关——归并拆分开来&#xff0c;变成归类和合并&#xff0c;归类则是将数组进行有序化&#xff0c;合并则是将两个有序的数组进行合并变成一个有序的数组。 它的特点在于并不是一开始就将整个数组进行归类和…

MODBUS TCP 转 CANOpen

产品概述 SG-TCP-COE-210 网关可以实现将 CANOpen 接口设备连接到 MODBUS TCP 网络中。用户不需要了解具体的 CANOpen 和 Modbus TCP 协议即可实现将CANOpen 设备挂载到 MODBUS TCP 接口的 PLC 上&#xff0c;并和 CANOpen 设备进行数据交互。 产品特点 &#xf…

Qt 构建目录

Qt Creator新建项目时&#xff0c;选择构建套件是必要的一环&#xff1a; 构建目录的默认设置 在Qt Creator中&#xff0c;项目的构建目录通常是默认设置的&#xff0c;位于项目文件夹内的一个子文件夹中&#xff0c;如&#xff1a;build-项目名-Desktop_Qt_版本号_编译器类型_…