在进行机器学习或深度学习之前,都要对样本图片进行预处理,其中需要将图片的尺寸统一调整。很多时候,样本的来源很多,尺寸和比例也不统一,可能来自于互联网爬虫,可能来自于不同的手机拍摄。如果将不同尺寸与宽高比的图片调整到统一尺寸,对后续模型的训练影响很大。关于图片尺寸调整这块网上的代码很多,大多都是强制更改尺寸,导致图片会变形,会影响样本的原始信息。
下面的方式1是常见的方式:
调整方式1
代码如下:
import torchvision.transforms as transforms
from PIL import Image
import os
transform = transforms.Compose([
transforms.Resize((600, 600), interpolation=transforms.InterpolationMode.BICUBIC)
# other transformations...
])
def translate_image(path_image, path_target, picture_name):
im = Image.open(path_image)
im = im.convert("RGB")
output = transform(im)
output.save(path_target + os.sep + picture_name)
效果:
可以看出形变非常明显
调整方式2
为了避免形变,需要计算以宽和高像素数,进行缩放,空白处以空白像素进行填充,可以有效避免形变。
流程如下:
代码如下:
import cv2
import os
def resize(path_image, out_path):
im = cv2.imread(path_image)
height, width = im.shape[:2] # 取彩色图片的长、宽。
ratio_h = height / target_size
ration_w = width / target_size
ratio = max(ratio_h, ration_w)
# 缩小图像 resize(...,size)--size(width,height)
size = (int(width / ratio), int(height / ratio))
shrink = cv2.resize(im, size, interpolation=cv2.INTER_AREA) # 双线性插值
BLACK = [0, 0, 0]
a = (target_size - int(width / ratio)) / 2
b = (target_size - int(height / ratio)) / 2
constant = cv2.copyMakeBorder(shrink, int(b), int(b), int(a), int(a), cv2.BORDER_CONSTANT, value=BLACK)
constant = cv2.resize(constant, (target_size, target_size), interpolation=cv2.INTER_AREA)
cv2.imwrite(out_path, constant, [cv2.IMWRITE_PNG_COMPRESSION, 9])
return constant
效果如下:
可以看出,无论是上下还是左右,都可以自动居中,周边进行填充
批量文件归一化实现
import cv2
import os
target_size = 600
def resize(path_image, out_path):
im = cv2.imread(path_image)
height, width = im.shape[:2] # 取彩色图片的长、宽。
ratio_h = height / target_size
ration_w = width / target_size
ratio = max(ratio_h, ration_w)
# 缩小图像 resize(...,size)--size(width,height)
size = (int(width / ratio), int(height / ratio))
shrink = cv2.resize(im, size, interpolation=cv2.INTER_AREA) # 双线性插值
BLACK = [0, 0, 0]
a = (target_size - int(width / ratio)) / 2
b = (target_size - int(height / ratio)) / 2
constant = cv2.copyMakeBorder(shrink, int(b), int(b), int(a), int(a), cv2.BORDER_CONSTANT, value=BLACK)
constant = cv2.resize(constant, (target_size, target_size), interpolation=cv2.INTER_AREA)
cv2.imwrite(out_path, constant, [cv2.IMWRITE_PNG_COMPRESSION, 9])
return constant
if __name__ == '__main__':
directory_name = "F://1"
path_target = "F://1-resize"
if not os.path.exists(path_target):
os.makedirs(path_target)
for picture_name in os.listdir(directory_name):
file_name = directory_name + os.sep + picture_name # 读取文件夹地址+图片名称类型
resize(file_name, path_target+ os.sep + picture_name)
print(file_name)