在做3D分割任务中，多数的方法多采用整体缩放，或裁剪成一个个小的patch操作，这样做的一个主要原因是内存问题。

相较于整体缩放，采用裁剪成patch的方法，对于小目标会更加的鲁棒，这也是大多数3D分割任务中常选取的方式。尤其是针对医学影像的器官分割任务，CT结节诊断等等，对于细节的要求是非常高的。采用缩小的方式，反而会使得目标的像素区域在输入阶段，就损失较多。

后面，就针对2D和3D的图像和MR数据进行有重叠的crop操作和merge操作，帮助对其中的细节进行理解，下面开始吧。

一、2D crop and merge

对于一个[10, 10]大小的示例图像，采用patch大小为[3, 3]的进行裁剪，每次patch与patch之间，在x和y方向重叠1个像素，无法构成一个patch的部分，选择丢弃，如下所示：
示意图

分析过程：

首先，决定行列可以有多少个patch，是左上角第一个patch中右下角的那个红色点，因为他是第一个在水平和竖直方向都会需要重叠的点；
反应在行和列上面，可以移动的区域也就是width - patch_width + 1，和height - patch_height + 1，因为对于左上角第一个patch，只有右下角的坐标，是参与到遍历里面的，可参与遍历的区域就是红色曲线区域；
对于行、列的每一步，都会重叠overlap_size个像素区域，所以可走的步长，是patch_width - overlap_width，和patch_height - overlap_height；
最后以左上角的坐标点为准，x:x+patch_width和y:y+patch_height就表示了一个区块patch的像素范围，被裁剪下来。

1.1、crop

下面是实现上述步骤的代码，主要就是要理解几个点：

在水平和数值方向，能取到多少个patch？
patch的滑动选取，一次可以移动多大的步长？
最后取像素块，就简单了许多。

import numpy as np

def crop_volume(volume, patch_size=[96, 96], overlap_size=[4, 4]):
    """
    Crop a 2D volume into patches with specified size and overlap.
    Args:
        volume (np.ndarray): the 3D volume to be cropped, with shape [width, height]
        patch_size (tuple or list): the size of patch, with format [patch_width, patch_height]
        overlap_size (tuple or list): the size of overlap between adjacent patches, with format [overlap_width, overlap_height]
    Returns:
        np.ndarray: the cropped patches, with shape [num_patches, patch_width, patch_height]
    """
    width, height = volume.shape
    patch_width, patch_height = patch_size

    overlap_width, overlap_height = overlap_size
    patches = []
    # 不够一个patch，就丢弃
    for x in range(0, width - patch_width + 1, patch_width - overlap_width):
        for y in range(0, height - patch_height + 1, patch_height - overlap_height):
            print(x, y)
            patch = volume[x:x+patch_width, y:y+patch_height]
            patches.append(patch)
        print('\n')
    patches = np.asarray(patches)

    return patches


# 生成一个[10, 10]大小的示例图像
imgs = np.random.rand(10, 10)
patch_size=[3, 3]
overlap_size = [1, 1]
# print('img shape:', imgs.shape)

patches = crop_volume(imgs, patch_size, overlap_size)
print('patches shape:', patches.shape)

验证了前面我们的猜想，后面我们直接取一个图片，来验证下我们的思路。如果上述的思路是对的，那么，在裁剪后保存的图像，就该是一个具体部分重叠区域，但是，还能够反映全貌的一个个小图。下面就是：

import os
import itk
import cv2
from matplotlib import pylab as plt
from PIL import Image
path = os.path.join(r'F:\tmp\results2', '10.png')
imgs = cv2.imread(path, 0)
volume_size = imgs.shape
patch_size=[96, 96]
overlap_size = [4, 4]
print('img shape:', imgs.shape)

patches = crop_volume(imgs, patch_size, overlap_size)
print('patches shape:', patches.shape)

for i in range(0, patches.shape[0], 1):
    one_patch = patches[i, :, :]
    print(i, one_patch.shape)
    width_p, height_p = one_patch.shape

    img_Image = Image.fromarray(one_patch)
    img_Image = img_Image.convert("L")

    img_Image.save(r"F:\tmp\results1/" + str(i) + ".png")

如下，是读取的原始图像，和crop后的一个个散落的小图。尽管一个个小图在我们展示的时候，他们之间使用间隙的，但并不影响我们看到他的全貌。

还是一种图的样子，区别在与彩色图像成了灰度图，最右侧和最下侧的像素像是少了一些。这是因为不足一个patch，被丢弃的原因。基于此，我们也能给猜出来，在后续merge阶段，可能会还原回去的图像存在些许的差异。

二维裁剪

1.2、merge

merge是crop的一个逆过程。当时我们怎么拆的，现在就原路给拼接回去。拼接回去的图像尺寸和crop前的图像尺寸是一致的，当时被忽略的部分，都是0。

移动的步长还是一致的，在行列方向分别还是：patch_width - overlap_width，和patch_height - overlap_height；
还要需要知道在行列方向，分别crop了多少次。以行为例，width - patch_width就是把第一个patch块去除掉的行长度，再除以(patch_width - overlap_width)，得到剩余部分可以走多少步，再+1，把第一个块的数量补上，也就得到了在行方向上，有多少个patch了；
以左上角的坐标点为准，x:x+patch_width和y:y+patch_height就表示了一个区块patch了，现在就把对应的patch像素，给赋值给原始图了。
因为overlap重叠区域，被多次覆盖，这部分需要求均值，巧妙的采用了一个新的像素块，专门记录被赋值的次数，最后除以对应的次数，就可能实现求均值的过程了。

为什么先需要把第一个patch块的行列方向都先去掉呢？

因为第一个块是最特殊的，它被重叠的区域，只有overlap_size_w行和overlap_size_h例，而其余的patch块，重叠区域都会是2行和2列，都遵循步长的节奏。

width - patch_width把第一个patch块去除掉后的行长度，还能准确反映有多少个patch吗？

答案是可以的，这是因为减去一个patch，无非就是少了一个overlap_size_h的长度，去掉一个overlap_size_h的长度，如果恰好整除，那么加上这个长度，也是多余的，无法再次构成一个新的patch；即便有剩余，也是无法组成一个新的patch的。

下面是上面图像crop阶段裁剪得到的patch，加上merge后的操作，如下：

def merge_patches(patches, volume_size, overlap_size):
    """
    Merge the cropped patches into a complete 2D volume.
    Args:
        patches (np.ndarray): the cropped patches, with shape [num_patches, patch_width, patch_height]
        volume_size (tuple or list): the size of the complete volume, with format [width, height]
        overlap_size (tuple or list): the size of overlap between adjacent patches, with format [overlap_width, overlap_height]
    Returns:
        np.ndarray: the merged volume, with shape [width, height]
    """
    width, height = volume_size
    patch_width, patch_height = patches.shape[1:]
    overlap_width, overlap_height = overlap_size
    num_patches_x = (width - patch_width) // (patch_width - overlap_width) + 1
    num_patches_y = (height - patch_height) // (patch_height - overlap_height) + 1
    print('merge:', num_patches_x, num_patches_y)

    merged_volume = np.zeros(volume_size)
    weight_volume = np.zeros(volume_size)   # weight_volume的目的是用于记录每个像素在裁剪过程中被遍历的次数，最后用于求平均值
    idx = 0
    for x in range(num_patches_x):
        for y in range(num_patches_y):
            x_start = x * (patch_width - overlap_width)
            y_start = y * (patch_height - overlap_height)
            merged_volume[x_start:x_start+patch_width, y_start:y_start+patch_height] += patches[idx]
            weight_volume[x_start:x_start+patch_width, y_start:y_start+patch_height] += 1
            idx += 1
    merged_volume /= weight_volume
    return merged_volume

path = os.path.join(r'F:\tmp\results2', '10.png')
imgs = cv2.imread(path, 0)
volume_size = imgs.shape
patch_size=[96, 96]
overlap_size = [4, 4]
print('img shape:', imgs.shape)

patches = crop_volume(imgs, patch_size, overlap_size)
print('patches shape:', patches.shape)

merged_volume = merge_patches(patches, volume_size, overlap_size)
print('merged_volume shape:', merged_volume.shape)
merged_volume = Image.fromarray(merged_volume)
merged_volume = merged_volume.convert("L")
merged_volume.save(r"F:\tmp\results2/" + "merged_volume.png")

如下，果然和我们前面猜想的一样，在merge后的图像，相比于原图，在右侧和下侧，都少了部分，这个问题后面在3D时候，再细细的探讨。

nerge

二、3D crop and merge

3D部分相比于2D，无非就是多了一个深度信息，也就是z轴信息。所以在crop阶段和merge阶段，只需要按照2D行列的方式，增加一个维度的信息即可。代码如下所示：

import numpy as np

def crop_volume(volume, patch_size=[96, 96, 96], overlap_size=[4, 4, 4]):
    """
    Crop a 3D volume into patches with specified size and overlap.
    Args:
        volume (np.ndarray): the 3D volume to be cropped, with shape [width, height, depth]
        patch_size (tuple or list): the size of patch, with format [patch_width, patch_height, patch_depth]
        overlap_size (tuple or list): the size of overlap between adjacent patches, with format [overlap_width, overlap_height, overlap_depth]
    Returns:
        np.ndarray: the cropped patches, with shape [num_patches, patch_width, patch_height, patch_depth]
    """
    width, height, depth = volume.shape
    patch_width, patch_height, patch_depth = patch_size

    overlap_width, overlap_height, overlap_depth = overlap_size
    patches = []
    for z in range(0, depth - patch_depth + 1, patch_depth - overlap_depth):
        for y in range(0, height - patch_height + 1, patch_height - overlap_height):
            for x in range(0, width - patch_width + 1, patch_width - overlap_width):
                patch = volume[x:x+patch_width, y:y+patch_height, z:z+patch_depth]
                patches.append(patch)
    patches = np.asarray(patches)
    return patches

def merge_patches(patches, volume_size, overlap_size):
    """
    Merge the cropped patches into a complete 3D volume.
    Args:
        patches (np.ndarray): the cropped patches, with shape [num_patches, patch_width, patch_height, patch_depth]
        volume_size (tuple or list): the size of the complete volume, with format [width, height, depth]
        overlap_size (tuple or list): the size of overlap between adjacent patches, with format [overlap_width, overlap_height, overlap_depth]
    Returns:
        np.ndarray: the merged volume, with shape [width, height, depth]
    """
    width, height, depth = volume_size
    patch_width, patch_height, patch_depth = patches.shape[1:]
    overlap_width, overlap_height, overlap_depth = overlap_size
    num_patches_x = (width - patch_width) // (patch_width - overlap_width) + 1
    num_patches_y = (height - patch_height) // (patch_height - overlap_height) + 1
    num_patches_z = (depth - patch_depth) // (patch_depth - overlap_depth) + 1
    print('merge:', num_patches_x, num_patches_y, num_patches_z)
    merged_volume = np.zeros(volume_size)
    weight_volume = np.zeros(volume_size)   # weight_volume的目的是用于记录每个像素在裁剪过程中被遍历的次数，最后用于求平均值
    idx = 0
    for z in range(num_patches_z):
        for y in range(num_patches_y):
            for x in range(num_patches_x):
                x_start = x * (patch_width - overlap_width)
                y_start = y * (patch_height - overlap_height)
                z_start = z * (patch_depth - overlap_depth)
                merged_volume[x_start:x_start+patch_width, y_start:y_start+patch_height, z_start:z_start+patch_depth] += patches[idx]
                weight_volume[x_start:x_start+patch_width, y_start:y_start+patch_height, z_start:z_start+patch_depth] += 1
                idx += 1
    merged_volume /= weight_volume
    return merged_volume


import os
import itk

nii_path = os.path.join(r'F:\tmp\results2', 'brain.nii.gz')
imgs = itk.array_from_image(itk.imread(nii_path))
imgs /= np.max(imgs)
volume_size = imgs.shape
patch_size = [96, 96, 96]
overlap_size = [32, 32, 32]
print('img shape:', imgs.shape)

# crop
patches = crop_volume(imgs, patch_size, overlap_size)
print('patches shape:', patches.shape)

print(patches.shape)
for d in range(0, patches.shape[0], 1):
    one_patch = patches[d, :, :, :]*255
    print(d, one_patch.shape)
    width_p, height_p, depth_p = one_patch.shape

    one_patch = itk.image_from_array(one_patch)

    itk.imwrite(one_patch, os.path.join(r'F:\tmp\results2\patch', str(d) + ".nii.gz"))  # 保存nii文件

# merge
merged_volume = merge_patches(patches, volume_size, overlap_size)
print('merged_volume shape:', merged_volume.shape)
merged_volume = itk.image_from_array(merged_volume)
itk.imwrite(merged_volume, r'F:\tmp\results2\merged_volume.nii.gz')