  • 前言
  • 一、Convolution Layers
  • 二、Pooling Layers
  • 三、Padding Layers
一、Convolution Layers


nn.Conv1dApplies a 1D convolution over an input signal composed of several input planes.
nn.Conv2dApplies a 2D convolution over an input signal composed of several input planes.
nn.Conv3dApplies a 3D convolution over an input signal composed of several input planes.
nn.ConvTranspose1dApplies a 1D transposed convolution operator over an input image composed of several input planes.
nn.ConvTranspose2dApplies a 2D transposed convolution operator over an input image composed of several input planes.
nn.ConvTranspose3dApplies a 3D transposed convolution operator over an input image composed of several input planes.
torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)


  • in_channels (int) – 输入通道数
  • out_channels (int) – 输出通道数
  • kernel_size (int or tuple) – 卷积核大小
  • stride (int or tuple, optional) – 卷积步长
  • padding (int, tuple or str, optional) – 对于输入图像的四周进行填充的数量进行控制,可指定填充像素数量,也可以指定填充模式,如"same", “valid”
  • padding_mode (str, optional) – 填充类型
  • dilation (int or tuple, optional) – 孔洞卷积的孔洞大小
  • groups (int, optional) – 分组卷积的分组
  • bias (bool, optional) – 是否采用偏置


  • 输入尺度: ( N , C i n , H i n , W i n ) (N,C_{in},H_{in},W_{in}) (N,Cin,Hin,Win)
  • 输出尺度: ( N , C o u t , H o u t , W o u t ) (N,C_{out},H_{out},W_{out}) (N,Cout,Hout,Wout)
  • 换算关系:
    H o u t = ⌊ H i n + 2 × padding [ 0 ] − dilation [ 0 ] × ( kernel size [ 0 ] − 1 ) − 1 stride [ 0 ] + 1 ⌋ W o u t = ⌊ W i n + 2 × padding [ 1 ] − dilation [ 1 ] × ( kernel size [ 1 ] − 1 ) − 1 stride [ 1 ] + 1 ⌋ H_{out}=\left\lfloor\frac{H_{in}+2\times\text{padding}[0]-\text{dilation}[0]\times(\text{kernel size}[0]-1)-1}{\text{stride}[0]}+1\right\rfloor\\W_{out}=\left\lfloor\frac{W_{in}+2\times\text{padding}[1]-\text{dilation}[1]\times(\text{kernel size}[1]-1)-1}{\text{stride}[1]}+1\right\rfloor Hout=stride[0]Hin+2×padding[0]dilation[0]×(kernel size[0]1)1+1Wout=stride[1]Win+2×padding[1]dilation[1]×(kernel size[1]1)1+1

kernel_size可以以int或tuple传入,前者时公式中的kernel size[0]和kernel size[1]都为传参的int值,其他参数也类似。

二、Pooling Layers


nn.MaxPool2dApplies a 2D max pooling over an input signal composed of several input planes.
nn.MaxUnpool2dComputes a partial inverse of MaxPool2d.
nn.AvgPool2dApplies a 2D average pooling over an input signal composed of several input planes.
nn.FractionalMaxPool2dApplies a 2D fractional max pooling over an input signal composed of several input planes.
nn.LPPool2dApplies a 2D power-average pooling over an input signal composed of several input planes.
nn.AdaptiveMaxPool2dApplies a 2D adaptive max pooling over an input signal composed of several input planes.
nn.AdaptiveAvgPool2dApplies a 2D adaptive average pooling over an input signal composed of several input planes.


torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)


  • kernel_size (Union[int, Tuple[int, int]]) – 池化窗口大小
  • stride (Union[int, Tuple[int, int]]) – 池化步长
  • padding (Union[int, Tuple[int, int]]) – 填充大小
  • dilation (Union[int, Tuple[int, int]]) – 空洞大小
  • return_indices (bool) – 是否返回最大值所在位置
  • ceil_mode (bool) – 如果无法整除,选择向下取整还是向上取整,默认向下取整


  • 输入尺度: ( N , C i n , H i n , W i n ) (N,C_{in},H_{in},W_{in}) (N,Cin,Hin,Win)
  • 输出尺度: ( N , C o u t , H o u t , W o u t ) (N,C_{out},H_{out},W_{out}) (N,Cout,Hout,Wout)
  • 换算关系:
    H o u t = ⌊ H i n + 2 × padding [ 0 ] − dilation [ 0 ] × ( kernel size [ 0 ] − 1 ) − 1 stride [ 0 ] + 1 ⌋ W o u t = ⌊ W i n + 2 × padding [ 1 ] − dilation [ 1 ] × ( kernel size [ 1 ] − 1 ) − 1 stride [ 1 ] + 1 ⌋ H_{out}=\left\lfloor\frac{H_{in}+2\times\text{padding}[0]-\text{dilation}[0]\times(\text{kernel size}[0]-1)-1}{\text{stride}[0]}+1\right\rfloor\\W_{out}=\left\lfloor\frac{W_{in}+2\times\text{padding}[1]-\text{dilation}[1]\times(\text{kernel size}[1]-1)-1}{\text{stride}[1]}+1\right\rfloor Hout=stride[0]Hin+2×padding[0]dilation[0]×(kernel size[0]1)1+1Wout=stride[1]Win+2×padding[1]dilation[1]×(kernel size[1]1)1+1

三、Padding Layers


nn.ReflectionPad2dPads the input tensor using the reflection of the input boundary.
nn.ReplicationPad2dPads the input tensor using replication of the input boundary.
nn.ZeroPad2dPads the input tensor boundaries with zero.
nn.ConstantPad2dPads the input tensor boundaries with a constant value.
nn.CircularPad2dPads the input tensor using circular padding of the input boundary.




  • padding (int, tuple) – the size of the padding. If is int, uses the same padding in all boundaries. If a 4-tuple, uses (padding_left, padding_right, padding_top, padding_bottom)


>>> m = nn.ZeroPad2d(2)
>>> input = torch.randn(1, 1, 3, 3)
>>> input
tensor([[[[-0.1678, -0.4418,  1.9466],
          [ 0.9604, -0.4219, -0.5241],
          [-0.9162, -0.5436, -0.6446]]]])
>>> m(input)
tensor([[[[ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
          [ 0.0000,  0.0000, -0.1678, -0.4418,  1.9466,  0.0000,  0.0000],
          [ 0.0000,  0.0000,  0.9604, -0.4219, -0.5241,  0.0000,  0.0000],
          [ 0.0000,  0.0000, -0.9162, -0.5436, -0.6446,  0.0000,  0.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000]]]])
>>> # using different paddings for different sides
>>> m = nn.ZeroPad2d((1, 1, 2, 0))
>>> m(input)
tensor([[[[ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000],
          [ 0.0000, -0.1678, -0.4418,  1.9466,  0.0000],
          [ 0.0000,  0.9604, -0.4219, -0.5241,  0.0000],
          [ 0.0000, -0.9162, -0.5436, -0.6446,  0.0000]]]])







