SViT 实验记录

news2024/11/24 11:22:08

目录

一、网络的搭建

1、Conv Stem

2、各阶段的模块

3、3X3卷积

二、前向传播过程

1、Stem

2、各阶段中的基本模块STT Block

1)CPE模块

 2)STA模块

网络结构


一、网络的搭建

论文中的结构原图

基本模块

1、Conv Stem

(patch_embed): PatchEmbed(
    (proj): Sequential(
      (0): Conv2d(3, 48, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
      (1): GELU()
      (2): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (3): Conv2d(48, 48, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (4): GELU()
      (5): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (6): Conv2d(48, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
      (7): GELU()
      (8): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (9): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (10): GELU()
      (11): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (pos_drop): Dropout(p=0.0, inplace=False)

2、各阶段的模块

MouleList  >>  BasicLayer >>  StokenAttentionLayer

在源代码中,构成各阶段的基本模块就是这个 StokenAttentionLayer

其中

CPE >> ResDWC

ResDWC(
            (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)
          )

LN   >>   LayerNorm2d

STA >>  StokenAttention

StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(96, 288, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )

BN >> BatchNorm2d

ConvFFN >> Mlp

Mlp(
            (fc1): Conv2d(96, 384, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384)
            )

3、3X3卷积

PatchMerging(
        (proj): Sequential(
          (0): Conv2d(96, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
          (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )

二、前向传播过程

采用随机的输入: 

input_try = torch.rand(1, 3, 512, 512)

模型为 SViT-S

1、Stem

Stem 由4个相连的 Conv2d-GELU-BN  层组成,没有进行位置编码,输出的向量形状为

x (1,64,128,128)

后连一个 Dropout 层,drop rate 由参数  args.drop 决定

2、各阶段中的基本模块STT Block

1)CPE模块

class ResDWC(nn.Module):
    def __init__(self, dim, kernel_size=3):
        super().__init__()
        
        self.dim = dim
        self.kernel_size = kernel_size
        
        self.conv = nn.Conv2d(dim, dim, kernel_size, 1, kernel_size//2, groups=dim)
                
        self.shortcut = nn.Parameter(torch.eye(kernel_size).reshape(1, 1, kernel_size, kernel_size))
        self.shortcut.requires_grad = False
        
    def forward(self, x):
        return F.conv2d(x, self.conv.weight+self.shortcut, self.conv.bias, stride=1, padding=self.kernel_size//2, groups=self.dim) # equal to x + conv(x)

它的前向传播中包含了这个过程 

F.conv2d(x, self.conv.weight+self.shortcut, self.conv.bias, ......

其中的 self.shortcut 全为1,这相当于 

(self.conv.weight+self.shortcut)*x+self.conv.bias\rightarrow conv(x)+x

对应了论文中的计算过程

 2)STA模块

执行过程

x = x + self.drop_path(self.attn(self.norm1(x)))

其中 self.norm1为 LN归一化,而主要的过程在self.attn中实现。

论文中的

 对应着 

hh, ww = H//h, W//w

进行下面采样得到 S0

stoken_features = F.adaptive_avg_pool2d(x, (hh, ww))

论文中的公式5

 对应

stoken_features = self.unfold(stoken_features)  # (B, C*9, hh*ww)  # (1, 576, 256)  采取周围的9个super token 进行association
stoken_features = stoken_features.transpose(1, 2).reshape(B, hh*ww, C, 9)  # (1,256,64,9)
affinity_matrix = pixel_features @ stoken_features * self.scale  # (B, hh*ww, h*w, 9)  # (1,256,64,9)
affinity_matrix = affinity_matrix.softmax(-1)  # (B, hh*ww, h*w, 9) (1,256,64,9)   论文中的 association map Qt

 论文中的column-normalized 过程

                if idx < self.n_iter - 1:  # column-normalized 过程
                    stoken_features = pixel_features.transpose(-1, -2) @ affinity_matrix # (B, hh*ww, C, 9)
                    
                    stoken_features = self.fold(stoken_features.permute(0, 2, 3, 1).reshape(B*C, 9, hh, ww)).reshape(B, C, hh, ww)            
                    
                    stoken_features = stoken_features/(affinity_matrix_sum + 1e-12) # (B, C, hh, ww)

公式6 

 stoken_features = pixel_features.transpose(-1, -2) @ affinity_matrix

 公式9

stoken_features = self.stoken_refine(stoken_features)

上面就是用 MHSA过程实现的

公式11

pixel_features = stoken_features @ affinity_matrix.transpose(-1, -2

 然后进行 

x = x + self.drop_path(self.mlp2(self.norm2(x)))

网络结构

SViT-s

STViT(
  (patch_embed): PatchEmbed(
    (proj): Sequential(
      (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
      (1): GELU()
      (2): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (3): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (4): GELU()
      (5): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (6): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
      (7): GELU()
      (8): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (9): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (10): GELU()
      (11): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (pos_drop): Dropout(p=0.0, inplace=False)
  (layers): ModuleList(
    (0): BasicLayer(
      (blocks): ModuleList(
        (0): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=64)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((64,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): Identity()
          (norm2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=256)
            )
          )
        )
        (1): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=64)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((64,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.016)
          (norm2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=256)
            )
          )
        )
        (2): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=64)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((64,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.032)
          (norm2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=256)
            )
          )
        )
      )
      (downsample): PatchMerging(
        (proj): Sequential(
          (0): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
          (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
    )
    (1): BasicLayer(
      (blocks): ModuleList(
        (0): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=128)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((128,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(128, 384, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.047)
          (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512)
            )
          )
        )
        (1): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=128)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((128,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(128, 384, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.063)
          (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512)
            )
          )
        )
        (2): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=128)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((128,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(128, 384, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.079)
          (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512)
            )
          )
        )
        (3): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=128)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((128,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(128, 384, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.095)
          (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512)
            )
          )
        )
        (4): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=128)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((128,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(128, 384, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.111)
          (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512)
            )
          )
        )
      )
      (downsample): PatchMerging(
        (proj): Sequential(
          (0): Conv2d(128, 320, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
          (1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
    )
    (2): BasicLayer(
      (blocks): ModuleList(
        (0): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=320)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(320, 960, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(320, 320, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.126)
          (norm2): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1280, 320, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1280)
            )
          )
        )
        (1): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=320)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(320, 960, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(320, 320, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.142)
          (norm2): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1280, 320, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1280)
            )
          )
        )
        (2): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=320)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(320, 960, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(320, 320, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.158)
          (norm2): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1280, 320, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1280)
            )
          )
        )
        (3): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=320)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(320, 960, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(320, 320, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.174)
          (norm2): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1280, 320, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1280)
            )
          )
        )
        (4): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=320)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(320, 960, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(320, 320, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.189)
          (norm2): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1280, 320, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1280)
            )
          )
        )
        (5): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=320)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(320, 960, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(320, 320, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.205)
          (norm2): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1280, 320, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1280)
            )
          )
        )
        (6): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=320)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(320, 960, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(320, 320, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.221)
          (norm2): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1280, 320, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1280)
            )
          )
        )
        (7): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=320)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(320, 960, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(320, 320, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.237)
          (norm2): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1280, 320, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1280)
            )
          )
        )
        (8): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=320)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(320, 960, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(320, 320, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.253)
          (norm2): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1280, 320, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1280, 1280, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1280)
            )
          )
        )
      )
      (downsample): PatchMerging(
        (proj): Sequential(
          (0): Conv2d(320, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
          (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
    )
    (3): BasicLayer(
      (blocks): ModuleList(
        (0): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(512, 1536, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.268)
          (norm2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(2048, 2048, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=2048)
            )
          )
        )
        (1): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(512, 1536, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.284)
          (norm2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(2048, 2048, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=2048)
            )
          )
        )
        (2): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(512, 1536, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.300)
          (norm2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(2048, 2048, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=2048)
            )
          )
        )
      )
    )
  )
  (proj): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1))
  (norm): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (swish): MemoryEfficientSwish()
  (avgpool): AdaptiveAvgPool2d(output_size=1)
  (head): Linear(in_features=1024, out_features=1000, bias=True)
)

=======================================================================

下面的为 SViT-L的模型,比较大的那个

STViT(
  (patch_embed): PatchEmbed(
    (proj): Sequential(
      (0): Conv2d(3, 48, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
      (1): GELU()
      (2): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (3): Conv2d(48, 48, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (4): GELU()
      (5): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (6): Conv2d(48, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
      (7): GELU()
      (8): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (9): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (10): GELU()
      (11): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (pos_drop): Dropout(p=0.0, inplace=False)
  (layers): ModuleList(
    (0): BasicLayer(
      (blocks): ModuleList(
        (0): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((96,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(96, 288, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): Identity()
          (norm2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(96, 384, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384)
            )
          )
        )
        (1): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((96,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(96, 288, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.003)
          (norm2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(96, 384, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384)
            )
          )
        )
        (2): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((96,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(96, 288, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.005)
          (norm2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(96, 384, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384)
            )
          )
        )
        (3): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((96,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(96, 288, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.008)
          (norm2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(96, 384, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384)
            )
          )
        )
      )
      (downsample): PatchMerging(
        (proj): Sequential(
          (0): Conv2d(96, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
          (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
    )
    (1): BasicLayer(
      (blocks): ModuleList(
        (0): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((192,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(192, 576, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.011)
          (norm2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(192, 768, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=768)
            )
          )
        )
        (1): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((192,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(192, 576, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.014)
          (norm2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(192, 768, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=768)
            )
          )
        )
        (2): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((192,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(192, 576, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.016)
          (norm2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(192, 768, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=768)
            )
          )
        )
        (3): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((192,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(192, 576, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.019)
          (norm2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(192, 768, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=768)
            )
          )
        )
        (4): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((192,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(192, 576, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.022)
          (norm2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(192, 768, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=768)
            )
          )
        )
        (5): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((192,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(192, 576, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.024)
          (norm2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(192, 768, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=768)
            )
          )
        )
        (6): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((192,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(192, 576, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.027)
          (norm2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(192, 768, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=768)
            )
          )
        )
      )
      (downsample): PatchMerging(
        (proj): Sequential(
          (0): Conv2d(192, 448, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
          (1): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
    )
    (2): BasicLayer(
      (blocks): ModuleList(
        (0): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.030)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (1): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.032)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (2): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.035)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (3): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.038)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (4): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.041)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (5): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.043)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (6): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.046)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (7): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.049)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (8): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.051)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (9): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.054)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (10): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.057)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (11): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.059)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (12): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.062)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (13): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.065)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (14): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.068)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (15): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.070)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (16): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.073)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (17): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.076)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
        (18): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(448, 448, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=448)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(448, 1344, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(448, 448, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.078)
          (norm2): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(448, 1792, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(1792, 448, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(1792, 1792, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1792)
            )
          )
        )
      )
      (downsample): PatchMerging(
        (proj): Sequential(
          (0): Conv2d(448, 640, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
          (1): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
    )
    (3): BasicLayer(
      (blocks): ModuleList(
        (0): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=640)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((640,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(640, 1920, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(640, 640, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.081)
          (norm2): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(640, 2560, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(2560, 640, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(2560, 2560, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=2560)
            )
          )
        )
        (1): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=640)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((640,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(640, 1920, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(640, 640, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.084)
          (norm2): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(640, 2560, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(2560, 640, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(2560, 2560, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=2560)
            )
          )
        )
        (2): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=640)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((640,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(640, 1920, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(640, 640, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.086)
          (norm2): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(640, 2560, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(2560, 640, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(2560, 2560, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=2560)
            )
          )
        )
        (3): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=640)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((640,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(640, 1920, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(640, 640, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.089)
          (norm2): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(640, 2560, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(2560, 640, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(2560, 2560, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=2560)
            )
          )
        )
        (4): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=640)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((640,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(640, 1920, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(640, 640, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.092)
          (norm2): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(640, 2560, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(2560, 640, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(2560, 2560, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=2560)
            )
          )
        )
        (5): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=640)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((640,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(640, 1920, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(640, 640, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.095)
          (norm2): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(640, 2560, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(2560, 640, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(2560, 2560, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=2560)
            )
          )
        )
        (6): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=640)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((640,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(640, 1920, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(640, 640, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.097)
          (norm2): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(640, 2560, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(2560, 640, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(2560, 2560, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=2560)
            )
          )
        )
        (7): StokenAttentionLayer(
          (pos_embed): ResDWC(
            (conv): Conv2d(640, 640, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=640)
          )
          (norm1): LayerNorm2d(
            (norm): LayerNorm((640,), eps=1e-06, elementwise_affine=True)
          )
          (attn): StokenAttention(
            (unfold): Unfold()
            (fold): Fold()
            (stoken_refine): Attention(
              (qkv): Conv2d(640, 1920, kernel_size=(1, 1), stride=(1, 1))
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Conv2d(640, 640, kernel_size=(1, 1), stride=(1, 1))
              (proj_drop): Dropout(p=0.0, inplace=False)
            )
          )
          (drop_path): DropPath(drop_prob=0.100)
          (norm2): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (mlp2): Mlp(
            (fc1): Conv2d(640, 2560, kernel_size=(1, 1), stride=(1, 1))
            (act1): GELU()
            (fc2): Conv2d(2560, 640, kernel_size=(1, 1), stride=(1, 1))
            (drop): Dropout(p=0.0, inplace=False)
            (conv): ResDWC(
              (conv): Conv2d(2560, 2560, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=2560)
            )
          )
        )
      )
    )
  )
  (proj): Conv2d(640, 1024, kernel_size=(1, 1), stride=(1, 1))
  (norm): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (swish): MemoryEfficientSwish()
  (avgpool): AdaptiveAvgPool2d(output_size=1)
  (head): Linear(in_features=1024, out_features=1000, bias=True)
)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/519358.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

算法修炼之练气篇——练气十三层

博主&#xff1a;命运之光 专栏&#xff1a;算法修炼之练气篇 目录 题目 1023: [编程入门]选择排序 题目描述 输入格式 输出格式 样例输入 样例输出 题目 1065: 二级C语言-最小绝对值 题目描述 输入格式 输出格式 样例输入 样例输出 题目 1021: [编程入门]迭代法求…

【Selenium上】——全栈开发——如桃花来

目录索引 Selenium是什么&#xff1a;下载和配置环境变量&#xff1a;1. 基本使用&#xff1a;导入五个常用包&#xff1a;基本代码&#xff1a; 实例引入&#xff1a;声明不同浏览器对象&#xff1a;访问页面&#xff1a; Selenium是什么&#xff1a; Selenium是一个用于Web应…

Cesium入门之四:基于Vue3+Vite+Cesium构建三维地球场景

Cesium官网中提供了基于webpack配置Cesium的方法&#xff0c;但是这种方法太繁琐&#xff0c;而且使用webpack时程序启动没有Vite启动快&#xff0c;因此&#xff0c;这里选择vite创建vue3cesium构建项目 创建vue3项目 新建CesiumProject文件夹&#xff0c;在该文件夹上点击右…

clang-format configurator - 交互式创建 clang-format 格式配置文件

clang-format configurator - 交互式创建 clang-format 格式配置文件 clang-format configurator https://zed0.co.uk/clang-format-configurator/ clang-format-configurator https://github.com/zed0/clang-format-configurator Interactively create a clang-format confi…

minikube,搭建+镜像加速,坚持 3 分钟,带你玩的明明白白

一、 安装 cri-docker 下载安装 # 在 https://github.com/Mirantis/ 下载 https://github.com/Mirantis/tar -xvf cri-dockerd-0.3.1.amd64.tgzcp cri-dockerd/cri-dockerd /usr/bin/chmod x /usr/bin/cri-dockerd# 确认已安装版本 cri-dockerd --version配置启动文件 cri-do…

一篇让你精通JWT,妥妥的避坑指南~

视频教程传送门&#xff1a;JWT 两小时极简入门&#xff1a;JWT实战应用与防坑指南~_哔哩哔哩_bilibiliJWT 两小时极简入门&#xff1a;JWT实战应用与防坑指南~共计12条视频&#xff0c;包括&#xff1a;01.课程介绍与前置知识点、02.JWT概念、03.JWT组成等&#xff0c;UP主更多…

一个例子让你彻底弄懂分布式系统的CAP理论

1 推荐的文章 下面这篇知乎文章是我见过的最简单易懂的一篇&#xff0c;把CAP定义以及为什么AP和CP只能二选一以及场景特定下选AP还是CP作为系统目标等讲解明明白白 谈谈分布式系统的CAP 2 个人对上面这篇文章的的一些补充 可用性可以人为设置一个阈值&#xff0c;比如用户体…

openPOWERLINK源码(最新)在stm32单片机上的移植指南

最近着了powerlink的道&#xff0c;连续几晚十二点前没睡过觉。不得不说兴趣这东西劲太大了&#xff0c;让人睡不着。喜欢上研究POWERLINK&#xff0c;最新版的源码结构挺清晰的&#xff0c;移植并测试了嵌入式linux作为从站和电脑主站之间的通信&#xff0c;挺有趣的。接下来想…

路由器配置方法(固定地址)

前言 由于学校给分配了IP地址&#xff0c;因此我们的路由器接入的时候不能选择自动接入方式&#xff0c;而要选择固定地址方式。 step 1 我们首先先将路由器接上网线&#xff0c;这里注意一定要接wan口 因为路由器分为两个口&#xff0c;wan口是入口&#xff0c;lan口是出口…

第十二届蓝桥杯青少组国赛Python真题,包含答案

第十二届蓝桥杯青少组国赛Python真题 一、选择题 第 1 题 单选题 设sHi LanQiao&#xff0c;运行以下哪个选项代码可以输出“LanQiao”子串 () 答案&#xff1a;A 第 2 题 单选题 已知a-2021.0529&#xff0c;运行以下哪个选项代码可以输出“2021.05”() 答案&#xff1a;…

2023.05.12 C高级 day4

有m1.txt m2.txt m3.txt m4.txt&#xff0c;分别创建出对应的目录&#xff0c;m1 m2 m3 m4 并把文件移动到对应的目录下 #!/bin/bash for i in 1 2 3 4 dotouch m$i.txtmkdir m$imv m$i.txt ./m$i/m$i.txt done 运行结果 2. 使用break关键字打印九九乘法表&#xff0c;提示&am…

【2023/05/12】Z3

Hello&#xff01;大家好&#xff0c;我是霜淮子&#xff0c;2023倒计时第7天。 Share Listen,my heart,to the whispers of the world with which it makes love to you. 译文&#xff1a; 静静的听&#xff0c;我的心呀&#xff0c;听那世界的低语&#xff0c;这是它对你求…

黑客必备工具:Hydra的完整安装和使用指南

安装Hydra 1.安装必要的依赖库 在终端中执行以下命令&#xff0c;安装Hydra所需的依赖库&#xff1a; sudo apt-get install build-essential checkinstall libssl-dev libssh-dev libidn11-dev libpcre3-dev libgtk2.0-dev libmysqlclient-dev libpq-dev libsvn-dev firebi…

经典HTML前端面试题总结

经典HTML前端面试题总结 1. 1简述一下你对 HTML 语义化的理解&#xff1f;.1.2 标签上 title 与 alt 属性的区别是什么&#xff1f;1.3 iframe的优缺点&#xff1f;1.4 href 与 src&#xff1f;1.5 HTML、XHTML、XML有什么区别1.6 知道img的srcset的作用是什么&#xff1f;1.7 …

代码随想录算法训练营第五十九天

代码随想录算法训练营第五十九天| 503.下一个更大元素II&#xff0c;42. 接雨水 503.下一个更大元素II42. 接雨水复杂单调栈整合单调栈 503.下一个更大元素II 题目链接&#xff1a;下一个更大元素II 因为可以循环&#xff0c;直接拼一个nums在nums后面就行了。 class Solutio…

[OGeek2019]babyrop

小白垃圾笔记不建议阅读。。。。 这道题额………………做了好几天。。 更离谱的是还把ubuntu16给玩坏了。 师傅说kali可以打通&#xff0c;气得我连夜下卡里 后来发现不是版本的问题&#xff0c;是我的脚本的问题。libc写的不对。 先分析这道题。 32位程序。没有canary&…

串口与wifi模块

经过以下学习&#xff0c;我们掌握&#xff1a; AT指令与wifi模块的测试方法&#xff1a;通过CH340直接测试&#xff0c;研究各种AT指令下wifi模块的响应信息形式。编程&#xff0c;使用串口中断接收wifi模块对AT指令的响应信息以及透传数据&#xff0c;通过判断提高指令执行的…

C语言函数大全-- w 开头的函数(1)

C语言函数大全 本篇介绍C语言函数大全-- w 开头的函数 1. wcscat 1.1 函数说明 函数声明函数功能wchar_t * wcscat(wchar_t *dest, const wchar_t *src);用于将一个宽字符字符串追加到另一个宽字符字符串的末尾 参数&#xff1a; dest &#xff1a; 目标字符串src &#xf…

29.Mybatis—多表操作与注解开发

目录 一、Mybatis学习。 &#xff08;1&#xff09;MyBatis的多表操作。 &#xff08;1.1&#xff09;一对一查询。 &#xff08;1.2&#xff09;一对多查询。 &#xff08;1.3&#xff09;多对多查询。 &#xff08;1.4&#xff09;三种查询知识小结。 一、Mybatis学习。…

算法修炼之练气篇——练气十二层

博主&#xff1a;命运之光 专栏&#xff1a;算法修炼之练气篇 前言&#xff1a;每天练习五道题&#xff0c;炼气篇大概会练习200道题左右&#xff0c;题目有C语言网上的题&#xff0c;也有洛谷上面的题&#xff0c;题目简单适合新手入门。&#xff08;代码都是命运之光自己写的…