PaddleVideo:Squeeze Time算法移植

news2024/10/2 6:37:22

 参考
PaddleVideo/docs/zh-CN/contribute/add_new_algorithm.md at develop · PaddlePaddle/PaddleVideo · GitHubAwesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video tagging and sport action detection. - PaddleVideo/docs/zh-CN/contribute/add_new_algorithm.md at develop · PaddlePaddle/PaddleVideoicon-default.png?t=N7T8https://github.com/PaddlePaddle/PaddleVideo/blob/develop/docs/zh-CN/contribute/add_new_algorithm.md

1:添加backbone:(网络我自己砍了几刀,目的是想和ppTSM-v2做对比)

paddlevideo/modeling/backbones/squeezetime.py

from __future__ import absolute_import, division, print_function

import paddle
import paddle.nn as nn
from paddle import ParamAttr
from paddle.nn import AdaptiveAvgPool2D, BatchNorm, Conv2D, Dropout, Linear, BatchNorm2D
from paddle.regularizer import L2Decay
from paddle.nn.initializer import KaimingNormal,Constant
import paddle.nn.functional as F

from ..registry import BACKBONES


def get_inplanes():
    return [64, 128, 256, 512]

class SpatialConv(nn.Layer):
    """
    Inter-temporal Object Interaction Module (IOI)
    """
    def __init__(self, dim_in, dim_out, pos_dim=7):
        super(SpatialConv, self).__init__()

        self.short_conv = nn.Conv2D(dim_in, dim_out, kernel_size=3, stride=1, padding=1, groups=1)

        self.glo_conv = nn.Sequential(
            nn.Conv2D(dim_in, 16, kernel_size=3, stride=1, padding=1, groups=1),
            nn.BatchNorm2D(16), nn.ReLU(),
            nn.Conv2D(16, 16, kernel_size=7, stride=1, padding=3),
            nn.BatchNorm2D(16), nn.ReLU(),
            nn.Conv2D(16, dim_out, kernel_size=3, stride=1, padding=1, groups=1), nn.Sigmoid()
        )

        self.pos_embed = self.create_parameter(shape=[1, 16, pos_dim, pos_dim], default_initializer=nn.initializer.KaimingNormal())

    def forward(self, x, param):
        x_short = self.short_conv(x)
        x = x * param

        for i in range(len(self.glo_conv)):
            if i == 3:
                _, _, H, W = x.shape
                if self.pos_embed.shape[2] != H or self.pos_embed.shape[3] != W:
                    pos_embed = F.interpolate(self.pos_embed, size=(H, W), mode='bilinear', align_corners=True)
                else:
                    pos_embed = self.pos_embed
                x = x + pos_embed

            x = self.glo_conv[i](x)

        return x_short * x

class Conv2d(nn.Layer):
    """
    Channel-Time Learning Module (CTL)
    """
    def __init__(
        self,
        in_channels: int,
        out_channels: int,
        kernel_size: int,
        stride: int = 1,
        padding: int = 0,
        dilation: int = 1,
        groups: int = 1,
        bias: bool = True,
        padding_mode: str = 'zeros', 
        pos_dim = 7):
        super(Conv2d, self).__init__()

        self.stride = stride

        self.param_conv = nn.Sequential(
            nn.AdaptiveAvgPool2D((1, 1)),
            nn.Conv2D(in_channels, in_channels, 1, stride=1, padding=1 // 2, bias_attr=False),
            nn.BatchNorm2D(in_channels),
            nn.ReLU(),
            nn.Conv2D(in_channels, in_channels, 1, bias_attr=False),
            nn.Sigmoid()
        )

        self.temporal_conv = nn.Conv2D(
            in_channels=in_channels, 
            out_channels=out_channels, 
            kernel_size=kernel_size, 
            stride=1, 
            padding=padding, 
            dilation=dilation, 
            groups=groups, 
            bias_attr=bias, 
            padding_mode=padding_mode
        )

        self.spatial_conv = SpatialConv(dim_in=in_channels, dim_out=out_channels, pos_dim=pos_dim)

    def forward(self, x):
        param = self.param_conv(x)
        x = self.temporal_conv(param * x) + self.spatial_conv(x, param)
        return x

def conv3x3x3(in_planes, out_planes, stride=1, pos_dim=7):
    return Conv2d(in_planes, out_planes, kernel_size=1, stride=stride, padding=0, bias=False, pos_dim=pos_dim)

def conv1x1x1(in_planes, out_planes, stride=1):
    return nn.Conv2D(in_planes, out_planes, kernel_size=1, stride=stride, bias_attr=False)

class BasicBlock(nn.Layer):
    """
    Channel-Time Learning (CTL) Block
    """
    expansion = 1

    def __init__(self, in_planes, planes, stride=1, shortcut_conv=None, pos_dim=7):
        super().__init__()

        self.conv1 = conv3x3x3(in_planes, planes, stride)
        self.bn1 = nn.BatchNorm2D(planes)
        self.relu = nn.ReLU()

        self.conv2 = conv3x3x3(planes, planes, pos_dim=pos_dim)
        self.bn2 = nn.BatchNorm2D(planes)

        self.shortcut_conv = shortcut_conv

        self.stride = stride
        if stride != 1:
            self.downsample = nn.Sequential(
                nn.Conv2D(in_planes, in_planes, kernel_size=2, stride=2, groups=in_planes),
                nn.BatchNorm2D(in_planes)
            )

    def forward(self, x):
        if self.stride != 1:
            x = self.downsample(x)

        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)

        if self.shortcut_conv is not None:
            residual = self.shortcut_conv(x)

        out += residual
        out = self.relu(out)

        return out

class Bottleneck(nn.Layer):
    """
    Channel-Time Learning (CTL) Block
    """
    expansion = 4

    def __init__(self, in_planes, planes, stride=1, shortcut_conv=None, pos_dim=7):
        super().__init__()

        self.conv1 = conv1x1x1(in_planes, planes)
        self.bn1 = nn.BatchNorm2D(planes)

        self.conv2 = conv3x3x3(planes, planes, pos_dim=pos_dim)
        self.bn2 = nn.BatchNorm2D(planes)

        self.conv3 = conv1x1x1(planes, planes * self.expansion)
        self.bn3 = nn.BatchNorm2D(planes * self.expansion)

        self.relu = nn.ReLU()

        self.shortcut_conv = shortcut_conv

        self.stride = stride

        if stride != 1:
            self.downsample = nn.Sequential(
                nn.Conv2D(in_planes, in_planes, kernel_size=2, stride=2, groups=in_planes),
                nn.BatchNorm2D(in_planes)
            )

    def forward(self, x):
        if self.stride != 1:
            x = self.downsample(x)

        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)

        if self.shortcut_conv is not None:
            residual = self.shortcut_conv(x)

        out += residual
        out = self.relu(out)

        return out

class ResNet(nn.Layer):
    def __init__(self,
                 block,
                 layers,
                 block_inplanes,
                 n_input_channels=3,
                 no_max_pool=False,
                 shortcut_type='B',
                 widen_factor=1.0,
                 dropout=0.2, 
                 freeze_bn=False, 
                 spatial_stride=[1,2,2,2], 
                 pos_dim=[64,32,16,8]):
        super().__init__()

        self.freeze_bn = freeze_bn
        block_inplanes = [int(x * widen_factor) for x in block_inplanes]

        self.in_planes = block_inplanes[0]
        self.no_max_pool = no_max_pool
        self.dropout = dropout

        self.conv1 = nn.Conv2D(n_input_channels,
                               self.in_planes,
                               kernel_size=5,
                               stride=2,
                               padding=2,
                               groups=1,
                               bias_attr=False)

        self.bn1 = nn.BatchNorm2D(self.in_planes)
        self.relu = nn.ReLU()
        self.maxpool = nn.MaxPool2D(kernel_size=3, stride=2, padding=1)

        self.layer1 = self._make_layer(block, block_inplanes[0], layers[0],
                                       shortcut_type, stride=spatial_stride[0], pos_dim=pos_dim[0])

        self.layer2 = self._make_layer(block,
                                       block_inplanes[1],
                                       layers[1],
                                       shortcut_type,
                                       stride=spatial_stride[1], pos_dim=pos_dim[1])

        self.layer3 = self._make_layer(block,
                                       block_inplanes[2],
                                       layers[2],
                                       shortcut_type,
                                       stride=spatial_stride[2], pos_dim=pos_dim[2])

        self.layer4 = self._make_layer(block,
                                       block_inplanes[3],
                                       layers[3],
                                       shortcut_type,
                                       stride=spatial_stride[3], pos_dim=pos_dim[3])


    def _downsample_basic_block(self, x, planes, stride):
        out = F.avg_pool2d(x, kernel_size=1, stride=stride)
        zero_pads = paddle.zeros([out.shape[0], planes - out.shape[1], out.shape[2], out.shape[3]])

        if isinstance(out, paddle.CUDAPlace):
            zero_pads = zero_pads.cuda()

        out = paddle.concat([out, zero_pads], axis=1)

        return out

    def _make_layer(self, block, planes, blocks, shortcut_type, stride=1, pos_dim=7):
        shortcut = None
        if self.in_planes != planes * block.expansion:
            shortcut = nn.Sequential(
                conv1x1x1(self.in_planes, planes * block.expansion, stride=1),
                nn.BatchNorm2D(planes * block.expansion)
            )

        layers = []
        layers.append(
            block(in_planes=self.in_planes,
                  planes=planes,
                  stride=stride, shortcut_conv=shortcut, pos_dim=pos_dim)
        )

        self.in_planes = planes * block.expansion

        for i in range(1, blocks):
            layers.append(block(self.in_planes, planes, pos_dim=pos_dim))

        return nn.Sequential(*layers)

    def forward(self, x):
        print('##################', x.shape)
        if len(x.shape) == 3:
            x = paddle.unsqueeze(x, axis=0)
        N, C, H, W = x.shape
        x = x.reshape([int(N/16), -1, H, W])

        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)

        if not self.no_max_pool:
            x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)

        return x

    def train(self, mode=True):
        freeze_bn = self.freeze_bn
        freeze_bn_affine = self.freeze_bn
        super(ResNet, self).train(mode)

        if freeze_bn:
            print("Freezing Mean/Var of BatchNorm2D.")
            for m in self.sublayers():
                if isinstance(m, nn.BatchNorm2D):
                    m.eval()

        if freeze_bn_affine:
            print("Freezing Weight/Bias of BatchNorm2D.")
            for m in self.sublayers():
                if isinstance(m, nn.BatchNorm2D):
                    m.weight.stop_gradient = True
                    m.bias.stop_gradient = True

def SqueezeTime_model(**kwargs):
    model = ResNet(Bottleneck, [2, 2, 2, 2], get_inplanes(), **kwargs)
    return model


@BACKBONES.register()
def SqueezeTime(pretrained=None, use_ssld=False, **kwargs):
    """
    Build SqueezeTime Model
    """

    model = SqueezeTime_model(widen_factor=0.5, dropout=0.5, n_input_channels=48, freeze_bn=False, spatial_stride=[1, 2, 2, 2], pos_dim=[64, 32, 16, 8])
    return  model

2:导入backbone:

paddlevideo/modeling/backbones/__init__.py

# Copyright (c) 2020  PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from .actbert import BertForMultiModalPreTraining
from .adds import ADDS_DepthNet
from .agcn import AGCN
from .asrf import ASRF
from .bmn import BMN
from .cfbi import CFBI
from .movinet import MoViNet
from .ms_tcn import MSTCN
from .resnet import ResNet
from .resnet_slowfast import ResNetSlowFast
from .resnet_slowfast_MRI import ResNetSlowFast_MRI
from .resnet_tsm import ResNetTSM
from .resnet_tsm_MRI import ResNetTSM_MRI
from .resnet_tsn_MRI import ResNetTSN_MRI
from .resnet_tweaks_tsm import ResNetTweaksTSM
from .resnet_tweaks_tsn import ResNetTweaksTSN
from .stgcn import STGCN
from .swin_transformer import SwinTransformer3D
from .transnetv2 import TransNetV2
from .vit import VisionTransformer
from .vit_tweaks import VisionTransformer_tweaks
from .ms_tcn import MSTCN
from .asrf import ASRF
from .resnet_tsn_MRI import ResNetTSN_MRI
from .resnet_tsm_MRI import ResNetTSM_MRI
from .resnet_slowfast_MRI import ResNetSlowFast_MRI
from .cfbi import CFBI
from .ctrgcn import CTRGCN
from .agcn2s import AGCN2s
from .movinet import MoViNet
from .resnet3d_slowonly import ResNet3dSlowOnly
from .toshift_vit import TokenShiftVisionTransformer
from .pptsm_mv2 import PPTSM_MobileNetV2
from .pptsm_mv3 import PPTSM_MobileNetV3
from .pptsm_v2 import PPTSM_v2
from .yowo import YOWO
from .squeezetime import SqueezeTime

__all__ = [
    'ResNet', 'ResNetTSM', 'ResNetTweaksTSM', 'ResNetSlowFast', 'BMN',
    'ResNetTweaksTSN', 'VisionTransformer', 'STGCN', 'AGCN', 'TransNetV2',
    'ADDS_DepthNet', 'VisionTransformer_tweaks', 'BertForMultiModalPreTraining',
    'ResNetTSN_MRI', 'ResNetTSM_MRI', 'ResNetSlowFast_MRI', 'CFBI', 'MSTCN',
    'ASRF', 'MoViNet', 'SwinTransformer3D', 'CTRGCN',
    'TokenShiftVisionTransformer', 'AGCN2s', 'PPTSM_MobileNetV2',
    'PPTSM_MobileNetV3', 'PPTSM_v2', 'ResNet3dSlowOnly', 'YOWO', 'SqueezeTime'
]

3:添加head:

paddlevideo/modeling/heads/i2d_head.py

# Copyright (c) 2020  PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import paddle
import paddle.nn as nn
from paddle import ParamAttr

from ..registry import HEADS
from ..weight_init import weight_init_
from .base import BaseHead


@HEADS.register()
class I2DHead(BaseHead):
    """Classification head for I2D.

    Args:
        num_classes (int): Number of classes to be classified.
        in_channels (int): Number of channels in input feature.
        loss_cls (dict): Config for building loss.
            Default: dict(name='CrossEntropyLoss')
        spatial_type (str): Pooling type in spatial dimension. Default: 'avg'.
        drop_ratio (float): Probability of dropout layer. Default: 0.5.
        std (float): Std value for Initiation. Default: 0.01.
        kwargs (dict, optional): Any keyword argument to be used to initialize
            the head.
    """
    def __init__(self,
                 num_classes,
                 in_channels,
                 loss_cfg=dict(name='CrossEntropyLoss'),
                 spatial_type='avg',
                 drop_ratio=0.5,
                 std=0.01,
                 **kwargs):

        super().__init__(num_classes, in_channels, loss_cfg, **kwargs)

        self.spatial_type = spatial_type
        self.dropout_ratio = drop_ratio
        self.init_std = std
                     
        if self.dropout_ratio != 0:
            self.dropout = nn.Dropout(p=self.dropout_ratio)
        else:
            self.dropout = None
            
        self.fc_cls = nn.Linear(self.in_channels, self.num_classes)

        if self.spatial_type == 'avg':
            self.avg_pool = nn.AdaptiveAvgPool2D((1, 1))
        else:
            self.avg_pool = nn.AdaptiveMaxPool2D((1,1))


    def forward(self, x, num_segs = None):
        """Defines the computation performed at every call.

        Args:
            x (Tensor): The input data.

        Returns:
            Tensor: The classification scores for input samples.
        """   
        
        # [N, in_channels, 4, 7, 7]
        if self.avg_pool is not None:
            x = self.avg_pool(x)
            
        # [N, in_channels, 1, 1, 1]
        if self.dropout is not None:
            x = self.dropout(x)
            
        # [N, in_channels, 1, 1, 1]
        x = paddle.reshape(x, [x.shape[0], -1])
        
        # [N, in_channels]
        cls_score = self.fc_cls(x)
        
        # [N, num_classes]
        return cls_score


    # def forward_new(self, x, num_segs = None):
    #     """Defines the computation performed at every call.

    #     Args:
    #         x (Tensor): The input data.

    #     Returns:
    #         Tensor: The classification scores for input samples.
    #     """      
        
    #     # [N, in_channels, 4, 7, 7]
    #     if self.avg_pool is not None:
    #         x = self.avg_pool(x)
            
    #     # [N, in_channels, 1, 1, 1]
    #     if self.dropout is not None:
    #         x = self.dropout(x)
            
    #     # [N, in_channels, 1, 1, 1]
    #     x = paddle.reshape(x, [x.shape[0], -1])
        
    #     # [N, in_channels]
    #     cls_score = self.fc_cls(x)
        
    #     # [N, num_classes]
    #     return cls_score



4:导入head:

paddlevideo/modeling/heads/__init__.py

# Copyright (c) 2020  PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from .adds_head import AddsHead
from .asrf_head import ASRFHead
from .attention_lstm_head import AttentionLstmHead, ActionAttentionLstmHead
from .base import BaseHead
from .bbox_head import BBoxHeadAVA
from .cfbi_head import CollaborativeEnsemblerMS
from .i3d_head import I3DHead
from .movinet_head import MoViNetHead
from .ms_tcn_head import MSTCNHead
from .pptimesformer_head import ppTimeSformerHead
from .pptsm_head import ppTSMHead
from .pptsn_head import ppTSNHead
from .roi_head import AVARoIHead
from .single_straight3d import SingleRoIExtractor3D
from .slowfast_head import SlowFastHead
from .stgcn_head import STGCNHead
from .timesformer_head import TimeSformerHead
from .transnetv2_head import TransNetV2Head
from .tsm_head import TSMHead
from .tsn_head import TSNHead
from .ms_tcn_head import MSTCNHead
from .asrf_head import ASRFHead
from .ctrgcn_head import CTRGCNHead
from .movinet_head import MoViNetHead
from .agcn2s_head import AGCN2sHead
from .token_shift_head import TokenShiftHead
from .i2d_head import I2DHead

__all__ = [
    'BaseHead', 'TSNHead', 'TSMHead', 'ppTSMHead', 'ppTSNHead', 'SlowFastHead',
    'AttentionLstmHead', 'TimeSformerHead', 'STGCNHead', 'TransNetV2Head',
    'I3DHead', 'SingleRoIExtractor3D', 'AVARoIHead', 'BBoxHeadAVA', 'AddsHead',
    'ppTimeSformerHead', 'CollaborativeEnsemblerMS', 'MSTCNHead', 'ASRFHead',
    'MoViNetHead', 'CTRGCNHead', 'TokenShiftHead', 'ActionAttentionLstmHead',
    'AGCN2sHead', 'I2DHead'
]

5:训练配置文件:

configs/recognition/pptsm/v2/md_ppsqt_16frames_uniform.yaml

MODEL: #MODEL field
    framework: "Recognizer2D" #Mandatory, indicate the type of network, associate to the 'paddlevideo/modeling/framework/' .
    backbone: #Mandatory, indicate the type of backbone, associate to the 'paddlevideo/modeling/backbones/' .
        name: "SqueezeTime" #Mandatory, The name of backbone.
    head:
        name: "I2DHead" #Mandatory, indicate the type of head, associate to the 'paddlevideo/modeling/heads'
        #pretrained: "" #Optional, pretrained model path.
        num_classes: 2
        in_channels: 1024

DATASET: #DATASET field
    batch_size: 16  #Mandatory, bacth size
    num_workers: 4 #Mandatory, the number of subprocess on each GPU.
    train:
        format: "FrameDataset" #Mandatory, indicate the type of dataset, associate to the 'paddlevidel/loader/dateset'
        data_prefix: "/home/mnt/sdd/Data/data_fights/rawframes" #Mandatory, train data root path
        file_path: "/home/mnt/sdd/Data/data_fights/train_list.txt" #Mandatory, train data index file path
        suffix: 'img_{:06}.jpg'
    valid:
        format: "FrameDataset" #Mandatory, indicate the type of dataset, associate to the 'paddlevidel/loader/dateset'
        data_prefix: "/home/mnt/sdd/Data/data_fights/rawframes" #Mandatory, valid data root path
        file_path: "/home/mnt/sdd/Data/data_fights/test_list.txt" #Mandatory, valid data index file path
        suffix: 'img_{:06}.jpg'
    test:
        format: "FrameDataset" #Mandatory, indicate the type of dataset, associate to the 'paddlevidel/loader/dateset'
        data_prefix: "/home/mnt/sdd/Data/data_fights/rawframes" #Mandatory, valid data root path
        file_path: "/home/mnt/sdd/Data/data_fights/test_list.txt" #Mandatory, valid data index file path
        suffix: 'img_{:06}.jpg'

PIPELINE: #PIPELINE field
    train: #Mandotary, indicate the pipeline to deal with the training data, associate to the 'paddlevideo/loader/pipelines/'
        decode:
            name: "FrameDecoder"
        sample:
            name: "Sampler"
            num_seg: 16
            seg_len: 1
            valid_mode: False
        transform: #Mandotary, image transfrom operator
            - Scale:
                short_size: 256
            - MultiScaleCrop:
                target_size: 256
            - RandomCrop:
                target_size: 224
            - RandomFlip:
            - Image2Array:
            - Normalization:
                mean: [0.485, 0.456, 0.406]
                std: [0.229, 0.224, 0.225]
    valid: #Mandatory, indicate the pipeline to deal with the validing data. associate to the 'paddlevideo/loader/pipelines/'
        decode:
            name: "FrameDecoder"
        sample:
            name: "Sampler"
            num_seg: 16
            seg_len: 1
            valid_mode: True
        transform:
            - Scale:
                short_size: 256
            - CenterCrop:
                target_size: 224
            - Image2Array:
            - Normalization:
                mean: [0.485, 0.456, 0.406]
                std: [0.229, 0.224, 0.225]
    test:  #Mandatory, indicate the pipeline to deal with the validing data. associate to the 'paddlevideo/loader/pipelines/'
        decode:
            name: "FrameDecoder"
        sample:
            name: "Sampler"
            num_seg: 16
            seg_len: 1
            valid_mode: True
        transform:
            - Scale:
                short_size: 256
            - CenterCrop:
                target_size: 224
            - Image2Array:
            - Normalization:
                mean: [0.485, 0.456, 0.406]
                std: [0.229, 0.224, 0.225]

OPTIMIZER: #OPTIMIZER field
  name: 'Momentum'
  momentum: 0.9
  learning_rate:
    iter_step: True
    name: 'CustomWarmupCosineDecay'
    max_epoch: 120
    warmup_epochs: 10
    warmup_start_lr: 0.005
    cosine_base_lr: 0.01
  weight_decay:
    name: 'L2'
    value: 1e-4
  use_nesterov: True

MIX:
    name: "Mixup"
    alpha: 0.2


METRIC:
    name: 'CenterCropMetric'

INFERENCE:
    name: 'ppSQT_Inference_helper'
    num_seg: 16
    target_size: 224

model_name: "ppSQT"
log_interval: 10 #Optional, the interal of logger, default:10
epochs: 120  #Mandatory, total epoch
log_level: "INFO" #Optional, the logger level. default: "INFO"

6:训练:

# multi-gpu-st
export CUDA_VISIBLE_DEVICES=0,1
python -B -m paddle.distributed.launch --gpus="0,1"  --log_dir=./log/log_sqt_frame_16  main.py  --validate -c configs/recognition/pptsm/v2/ppsqt_lcnet_md_16frames_uniform.yaml

7:结果:精度比ppTSM-v2低8个点左右。有可能是没有预训练权重的问题。 

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1912738.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

Xilinx FPGA UltraScale SelectIO 接口逻辑资源

目录 1. 简介 2. Bank Overview 2.1 Diagram 2.2 IOB 2.3 Slice 2.4 Byte Group 2.5 I/O bank 示例 2.6 Pin Definition 2.7 数字控制阻抗(DCI) 2.8 SelectIO 管脚供电电压 2.8.1 VCCO 2.8.2 VREF 2.8.3 VCCAUX 2.8.4 VCCAUX_IO 2.8.5 VCCINT_IO 3. 总结 1. 简介…

基于信号量的生产者消费者模型

文章目录 信号量认识概念基于线程分析信号量信号量操作 循环队列下的生产者消费者模型理论认识代码部分 信号量 认识概念 信号量本质: 计数器 它也叫做公共资源 为了线程之间,进程间通信------>多个执行流看到的同一份资源---->多个资源都会并发访问这个资源(此时易出现…

【Qt课设】基于Qt实现的中国象棋

一、摘 要 本报告讨论了中国象棋程序设计的关键技术和方法。首先介绍了中国象棋的棋盘制作,利用Qt中的一些绘画类的函数来进行绘制。在创作中国象棋棋子方面,首先,我们先定义一下棋子类,将棋子中相同的部分进行打包,使…

Python:安装/Mac

之前一直陆陆续续有学python!今天开始!正式开肝!!! 进入网站:可能会有点慢,多开几个网页 https://www.python.org 点击下载,然后进入新的页面,往下滑 来到File&#xff0…

PHP验证日本免费电话号码格式

首先,您需要了解免费电话号码的格式。 日本免费电话也就那么几个号段:0120、0990、0180、0570、0800等开头的,0800稍微特殊点,在手机号里面有080 开头,但是后面不一样了。 关于免费电话号码的划分,全部写…

忘记Apple ID密码怎么退出苹果ID账号?

忘记Apple ID密码怎么退出账号?Apple ID对每个苹果用户来说都是必不可少的,没有它,用户就不能享受iCloud、App Store、iTunes等服务。苹果手机软件下载、丢失解锁、恢复出厂设置等都需要使用Apple ID。如果忘记Apple ID 密码,这会…

Linux——多线程(五)

1.线程池 1.1初期框架 thread.hpp #include<iostream> #include <string> #include <unistd.h> #include <functional> #include <pthread.h>namespace ThreadModule {using func_t std::function<void()>;class Thread{public:void E…

九、Linux二进制安装ElasticSearch集群

目录 九、Linux二进制安装ElasticSearch集群1 下载2 安装前准备(单机&#xff0c;集群每台机器都需要配置)3 ElasticSearch单机&#xff08;7.16.2&#xff09;4 ElasticSearch集群&#xff08;8.14.2&#xff09;4.1 解压文件&#xff08;先将下载文件放到/opt下&#xff09;4…

Java系列-valitile

背景 volatile这个关键字可以说是面试过程中出现频率最高的一个知识点了&#xff0c;面试官的问题也是五花八门&#xff0c;各种刁钻的角度。之前也是简单背过几道八股文&#xff0c;什么可见性&#xff0c;防止指令重拍等&#xff0c;但面试官一句&#xff1a;volatile原理是什…

Vue基础--v-model/v-for/事件属性/侦听器

目录 一 v-model表单元素 1.1 v-model绑定文本域的value 1.1.1 lazy属性&#xff1a;光标离开再发请求 1.1.2 number属性&#xff1a;如果能转成number就会转成numer类型 1.1.3 trim属性&#xff1a;去文本域输入的前后空格 1.2v-model绑定单选checkbox 1.3代码展示 二 …

Python OpenCV 教学取得视频资讯

这篇教学会介绍使用OpenCV&#xff0c;取得影像的长宽尺寸、以及读取影像中某些像素的颜色数值。 因为程式中的OpenCV 会需要使用镜头或GPU&#xff0c;所以请使用本机环境( 参考&#xff1a;使用Python 虚拟环境) 或使用Anaconda Jupyter 进行实作( 参考&#xff1a;使用Anaco…

基于单片机的温湿度感应智能晾衣杆系统设计

&#xff3b;摘 要&#xff3d; 本设计拟开发一种湿度感应智能晾衣杆系统 &#xff0c; 此新型晾衣杆是以单片机为主控芯片 来控制的实时检测系统 &#xff0e; 该系统使用 DHT11 温湿度传感器来检测大气的温湿度 &#xff0c; 然后通过单 片机处理信息来控制 28BYJ &…

配置路由器支持Telnet操作 计网实验

实验要求&#xff1a; 假设某学校的网络管理员第一次在设备机房对路由器进行了初次配置后&#xff0c;他希望以后在办公室或出差时也可以对设备进行远程管理&#xff0c;现要在路由器上做适当配置&#xff0c;使他可以实现这一愿望。 本实验以一台R2624路由器为例&#xff0c;…

使用 Hugging Face 的 Transformers 库加载预训练模型遇到的问题

题意&#xff1a; Size mismatch for embed_out.weight: copying a param with shape torch.Size([0]) from checkpoint - Huggingface PyTorch 这个错误信息 "Size mismatch for embed_out.weight: copying a param with shape torch.Size([0]) from checkpoint - Hugg…

Redis管理禁用命令

在redis数据量比较大时&#xff0c;执行 keys * &#xff0c;fluashdb 这些命令&#xff0c;会导致redis长时间阻塞&#xff0c;大量请求被阻塞&#xff0c;cpu飙升&#xff0c;严重可能导致redis宕机&#xff0c;数据库雪崩。所以一些命令在生产环境禁止使用。 Redis 禁用命令…

开始尝试从0写一个项目--前端(二)

修改请求路径的位置 将后续以及之前的所有请求全都放在同一个文件夹里面 定义axios全局拦截器 为了后端每次请求都需要向后端传递jwt令牌检验 ps&#xff1a;愁死了&#xff0c;翻阅各种资料&#xff0c;可算是搞定了&#xff0c;哭死~~ src\utils\request.js import axio…

【QML之·基础语法概述】

系列文章目录 文章目录 前言一、QML基础语法二、属性三、脚本四、核心元素类型4.1 元素可以分为视觉元素和非视觉元素。4.2 Item4.2.1 几何属性(Geometry&#xff09;:4.2.2 布局处理:4.2.3 键处理&#xff1a;4.2.4 变换4.2.5 视觉4.2.6 状态定义 4.3 Rectangle4.3.1 颜色 4.4…

互联网3.0时代的变革者:华贝甄选大模型创新之道

在当今竞争激烈的商业世界中&#xff0c;华贝甄选犹如一颗璀璨的明星&#xff0c;闪耀着独特的光芒。 华贝甄选始终将技术创新与研发视为发展的核心驱动力。拥有先进的研发团队和一流设施&#xff0c;积极探索人工智能、大数据、区块链等前沿技术&#xff0c;为用户提供高性能…

Knife4j的介绍与使用

目录 一、简单介绍1.1 简介1.2 主要特点和功能&#xff1a; 二、使用步骤&#xff1a;2.1 添加依赖&#xff1a;2.2 yml数据源配置2.3 创建knife4j配置类2.4 注解的作用 最后 一、简单介绍 1.1 简介 Knife4j 是一款基于Swagger的开源文档管理工具&#xff0c;主要用于生成和管…

【PTA天梯赛】L1-003 个位数统计(15分)

作者&#xff1a;指针不指南吗 专栏&#xff1a;算法刷题 &#x1f43e;或许会很慢&#xff0c;但是不可以停下来&#x1f43e; 文章目录 题目题解总结 题目 题目链接 题解 使用string把长度达1000位的数字存起来开一个代表个位数的数组 a[11]倒序计算最后一位&#xff0c;…