LONG-TAILED RECOGNITION 精读

news2024/9/25 3:24:21

BackGround

解决类别不平衡问题一般的思路:

  1. re-sample the data 重采样
  2. design specific loss functions that better facilitate learning with imbalanced data 设计针对不平衡数据的损失函数
  3. enhance recognition performance of the tail classes by transferring knowledge from the head classes 知识迁移
    • 解释一下:通过从头部类别中转移知识来提高尾部类别的识别性能。在机器学习和模式识别领域,通常存在一些类别的样本数量较少,被称为尾部类别;而另一些类别的样本数量较多,被称为头部类别。由于尾部类别样本数量的不足,其识别性能往往较低。因此,通过从头部类别中学习到的知识,可以帮助提高尾部类别的识别性能。这种方法通常被称为知识迁移。

下面直奔主题:decouple the learning procedure into representation learning and classification
即,解耦成两个部分:表征学习和分类器。

  • 表征学习部分:the model is exposed to the training instances and trained through different sampling strategies or losses.
  • 分类器部分:upon the learned representations, the model recognizes the long-tailed classes through various classifiers

采样策略

  1. standard instance-based sampling
    标准基于实例的采样(standard instance-based sampling)是一种常用的采样策略,用于解决类别不平衡问题。在类别不平衡的数据集中,某些类别的样本数量远远多于其他类别,这可能导致模型对于少数类别的识别性能较差。

    标准基于实例的采样通过对数据集中的样本进行重采样,以平衡不同类别的样本数量。具体而言,该采样策略会从多数类别中随机选择一定数量的样本,使其数量与少数类别相当。这样可以避免模型过于偏向多数类别,提高对少数类别的识别能力。

    标准基于实例的采样通常有两种方式:欠采样(undersampling)和过采样(oversampling)。

    • 欠采样:从多数类别中删除一些样本,使其数量与少数类别相当。这种方法可能会导致信息丢失,因为删除了一些多数类别的样本。但它可以减少多数类别的影响,使得模型更关注少数类别。
    • 过采样:通过复制或生成新的样本,使得多数类别的样本数量与少数类别相当。这种方法可能会导致过拟合,因为复制或生成的样本可能过于接近多数类别的分布。但它可以增加少数类别的样本数量,提高模型对少数类别的学习能力。
  2. class-balanced sampling
    类别平衡采样的目标是通过调整样本的采样权重,使得每个类别在采样过程中都有相似的机会被选中。这样可以避免模型过于偏向多数类别,提高对少数类别的学习能力。

  3. 以上两种方法混合

Class-balanced Losses

  1. Focal loss
  2. Meta-Weight-Net
  3. re-weighted training
  4. based on Bayesian uncertainty
  5. to balance the classification regions of head and tail classes using an affinity measure to enforce cluster centers of classes to be uniformly spaced and equidistant

Transfer learning from head- to tail classes

transferring features learned from head classes with abundant
training instances to under-represented tail classes

  1. transferring the intra-class variance
  2. transferring semantic deep features
  • However it is usually a non-trivial task to design specific modules (e.g. external memory) for feature transfer
  1. low-shot recognition,the setup for long-tail recognition assumes access to both head and tail classes and a more continuous decrease in in class labels
  2. adopt re-balancing schedules that learn representation and classifier jointly within a two-stage training scheme
  3. OLTR:uses instance-balanced sampling to first learn
    representations that are fine-tuned in a second stage with class-balanced sampling together with a
    memory module
  4. LDAM:a label-distribution-aware margin loss that
    expands the decision boundaries of few-shot classes.

分类器

  1. re-training the parametric linear classifier(预训练的线性分类器) in a class-balancing manner (i.e., re-sampling)
  2. non-parametric nearest class mean classifier(KNN), which classifies the data based on their closest class-specific mean representations from the training set
  3. normalizing the classifier weights(分类器权重进行归一化,加入温度参数), which adjusts the weight magnitude directly to be more balanced, adding a temperature to modulate the normalization procedure

作者的实验结论

  1. instance-balanced sampling learns the best and most generalizable representations.
  2. 重新调整由联合学习分类器指定的决策边界是有优势的,这可以通过表示学习期间重新训练分类器使用类别平衡采样,或者通过简单而有效的分类器权重归一化来实现。我们的实验结果表明,这两种方法都可以实现决策边界的调整,并且它们都只有一个超参数来控制“温度”,而且不需要额外的训练。
  3. By applying the decoupled learning scheme to standard networks (e.g., ResNeXt), we
    achieve significantly higher accuracy than well established state-of-the-art methods (different sampling strategies, new loss designs and other complex modules) on multiple longtailed recognition benchmark datasets, including ImageNet-LT, Places-LT, and iNaturalist

Start

采样策略上的更新:

  • 均方根采样
    在这里插入图片描述

  • 调节采样
    在这里插入图片描述

  • Loss re-weighting strategies


  • Classifier Re-training (cRT)

  • Nearest Class Mean classifier (NCM)

  • τ -normalized classifier (τ -normalized)
    在这里插入图片描述

  • Learnable weight scaling (LWS)
    在这里插入图片描述


Extra Feature

  1. ResNet/ + modulatedatt -> x , featuremap
  • ResNet 残差网络就不提了
  • ResNext 则使用了类似的残差块,但在残差块内部引入了分组卷积(grouped convolution)的概念,通过增加卷积层的宽度来增加模型的表示能力。
  • ModulatedAttLayer

这个layer是一个模块化的注意力层,用于提取输入特征图的重要信息。它通过计算输入特征图的不同部分之间的关联性,以及对特征图进行空间注意力调整,来增强特征图中的有用信息。

  • 具体来说,该层包括三个卷积操作(g、theta、phi)和一个卷积操作(conv_mask)。它通过计算输入特征图的三个不同表示(g、theta、phi),并使用这些表示计算特征图之间的关联性(map_t_p),然后将关联性与输入特征图中的空间注意力调整(mask)相乘,得到最终的特征图输出(final)。
  • 该层还包括一个全连接层(fc_spatial),用于计算特征图的空间注意力权重(spatial_att)。最后,该层返回特征图输出和中间结果(x、spatial_att、mask)作为输出。
  • 在这个模块化的注意力层中,卷积层的kernel_size被设置为1是为了保持特征图的空间维度不变。这是因为在这个层中,主要关注的是特征图之间的关联性和空间注意力调整,而不是特征图的空间维度变化。

  • 因此,通过将kernel_size设置为1,可以确保卷积操作只对特征图的通道维度进行操作,而不改变其空间维度。这样可以保持特征图的空间结构不变,使得后续的操作能够更好地利用特征图中的信息。

import torch
from torch import nn
from torch.nn import functional as F
import pdb

class ModulatedAttLayer(nn.Module):

    def __init__(self, in_channels, reduction = 2, mode='embedded_gaussian'):
        super(ModulatedAttLayer, self).__init__()
        self.in_channels = in_channels
        self.reduction = reduction
        self.inter_channels = in_channels // reduction
        self.mode = mode
        assert mode in ['embedded_gaussian']

        self.g = nn.Conv2d(self.in_channels, self.inter_channels, kernel_size = 1)
        self.theta = nn.Conv2d(self.in_channels, self.inter_channels, kernel_size = 1)
        self.phi = nn.Conv2d(self.in_channels, self.inter_channels, kernel_size = 1)
        self.conv_mask = nn.Conv2d(self.inter_channels, self.in_channels, kernel_size = 1, bias=False)
        self.relu = nn.ReLU(inplace=True)

        self.avgpool = nn.AvgPool2d(7, stride=1)
        self.fc_spatial = nn.Linear(7 * 7 * self.in_channels, 7 * 7)

        self.init_weights()

    def init_weights(self):
        msra_list = [self.g, self.theta, self.phi]
        for m in msra_list:
            nn.init.kaiming_normal_(m.weight.data)
            m.bias.data.zero_()
        self.conv_mask.weight.data.zero_()

    def embedded_gaussian(self, x):
        # embedded_gaussian cal self-attention, which may not strong enough
        batch_size = x.size(0)

        g_x = self.g(x).view(batch_size, self.inter_channels, -1)
        g_x = g_x.permute(0, 2, 1)
        theta_x = self.theta(x).view(batch_size, self.inter_channels, -1)
        theta_x = theta_x.permute(0, 2, 1)
        phi_x = self.phi(x).view(batch_size, self.inter_channels, -1)

        map_t_p = torch.matmul(theta_x, phi_x)
        mask_t_p = F.softmax(map_t_p, dim=-1)

        map_ = torch.matmul(mask_t_p, g_x)
        map_ = map_.permute(0, 2, 1).contiguous()
        map_ = map_.view(batch_size, self.inter_channels, x.size(2), x.size(3))
        mask = self.conv_mask(map_)
        
        x_flatten = x.view(-1, 7 * 7 * self.in_channels)

        spatial_att = self.fc_spatial(x_flatten)
        spatial_att = spatial_att.softmax(dim=1)
        
        spatial_att = spatial_att.view(-1, 7, 7).unsqueeze(1)
        spatial_att = spatial_att.expand(-1, self.in_channels, -1, -1)

        final = spatial_att * mask + x

        return final, [x, spatial_att, mask]

    def forward(self, x):
        if self.mode == 'embedded_gaussian':
            output, feature_maps = self.embedded_gaussian(x)
        else:
            raise NotImplemented("The code has not been implemented.")
        return output, feature_maps

Classifier

KNNClassifier

"""Copyright (c) Facebook, Inc. and its affiliates.
All rights reserved.

This source code is licensed under the license found in the
LICENSE file in the root directory of this source tree.
"""


import torch
import torch.nn as nn
import numpy as np
import pickle
from os import path

class KNNClassifier(nn.Module):
    def __init__(self, feat_dim=512, num_classes=1000, feat_type='cl2n', dist_type='l2'):
        super(KNNClassifier, self).__init__()
        assert feat_type in ['un', 'l2n', 'cl2n'], "feat_type is wrong!!!"
        assert dist_type in ['l2', 'cos'], "dist_type is wrong!!!"
        self.feat_dim = feat_dim
        self.num_classes = num_classes
        self.centroids = torch.randn(num_classes, feat_dim)
        self.feat_mean = torch.randn(feat_dim)
        self.feat_type = feat_type
        self.dist_type = dist_type
        self.initialized = False
    
    def update(self, cfeats):
        mean = cfeats['mean']
        centroids = cfeats['{}cs'.format(self.feat_type)]

        mean = torch.from_numpy(mean)
        centroids = torch.from_numpy(centroids)
        self.feat_mean.copy_(mean)
        self.centroids.copy_(centroids)
        if torch.cuda.is_available():
            self.feat_mean = self.feat_mean.cuda()
            self.centroids = self.centroids.cuda()
        self.initialized = True

    def forward(self, inputs, *args):
        centroids = self.centroids
        feat_mean = self.feat_mean

        # Feature transforms
        if self.feat_type == 'cl2n':
            inputs = inputs - feat_mean
            #centroids = centroids - self.feat_mean

        if self.feat_type in ['l2n', 'cl2n']:
            norm_x = torch.norm(inputs, 2, 1, keepdim=True)
            inputs = inputs / norm_x

            #norm_c = torch.norm(centroids, 2, 1, keepdim=True)
            #centroids = centroids / norm_c
        
        # Logit calculation
        if self.dist_type == 'l2':
            logit = self.l2_similarity(inputs, centroids)
        elif self.dist_type == 'cos':
            logit = self.cos_similarity(inputs, centroids)
        
        return logit, None

    def l2_similarity(self, A, B):
        # input A: [bs, fd] (batch_size x feat_dim)
        # input B: [nC, fd] (num_classes x feat_dim)
        feat_dim = A.size(1)

        AB = torch.mm(A, B.t())
        AA = (A**2).sum(dim=1, keepdim=True)
        BB = (B**2).sum(dim=1, keepdim=True)
        dist = AA + BB.t() - 2*AB

        return -dist
    
    def cos_similarity(self, A, B):
        feat_dim = A.size(1)
        AB = torch.mm(A, B.t())
        AB = AB / feat_dim
        return AB


def create_model(feat_dim, num_classes=1000, feat_type='cl2n', dist_type='l2',
                 log_dir=None, test=False, *args):
    print('Loading KNN Classifier')
    print(feat_dim, num_classes, feat_type, dist_type, log_dir, test)
    clf = KNNClassifier(feat_dim, num_classes, feat_type, dist_type)

    if log_dir is not None:
        fname = path.join(log_dir, 'cfeats.pkl')
        if path.exists(fname):
            print('===> Loading features from %s' % fname)
            with open(fname, 'rb') as f:
                data = pickle.load(f)
            clf.update(data)
    else:
        print('Random initialized classifier weights.')
    
    return clf


if __name__ == "__main__":
    cens = np.eye(4)
    mean = np.ones(4)
    xs = np.array([
        [0.9, 0.1, 0.0, 0.0],
        [0.2, 0.1, 0.1, 0.6],
        [0.3, 0.3, 0.4, 0.0],
        [0.0, 1.0, 0.0, 0.0],
        [0.25, 0.25, 0.25, 0.25]
    ])
    xs = torch.Tensor(xs)

    classifier = KNNClassifier(feat_dim=4, num_classes=4, 
                               feat_type='un')
    classifier.update(mean, cens)
    import pdb; pdb.set_trace()
    logits, _ = classifier(xs)

Dot_Classifier

"""Copyright (c) Facebook, Inc. and its affiliates.
All rights reserved.

This source code is licensed under the license found in the
LICENSE file in the root directory of this source tree.
"""


import torch
import torch.nn as nn
from torch.nn.parameter import Parameter

from utils import *
from os import path

class DotProduct_Classifier(nn.Module):
    
    def __init__(self, num_classes=1000, feat_dim=2048, *args):
        super(DotProduct_Classifier, self).__init__()
        # print('<DotProductClassifier> contains bias: {}'.format(bias))
        self.fc = nn.Linear(feat_dim, num_classes)
        self.scales = Parameter(torch.ones(num_classes))
        for param_name, param in self.fc.named_parameters():
            param.requires_grad = False
        
    def forward(self, x, *args):
        x = self.fc(x)
        x *= self.scales
        return x, None
    
def create_model(feat_dim, num_classes=1000, stage1_weights=False, dataset=None, log_dir=None, test=False, *args):
    print('Loading Dot Product Classifier.')
    clf = DotProduct_Classifier(num_classes, feat_dim)

    if not test:
        if stage1_weights:
            assert(dataset)
            print('Loading %s Stage 1 Classifier Weights.' % dataset)
            if log_dir is not None:
                subdir = log_dir.strip('/').split('/')[-1]
                subdir = subdir.replace('stage2', 'stage1')
                weight_dir = path.join('/'.join(log_dir.split('/')[:-1]), subdir)
                # weight_dir = path.join('/'.join(log_dir.split('/')[:-1]), 'stage1')
            else:
                weight_dir = './logs/%s/stage1' % dataset
            print('==> Loading classifier weights from %s' % weight_dir)
            clf.fc = init_weights(model=clf.fc,
                                  weights_path=path.join(weight_dir, 'final_model_checkpoint.pth'),
                                  classifier=True)
        else:
            print('Random initialized classifier weights.')

    return clf

Loss

DiscCentroidsLoss

class DiscCentroidsLoss(nn.Module):
    def __init__(self, num_classes, feat_dim, size_average=True):
        super(DiscCentroidsLoss, self).__init__()
        self.num_classes = num_classes
        self.centroids = nn.Parameter(torch.randn(num_classes, feat_dim))
        self.disccentroidslossfunc = DiscCentroidsLossFunc.apply
        self.feat_dim = feat_dim
        self.size_average = size_average

    def forward(self, feat, label):
        batch_size = feat.size(0)
        
        # calculate attracting loss

        feat = feat.view(batch_size, -1)
        # To check the dim of centroids and features
        if feat.size(1) != self.feat_dim:
            raise ValueError("Center's dim: {0} should be equal to input feature's \
                            dim: {1}".format(self.feat_dim,feat.size(1)))
        batch_size_tensor = feat.new_empty(1).fill_(batch_size if self.size_average else 1)
        loss_attract = self.disccentroidslossfunc(feat, label, self.centroids, batch_size_tensor).squeeze()
        
        # calculate repelling loss

        distmat = torch.pow(feat, 2).sum(dim=1, keepdim=True).expand(batch_size, self.num_classes) + \
                  torch.pow(self.centroids, 2).sum(dim=1, keepdim=True).expand(self.num_classes, batch_size).t()
        distmat.addmm_(1, -2, feat, self.centroids.t())

        classes = torch.arange(self.num_classes).long().cuda()
        labels_expand = label.unsqueeze(1).expand(batch_size, self.num_classes)
        mask = labels_expand.eq(classes.expand(batch_size, self.num_classes))

        distmat_neg = distmat
        distmat_neg[mask] = 0.0
        # margin = 50.0
        margin = 10.0
        loss_repel = torch.clamp(margin - distmat_neg.sum() / (batch_size * self.num_classes), 0.0, 1e6)

        # loss = loss_attract + 0.05 * loss_repel
        loss = loss_attract + 0.01 * loss_repel

        return loss
 
 
class DiscCentroidsLossFunc(Function):
    @staticmethod
    def forward(ctx, feature, label, centroids, batch_size):
        ctx.save_for_backward(feature, label, centroids, batch_size)
        centroids_batch = centroids.index_select(0, label.long())
        return (feature - centroids_batch).pow(2).sum() / 2.0 / batch_size

    @staticmethod
    def backward(ctx, grad_output):
        feature, label, centroids, batch_size = ctx.saved_tensors
        centroids_batch = centroids.index_select(0, label.long())
        diff = centroids_batch - feature
        # init every iteration
        counts = centroids.new_ones(centroids.size(0))
        ones = centroids.new_ones(label.size(0))
        grad_centroids = centroids.new_zeros(centroids.size())

        counts = counts.scatter_add_(0, label.long(), ones)
        grad_centroids.scatter_add_(0, label.unsqueeze(1).expand(feature.size()).long(), diff)
        grad_centroids = grad_centroids/counts.view(-1, 1)
        return - grad_output * diff / batch_size, None, grad_centroids / batch_size, None


    
def create_loss (feat_dim=512, num_classes=1000):
    print('Loading Discriminative Centroids Loss.')
    return DiscCentroidsLoss(num_classes, feat_dim)
import torch.nn as nn

def create_loss ():
    print('Loading Softmax Loss.')
    return nn.CrossEntropyLoss()

采样

  • 循环采样

循环遍历不同类别的样本,并按照指定数量从每个类别中取样本。它使用了两个迭代器,一个用于循环遍历不同类别的样本,另一个用于循环遍历每个类别中的样本。在每次迭代中,它从当前类别的样本迭代器中取出指定数量的样本,并返回给调用者。当遍历完所有类别的样本后,它会重新随机打乱类别顺序,以便下一轮迭代时可以重新遍历样本。

  • 样本重要性优先采样
class PriorityTree(object):
    def __init__(self, capacity, init_weights, fixed_weights=None, fixed_scale=1.0,
                 alpha=1.0):
        """
        fixed_weights: weights that wont be updated by self.update()
        """
        assert fixed_weights is None or len(fixed_weights) == capacity
        assert len(init_weights) == capacity
        self.alpha = alpha
        self._capacity = capacity
        self._tree_size = 2 * capacity - 1
        self.fixed_scale = fixed_scale
        self.fixed_weights = np.zeros(self._capacity) if fixed_weights is None \
                             else fixed_weights
        self.tree = np.zeros(self._tree_size)
        self._initialized = False
        self.initialize(init_weights)

    def initialize(self, init_weights):
        """Initialize the tree."""

        # Rescale the fixed_weights if it is not zero
        self.fixed_scale_init = self.fixed_scale
        if self.fixed_weights.sum() > 0 and init_weights.sum() > 0:
            self.fixed_scale_init *= init_weights.sum() / self.fixed_weights.sum()
            self.fixed_weights *= self.fixed_scale * init_weights.sum() \
                                / self.fixed_weights.sum()
        print('FixedWeights: {}'.format(self.fixed_weights.sum()))

        self.update_whole(init_weights + self.fixed_weights)
        self._initialized = True
    
    def reset_adaptive_weights(self, adaptive_weights):
        self.update_whole(self.fixed_weights + adaptive_weights)
    
    def reset_fixed_weights(self, fixed_weights, rescale=False):
        """ Reset the manually designed weights and 
            update the whole tree accordingly.

            @rescale: rescale the fixed_weights such that 
            fixed_weights.sum() = self.fixed_scale * adaptive_weights.sum()
        """

        adaptive_weights = self.get_adaptive_weights()
        fixed_sum = fixed_weights.sum()
        if rescale and fixed_sum > 0:
            # Rescale fixedweight based on adaptive weights
            scale = self.fixed_scale * adaptive_weights.sum() / fixed_sum
        else:
            # Rescale fixedweight based on previous fixedweight
            scale = self.fixed_weights.sum() / fixed_sum
        self.fixed_weights = fixed_weights * scale
        self.update_whole(self.fixed_weights + adaptive_weights)
    
    def update_whole(self, total_weights):
        """ Update the whole tree based on per-example sampling weights """
        if self.alpha != 1:
            total_weights = np.power(total_weights, self.alpha)
        lefti = self.pointer_to_treeidx(0)
        righti = self.pointer_to_treeidx(self.capacity-1)
        self.tree[lefti:righti+1] = total_weights

        # Iteratively find a parent layer
        while lefti != 0 and righti != 0:
            lefti = (lefti - 1) // 2 if lefti != 0 else 0
            righti = (righti - 1) // 2 if righti != 0 else 0
            
            # Assign paraent weights from right to left
            for i in range(righti, lefti-1, -1):
                self.tree[i] = self.tree[2*i+1] + self.tree[2*i+2]
    
    def get_adaptive_weights(self):
        """ Get the instance-aware weights, that are not mannually designed"""
        if self.alpha == 1:
            return self.get_total_weights() - self.fixed_weights
        else:
            return self.get_raw_total_weights() - self.fixed_weights
    
    def get_total_weights(self):
        """ Get the per-example sampling weights
            return shape: [capacity]
        """
        lefti = self.pointer_to_treeidx(0)
        righti = self.pointer_to_treeidx(self.capacity-1)
        return self.tree[lefti:righti+1]

    def get_raw_total_weights(self):
        """ Get the per-example sampling weights
            return shape: [capacity]
        """
        lefti = self.pointer_to_treeidx(0)
        righti = self.pointer_to_treeidx(self.capacity-1)
        return np.power(self.tree[lefti:righti+1], 1/self.alpha)

    @property
    def size(self):
        return self._tree_size

    @property
    def capacity(self):
        return self._capacity

    def __len__(self):
        return self.capacity

    def pointer_to_treeidx(self, pointer):
        assert pointer < self.capacity
        return int(pointer + self.capacity - 1)

    def update(self, pointer, priority):
        assert pointer < self.capacity
        tree_idx = self.pointer_to_treeidx(pointer)
        priority += self.fixed_weights[pointer]
        if self.alpha != 1:
            priority = np.power(priority, self.alpha)
        delta = priority - self.tree[tree_idx]
        self.tree[tree_idx] = priority
        while tree_idx != 0:
            tree_idx = (tree_idx - 1) // 2
            self.tree[tree_idx] += delta
    
    def update_delta(self, pointer, delta):
        assert pointer < self.capacity
        tree_idx = self.pointer_to_treeidx(pointer)
        ratio = 1- self.fixed_weights[pointer] / self.tree[tree_idx]
        # delta *= ratio
        if self.alpha != 1:
            # Update delta
            if self.tree[tree_idx] < 0 or \
                np.power(self.tree[tree_idx], 1/self.alpha) + delta < 0:
                import pdb; pdb.set_trace()
            delta = np.power(np.power(self.tree[tree_idx], 1/self.alpha) + delta,
                             self.alpha) \
                  - self.tree[tree_idx]
        self.tree[tree_idx] += delta
        while tree_idx != 0:
            tree_idx = (tree_idx - 1) // 2
            self.tree[tree_idx] += delta

    def get_leaf(self, value):
        assert self._initialized, 'PriorityTree not initialized!!!!'
        assert self.total > 0, 'No priority weights setted!!'
        parent = 0
        while True:
            left_child = 2 * parent + 1
            right_child = 2 * parent + 2
            if left_child >= len(self.tree):
                tgt_leaf = parent
                break
            if value < self.tree[left_child]:
                parent = left_child
            else:
                value -= self.tree[left_child]
                parent = right_child
        data_idx = tgt_leaf - self.capacity + 1
        return data_idx, self.tree[tgt_leaf]        # data idx, priority

    @property
    def total(self):
        assert self._initialized, 'PriorityTree not initialized!!!!'
        return self.tree[0]

    @property
    def max(self):
        return np.max(self.tree[-self.capacity:])

    @property
    def min(self):
        assert self._initialized, 'PriorityTree not initialized!!!!'
        return np.min(self.tree[-self.capacity:])
    
    def get_weights(self):
        wdict = {'fixed_weights': self.fixed_weights, 
                 'total_weights': self.get_total_weights()}
        if self.alpha != 1:
            wdict.update({'raw_total_weights': self.get_raw_total_weights(),
                          'alpha': self.alpha})

        return wdict

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/935296.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

【NVIDIA CUDA】2023 CUDA夏令营编程模型(二)

博主未授权任何人或组织机构转载博主任何原创文章&#xff0c;感谢各位对原创的支持&#xff01; 博主链接 本人就职于国际知名终端厂商&#xff0c;负责modem芯片研发。 在5G早期负责终端数据业务层、核心网相关的开发工作&#xff0c;目前牵头6G算力网络技术标准研究。 博客…

为啥外卖小哥宁愿600一月租电动车,也不花2、3千买一辆送外卖!背后的原因......

大家好&#xff01;我是菜哥&#xff01; 又到周末了&#xff0c;我们聊点非技术的东西。最近知乎“为何那些穿梭于城市大街小巷的外卖小哥&#xff0c;宁愿每月掏出600块租一辆电动车&#xff0c;也不愿意掏出2、3千买一辆呢&#xff1f;” 冲上热榜&#xff01; 听起来有点“…

redis学习笔记 - 进阶部分

文章目录 redis单线程如何处理并发的客户端&#xff0c;以及为何单线程快&#xff1f;redis的发展历程&#xff1a;redis单线程和多线程的体现&#xff1a;redis3.x单线程时代但性能很快的主要原因&#xff1a;redis4.x开始引入多线程&#xff1a;redis6/redis7引入多线程IO&am…

UDP通信、本地套接字

#include <sys/types.h> #include <sys/socket > ssize_t sendto(int sockfd, const void *buf, size_t len, int flags,const struct sockaddr *dest_addr, socklen_t addrlen);- 参数&#xff1a;- sockfd : 通信的fd- buf : 要发送的数据- len : 发送数据的长度…

leetcode 567. 字符串的排列(滑动窗口-java)

滑动窗口 字符串的排列滑动窗口代码演示进阶优化版 上期经典 字符串的排列 难度 -中等 leetcode567. 字符串的排列 给你两个字符串 s1 和 s2 &#xff0c;写一个函数来判断 s2 是否包含 s1 的排列。如果是&#xff0c;返回 true &#xff1b;否则&#xff0c;返回 false 。 换句…

curl --resolve参数的作用

之所以会有这样的操作&#xff0c;是因为域名一般对应的都是一个反向代理&#xff0c;直接请求域名&#xff0c;反向代理会将流量随机选一台机器打过去&#xff0c;而无法确保所有的机器都可用。所以直接用ip。 在 curl 命令中&#xff0c;--resolve 参数用于指定自定义的主机名…

【LeetCode-中等题】2. 两数相加

文章目录 题目方法一&#xff1a;借助一个进制位&#xff0c;以及更新尾结点方法一改进&#xff1a;相比较第一种&#xff0c;给head一个临时头节点&#xff08;开始节点&#xff09;&#xff0c;最后返回的时候返回head.next&#xff0c;这样可以省去第一次的判断 题目 方法一…

Java-继承和多态(上)

面向对象思想中提出了继承的概念&#xff0c;专门用来进行共性抽取&#xff0c;实现代码复用。 继承(inheritance)机制&#xff1a;继承主要解决的问题是&#xff1a;共性的抽取&#xff0c;实现代码复用。 继承的语法 在Java中如果要表示类之间的继承关系&#xff0c;需要借助…

2 hadoop的目录

1. 目录结构&#xff1a; 其中比较的重要的路径有&#xff1a; hdfs,mapred,yarn &#xff08;1&#xff09;bin目录&#xff1a;存放对Hadoop相关服务&#xff08;hdfs&#xff0c;yarn&#xff0c;mapred&#xff09;进行操作的脚本 &#xff08;2&#xff09;etc目录&#x…

docker之 Consul(注册与发现)

目录 一、什么是服务注册与发现&#xff1f; 二、什么是consul 三、consul 部署 3.1建立Consul服务 3.1.1查看集群状态 3.1.2通过 http api 获取集群信息 3.2registrator服务器 3.2.1安装 Gliderlabs/Registrator 3.2.2测试服务发现功能是否正常 3.2.3验证 http 和 ng…

Seaborn绘制热力图的子图

Seaborn绘制热力图的子图 提示&#xff1a;如何绘制三张子图 绘制的时候&#xff0c;会出现如下问题 &#xff08;1&#xff09;如何绘制1*3的子图 &#xff08;2&#xff09;三个显示条&#xff0c;如何只显示最后一个 提示&#xff1a;下面就展示详细步骤 Seaborn绘制热力…

虾皮shopee国际站区域根据ID取商品详情 API 返回值说明

虾皮shopee国际站区域根据ID取商品详情 API 数据代码如下&#xff1a; item_get-根据ID取商品详情 shopee.item_get 公共参数 名称类型必须描述技术交流18179014480keyString是调用key&#xff08;必须以GET方式拼接在URL中&#xff09;API接口接入secretString是调用密钥a…

数据结构(Java实现)-包装类和泛型

包装类 在Java中&#xff0c;由于基本类型不是继承自Object&#xff0c;为了在泛型代码中可以支持基本类型&#xff0c;Java给每个基本类型都对应了 一个包装类型。 基本数据类型和对应的包装类 装箱和拆箱 装箱操作&#xff0c;新建一个 Integer 类型对象&#xff0c;将 i 的…

css选择器,只有第一个不加边框,其余都加

我们在开发中&#xff0c;经常会遇到这样的场景&#xff1a; 要给除了第一个以外的其他元素添加样式&#xff0c;那么我们通过css选择器就可以做到这一点&#xff0c;往下看吧&#xff01; 1. 示例代码 2. 效果展示

Transformer模块(Restormer)

由一个MDTA模块和一个GDFN模块组成一个Transformer Block 我们看一下代码实现&#xff1a; class TransformerBlock(nn.Module):def __init__(self, dim, num_heads, ffn_expansion_factor, bias, LayerNorm_type):super(TransformerBlock, self).__init__()self.norm1 Laye…

C++ - C++11

文章目录 1. std::initializer_list2. decltype3. 左值引用和右值引用4. 完美转发(模板中的&&万能引用)5. 类的新功能6. 可变参数模板7. lambda表达式8. 包装器 1. std::initializer_list 内置类型 int main() {/* 内置类型 */int x1 1;int x2 { 2 };int x3{ 3 };i…

经管博士科研必备【6】Gumbel分布

关于Gumbel分布的相关知识也可以参考: 1.3.6.6.16. Extreme Value Type I Distribution Chapter 4 The Multinomial Logit Model | A Self-Instructing Course in Mode Choice Modeling 1.什么是Gumbel分布 在推导MNL模型(Multinomial Logit)、NL(Nested Logit)模型前,我…

dvwa xss通关

反射型XSS通关 low难度 选择难度&#xff1a; 直接用下面JS代码尝试&#xff1a; <script>alert(/xss/)</script>通关成功&#xff1a; medium难度 直接下面代码尝试后失败 <script>alert(/xss/)</script>发现这段代码直接被输出&#xff1a; 尝试…

Python爬虫框架之Selenium库入门:用Python实现网页自动化测试详解

概要 是否还在为网页测试而烦恼&#xff1f;是否还在为重复的点击、等待而劳累&#xff1f;试试强大的Selenium&#xff01;让你的网页自动化测试变得轻松有趣&#xff01; 一、Selenium库到底是什么&#xff1f; Selenium 是一个强大的自动化测试工具&#xff0c;它可以让你直…

OLED透明屏水波纹效果:打造独特的显示体验

OLED透明屏水波纹效果是一种独特的显示技术&#xff0c;通过模拟水波纹的视觉效果&#xff0c;为用户带来更加生动逼真的观感。 根据市场调研报告显示&#xff0c;OLED透明屏水波纹效果已经在广告、游戏和商业领域得到广泛应用&#xff0c;为品牌提供了新的展示方式&#xff0…