Utilizing Transformer Representations Efficiently

news2025/1/10 23:41:20

Contents

  • Introduction
  • Different Pooling Strategies
    • Pooler Output
    • Last Hidden State Output
    • Hidden States Output
  • More...
  • References

Introduction

  • 在用预训练模型微调时,我们比较习惯于直接用 Transformer 最后一层的输出经过 FC / Bi-LSTM… 后输出最终结果。但实际上,Transformer 的每个层都捕捉的是不同粒度的语言信息 (i.e. with surface features in lower layers, syntactic features in middle layers, and semantic features in higher layers),因此有必要针对不同任务采用不同的 pooling strategy

在这里插入图片描述


HuggingFace Transformers 在输入 input_idsattention_mask 后会得到 2 outputs (3 if configured). 下面主要讨论各种 pooling strategy 来综合利用这些输出

  • pooler output [batch_size, hidden_size] - Last layer hidden-state of the first token of the sequence (classification token) further processed by a Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence prediction (classification) objective during pretraining. We can deactivate pooler outputs by setting add pooling layer to False in model config and passing that to model.
  • last hidden state [batch_size, seq_Len, hidden_size] which is the sequence of hidden states at the output of the last layer.
  • hidden states [n_layers, batch_size, seq_len, hidden_size] - Hidden states for all layers and for all ids. (e.g. for base models, n_layers is 1 embed layer + 12 layers = 13. idx 0 is embed layer) Note: To unlock Transformer for giving hidden states as output we need to pass output_hidden_states parameter.

Different Pooling Strategies

Pooler Output

  • Pooler output + FC
logits = nn.Linear(config.hidden_size, 1)(pooler_output) # regression head

Last Hidden State Output

  • CLS Embeddings[CLS] Embed + FC
cls_embeddings = last_hidden_state[:, 0]
logits = nn.Linear(config.hidden_size, 1)(cls_embeddings) # regression head
  • Mean Pooling:Last Hidden State Output + Mean Pooling (remember to ignore padding tokens using attention masks)
input_mask_expanded = attention_mask.unsqueeze(-1).expand(last_hidden_state.size()).float()
sum_embeddings = torch.sum(last_hidden_state * input_mask_expanded, 1)
sum_mask = input_mask_expanded.sum(1)
sum_mask = torch.clamp(sum_mask, min=1e-9)
mean_embeddings = sum_embeddings / sum_mask
logits = nn.Linear(config.hidden_size, 1)(mean_embeddings) # regression head
  • Max Pooling:Last Hidden State Output + Max Pooling (remember to ignore padding tokens using attention masks, i.e. simply set masked token embeds’ value to 1e-9)
input_mask_expanded = attention_mask.unsqueeze(-1).expand(last_hidden_state.size()).float()
last_hidden_state = last_hidden_state.clone()
last_hidden_state[input_mask_expanded == 0] = -1e9  # Set padding tokens to large negative value
max_embeddings = torch.max(last_hidden_state, 1)[0]
logits = nn.Linear(config.hidden_size, 1)(max_embeddings) # regression head
  • Mean + Max Pooling:(1) Last Hidden State Output + Max Pooling. (2) Last Hidden State Output + Mean Pooling. (3) Concat to have a final representation that is twice the hidden size.
mean_pooling_embeddings = torch.mean(last_hidden_state, 1)
_, max_pooling_embeddings = torch.max(last_hidden_state, 1)
mean_max_embeddings = torch.cat((mean_pooling_embeddings, max_pooling_embeddings), 1)
logits = nn.Linear(config.hidden_size*2, 1)(mean_max_embeddings)
  • Conv1D Pooling:Last Hidden State Output + 2 Conv1d layers
cnn1 = nn.Conv1d(768, 256, kernel_size=2, padding=1)
cnn2 = nn.Conv1d(256, 1, kernel_size=2, padding=1)

last_hidden_state = last_hidden_state.permute(0, 2, 1)	# [batch_size, embed_size, seq_len]
cnn_embeddings = F.relu(cnn1(last_hidden_state))	# [batch_size, 256, seq_len]
cnn_embeddings = cnn2(cnn_embeddings)	# [batch_size, 1, seq_len]
logits, _ = torch.max(cnn_embeddings, 2)	# [batch_size, 1]

Hidden States Output

Motivation

  • The output of the last layer may not always be the best representation of the input text during the fine-tuning for downstrea tasks.
  • For pre-trained language models, including Transformer, the most transferable contextualized representations of input text tend to occur in the middle layers, while the top layers specialize for language modeling. Therefore, the use of the last layer’s output may restrict the power of the pre-trained representation.

  • Layerwise CLS Embeddings:e.g. use second-to-last layer CLS Embeddings
all_hidden_states = torch.stack(outputs[2])	# [n_layers, batch_size, seq_len, hidden_size]
cls_embeddings = all_hidden_states[-2, :, 0] # layer_index+1 as we have 13 layers (embedding + num of blocks)
logits = nn.Linear(config.hidden_size, 1)(cls_embeddings) # regression head
  • Concatenate Pooling: Concatenate CLS Embeddings from different layers into one. e.g. Concatenate Last 4 Layers
all_hidden_states = torch.stack(outputs[2])
concatenate_pooling = torch.cat(
    (all_hidden_states[-1], all_hidden_states[-2], all_hidden_states[-3], all_hidden_states[-4]),-1
)
concatenate_pooling = concatenate_pooling[:, 0]
logits = nn.Linear(config.hidden_size*4, 1)(concatenate_pooling) # regression head
  • Weighted Layer Pooling: Token embeddings are the weighted mean of their different hidden layer representations. Averaged CLS Embed can be used as the final representation. Weighted Layer Pooling works the best of all pooling techniques be it any given task.
class WeightedLayerPooling(nn.Module):
    def __init__(self, num_hidden_layers, layer_start: int = 4, layer_weights = None):
        super(WeightedLayerPooling, self).__init__()
        self.layer_start = layer_start
        self.num_hidden_layers = num_hidden_layers
        self.layer_weights = layer_weights if layer_weights is not None \
            else nn.Parameter(
                torch.tensor([1] * (num_hidden_layers+1 - layer_start), dtype=torch.float)
            )

    def forward(self, all_hidden_states):
        all_layer_embedding = all_hidden_states[self.layer_start:, :, :, :]
        weight_factor = self.layer_weights.unsqueeze(-1).unsqueeze(-1).unsqueeze(-1).expand(all_layer_embedding.size())	# [n_layer, batch_size, seq_len, embed_dim]
        weighted_average = (weight_factor*all_layer_embedding).sum(dim=0) / self.layer_weights.sum()
        return weighted_average		# [batch_size, seq_len, embed_dim]
    
layer_start = 9		# 9th ~ 12th hidden layer
pooler = WeightedLayerPooling(
    config.num_hidden_layers, 
    layer_start=layer_start, layer_weights=None
)
weighted_pooling_embeddings = pooler(all_hidden_states)	# [batch_size, seq_len, embed_dim]
weighted_pooling_embeddings = weighted_pooling_embeddings[:, 0]	# Get CLS Embed. [batch_size, embed_dim]
logits = nn.Linear(config.hidden_size, 1)(weighted_pooling_embeddings)
  • LSTM / GRU Pooling:Use a LSTM network to connect all intermediate representations of the [CLS] token, and the output of the last LSTM cell is used as the final representation.
    o = h L S T M L = L S T M ( h C L S i ) , i ∈ [ 1 , L ] o=h_{L S T M}^L=L S T M\left(h_{C L S}^i\right), i \in[1, L] o=hLSTML=LSTM(hCLSi),i[1,L]
class LSTMPooling(nn.Module):
    def __init__(self, num_layers, hidden_size, hiddendim_lstm):
        super(LSTMPooling, self).__init__()
        self.num_hidden_layers = num_layers
        self.hidden_size = hidden_size
        self.hiddendim_lstm = hiddendim_lstm
        self.lstm = nn.LSTM(self.hidden_size, self.hiddendim_lstm, batch_first=True)
        self.dropout = nn.Dropout(0.1)
    
    def forward(self, all_hidden_states):
        ## forward
        hidden_states = torch.stack([all_hidden_states[layer_i][:, 0].squeeze()
                                     for layer_i in range(1, self.num_hidden_layers+1)], dim=-1)		# [batch_size, embed_dim * num_hidden_layers]
        hidden_states = hidden_states.view(-1, self.num_hidden_layers, self.hidden_size)
        out, _ = self.lstm(hidden_states, None)
        out = self.dropout(out[:, -1, :])
        return out

hiddendim_lstm = 256
pooler = LSTMPooling(config.num_hidden_layers, config.hidden_size, hiddendim_lstm)
lstm_pooling_embeddings = pooler(all_hidden_states)
logits = nn.Linear(hiddendim_lstm, 1)(lstm_pooling_embeddings) # regression head
  • Attention Pooling:We can use a dot-product attention module to dynamically combine all intermediates
    o = softmax ⁡ ( q h C L S T ) h C L S W h o= \operatorname{softmax}\left(q h_{C L S}^T\right) h_{C L S}W_h o=softmax(qhCLST)hCLSWh其中, h C L S ∈ R n l × d h h_{CLS}\in\R^{n_l\times d_h} hCLSRnl×dh n l n_l nl 个 layer 的 CLS Embeds, q ∈ R 1 × d h , W h ∈ R d h × d f c q\in\R^{1\times d_h},W_h\in\R^{d_h\times d_{fc}} qR1×dh,WhRdh×dfc 为权重
class AttentionPooling(nn.Module):
    def __init__(self, num_layers, hidden_size, hiddendim_fc):
        super(AttentionPooling, self).__init__()
        self.num_hidden_layers = num_layers
        self.hidden_size = hidden_size
        self.hiddendim_fc = hiddendim_fc
        self.dropout = nn.Dropout(0.1)

        q_t = np.random.normal(loc=0.0, scale=0.1, size=(1, self.hidden_size))
        self.q = nn.Parameter(torch.from_numpy(q_t)).float()
        w_ht = np.random.normal(loc=0.0, scale=0.1, size=(self.hidden_size, self.hiddendim_fc))
        self.w_h = nn.Parameter(torch.from_numpy(w_ht)).float()

    def forward(self, all_hidden_states):
        hidden_states = torch.stack([all_hidden_states[layer_i][:, 0].squeeze()
                                     for layer_i in range(1, self.num_hidden_layers+1)], dim=-1)		# [batch_size, embed_dim * num_hidden_layers]
        hidden_states = hidden_states.view(-1, self.num_hidden_layers, self.hidden_size)	# [batch_size, num_hidden_layers, embed_dim]
        out = self.attention(hidden_states)
        out = self.dropout(out)
        return out

    def attention(self, h):
        v = torch.matmul(self.q, h.transpose(-2, -1)).squeeze(1)
        v = F.softmax(v, -1)
        v_temp = torch.matmul(v.unsqueeze(1), h).transpose(-2, -1)
        v = torch.matmul(self.w_h.transpose(1, 0), v_temp).squeeze(2)
        return v

hiddendim_fc = 128
pooler = AttentionPooling(config.num_hidden_layers, config.hidden_size, hiddendim_fc)
attention_pooling_embeddings = pooler(all_hidden_states)
logits = nn.Linear(hiddendim_fc, 1)(attention_pooling_embeddings) # regression head
  • WKPooling (Wang, Bin, and C-C. Jay Kuo. “Sbert-wk: A sentence embedding method by dissecting bert-based word models.” IEEE/ACM Transactions on Audio, Speech, and Language Processing 28 (2020): 2146-2157.) It consists of the following two steps:
    • Determine a unified word representation for each word in a sentence by integrating its representations across layers by examining its alignment and novelty properties.
    • Conduct a weighted average of unified word representations based on the word importance measure to yield the ultimate sentence embedding vector.

Note: SBERT-WK uses QR decomposition. torch QR decomposition is currently extremely slow when run on GPU. Hence, the tensor is first transferred to the CPU before it is applied. This makes this pooling method rather slow.

class WKPooling(nn.Module):
    def __init__(self, layer_start: int = 4, context_window_size: int = 2):
        super(WKPooling, self).__init__()
        self.layer_start = layer_start
        self.context_window_size = context_window_size

    def forward(self, all_hidden_states):
        ft_all_layers = all_hidden_states
        org_device = ft_all_layers.device
        all_layer_embedding = ft_all_layers.transpose(1,0)
        all_layer_embedding = all_layer_embedding[:, self.layer_start:, :, :]  # Start from 4th layers output

        # torch.qr is slow on GPU (see https://github.com/pytorch/pytorch/issues/22573). So compute it on CPU until issue is fixed
        all_layer_embedding = all_layer_embedding.cpu()

        attention_mask = features['attention_mask'].cpu().numpy()
        unmask_num = np.array([sum(mask) for mask in attention_mask]) - 1  # Not considering the last item
        embedding = []

        # One sentence at a time
        for sent_index in range(len(unmask_num)):
            sentence_feature = all_layer_embedding[sent_index, :, :unmask_num[sent_index], :]
            one_sentence_embedding = []
            # Process each token
            for token_index in range(sentence_feature.shape[1]):
                token_feature = sentence_feature[:, token_index, :]
                # 'Unified Word Representation'
                token_embedding = self.unify_token(token_feature)
                one_sentence_embedding.append(token_embedding)

            ##features.update({'sentence_embedding': features['cls_token_embeddings']})

            one_sentence_embedding = torch.stack(one_sentence_embedding)
            sentence_embedding = self.unify_sentence(sentence_feature, one_sentence_embedding)
            embedding.append(sentence_embedding)

        output_vector = torch.stack(embedding).to(org_device)
        return output_vector

    def unify_token(self, token_feature):
        ## Unify Token Representation
        window_size = self.context_window_size

        alpha_alignment = torch.zeros(token_feature.size()[0], device=token_feature.device)
        alpha_novelty = torch.zeros(token_feature.size()[0], device=token_feature.device)

        for k in range(token_feature.size()[0]):
            left_window = token_feature[k - window_size:k, :]
            right_window = token_feature[k + 1:k + window_size + 1, :]
            window_matrix = torch.cat([left_window, right_window, token_feature[k, :][None, :]])
            Q, R = torch.qr(window_matrix.T)

            r = R[:, -1]
            alpha_alignment[k] = torch.mean(self.norm_vector(R[:-1, :-1], dim=0), dim=1).matmul(R[:-1, -1]) / torch.norm(r[:-1])
            alpha_alignment[k] = 1 / (alpha_alignment[k] * window_matrix.size()[0] * 2)
            alpha_novelty[k] = torch.abs(r[-1]) / torch.norm(r)

        # Sum Norm
        alpha_alignment = alpha_alignment / torch.sum(alpha_alignment)  # Normalization Choice
        alpha_novelty = alpha_novelty / torch.sum(alpha_novelty)

        alpha = alpha_novelty + alpha_alignment
        alpha = alpha / torch.sum(alpha)  # Normalize

        out_embedding = torch.mv(token_feature.t(), alpha)
        return out_embedding

    def norm_vector(self, vec, p=2, dim=0):
        ## Implements the normalize() function from sklearn
        vec_norm = torch.norm(vec, p=p, dim=dim)
        return vec.div(vec_norm.expand_as(vec))

    def unify_sentence(self, sentence_feature, one_sentence_embedding):
        ## Unify Sentence By Token Importance
        sent_len = one_sentence_embedding.size()[0]

        var_token = torch.zeros(sent_len, device=one_sentence_embedding.device)
        for token_index in range(sent_len):
            token_feature = sentence_feature[:, token_index, :]
            sim_map = self.cosine_similarity_torch(token_feature)
            var_token[token_index] = torch.var(sim_map.diagonal(-1))

        var_token = var_token / torch.sum(var_token)
        sentence_embedding = torch.mv(one_sentence_embedding.t(), var_token)

        return sentence_embedding
    
    def cosine_similarity_torch(self, x1, x2=None, eps=1e-8):
        x2 = x1 if x2 is None else x2
        w1 = x1.norm(p=2, dim=1, keepdim=True)
        w2 = w1 if x2 is x1 else x2.norm(p=2, dim=1, keepdim=True)
        return torch.mm(x1, x2.t()) / (w1 * w2.t()).clamp(min=eps)
pooler = WKPooling(layer_start=9)
wkpooling_embeddings = pooler(all_hidden_states)
logits = nn.Linear(config.hidden_size, 1)(wkpooling_embeddings) # regression head

More…

  • SWA, Apex AMP & Interpreting Transformers in Torch notebook is an implementation of the Stochastic Weight Averaging technique with NVIDIA Apex on transformers using PyTorch. The notebook also implements how to interactively interpret Transformers using LIT (Language Interpretability Tool) a platform for NLP model understanding.
  • On Stability of Few-Sample Transformer Fine-Tuning notebook goes over various remedies to increase few-sample fine-tuning stability and they show a significant performance improvement over simple finetuning methods.
  • Speeding up Transformer w/ Optimization Strategies notebook explains in-depth 5 optimization strategies with code. All these techniques are promising and can improve the model performance both in terms of speed and accuracy.
  • Other strategies: Dense Pooling, Word Weight (TF-IDF) Pooling, Async Pooling, Parallel / Heirarchical Aggregation

References

  • Utilizing Transformer Representations Efficiently
  • Kaggle competition: https://www.kaggle.com/competitions/feedback-prize-english-language-learning/overview
  • kaggle code: FB3 single pytorch model [train]

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/21566.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

Perforce P4V 资源汇总

Perforce P4V 入门https://download.csdn.net/download/love_xiaozhao/20533522 P4 Command Referencehttps://download.csdn.net/download/love_xiaozhao/20534062 P4V文件状态命令速查表https://download.csdn.net/download/love_xiaozhao/20533404PerforcexHelix分支策略_…

【android Framework 探究】android 13 aosp编译全记录

写在开始 自从关注Framework这一块儿,就有了自己编译aosp刷机的想法,模拟器当然是可以的,但是体验感还不能和真机想比,于是买一个二手piexl的想法就有了,根据预算选定piexl 5,支持最新的android 13&#xf…

编码命名方式知多少

文章目录1.camel case (驼峰式)2.snake case (蛇形式)3.kebab case (烤串式)4.匈牙利命名法5.小结参考文献编码时,命名无处不在。比如我们需要对文件命令,对目录命名,对变…

m低信噪比下GPS信号的捕获算法研究,使用matlab算法进行仿真

目录 1.算法概述 2.仿真效果预览 3.MATLAB部分代码预览 4.完整MATLAB程序 1.算法概述 GPS系统的星座部分是由21颗工作卫星和3颗在轨备用卫星组成,其高度为20183km,这24颗卫星均匀分布在6个等间隔的、相对轨道面倾角为55的近圆轨道上。 ​ GPS卫星的…

处理csv、bmp等常用数据分析操作--python

请先看思维导图,看是否包含你所需要的东西,如果没有,就可以划走了,免得浪费时间,谢谢 条条大路通罗马,我只是介绍了我掌握的这一条,不喜勿喷,谢谢。 目录 一、创建文件夹&#xff0…

一行日志,让整个文件导出服务导出内容都为空..

输出一行日志&#xff0c;却让整个文件上传服务挂了...问题分析小结问题 直接上代码&#xff0c;看看有无眼尖的小伙伴发现问题&#xff1a; // 设置参数 MultiValueMap<String, Object> param new LinkedMultiValueMap<>(); FileSystemResource resource new …

log4cpp初入门

目录下载与安装log4cpp框架CategoryAppenderLayoutPriortyOutput功能日志级别⽇志格式化⽇志输出日志回滚日志配置文件下载与安装 https://sourceforge.net/projects/log4cpp/ tar xvf log4cpp-1.1.3.tar.gz cd log4cpp ./configure make make check make install ldconfig…

轻松玩转树莓派Pico之三、Windows+Ubuntu虚拟机模式下VSCode C语言开发环境搭建

目录 1、VSCode下载与安装 2、VSCode基础插件安装 3、SSH连接与配置 4、SSH免密登录 5、Pico编译 工欲善其事&#xff0c;必先利其器。之前的介绍的Pico流程都是通过命令行编译&#xff0c;没有进行更深入的介绍&#xff0c;本文将介绍Pico的VSCode-C语言开发环境搭建与配…

Rust WASM 与 JS 计算素数性能对比

前言 刚接触Rust wasm&#xff0c;请各看官高抬贵手。 简介 根据网上资料&#xff0c;使用 wasm-pack 搭配wasm-bindgen将Rust代码编译成 wasm。 搭好环境后&#xff0c;想对比一下rust-wasm与js的性能差距。 环境 OS: Deepin 20.7.1 apricotKernel: Linux 5.15.34CPU: Int…

LeetCode单周赛第320场 AcWing周赛第78场总结

1. LeetCode单周赛第320场 1.1 数组中不等三元组的数目 1.1.1 原题链接&#xff1a;力扣https://leetcode.cn/problems/number-of-unequal-triplets-in-array/ 1.1.2 解题思路&#xff1a; 暴力遍历咯。 1.1.3 代码&#xff1a; class Solution { public:int unequalTripl…

JAVA的学习心路历程之JDK基础入门(上)

任务需要&#xff0c;需要我学java调用linux下的动态库&#xff0c;于是搜寻java知识更新这篇。 从我上大学起我就听别人说JAVA&#xff0c;不不&#xff0c;应该是初中&#xff0c;那时候流行带键盘的智能手机&#xff0c;里面有好些个游戏都是JAVA写的&#xff0c;可见JAVA有…

【JavaWeb】HTML

HTML1 HTML概念1.1 超文本1.2 标记语言2 HTML的入门程序3 HTML语法规则4 使用idea创建StaticWeb工程5 HTML的各个标签的使用5.1标题标签5.2段落标签5.3换行标签5.4无序列表标签5.5超链接标签5.6图像标签5.7块标签6.使用表格标签展示数据6.1未合并单元格6.2合并单元格-合并列6.3…

套接字+网络套接字函数+客户端大小写程序

NAT映射 一般来说&#xff1a;源主机和目的主机都属于局域网&#xff0c;也就是ip地址可能相同 但是&#xff1a;路由器一般是公网ip,即整个网络环境可见 每一个路由器会维护一个NAT映射表 路由器的ip一般是固定的公网ip路由器与把与它相连的主机&#xff1a;私有ip端口号映射…

Java-多线程-ThreadPoolExecutor

前言 前面我们讲解线程的时候&#xff0c;讲到了使用Executors创建线程池&#xff0c;但是它里面所有方法可变的参数太少&#xff0c;不能很好的进行自定义设置&#xff0c;以及以后的扩展&#xff0c;更合理的使用cpu线程的操作&#xff0c;所以使用ThreadPoolExecutor创建线程…

SpringBoot集成webservice

前言 之前在工作中&#xff0c;有时候需要去对接第三方的医院&#xff0c;而很多医院的his系统用的都是老技术&#xff08;WebService&#xff09;。一直在对接webservice接口&#xff0c;却不知道webservice接口是怎么实现的&#xff0c;这一次&#xff0c;我们来一探究竟。 …

Android Compose Bloom 项目实战 (一) : 项目说明与配置

1. 项目介绍 Bloom是谷歌 AndroidDevChallenge (Android 开发挑战赛) 中的一期活动&#xff0c;目的是为了推广Compose&#xff0c;非常适合用来练手&#xff0c;通过这个项目&#xff0c;我们可以很好的入门Compose。本文介绍了如何从零开始&#xff0c;开发这个Compose项目。…

计算机系统结构期末复习

名词解释 程序访问局部性 时间局部性是指最近被访问过的数据很可能再次被访问 空间局部性是指最近被访问过的存储空间的附近空间可能会被访问 计算机体系结构 计算机体系结构是程序员所看到的计算机属性&#xff0c;即概念性结构与功能特性 窗口重叠技术 为了能更简单、更直接…

UE5笔记【五】操作细节——光源、光线参数配置、光照图修复

设置光线重载质量模式为预览&#xff1a;可以加快重构速度。 为了更快速高效的学习&#xff0c;直接查看别人已经建好的素材实例。 在EpicGames启动器中打开示例&#xff0c;找到这个照片级渲染。查看别人建好的效果图。 创建工程以UE5版本打开。进入查看。 观察Baked和碰撞测…

MySQL常用函数大全(面试篇)

本篇文章讲解是是MySQL的函数方法&#xff0c;涵盖所有的MySQL常见的方法。主要介绍了面试常问函数。 一、数字函数二、字符串函数三、日期函数四、MySQL高级函数 &#xff08;一&#xff09;数字函数 1、ABS(x) 返回x的绝对值 2、AVG(expression) 返回一个表达式的平均值&am…

Redis分布式锁

1. 什么是分布式锁 分布式锁指的是&#xff0c;所有服务中的所有线程都去获得同一把锁&#xff0c;但只有一个线程可以成功的获得锁&#xff0c;其他没有获得锁的线程必须全部等待&#xff0c;等到获得锁的线程释放掉锁之后获得了锁才能进行操作。 Redis官网中&#xff0c;set…