Bert和LSTM:情绪分类中的表现

news2025/1/24 17:33:20

一、说明

        这篇文章的目的是评估和比较 2 种深度学习算法(BERT 和 LSTM)在情感分析中进行二元分类的性能。评估将侧重于两个关键指标:准确性(衡量整体分类性能)和训练时间(评估每种算法的效率)。

二、数据

        为了实现这一目标,我使用了IMDB数据集,其中包括50,000条电影评论。数据集平均分为 25,000 条正面评论和 25,000 条负面评论,使其适用于训练和测试二进制情绪分析模型。若要获取数据集,请转到以下链接:

50K电影评论的IMDB数据集

大型影评数据集

www.kaggle.com

        下图显示了数据集的五行。我给积极情绪分配了 1,给消极情绪分配了 0。

 

三、算法

        1. 长短期记忆(LSTM):它是一种循环神经网络(RNN),旨在处理顺序数据。它可以通过使用存储单元和门来捕获长期依赖关系。

        2. BERT(来自变压器的双向编码器表示):它是一种预先训练的基于变压器的模型,使用自监督学习方法。利用双向上下文来理解句子中单词的含义。

        -配置

        对于 LSTM,模型采用文本序列以及每个序列的相应长度作为输入。它嵌入文本(嵌入维度 = 20),通过 LSTM 层(大小 = 64)处理文本,通过 ReLU 激活的全连接层传递最后一个隐藏状态,最后应用 S 形激活以生成 0 到 1 之间的单个输出值。(周期数:10,学习率:0.001,优化器:亚当)

        对于BERT,我使用了DistilBertForSequenceClassification,它基于DistilBERT架构。DistilBERT是原始BERT模型的较小,蒸馏版本。它旨在具有较少数量的参数并降低计算复杂性,同时保持相似的性能水平。(周期数:3,学习率:5e-5,优化器:亚当)

四、LSTM 代码

!pip install torchtext

!pip install portalocker>=2.0.0

import torch
import torch.nn as nn

from torchtext.datasets import IMDB
from torch.utils.data.dataset import random_split

# Step 1: load and create the datasets

train_dataset = IMDB(split='train')
test_dataset = IMDB(split='test')

# Set random number to 123 to compare with BERT model
torch.manual_seed(123)
train_dataset, valid_dataset = random_split(
    list(train_dataset), [20000, 5000])

## Step 2: find unique tokens (words)
import re
from collections import Counter, OrderedDict

token_counts = Counter()

def tokenizer(text):
    text = re.sub('<[^>]*>', '', text)
    emoticons = re.findall('(?::|;|=)(?:-)?(?:\)|\(|D|P)', text.lower())
    text = re.sub('[\W]+', ' ', text.lower()) +\
        ' '.join(emoticons).replace('-', '')
    tokenized = text.split()
    return tokenized


for label, line in train_dataset:
    tokens = tokenizer(line)
    token_counts.update(tokens)


print('Vocab-size:', len(token_counts))

## Step 3: encoding each unique token into integers
from torchtext.vocab import vocab

sorted_by_freq_tuples = sorted(token_counts.items(), key=lambda x: x[1], reverse=True)
ordered_dict = OrderedDict(sorted_by_freq_tuples)

vocab = vocab(ordered_dict)

'''
The special tokens "<pad>" and "<unk>" are inserted into the vocabulary using vocab.insert_token("<pad>", 0) and vocab.insert_token("<unk>", 1) respectively.
The index 0 is assigned to "<pad>" token, which is typically used for padding sequences.
The index 1 is assigned to "<unk>" token, which represents unknown or out-of-vocabulary tokens.


vocab.set_default_index(1) sets the default index of the vocabulary to 1, which corresponds to the "<unk>" token.
This means that if a token is not found in the vocabulary, it will be mapped to the index 1 by default.
'''

vocab.insert_token("<pad>", 0)
vocab.insert_token("<unk>", 1)
vocab.set_default_index(1)

print([vocab[token] for token in ['this', 'is', 'an', 'example']])

'''
The IMDB class in datatext contains 1 = negative and 2 = positive
'''

'''
The label_pipeline lambda function takes a label value x as input.
It checks if the label value x is equal to 2 using the comparison x == 2.
If the condition is true, it returns a float value of 1.0. This implies that the label is positive.
If the condition is false (i.e., the label value is not equal to 2), it returns a float value of 0.0. This implies that the label is negative.
'''

text_pipeline = lambda x: [vocab[token] for token in tokenizer(x)]
label_pipeline = lambda x: 1. if x == 2 else 0

'''
This line suggests that the subsequent computations and tensors will be moved to the specified CUDA device for processing,
taking advantage of GPU acceleration if available.
'''

device = torch.device("cuda:0")

## Step 3-B: wrap the encode and transformation function

'''
Instead of loading the whole reviews into memory which is way too expensive for the computer,
you can load a batch for manuy times which requires way less memory as compared to loading the complete data set.
Another reason that we use batch is that if we load the whole dataset at once, the deep learning algorithm(may be a neural network)
has to store errors values for all data points in the memory and this will cause a great decrease in speed of training.
With batches, the model updates the parameters(weights and bias) only after passing through the whole data set.
'''

def collate_batch(batch):
    label_list, text_list, lengths = [], [], []
    for _label, _text in batch:
        label_list.append(label_pipeline(_label))
        processed_text = torch.tensor(text_pipeline(_text),
                                      dtype=torch.int64)
        text_list.append(processed_text)
        lengths.append(processed_text.size(0))

    ## Convert lists to tensors
    label_list = torch.tensor(label_list)
    lengths = torch.tensor(lengths)

    ## pads the text sequences in text_list to have the same length by adding padding tokens.
    padded_text_list = nn.utils.rnn.pad_sequence(
        text_list, batch_first=True)
    return padded_text_list.to(device), label_list.to(device), lengths.to(device)

## Take a small batch to check if the wrapping works

from torch.utils.data import DataLoader
dataloader = DataLoader(train_dataset, batch_size=4, shuffle=False, collate_fn=collate_batch)
text_batch, label_batch, length_batch = next(iter(dataloader))
print(text_batch)
print(label_batch)
print(length_batch)
print(text_batch.shape)

## Step 4: batching the datasets

batch_size = 32

train_dl = DataLoader(train_dataset, batch_size=batch_size,
                      shuffle=True, collate_fn=collate_batch)
valid_dl = DataLoader(valid_dataset, batch_size=batch_size,
                      shuffle=False, collate_fn=collate_batch)
test_dl = DataLoader(test_dataset, batch_size=batch_size,
                     shuffle=False, collate_fn=collate_batch)

print(len(list(train_dl.dataset)))
print(len(list(valid_dl.dataset)))
print(len(list(test_dl.dataset)))

'''
the code defines an RNN model that takes encoded text inputs,
processes them through an embedding layer and an LSTM layer,
and produces a binary output using fully connected layers and a sigmoid activation function.
The model is initialized with specific parameters and moved to the specified device for computation.
'''


class RNN(nn.Module):
    def __init__(self, vocab_size, embed_dim, rnn_hidden_size, fc_hidden_size):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size,
                                      embed_dim,
                                      padding_idx=0)
        self.rnn = nn.LSTM(embed_dim, rnn_hidden_size,
                           batch_first=True)
        self.fc1 = nn.Linear(rnn_hidden_size, fc_hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(fc_hidden_size, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, text, lengths):
        out = self.embedding(text)
        out = nn.utils.rnn.pack_padded_sequence(out, lengths.cpu().numpy(), enforce_sorted=False, batch_first=True)
        out, (hidden, cell) = self.rnn(out)
        out = hidden[-1, :, :]
        out = self.fc1(out)
        out = self.relu(out)
        out = self.fc2(out)
        out = self.sigmoid(out)
        return out

vocab_size = len(vocab)
embed_dim = 20
rnn_hidden_size = 64
fc_hidden_size = 64

torch.manual_seed(123)
model = RNN(vocab_size, embed_dim, rnn_hidden_size, fc_hidden_size)
model = model.to(device)

def train(dataloader):
    model.train()
    total_acc, total_loss = 0, 0
    for text_batch, label_batch, lengths in dataloader:
        optimizer.zero_grad()
        pred = model(text_batch, lengths)[:, 0]
        loss = loss_fn(pred, label_batch)
        loss.backward()
        optimizer.step()
        total_acc += ((pred>=0.5).float() == label_batch).float().sum().item()
        total_loss += loss.item()*label_batch.size(0)
    return total_acc/len(dataloader.dataset), total_loss/len(dataloader.dataset)

def evaluate(dataloader):
    model.eval()
    total_acc, total_loss = 0, 0
    with torch.no_grad():
        for text_batch, label_batch, lengths in dataloader:
            pred = model(text_batch, lengths)[:, 0]
            loss = loss_fn(pred, label_batch.float())  # Convert label_batch to Float
            total_acc += ((pred >= 0.5).float() == label_batch).float().sum().item()
            total_loss += loss.item() * label_batch.size(0)


    return total_acc/len(list(dataloader.dataset)),\
           total_loss/len(list(dataloader.dataset))

import time
start_time = time.time()

loss_fn = nn.BCELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

num_epochs = 10

torch.manual_seed(123)

for epoch in range(num_epochs):
    acc_train, loss_train = train(train_dl)
    acc_valid, loss_valid = evaluate(valid_dl)
    print(f'Epoch {epoch} accuracy: {acc_train:.4f} val_accuracy: {acc_valid:.4f}')
    print(f'Time elapsed: {(time.time() - start_time)/60:.2f} min')
print(f'Total Training Time: {(time.time() - start_time)/60:.2f} min')

acc_test, _ = evaluate(test_dl)
print(f'test_accuracy: {acc_test:.4f}')

"""### Test with new movie reviews of Spider-Man: Across the Spider-Verse (2023)"""

def collate_single_text(text):
    processed_text = torch.tensor(text_pipeline(text), dtype=torch.int64)
    length = processed_text.size(0)
    padded_text = nn.utils.rnn.pad_sequence([processed_text], batch_first=True)
    return padded_text.to(device), length

text = "It is the first marvel movie to make me shed a tear. It has heart, it feels so alive with it's conveyance of emotions and feelings, it uses our nostalgia for the first movie AGAINST US it is on a completely new level of animation, there is a twist on every turn you make while watching this movie. "
padded_text, length = collate_single_text(text)
padded_text = padded_text.to(device)

model.eval()  # Set the model to evaluation mode
with torch.no_grad():
    encoded_text = padded_text.to(device)  # Move the encoded_text tensor to the CUDA device

    lengths = torch.tensor([len(encoded_text)])  # Compute the length of the text sequence

    output = model(encoded_text, lengths)  # Pass the lengths argument
    probability = output.item()  # Obtain the predicted probability
    if probability >= 0.5:
        prediction = "Positive"
    else:
        prediction = "Negative"

print(f"Text: {text}")
print(f"Prediction: {prediction} (Probability: {probability})")

text = "This movie was very boring and garbage this is why Hollywood has zero imagination. They rewrote Spiderman as Miles Morales so that they can fit the DEI agenda which was more important than time. "
padded_text, length = collate_single_text(text)
padded_text = padded_text.to(device)

model.eval()  # Set the model to evaluation mode
with torch.no_grad():
    encoded_text = padded_text.to(device)  # Move the encoded_text tensor to the CUDA device

    lengths = torch.tensor([len(encoded_text)])  # Compute the length of the text sequence

    output = model(encoded_text, lengths)  # Pass the lengths argument
    probability = output.item()  # Obtain the predicted probability
    if probability >= 0.5:
        prediction = "Positive"
    else:
        prediction = "Negative"

print(f"Text: {text}")
print(f"Prediction: {prediction} (Probability: {probability})")

五、BERT代码

!pip install transformers

import gzip
import shutil
import time

import pandas as pd
import requests
import torch
import torch.nn.functional as F
import torchtext

import transformers
from transformers import DistilBertTokenizerFast
from transformers import DistilBertForSequenceClassification

torch.backends.cudnn.deterministic = True
RANDOM_SEED = 123
torch.manual_seed(RANDOM_SEED)
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

NUM_EPOCHS = 3

path = '/content/drive/MyDrive/data/movie_data.csv'

df = pd.read_csv(path)

df.head()

df.shape

train_texts = df.iloc[:35000]['review'].values
train_labels = df.iloc[:35000]['sentiment'].values

valid_texts = df.iloc[35000:40000]['review'].values
valid_labels = df.iloc[35000:40000]['sentiment'].values

test_texts = df.iloc[40000:]['review'].values
test_labels = df.iloc[40000:]['sentiment'].values

tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-uncased')

train_encodings = tokenizer(list(train_texts), truncation=True, padding=True)
valid_encodings = tokenizer(list(valid_texts), truncation=True, padding=True)
test_encodings = tokenizer(list(test_texts), truncation=True, padding=True)

train_encodings[0]

class IMDbDataset(torch.utils.data.Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels

    def __getitem__(self, idx):
        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
        item['labels'] = torch.tensor(self.labels[idx])
        return item

    def __len__(self):
        return len(self.labels)


train_dataset = IMDbDataset(train_encodings, train_labels)
valid_dataset = IMDbDataset(valid_encodings, valid_labels)
test_dataset = IMDbDataset(test_encodings, test_labels)

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=16, shuffle=True)
valid_loader = torch.utils.data.DataLoader(valid_dataset, batch_size=16, shuffle=False)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=16, shuffle=False)

model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased')
model.to(DEVICE)
model.train()

optim = torch.optim.Adam(model.parameters(), lr=5e-5)

def compute_accuracy(model, data_loader, device):
    with torch.no_grad():
        correct_pred, num_examples = 0, 0

        for batch_idx, batch in enumerate(data_loader):

        ### Prepare data
            input_ids = batch['input_ids'].to(device)
            attention_mask = batch['attention_mask'].to(device)
            labels = batch['labels'].to(device)
            outputs = model(input_ids, attention_mask=attention_mask)
            logits = outputs['logits']
            predicted_labels = torch.argmax(logits, 1)
            num_examples += labels.size(0)
            correct_pred += (predicted_labels == labels).sum()

        return correct_pred.float()/num_examples * 100

start_time = time.time()

for epoch in range(NUM_EPOCHS):

    model.train()

    for batch_idx, batch in enumerate(train_loader):

        ### Prepare data
        input_ids = batch['input_ids'].to(DEVICE)
        attention_mask = batch['attention_mask'].to(DEVICE)
        labels = batch['labels'].to(DEVICE)

        ### Forward
        outputs = model(input_ids, attention_mask=attention_mask, labels=labels)
        loss, logits = outputs['loss'], outputs['logits']

        ### Backward
        optim.zero_grad()
        loss.backward()
        optim.step()

        ### Logging
        if not batch_idx % 250:
            print (f'Epoch: {epoch+1:04d}/{NUM_EPOCHS:04d} | '
                   f'Batch {batch_idx:04d}/{len(train_loader):04d} | '
                   f'Loss: {loss:.4f}')

    model.eval()

    with torch.set_grad_enabled(False):
        print(f'Training accuracy: '
              f'{compute_accuracy(model, train_loader, DEVICE):.2f}%'
              f'\nValid accuracy: '
              f'{compute_accuracy(model, valid_loader, DEVICE):.2f}%')

    print(f'Time elapsed: {(time.time() - start_time)/60:.2f} min')

print(f'Total Training Time: {(time.time() - start_time)/60:.2f} min')
print(f'Test accuracy: {compute_accuracy(model, test_loader, DEVICE):.2f}%')

六、结果

 

七、为什么BERT的性能优于LSTM?

        BERT之所以获得高准确率,有几个原因:
        1)BERT通过考虑给定单词两侧的周围单词来捕获单词的上下文含义。这种双向方法使模型能够理解语言的细微差别并有效地捕获单词之间的依赖关系。

        2)BERT采用变压器架构,可有效捕获顺序数据中的长期依赖关系。转换器采用自我注意机制,使模型能够权衡句子中不同单词的重要性。这种注意力机制有助于BERT专注于相关信息,从而获得更好的表示和更高的准确性。

        3)BERT在大量未标记的数据上进行预训练。这种预训练允许模型学习一般语言表示,并获得对语法、语义和世界知识的广泛理解。通过利用这些预训练的知识,BERT可以更好地适应下游任务并实现更高的准确性。

八、结论

        与 LSTM 相比,BERT 确实需要更长的时间来微调,因为它的架构更复杂,参数空间更大。但同样重要的是要考虑到BERT在许多任务中的性能优于LSTM。 达门·

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/965552.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

OpenCV(十六):高斯图像金字塔

目录 1.高斯图像金字塔原理 2.高斯图像金字塔实现 1.高斯图像金字塔原理 高斯图像金字塔是一种用于多尺度图像表示和处理的重要技术。它通过对图像进行多次高斯模糊和下采样操作来生成不同分辨率的图像层级&#xff0c;每个层级都是原始图像的模糊和降采样版本。 以下是高斯…

系统中出现大量不可中断进程和僵尸进程(理论)

一 进程状态 当 iowait 升高时&#xff0c;进程很可能因为得不到硬件的响应&#xff0c;而长时间处于不可中断状态。从 ps 或者 top 命令的输出中&#xff0c;你可以发现它们都处于 D 状态&#xff0c;也就是不可中断状态&#xff08;Uninterruptible Sleep&#xff09;。 R …

Python Opencv实践 - 轮廓检测

import cv2 as cv import numpy as np import matplotlib.pyplot as pltimg cv.imread("../SampleImages/map.jpg") print(img.shape) plt.imshow(img[:,:,::-1])#Canny边缘检测 edges cv.Canny(img, 127, 255, 0) plt.imshow(edges, cmapplt.cm.gray)#查找轮廓 #c…

【C++代码】找出字符串中第一个匹配项的下标,重复的子字符串--代码随想录

题目&#xff1a;找出字符串中第一个匹配项的下标 给你两个字符串 haystack 和 needle &#xff0c;请你在 haystack 字符串中找出 needle 字符串的第一个匹配项的下标&#xff08;下标从 0 开始&#xff09;。如果 needle 不是 haystack 的一部分&#xff0c;则返回 -1 。 题…

地理测绘基础知识(4) 照射计算上篇

我们接着上一篇来推导。 照射计算&#xff0c;是一种常用的三维几何计算。已知一个光源的光强图&#xff0c;计算光源投射到地表各处的功率密度。这种计算需求可以直观的理解为计算已知位置、指向、聚光特性的手电筒&#xff0c;计算地表某地点强度。当然&#xff0c;如果穷尽地…

mysql使用st_distance_sphere函数报错Incorrect arguments to st_distance_sphere

最近发现执行mysql st_distance_sphere报错了。 报错的信息是Incorrect arguments to st_distance_sphere。 最开始以为是跟mysql的版本有关系&#xff0c;所以看了下自己本地的mysql版本&#xff0c;执行一下sql select version(); 发现自己本地的mysql版本是 5.7.30 这…

FFmpeg报错:Connection to tcp://XXX?timeout=XXX failed: Connection timed out

一、现象 通过FFmpeg&#xff08;FFmpeg的版本是5.0.3&#xff09;拉摄像机的rtsp流获取音视频数据&#xff0c;执行命令&#xff1a; ./ffmpeg -timeout 3000000 -i "rtsp://172.16.17.156/stream/video5" 报错&#xff1a;Connection to tcp://XXX?timeoutXXX …

JavaScript 生成 16: 9 宽高比

这篇文章只是对 for 循环一个简单应用&#xff0c;没有什么知识含量。 可以跳过这篇文章。 只是我用来保存一下我的代码&#xff0c;保存在本地我嫌碍眼&#xff0c;总想把他删了。 正文部分 公式&#xff1a;其中 width 表示宽度&#xff0c;height 表示高度 16 9 w i d t…

CLIP改进工作串讲(bryanyzhu)内容记录

文章目录 分割Language-driven semantic segmentation - ICLR2022GroupViT: Semantic Segmentation Emerges from Text Supervision 目标检测ViLD : Open-vocabulary object detection via vision and language knowledge distillation 视觉定位GLIP:Grounded Language-Image P…

VsCode搭建Java开发环境 vscode搭建java开发环境 vscode springboot 搭建springboot

VsCode搭建Java开发环境 vscode搭建java开发环境 vscode springboot 搭建springboot VsCode java开发截图1、安装Java 环境相关插件2、安装 Spring 插件3、安装 Mybatis 插件第一个 vsc-mybatis第二个 mybatisX 4、安装Maven环境4.1、安装Maven环境4.2、VsCode配置Maven环境 5、…

使用Python进行Base64编码和解码

假设您有一个想要通过网络传输的二进制图像文件。您很惊讶对方没有正确接收该文件 - 该文件只是包含奇怪的字符&#xff01; 嗯&#xff0c;您似乎试图以原始位和字节格式发送文件&#xff0c;而所使用的媒体是为流文本而设计的。 避免此类问题的解决方法是什么&#xff1f;答…

Interspeech 2023 | 火山引擎流媒体音频技术之语音增强和AI音频编码

背景介绍 为了应对处理各类复杂音视频通信场景&#xff0c;如多设备、多人、多噪音场景&#xff0c;流媒体通信技术渐渐成为人们生活中不可或缺的技术。为达到更好的主观体验&#xff0c;使用户听得清、听得真&#xff0c;流媒体音频技术方案融合了传统机器学习和基于AI的语音增…

微服务--Seata(分布式事务)

TCC模式在代码中实现&#xff1a;侵入性强&#xff0c;并且的自己实现事务控制逻辑 Try&#xff0c;Confirm() cancel() 第三方开源框架&#xff1a;BeyeTCC\TCC-transaction\Himly 异步实现&#xff1a;MQ可靠消息最终一致性 GlobalTransacational---AT模式

Threejs里反向播放动画

在Blender里给对象添加了一个动画后&#xff0c;假设是在帧1到帧40添加的动画帧&#xff0c;那么正常播放时是从帧1到帧40&#xff0c;反向播放则是从帧40到帧1&#xff0c;本文讲述如何在Threejs里方向播放Blender里添加的动画。 一 添加动画 之前文章中已经讲述如何在Blende…

MAC ITEM 解决cd: string not in pwd的问题

今天使用cd 粘贴复制的路径的时候,报了这么一个错. cd: string not in pwd eistert192 Library % cd Application Support cd: string not in pwd: Application eistert192 Library % 让人一脸懵逼. 对比一下,发现中文路径里的空格截断了路径 导致后面的路径就没有办法被包含…

财报解读:迈向高端化,珍酒李渡如何持续讲好品牌故事?

2023年上半年&#xff0c;尤其是第二季度&#xff0c;白酒行业淡季属性较为明显。对于市场情况&#xff0c;中国酒业协会《2023中国白酒市场中期研究报告》也有所披露&#xff1a;约40.91%的受访者反馈春节后平日的白酒消费量有所减少&#xff0c;约31.82%的受访者反馈五一期间…

数据结构与算法(二)算法分析

算法的特性 算法具有五个基本特性&#xff1a;输入、输出、有穷性、确定性和可行性。 输入输出 算法具有零个或多个输入至少有一个或多个输出&#xff1a;算法是一定需要输出的&#xff0c;不需要输出&#xff0c;你用这个算法干吗&#xff1f; 有穷性 指算法在执行有限的步骤…

教你如何进行vcruntime140_1.dll文件下载安装,4种方法详细的安装方法

今天主要要跟大家说说vcruntime140_1.dll文件下载安装&#xff0c;其实要下载安装这个文件还是有不少方法的&#xff0c;只要不要慌&#xff0c;有的时候办法解决&#xff0c;首先我们要知道vcruntime140_1.dll是Microsoft Visual C的一部分&#xff0c;是许多计算机程序运行所…

Python项目打包与部署(1):模块与包的概念与关系

Python是动态类型编程语言&#xff0c;意味着python不需要提前编译。1个Python项目通常也包含多个.py文件&#xff0c; 通常也会引用python标准库&#xff0c;或第3方库&#xff0c;也存在着依赖关系。因此python项目也 当实际构建1个 Python 项目时&#xff0c;模块与包是我们…

【python基础教程】类中属性和方法的具体定义方法及使用

前言 嗨喽&#xff0c;大家好呀~这里是爱看美女的茜茜呐 以下介绍在python的re模块中怎样应用正则表达式 &#x1f447; &#x1f447; &#x1f447; 更多精彩机密、教程&#xff0c;尽在下方&#xff0c;赶紧点击了解吧~ python源码、视频教程、插件安装教程、资料我都准备…