使用Bert，ERNIE，进行中文文本分类

news2026/2/15 2:57:10

GitHub - 649453932/Bert-Chinese-Text-Classification-Pytorch: 使用Bert，ERNIE，进行中文文本分类使用Bert，ERNIE，进行中文文本分类. Contribute to 649453932/Bert-Chinese-Text-Classification-Pytorch development by creating an account on GitHub.https://github.com/649453932/Bert-Chinese-Text-Classification-Pytorch

gayhub上有一个项目，用Bert和ERNIE进行中文文本分类的，基于pytorch运行的挺好，但是在使用过程中有几个修改的地方。

1. 运行时报错没有THUCNews/saved_dict这个位置，新建个文件夹就行了。

# 中文模型
# https://github.com/649453932/Bert-Chinese-Text-Classification-Pytorch/tree/master
预训练模型下载地址：
bert_Chinese: 模型 https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese.tar.gz
词表 https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-vocab.txt
备用：模型的网盘地址：https://pan.baidu.com/s/1qSAD5gwClq7xlgzl_4W3Pw
ERNIE_Chinese: http://image.nghuyong.top/ERNIE.zip
备用：网盘地址：https://pan.baidu.com/s/1lEPdDN1-YQJmKEd_g9rLgw
解压后，按照上面说的放在对应目录下，文件名称确认无误即可。

# 缺文件夹
mkdir -p  THUCNews/saved_dict/

2.项目有几个依赖库需要安装一下：

pip install torch


pip install tqdm scikit-learn tensorboardX  -i  https://pypi.tuna.tsinghua.edu.cn/simple/
pip install boto3 requests regex

python3 run.py  --model bert

3.代码在运行时会报几个Warning，大概是pytorch升级了，旧的函数被弃用，不影响运行。

但可以如此修改以消除警告。

pytorch_pretrained\optimization.py:275: UserWarning: This overload of add_ is deprecated:
add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add_(Tensor other, *, Number alpha) (Triggered internally at ..\torch\csrc\utils\python_arg_parser.cpp:1025.)
改为：
next_m.mul_(beta1).add_(1 - beta1, grad)

改为add_(grad, alpha=1 - beta1)即可
.addcmul(grad, grad, value = 1-beta2)

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/713402.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！