PyTorch搭建循环神经网络(RNN)进行文本分类、预测及损失分析(对不同国家的语言单词和姓氏进行分类,附源码和数据集)

news2025/4/3 0:52:26

需要源码和数据集请点赞关注收藏后评论区留言~~~

下面我们将使用循环神经网络训练来自18种起源于不同语言的数千种姓氏,并根据拼写方式预测名称的来源。

一、数据准备和预处理

总共有18个txt文件,并且对它们进行预处理,输出如下

部分预处理代码如下

from __future__ import unicode_literals, print_function, division
from io import open
import glob
import os

def findFiles(path): return glob.glob(path)

print(findFiles('data/names/*.txt'))

import unicodedata
import string

all_letters = string.ascii_letters + " .,;'"
n_letters = len(all_letters)

    return ''.join(
        c for c in unicodedata.normalize('NFD', s)
        if unicodedata.category(c) != 'Mn'
        and c in all_letters
    )


for filename in findFiles('data/names/*.txt'):
    category = os.path.splitext(os.path.basename(filename))[0]
    all_categories.append(category)
    lines = readLines(filename)
    category_lines[category] = lines

n_categories = len(all_categories)

 二、将名字转换为张量

现在已经整理好了所有数据集种的名字,这里需要将它们转换为张量以使用它们,为了表示单个字母,这里使用独热编码的方法

三、构建神经网络

在PyTorch种构建循环神经网络涉及在多个时间步长上克隆多个RNN层 的参数,RNN层保留了Hidden State和梯度,这些状态完全由PyTorch的计算图来自动完成维护,这意味我们只需要关心前馈网络而不需要关注反向传播

 四、训练RNN网络

训练该网络所需要做的是向他输入大量的数据,令其进行预测,然后告诉它是否有错误

每个训练的循环包含下面七个步骤

1:创建输入和目标Tensor

2:创建归零的初始Hidden State

3:输入一个字母

4:传递Hidden State给下一个字母输入

5:比较最终输出和目标

6:反向传播

7:返回输出和损失

平均损失如下

 

 五、绘制损失变化图像

绘制网络的历史损失变化,以显示网络学习情况 

可见随着训练次数的增加损失逐渐 梯度下降

 六、预测结果

为了了解网络在不同类别上的表现如何,这里将创建一个混淆矩阵,为每种实际语言指示网络猜测那种语言,结果如下图,可以从主轴上挑出一些亮点,以显示它猜错了哪些语言

可见中文/朝鲜语 西班牙语/意大利语会有混淆,网络预测希腊语名字十分准确,但是英语名字预测的很糟糕

 七、预测用户输入

大家可以输入任何希望预测的名字到模型中,网络会给出几个名字最有可能的语言类型

 八、代码

需要全部源码请点赞关注收藏后评论区留言~~~

from __future__ import unicode_literals, print_function, division


from io import open
import glob
import os

def findFiles(path): return glob.glob(path)

print(findFiles('data/names/*.txt'))

import unicodedata
import string

all_letters = string.ascii_letters + " .,;'"
n_letters = len(all_letters)

# Turn a Unicode string to plain ASCII, thanks to https://stackoverflow.com/a/518232/2809427
def unicodeToAscii(s):
    return ''.join(
        c for c in unicodedata.normalize('NFD', s)
        if unicodedata.category(c) != 'Mn'
        and c in all_letters
    )

print(unicodeToAscii('Ślusàrski'))

# Build the category_lines dictionary, a list of names per language
category_lines = {}
all_categories = []

# Read a file and split into lines
def readLines(filename):
    lines = open(filename, encoding='utf-8').read().strip().split('\n')
    return [unicodeToAscii(line) for line in lines]

for filename in findFiles('data/names/*.txt'):
    category = os.path.splitext(os.path.basename(filename))[0]
    all_categories.append(category)
    lines = readLines(filename)
    category_lines[category] = lines

n_categories = len(all_categories)



# 
# 

# In[33]:


#print(category_lines['Italian'][:5])


# Turning Names into Tensors

# 

# In[34]:


import torch

# Find letter index from all_letters, e.g. "a" = 0
def letterToIndex(letter):
    return all_letters.find(letter)

# Just for demonstration, turn a letter into a <1 x n_letters> Tensor
def letterToTensor(letter):
    tensor = torch.zeros(1, n_letters)
    tensor[0][letterToIndex(letter)] = 1
    return tensor

# Turn a line into a <line_length x 1 x n_letters>,
# or an array of one-hot letter vectors
def lineToTensor(line):
    tensor = torch.zeros(len(line), 1, n_letters)
    for li, letter in enumerate(line):
        tensor[li][0][letterToIndex(letter)] = 1
    return tensor

print(letterToTensor('J'))

print(lineToTensor('Jones').size())



# This RNN module (mostly copied from `the PyTorch for Torch users
# tutorial <https://pytorch.org/tutorials/beginner/former_torchies/
# nn_tutorial.html#example-2-recurrent-net>`__)
# is just 2 linear layers which operate on an input and hidden state, with
# a LogSoftmax layer after the output.
# 
# .. figure:: https://i.imgur.com/Z2xbySO.png
#    :alt:
# 
# 
# 
# 

# In[35]:


i
        self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
        self.i2o = nn.Linear(input_size + hidden_size, output_size)
        self.softmax = nn.LogSoftmax(dim=1)

    def forward(self, input, hidden):
        combined = torch.cat((input, hidden), 1)
        hidden = self.i2h(combined)
        output = self.i2o(combined)
        output = self.softmax(output)
        return output, hidden

    def initHidden(self):
        return torch.zeros(1, self.hidden_size)

n_hidden = 128
rnn = RNN(n_letters, n_hidden, n_categories)


# To run a step of this network we need to pass an input (in our case, the
# Tensor for the current letter) and a previous hidden state (which we
# initialize as zeros at first). We'll get back the output (probability of
# each language) and a next hidden state (which we keep for the next
# step).
# 
# 
# 

# In[36]:


inp

# For the sake of efficiency we don't want to be creating a new Tensor for
# every step, so we will use ``lineToTensor`` instead of
# ``letterToTensor`` and use slices. This could be further optimized by
# pre-computing batches of Tensors.
# 
# 
# 

# In[37]:


input = lineToTensor('Albert')
hidden = torch.zeros(1, n_hidden)

output, next_hidden = rnn(input[0], hidden)
print(output)


# As you can see the output is a ``<1 x n_categories>`` Tensor, where
# every item is the likelihood of that category (higher is more likely).
# 
# 
# 

# Training
# ========
# Preparing for Training
# ----------------------
# 
# Before going into training we should make a few helper functions. The
# first is to interpret the output of the network, which we know to be a
# likelihood of each category. We can use ``Tensor.topk`` to get the index
# of the greatest value:
# 
# 
# 

# In[38]:


def categoryFromOutput(output):
    top_n, top_i = output.topk(1)
    category_i = top_i[0].item()
    return all_categories[category_i], category_i

#print(categoryFromOutput(output))


# We will also want a quick way to get a training example (a name and its
# language):
# 
# 
# 

# In[39]:


import random

def randomChoice(l):
    return l[random.randint(0, len(l) - 1)]

def randomTrainingExample():
    category = randomChoice(all_categories)
    line = randomChoice(category_lines[category])
    category_tensor = torch.tensor([all_categories.index(category)], dtype=torch.long)
    line_tensor = lineToTensor(line)
    return category, line, category_tensor, line_tensor

for i in range(10):
    category, line, category_tensor, line_tensor = randomTrainingExample()
    print('category =', category, '/ line =', line)


# Training the Network
# --------------------
# 
# Now all it takes to train this network is show it a bunch of examples,
# have it make guesses, and tell it if it's wrong.
# 
# For the loss function ``nn.NLLLoss`` is appropriate, since the last
# layer of the RNN is ``nn.LogSoftmax``.
# 
# 
# 

# In[40]:


criterion = nn.NLLLoss()

# 
#    -  Keep hidden state for next letter
# 
# -  Compare final output to target
# -  Back-propagate
# -  Return the output and loss
# 
# 
# 

# In[41]:


learning_rate = 0.005 # If you set this too high, it might explode. If too low, it might not learn

def train(category_tensor, line_tensor):
    hidden = rnn.initHidden()

    rnn.zero_grad()

    for i in range(line_tensor.size()[0]):
        output, hidden = rnn(line_tensor[i], hidden)

    loss = criterion(output, category_tensor)
    loss.backward()

    # Add parameters' gradients to their values, multiplied by learning rate
    for p in rnn.parameters():
        p.data.add_(p.grad.data, alpha=-learning_rate)

    return output, loss.item()


# Now we just have to run that with a bunch of examples. Since the
# ``train`` function returns both the output and loss we can print its
# guesses and also keep track of loss for plotting. Since there are 1000s
# of examples we print only every ``print_every`` examples, and take an
# average of the loss.
# 
# 
# 

    m = math.floor(s / 60)
    s -= m * 60
    return '%dm %ds' % (m, s)

start = time.time()

for iter in range(1, n_iters + 1):
    category, line, category_tensor, line_tensor = randomTrainingExample()
    output, loss = train(category_tensor, line_tensor)
    current_loss += loss

    # Print iter number, loss, name and guess
    if iter % print_every == 0:
        guess, guess_i = categoryFromOutput(output)
        correct = '✓' if guess == category else '✗ (%s)' % category
        print('%d %d%% (%s) %.4f %s / %s %s' % (iter, iter / n_iters * 100, timeSince(start), loss, line, guess, correct))

    # Add current loss avg to list of losses
    if iter % plot_every == 0:
        all_losses.append(current_loss / plot_every)
        current_loss = 0


# Plotting the Results
# --------------------
# 
# Plotting the historical loss from ``all_losses`` shows the network
# learning:
# 
# 
# 

# In[22]:


plt.figure()
plt.plot(all_losses)


# Evaluating the Results
# ======================
# 
# To see how well the network performs on different categories, we will
# create a confusion matrix, indicating for every actual language (rows)
# which language the network guesses (columns). To calculate the confusion
# matrix a bunch of samples are run through the network with
# ``evaluate()``, which is the same as ``train()`` minus the backprop.
# 
# 
# 

# In[46]:


# Keep track of correct guesses in a confusion matrix
confusion = torch.zeros(n_categories, n_categories)
n_confusion = 10000

# Just return an output given a line
def evaluate(line_tensor):
    hidde
# Go through a bunch of examples and record which are correctly guessed
for i in range(n_confusion):
    category, line, category_tensor, line_tensor = randomTrainingExample()
    output = evaluate(line_tensor)
    guess, guess_i = categoryFromOutput(output)
    category_i = all_categories.index(category)
    confusion[category_i][guess_i] += 1

# Normalize by dividing every row by its sum
for i in range(n_categories):
    confusion[i] = confusion[i] / confusion[i].sum()

# Set up plot
fig = plt.figure()
ax = fig.add_subplot(111)
cax = ax.matshow(confusion.numpy())
fig.colorbar(cax)

# Set up axes
ax.set_xticklabels([''] + all_categories, rotation=90)
ax.set_yticklabels([''] + all_categories)

# Force label at every tick
ax.xaxis.set_major_locator(ticker.MultipleLocator(1))
ax.yaxis.set_major_locator(ticker.MultipleLocator(1))


# You can pick out bright spots off the main axis that show which
# languages it guesses incorrectly, e.g. Chinese for Korean, and Spanish
# for Italian. It seems to do very well with Greek, and very poorly with
# English (perhaps because of overlap with other languages).
# 
# 
# 

# Running on User Input
# ---------------------
# 
# 
# 

# In[47]:


def predict(input_line, n_predictions=3):
    print('\n> %s' % input_line)
    with t evaluate(lineToTensor(input_line))
 = []

        for i in range(n_predictions):
            value = topv[0][i].item()
            category_index = topi[0][i].item()
            print('(%.2f) %s' % (value, all_categories[category_index]))
            predictions.append([value, all_categories[category_index]])

predict('Dovesky')
predict('Jackson')
predict('Satoshi')


# The final versions of the scripts `in the Practical PyTorch
# repo <https://github.com/spro/practical-pytorch/tree/master/char-rnn-classification>`__
# split the above code into a few files:
# 
# -  ``data.py`` (loads files)
# -  ``model.py`` (defines the RNN)
# -  ``train.py`` (runs training)
# -  ``predict.py`` (runs ``predict()`` with command line arguments)
# -  ``server.py`` (serve prediction as a JSON API with bottle.py)
# 
# Run ``train.py`` to train and save the network.
# 
# Run ``predict.py`` with a name to view predictions:
# 
# ::
# 
#     $ python predict.py Hazaki
#     (-0.42) Japanese
#     (-1.39) Polish
irst name -> gender
#    -  Character name -> writer
#    -  Page title -> blog or subreddit
# 
# -  Get better results with a bigger and/or better shaped network
# 
#    -  Add more linear layers
#    -  Try the ``nn.LSTM`` and ``nn.GRU`` layers
#    -  Combine multiple of these RNNs as a higher level network
# 
# 
# 

创作不易 觉得有帮助请点赞关注收藏~~~

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/5577.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

Windows版Ros环境的搭建以及Rviz显示激光点云信息

安装步骤&#xff1a; 1.安装visual studio 2019-2022 2.安装ROS 3.创建ROS快捷终端 4.运行测试效果 一、安装Visual Studio 2022 需要利用vs编译ROS代码&#xff0c;所以需要安装Visual Studio 2022 这里注意要使用vs2022&#xff0c;ROS wiki给的教程是使用2019 1).使…

Python学习小组课程-课程大纲与Python开发环境安装

一、前言 注意&#xff1a;此为内部小组学习资料&#xff0c;非售卖品&#xff0c;仅供学习参考。 为提升项目落地的逻辑思维能力&#xff0c;以及通过自我创造工具来提升工作效率&#xff0c;特成立Python学习小组。计划每周花一个小时进行在线会议直播学习&#xff0c;面向…

力扣21 - 合并两个有序链表【归并排序思维】

链式铠甲——合体一、题目描述二、思路分析三、代码详解way1【不带头结点】way2【带头结点】四、整体代码展示【需要自取】方法一&#xff1a;不带哨兵位【无头结点】方法二&#xff1a;带哨兵位【有头结点】五、总结与提炼一、题目描述 原题传送门&#x1f6aa; 将两个升序链…

vs2019编译ffmpeg4.4为静态库或动态库

参考文章&#xff1a;vs2019编译ffmpeg源码为静态库动态库【完整步骤、亲测可行】 文章目录编译测试编译 直接把博主的项目下下来 我打开里面FFmpeg文件发现它貌似是4.4版本 然后照着他给的步骤执行命令 先找到vs2019的命令行工具 然后执行两个脚本 执行以上两个脚本后&…

快速排序和归并排序非递归的详解

在经过主页中《八大排序》&#xff08;下&#xff09;的学习&#xff0c;我们了解了快速排序和归并排序且都是递归的思想&#xff0c;但是如果递归的深度很深呢&#xff1f;这一节我们就引出用非递归的思想解决这个问题。&#x1f635;&#x1f635;&#x1f635; 快速排序非递…

根据给定数组,创建形状相同的数组并且采用不同方式填充full_like()

【小白从小学Python、C、Java】 【计算机等级考试500强双证书】 【Python-数据分析】 根据给定数组&#xff0c;创建形状相同的数组 并且采用不同方式填充 full_like() [太阳]选择题 对下面代码中full_like函数结果描述错误的选项为&#xff1f; import numpy as np print(&q…

谷粒学院——Day05【后台系统前端项目创建、讲师管理模块前端开发】

后台系统前端项目创建 一、vue-element-admin 简介 vue-element-admin 是基于 element-ui 的一套后台管理系统集成方案。 功能&#xff1a;https://panjiachen.github.io/vue-element-admin-site/zh/guide/#功能 GitHub地址&#xff1a;https://github.com/PanJiaChen/vue-ele…

分布式锁_Redis分布式锁+Redisson分布式锁+Zookeeper分布式锁+Mysql分布式锁(原版)

分布式锁_Redis分布式锁Redisson分布式锁Zookeeper分布式锁Mysql分布式锁&#xff08;原版&#xff09; 文章目录分布式锁_Redis分布式锁Redisson分布式锁Zookeeper分布式锁Mysql分布式锁&#xff08;原版&#xff09;1. 传统锁回顾1.1. 从减库存聊起1.2. 环境准备1.3. 简单实现…

Dreamweaver网页设计与制作100例——HTML5期末考核大作业——票务网站整套网页

HTML实例网页代码, 本实例适合于初学HTML的同学。该实例里面有设置了css的样式设置&#xff0c;有div的样式格局&#xff0c;这个实例比较全面&#xff0c;有助于同学的学习,本文将介绍如何通过从头开始设计个人网站并将其转换为代码的过程来实践设计。 文章目录一、网页介绍一…

正确查询DO基站IP

对于DO站的IP地址在系统中设置是否正确需要确定基站侧IP地址和RNC侧地址是否匹配&#xff0c;匹配关系为&#xff1a;基站侧IP地址减2即为RNC侧地址&#xff08;如&#xff1a;RCS 234 BTS-IP: 6.33.84.30 则匹配RNC侧地址即为6.33.84.28&#xff09;&#xff0c;下面举例进行襄…

基于单片机的语音小车设计

目 录 引言 1 1 系统概述 1 1.1 声控产品前景和发展趋势 1 1.2 研究目的和意义 1 1.3 本次设计内容 2 2 系统设计的整体方案 2 2.1 主控芯片的方案论证 2 2.2 语音识别模块的方案论证 3 2.3 电机驱动方案选择 4 2.4 本章小节 4 3 系统…

使用ssh克隆GitHub仓库以及替换https方式

目录 使用ssh克隆GitHub仓库 第一步&#xff1a;生成ssh 第二步&#xff1a;添加SSH key 第三步&#xff1a;验证绑定是否成功 第四步&#xff1a;克隆 意外的情况&#xff1a; 情况1&#xff1a;ssh连接GitHub失败 情况2&#xff1a;使用git clone 不成功 替换原来的…

队列的简单实现

队列的简单实现一、什么是队列二、队列的分类三、队列的数据结构四、队列的基本操作1、初始化队列2、销毁队列3、入队4、出队5、队列判空6、获取队头元素7、获取队尾元素8、获取队列元素总结头文件基本操作一、什么是队列 首先我们既然想要实现队列就得明白什么是队列&#xff…

1.7.4、计算机网络体系结构中的术语

1.7.4、计算机网络体系结构中的术语 1.7.4.1、实体 实体&#xff1a; 任何可发送或接收信息的硬件或软件进程。 对等实体&#xff1a; 收发双方相同层次中的实体 1.7.4.2、协议 协议&#xff1a;控制两个的对等实体进行逻辑通信的规则的集合 之所以称为逻辑通信&#xf…

目标检测论文解读复现之五:改进YOLOv5的SAR图像舰船目标检测

目标检测论文解读复现 文章目录目标检测论文解读复现前言一、摘要二、网络模型及核心创新点三、应用数据集四、实验效果&#xff08;部分展示&#xff09;五、实验结论六、投稿期刊介绍前言 此前出了目标改进算法专栏&#xff0c;但是对于应用于什么场景&#xff0c;需要什么改…

HTML5期末考核大作业,电影网站——橙色国外电影 web期末作业设计网页

HTML实例网页代码, 本实例适合于初学HTML的同学。该实例里面有设置了css的样式设置&#xff0c;有div的样式格局&#xff0c;这个实例比较全面&#xff0c;有助于同学的学习,本文将介绍如何通过从头开始设计个人网站并将其转换为代码的过程来实践设计。 文章目录一、网页介绍一…

【代码精读】ATF的异常向量表

快速链接: . 👉👉👉 【代码精读】–Kernel/ATF/optee等-目录👈👈👈 付费专栏-付费课程 【购买须知】:本专栏的视频介绍-----视频👈👈👈概要: 本文概述了ARMv8/ARMv9的aarch64体系中异常向量表的结构、以及基地寄存器的总结。然后通过导读ATF BL31的异常向量…

Flink系列文档-(YY09)-Flink时间语义

1 三种时间语义 在实时流式计算中&#xff0c;"时间"是一个能影响计算结果的非常重要因素&#xff01; 试想场景&#xff1a;每隔1分钟计算一次最近10分钟的活跃用户量&#xff1a; ①假设此刻的时间是13:10&#xff0c;要计算的活跃用户量时间段为&#xff1a;[ …

【C++】类和对象(下)

​&#x1f320; 作者&#xff1a;阿亮joy. &#x1f386;专栏&#xff1a;《吃透西嘎嘎》 &#x1f387; 座右铭&#xff1a;每个优秀的人都有一段沉默的时光&#xff0c;那段时光是付出了很多努力却得不到结果的日子&#xff0c;我们把它叫做扎根 目录&#x1f449;再谈构造…

kindle自定义屏保之自定义字帖

kindle自定义屏保之自定义字帖 01 前言 毕业以后&#xff0c;很少动笔写字了&#xff0c;某天要手写一堆材料&#xff0c;写出来实在不忍直视&#xff0c;于是当晚下班后突发奇想——能不能把一些字帖搞成kindle屏保&#xff0c;摆在桌面上&#xff0c;睡前说不准还能练练 随…