系列文章
- 手把手教你:基于Django的新闻文本分类可视化系统(文本分类由bert实现)
- 手把手教你:基于python的文本分类(sklearn-决策树和随机森林实现)
- 手把手教你:基于TensorFlow的语音识别系统
目录
- 系列文章
- 一、项目简介
- 二、系统功能
- 1、数据导入
- 2、数据清洗
- 3、词频统计
- 4、情感分析
- 5、可视化
- 6、用户交互
- 三.界面简介
- 1、首页
- 2、题库界面
- 3、数据分析可视化界面
- 4、用户管理界面
- 四、界面架构
- 五、代码功能介绍
- 1、依赖环境
- 2、创建数据库
- 3、数据分析模块
- 4、用户管理模块
- 5、Django展示界面构建
- 六、代码下载地址
一、项目简介
本文主要介绍的基于Python和Django框架构建的英文数据分析与可视化系统。
该系统使用的是英语四六级考试相关文章、题目、答案,并对各种文本进行分析,包括且不限于以下关键技术:
实现了对英文文本数据的处理,包括词频统计、情感分析等,并将分析结果以图表形式进行展示。通过这个系统,用户可以便捷地进行英文数据的分析和可视化。
完整代码在最下方,想要先看源码的同学可以移步本文最下方进行下载。
博主也参考过文本分类相关模型的文章,但大多是理论大于方法。很多同学肯定对原理不需要过多了解,只需要搭建出一个可视化系统即可。
也正是因为我发现网上大多的帖子只是针对原理进行介绍,功能实现的相对很少。
如果您有以上想法,那就找对地方了!
不多废话,直接进入正题!
二、系统功能
1、数据导入
支持多种格式的英文文本数据导入,如.txt, .csv, .json等。
2、数据清洗
提供数据清洗功能,包括去除停用词、标点符号、数字等。
3、词频统计
统计文本中单词的频率,并以词云图、条形图等形式展示。
4、情感分析
基于NLP技术对文本进行情感分析,判断文本的积极、消极或中性情绪。
5、可视化
采用多种图表进行数据可视化,如折线图、饼图、散点图等。
6、用户交互
提供友好的用户界面,方便用户进行操作和交互。
三.界面简介
系统完成后界面如下:
1、首页
- 首页:提供了三个模块:
Materials
、Business Intelligence
、User Management
- 分别为:1、CET的题目信息查询。2、数据分析与可视化。3、用户注册与登录
2、题库界面
- 页面二:可以查看CET的2016年-2020年的题目和答案,用于自行练习。
- 页面三:答题界面
3、数据分析可视化界面
-
页面四:题目及答案的数据分析及词频统计界面
-
单词长度统计
- 高频单词统计
4、用户管理界面
- 用户注册
- 用户登录
四、界面架构
整个系统架构以及功能模块不复杂,可以参考下图:
五、代码功能介绍
1、依赖环境
本项目使用的是pycharm的python编译环境,如不清楚如何使用的同学可以参考csdn上其他博主的基础教程,这里就不进行赘述。
后端:Python, Django, NLTK(自然语言处理工具包)
前端:HTML, CSS, JavaScript, Bootstrap, jQuery
数据库:MySQL
可视化库:Matplotlib, Pyecharts, WordCloud
2、创建数据库
- 创建mysql数据库,并写入相关数据
import mysql.connector
import pandas as pd
class DbCreate:
def __init__(self):
self.my_db = mysql.connector.connect(
host="localhost",
user="root",
passwd="123456",
auth_plugin='mysql_native_password'
)
def create_database(self, database_name):
"""
创建database
database_name:数据库名
"""
# 实例化
db_cursor = self.my_db.cursor()
try:
sql = "CREATE DATABASE " + database_name + ";"
db_cursor.execute(sql)
# 提交
self.my_db.commit()
except Exception as e:
print("create_database 失败,错误信息:", str(e))
else:
print("create_database 成功......")
def drop_database(self, database_name):
"""
删除database
database_name:数据库名
"""
# 实例化
db_cursor = self.my_db.cursor()
try:
sql = "DROP DATABASE " + database_name + ";"
db_cursor.execute(sql)
# 提交
self.my_db.commit()
except Exception as e:
print("drop_database 失败,错误信息:", str(e))
else:
print("drop_database 成功......")
class DbConnect:
def __init__(self, dbname):
self.my_db = mysql.connector.connect(
host="localhost",
user="root",
passwd="123456",
auth_plugin='mysql_native_password',
database=dbname
)
def insert_user(self, email, name, password):
"""
创建用户
"""
my_cursor = self.my_db.cursor()
try:
sql = "INSERT INTO user (email,name,password) VALUES (%s, %s, %s);"
val = (email, name, password)
# 执行
my_cursor.execute(sql, val)
# 提交
self.my_db.commit()
except Exception as e:
print("insert_user 失败,错误信息:", str(e))
else:
print('insert_user 成功......')
def create_tb_user(self):
"""
创建user表
"""
my_cursor = self.my_db.cursor()
try:
# 用户表建表语句
sql = "CREATE TABLE user (id INT AUTO_INCREMENT PRIMARY KEY, email VARCHAR(255), name VARCHAR(255)" \
", password VARCHAR(255));"
my_cursor.execute(sql)
# 提交
self.my_db.commit()
except Exception as e:
print("create_tb_user 失败,错误信息:", str(e))
else:
print('create_tb_user 成功......')
# 创建管理员
print("创建管理员......")
self.insert_user("email@163.com", "test", "123456")
def create_tb_cet(self):
"""
创建题库表
"""
my_cursor = self.my_db.cursor()
try:
# 用户表建表语句
sql = "CREATE TABLE exam (id INT AUTO_INCREMENT PRIMARY KEY, 编号 VARCHAR(255),级别 VARCHAR(255)" \
", 年份 VARCHAR(255),月份 VARCHAR(255),套数 VARCHAR(255),题型 VARCHAR(255),内容 TEXT," \
"选项 TEXT,答案 TEXT ,翻译 TEXT );"
my_cursor.execute(sql)
# 提交
self.my_db.commit()
except Exception as e:
print("create_tb_cet 失败,错误信息:", str(e))
else:
print('create_tb_cet 成功......')
def insert_exam(self, nbr, level, year, month, t_nbr, tx, text, choose, answer, translation):
"""
执行单个插入题库信息
"""
my_cursor = self.my_db.cursor()
try:
sql = "INSERT INTO exam (编号,级别,年份,月份,套数,题型,内容,选项,答案,翻译) VALUES " \
"(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s);"
val = (nbr, level, year, month, t_nbr, tx, text, choose, answer, translation)
# 执行
my_cursor.execute(sql, val)
# 提交
self.my_db.commit()
except Exception as e:
print("insert_user 失败,错误信息:", str(e))
def insert_exam_all(self):
"""
读取Excel信息,插入所有题库
"""
data_exam = pd.read_excel("data/data_exam.xlsx", engine='openpyxl')
print("读取题库信息成功,开始更新题库信息至数据库......")
for i in range(len(data_exam)):
try:
self.insert_exam(
str(data_exam['编号'][i]),
str(data_exam['级别'][i]),
str(data_exam['年份'][i]),
str(data_exam['月份'][i]),
str(data_exam['套数'][i]),
str(data_exam['题型'][i]),
str(data_exam['内容'][i]),
str(data_exam['选项'][i]),
str(data_exam['答案'][i]),
str(data_exam['翻译'][i])
)
except Exception as e:
print("insert_exam_all 执行异常,错误信息如下:", e)
else:
print("第", i + 1, "条题库信息插入成功......")
print("完成题库信息导入.......")
def create_tb_text(self):
"""
创建文章表
"""
my_cursor = self.my_db.cursor()
try:
# 用户表建表语句
sql = "CREATE TABLE text (id INT AUTO_INCREMENT PRIMARY KEY, 级别 VARCHAR(255)" \
", 年份 VARCHAR(255),月份 VARCHAR(255),套数 VARCHAR(255),题型 VARCHAR(255),内容 TEXT)"
my_cursor.execute(sql)
# 提交
self.my_db.commit()
except Exception as e:
print("create_tb_text 失败,错误信息:", str(e))
else:
print('create_tb_text 成功......')
def insert_text(self, level, year, month, t_nbr, tx, text):
"""
执行单个插入文章信息
"""
my_cursor = self.my_db.cursor()
try:
sql = "INSERT INTO text (级别,年份,月份,套数,题型,内容) VALUES " \
"(%s, %s, %s, %s, %s, %s);"
val = (level, year, month, t_nbr, tx, text)
# 执行
my_cursor.execute(sql, val)
# 提交
self.my_db.commit()
except Exception as e:
print("insert_text 失败,错误信息:", str(e))
def insert_text_all(self):
"""
读取Excel信息,插入所有文章
"""
data_exam = pd.read_excel("data/data_text.xlsx", engine='openpyxl')
print("读取题库信息成功,开始更新题库信息至数据库......")
for i in range(len(data_exam)):
try:
self.insert_text(
str(data_exam['级别'][i]),
str(data_exam['年份'][i]),
str(data_exam['月份'][i]),
str(data_exam['套数'][i]),
str(data_exam['题型'][i]),
str(data_exam['内容'][i])
)
except Exception as e:
print("insert_text_all 执行异常,错误信息如下:", e)
else:
print("第", i + 1, "条题库信息插入成功......")
print("完成题库信息导入.......")
if __name__ == '__main__':
# 建库名
db_name = 'exam'
# # 建库
# my_db_create = DbCreate()
# # my_db_create.drop_database('english')
# my_db_create.drop_database(db_name)
# my_db_create.create_database(db_name)
# mysql创建连接
my_db_connect = DbConnect(db_name)
# # 创建user表
# my_db_connect.create_tb_user()
# # 创建题库表
# my_db_connect.create_tb_cet()
# # 插入题库信息
# my_db_connect.insert_exam_all()
# 创建文章表
my_db_connect.create_tb_text()
# 插入题库信息
my_db_connect.insert_text_all()
3、数据分析模块
from collections import Counter
import mysql.connector
import pandas as pd
from pyecharts.charts import Bar
from pyecharts import options as opts
from pyecharts.charts import Pie
class DbConnect:
def __init__(self):
self.my_db = mysql.connector.connect(
host="localhost",
user="root",
passwd="123456",
auth_plugin='mysql_native_password',
database='exam'
)
def select_text_len(self):
"""
获取文章内容、年度、AB
"""
text_result = {}
try:
my_cursor = self.my_db.cursor()
print("连接数据库成功,开始读取数据......")
sql = "SELECT 内容,年份 FROM text WHERE 题型 = 'A' AND 级别=4"
my_cursor.execute(sql)
result = my_cursor.fetchall()
text_4_a = []
text_year = []
# 获取4级文章和年度
for i in range(len(result)):
text_4_a.append(result[i][0])
text_year.append(result[i][1])
sql = "SELECT 内容 FROM text WHERE 题型 = 'A' AND 级别=6"
my_cursor.execute(sql)
result = my_cursor.fetchall()
text_6_a = []
# 获取6级文章
for i in range(len(result)):
text_6_a.append(result[i][0])
sql = "SELECT 内容 FROM text WHERE 题型 = 'B' AND 级别=4"
my_cursor.execute(sql)
result = my_cursor.fetchall()
text_4_b = []
# 获取B卷4级文章
for i in range(len(result)):
text_4_b.append(result[i][0])
sql = "SELECT 内容 FROM text WHERE 题型 = 'B' AND 级别=6"
my_cursor.execute(sql)
result = my_cursor.fetchall()
text_6_b = []
# 获取B卷6级文章
for i in range(len(result)):
text_6_b.append(result[i][0])
text_result = {
'year': text_year,
'4_a': text_4_a,
'4_b': text_4_b,
'6_a': text_6_a,
'6_b': text_6_b
}
except Exception as e:
msg = "查询失败,错误信息:", str(e)
print(msg)
return text_result
def deal_length(my_l):
"""
处理文章长度
"""
result = []
for text in my_l:
result.append(len(text.split(" ")))
return result
def show_length():
"""
展示A、B文章长度
"""
my_dbc = DbConnect()
result = my_dbc.select_text_len()
year = result['year']
text_4a = result['4_a']
text_4b = result['4_b']
text_6a = result['6_a']
text_6b = result['6_b']
# 处理文章长度
text_4a_l = deal_length(text_4a)
text_4b_l = deal_length(text_4b)
text_6a_l = deal_length(text_6a)
text_6b_l = deal_length(text_6b)
# 生成4级网页
print("开始生成网页......")
bar = Bar(init_opts=opts.InitOpts(width="800px", height="400px", page_title="img_view",
))
# 展示cet4的文章长度
bar.add_xaxis(year)
bar.add_yaxis("CET4 A Length", text_4a_l)
bar.add_yaxis("CET4 B Length", text_4b_l)
bar.set_global_opts(title_opts=opts.TitleOpts(title="CET4", subtitle="Statistical"))
bar.render("./templates/bi_show_cet4l.html")
print("生成成功,请见目录:/templates/bi_show_cet4l.html ......")
# 生成6级网页
bar = Bar(init_opts=opts.InitOpts(width="800px", height="400px", page_title="img_view",
))
# 展示cet6的文章长度
bar.add_xaxis(year)
bar.add_yaxis("CET6 A Length", text_6a_l)
bar.add_yaxis("CET6 B Length", text_6b_l)
bar.set_global_opts(title_opts=opts.TitleOpts(title="CET6", subtitle="Statistical"))
bar.render("./templates/bi_show_cet6l.html")
print("生成成功,请见目录:/templates/bi_show_cet6l.html ......")
def read_stop_txt(rootdir):
"""
读取停用词表
"""
lines = []
with open(rootdir, 'r', encoding="utf-8") as file_to_read:
while True:
line = file_to_read.readline()
if not line:
break
line = line.strip('\n').strip(' ')
lines.append(line)
return lines
def words_clean(data):
"""数据清洗"""
lines = []
for word in data:
word = word.replace("\n", '').replace("\\u", '').replace("\\", '').replace("_", '') \
.replace("—", '').replace("-", '').replace("?", '').replace("[", '').replace("]", '') \
.replace(".", '').replace("“", '').replace(",", '').replace("$", '').replace(";", '') \
.replace("(", '').replace(")", '').replace("\"", '').replace(":", '').replace("”", '')
# if word.isalpha():
# lines.append(word)
# else:
# print(word)
lines.append(word)
return lines
def show_words_length():
"""
展示A、B文章单词长度占比
"""
my_dbc = DbConnect()
result = my_dbc.select_text_len()
list_4_a = result['4_a']
list_4_b = result['4_b']
list_6_a = result['6_a']
list_6_b = result['6_b']
# 合并文章,处理分词
text_4_a = words_clean(' '.join(list_4_a).split(" "))
text_4_b = words_clean(' '.join(list_4_b).split(" "))
text_6_a = words_clean(' '.join(list_6_a).split(" "))
text_6_b = words_clean(' '.join(list_6_b).split(" "))
len_4a = [len(word) for word in text_4_a if word != '']
len_4b = [len(word) for word in text_4_b if word != '']
len_6a = [len(word) for word in text_6_a if word != '']
len_6b = [len(word) for word in text_6_b if word != '']
# 获取单词长度
c_4a = Counter(len_4a).most_common(12)
c_4b = Counter(len_4b).most_common(12)
c_6a = Counter(len_6a).most_common(12)
c_6b = Counter(len_6b).most_common(12)
# 设置主标题与副标题,标题设置居中,设置宽度
w = "700px"
h = "600px"
center = [400, 300]
pos_left = 200
# 生成4级网页
pie = Pie(init_opts=opts.InitOpts(width=w, height=h, page_title="img_view",
))
# 加入数据,上方的colums选项取消显示
# 文章A
pie.add("words length", c_4a, center=center)
pie.set_global_opts(title_opts=opts.TitleOpts(title="CET4-A", subtitle="words length statistical")
, legend_opts=opts.LegendOpts(pos_left=pos_left)
)
pie.render("./templates/bi_show_cet4pa.html")
print("生成成功,请见目录:/templates/bi_show_cet4pa.html ......")
# 文章B
pie = Pie(init_opts=opts.InitOpts(width=w, height=h, page_title="img_view",
))
# 加入数据,上方的colums选项取消显示
pie.add("words length", c_4b, center=center)
pie.set_global_opts(title_opts=opts.TitleOpts(title="CET4-B", subtitle="words length statistical")
, legend_opts=opts.LegendOpts(pos_left=pos_left)
)
pie.render("./templates/bi_show_cet4pb.html")
print("生成成功,请见目录:/templates/bi_show_cet4pb.html ......")
# 生成6级网页
# 设置主标题与副标题,标题设置居中,设置宽度
# 文章A
pie = Pie(init_opts=opts.InitOpts(width=w, height=h, page_title="img_view",
))
# 加入数据,上方的colums选项取消显示
pie.add("words length", c_6a, center=center)
pie.set_global_opts(title_opts=opts.TitleOpts(title="CET6-A", subtitle="words length statistical"),
legend_opts=opts.LegendOpts(pos_left=pos_left))
pie.render("./templates/bi_show_cet6pa.html")
print("生成成功,请见目录:/templates/bi_show_cet6pa.html ......")
# 文章B
pie = Pie(init_opts=opts.InitOpts(width=w, height=h, page_title="img_view",
))
# 加入数据,上方的colums选项取消显示
pie.add("words length", c_6b, center=center)
pie.set_global_opts(title_opts=opts.TitleOpts(title="CET6-B", subtitle="words length statistical"),
legend_opts=opts.LegendOpts(pos_left=pos_left))
pie.render("./templates/bi_show_cet6pb.html")
print("生成成功,请见目录:/templates/bi_show_cet6pb.html ......")
def show_words(top=20):
"""
展示A、B文章Top关键词
"""
my_dbc = DbConnect()
result = my_dbc.select_text_len()
list_4 = result['4_a'] + result['4_b']
list_6 = result['6_a'] + result['6_b']
# 合并文章,处理分词
text_4 = words_clean(' '.join(list_4).split(" "))
text_6 = words_clean(' '.join(list_6).split(" "))
# 去停用词
stop_words = read_stop_txt('./data/stopwords.txt')
stop_4 = [word for word in text_4 if word not in stop_words and word != '']
stop_6 = [word for word in text_6 if word not in stop_words and word != '']
# 获取关键词
c_4 = Counter(stop_4).most_common(top)
c_4.reverse()
show_x = list(dict(c_4).keys())
show_y = list(dict(c_4).values())
# 生成4级网页
print("开始生成网页......")
bar = Bar(init_opts=opts.InitOpts(width="800px", height="700px", page_title="img_view",
))
# 展示cet4的高频词
bar.add_xaxis(show_x)
bar.add_yaxis("Frequency of high frequency words", show_y, itemstyle_opts=opts.ItemStyleOpts(color='#8f4b2e'))
bar.reversal_axis()
bar.set_global_opts(title_opts=opts.TitleOpts(title="CET4", subtitle="Statistical"))
bar.set_series_opts(label_opts=opts.LabelOpts(position="right"))
bar.render("./templates/bi_show_cet4f.html")
print("生成成功,请见目录:/templates/bi_show_cet4f.html ......")
# 生成6级网页
c_6 = Counter(stop_6).most_common(top)
c_6.reverse()
show_x = list(dict(c_6).keys())
show_y = list(dict(c_6).values())
bar = Bar(init_opts=opts.InitOpts(width="800px", height="700px", page_title="img_view",
))
# 展示cet6的高频词
bar.add_xaxis(show_x)
bar.add_yaxis("Frequency of high frequency words", show_y, itemstyle_opts=opts.ItemStyleOpts(color='#2a5caa'))
bar.reversal_axis()
bar.set_global_opts(title_opts=opts.TitleOpts(title="CET6", subtitle="Statistical"))
bar.set_series_opts(label_opts=opts.LabelOpts(position="right"))
bar.render("./templates/bi_show_cet6f.html")
print("生成成功,请见目录:/templates/bi_show_cet6f.html ......")
if __name__ == '__main__':
show_words_length()
show_words()
show_length()
4、用户管理模块
- 用户注册
def do_signup(request):
email = request.POST['email']
name = request.POST['name']
password1 = request.POST['password1']
password2 = request.POST['password2']
if email == '' or name == '' or password1 == '' or password2 == '':
context = {
"text": "Input error, please complete all forms above!"
}
elif password1 != password2:
context = {
"text": "Operation error. The passwords you entered twice are inconsistent!"
}
else:
import app.db_model as db_model
db_c = db_model.DbConnect()
msg = db_c.insert_user(email, name, password1)
context = {
"text": msg
}
return render(request, 'user_signup_result.html', context)
- 用户登录
def user_login(request):
"""
访问登录界面
"""
rep = render(request, 'user_login.html', {})
return rep
def get_ans(request):
"""
登录界面跳转
"""
import app.db_model as db_model
try:
# if user_name is None or password is None:
user_name = request.POST["user_name"]
password = request.POST["password"]
except KeyError:
# 无登录信息
try:
# 是否已登录,已登录直接返回主页
user_name = json.loads(request.COOKIES.get('user_name'))
password = json.loads(request.COOKIES.get('password'))
# user_name = request.COOKIES.get('user_name')
# password = request.COOKIES.get('password')
print(user_name, password)
except TypeError:
context = {
"state": "1",
"text": "Login status has expired, please login again!"
}
return render(request, 'user_error.html', context)
# print("获取的用户名和密码:", user_name, ';', password)
if password is "" or user_name is "":
context = {
"state": "1",
"text": "The entered content is empty, please re-enter the login information!"
}
return render(request, 'user_error.html', context)
else:
db_c = db_model.DbConnect()
login_msg = db_c.user_login(user_name, password)
if login_msg['state'] == 0:
context = {
"state": login_msg['state'],
"text": "The user name does not exist, please enter the correct user name!"
}
return render(request, 'user_error.html', context)
if login_msg['state'] == -1:
context = {
"state": login_msg['state'],
"text": login_msg['msg']
}
return render(request, 'user_error.html', context)
elif login_msg['state'] == 2:
context = {
"state": login_msg['state'],
"text": "Wrong password, please enter the correct password!"
}
return render(request, 'user_error.html', context)
else:
"""登录成功,设置cookie"""
rep = render(request, 'user_home.html', login_msg)
'''字符转换'''
user_name = json.dumps(user_name)
password = json.dumps(password)
''' max_age 设置过期时间,单位是秒 '''
rep.set_cookie('user_name', str(user_name), max_age=1200)
rep.set_cookie('password', str(password), max_age=1200)
return rep
5、Django展示界面构建
由于展示界面代码较多,这里就不一一进行展示,感兴趣的同学可以在文章下方找到完整代码下载地址。
- 这里就附上加载较为关键的前端代码。
- 展示数据分析结果页面:
{% extends 'base_user.html' %}
{% block content %}
{% load static %}
<div class="container">
<!--<img id="home_img" class="img-thumbnail" src="{% static 'img/home_img.jpg' %}">-->
<div id="carouselExampleIndicators" class="carousel slide" data-ride="carousel">
<ol class="carousel-indicators">
<li data-target="#carouselExampleIndicators" data-slide-to="0" class="active"></li>
<li data-target="#carouselExampleIndicators" data-slide-to="1"></li>
<li data-target="#carouselExampleIndicators" data-slide-to="2"></li>
</ol>
<div class="carousel-inner" id="home_carousel">
<div class="carousel-item active">
<img src="{% static 'img/bi_img1.jpg' %}" class="d-block w-100" alt="home_img1">
</div>
<div class="carousel-item">
<img src="{% static 'img/bi_img2.jpg' %}" class="d-block w-100" alt="home_img2">
</div>
<div class="carousel-item">
<img src="{% static 'img/bi_img3.jpg' %}" class="d-block w-100" alt="home_img3">
</div>
</div>
<button class="carousel-control-prev" type="button" data-target="#carouselExampleIndicators" data-slide="prev">
<span class="carousel-control-prev-icon" aria-hidden="true"></span>
<span class="sr-only">Previous</span>
</button>
<button class="carousel-control-next" type="button" data-target="#carouselExampleIndicators" data-slide="next">
<span class="carousel-control-next-icon" aria-hidden="true"></span>
<span class="sr-only">Next</span>
</button>
</div>
</div>
<div class="container">
<div class="jumbotron">
<div class="row">
<div class="col-md-12">
<div class="card text-left">
<h4 class="card-header">Articles Length Analysis</h4>
<div class="card-body">
<div class="tab-content">
<p class="card-title">Length analysis of reading comprehension articles over the years</p>
<hr class="my-4">
<div id="img_l_cet4">
<iframe src="l_show_cet4" width="100%" height="450" scrolling="no" frameborder="0"
οnlοad="changeFrameHeight()"></iframe>
</div>
<hr class="my-4">
<div id="img_l_cet6">
<iframe src="l_show_cet6" width="100%" height="450" scrolling="no" frameborder="0"
οnlοad="changeFrameHeight()"></iframe>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<div class="row">
<div class="col-md-12">
<div class="card text-left">
<h4 class="card-header">CET4: Word Length Analysis</h4>
<div class="card-body">
<div class="tab-content">
<p class="card-title">Analysis of word length in reading comprehension articles</p>
<hr class="my-4">
<div id="img_p_cet4a">
<iframe src="p_show_cet4a" width="100%" height="650" scrolling="no"
frameborder="0"
οnlοad="changeFrameHeight()"></iframe>
</div>
<hr class="my-4">
<div id="img_p_cet4b">
<iframe src="p_show_cet4b" width="100%" height="650" scrolling="no"
frameborder="0"
οnlοad="changeFrameHeight()"></iframe>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<div class="row">
<div class="col-md-12">
<div class="card text-left">
<h4 class="card-header">CET6: Word Length Analysis</h4>
<div class="card-body">
<div class="tab-content">
<p class="card-title">Analysis of word length in reading comprehension articles</p>
<hr class="my-4">
<div id="img_p_cet6a">
<iframe src="p_show_cet6a" width="100%" height="650" scrolling="no"
frameborder="0"
οnlοad="changeFrameHeight()"></iframe>
</div>
<hr class="my-4">
<div id="img_p_cet6b">
<iframe src="p_show_cet6b" width="100%" height="650" scrolling="no"
frameborder="0"
οnlοad="changeFrameHeight()"></iframe>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<div class="row">
<div class="col-md-12">
<div class="card text-left">
<h4 class="card-header">High frequency words</h4>
<div class="card-body">
<div class="tab-content">
<p class="card-title">High frequency words and their frequency in reading comprehension A
and reading comprehension B</p>
<hr class="my-4">
<div id="img_f_cet4">
<iframe src="f_show_cet4" width="100%" height="750" scrolling="no" frameborder="0"
οnlοad="changeFrameHeight()"></iframe>
</div>
<hr class="my-4">
<div id="img_f_cet6">
<iframe src="f_show_cet6" width="100%" height="750" scrolling="no" frameborder="0"
οnlοad="changeFrameHeight()"></iframe>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
{% endblock %}
- 展示词频统计:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>img_view</title>
<script type="text/javascript" src="https://assets.pyecharts.org/assets/echarts.min.js"></script>
</head>
<body>
<div id="52278b250e324ef4b6962156dc3c3ce8" class="chart-container" style="width:800px; height:700px;"></div>
<script>
var chart_52278b250e324ef4b6962156dc3c3ce8 = echarts.init(
document.getElementById('52278b250e324ef4b6962156dc3c3ce8'), 'white', {renderer: 'canvas'});
var option_52278b250e324ef4b6962156dc3c3ce8 = {
"animation": true,
"animationThreshold": 2000,
"animationDuration": 1000,
"animationEasing": "cubicOut",
"animationDelay": 0,
"animationDurationUpdate": 300,
"animationEasingUpdate": "cubicOut",
"animationDelayUpdate": 0,
"color": [
"#c23531",
"#2f4554",
"#61a0a8",
"#d48265",
"#749f83",
"#ca8622",
"#bda29a",
"#6e7074",
"#546570",
"#c4ccd3",
"#f05b72",
"#ef5b9c",
"#f47920",
"#905a3d",
"#fab27b",
"#2a5caa",
"#444693",
"#726930",
"#b2d235",
"#6d8346",
"#ac6767",
"#1d953f",
"#6950a1",
"#918597"
],
"series": [
{
"type": "bar",
"name": "Frequency of high frequency words",
"legendHoverLink": true,
"data": [
9,
9,
9,
10,
10,
10,
11,
11,
12,
12,
12,
14,
18,
18,
19,
19,
20,
21,
26,
28
],
"showBackground": false,
"barMinHeight": 0,
"barCategoryGap": "20%",
"barGap": "30%",
"large": false,
"largeThreshold": 400,
"seriesLayoutBy": "column",
"datasetIndex": 0,
"clip": true,
"zlevel": 0,
"z": 2,
"label": {
"show": true,
"position": "right",
"margin": 8
},
"itemStyle": {
"color": "#2a5caa"
},
"rippleEffect": {
"show": true,
"brushType": "stroke",
"scale": 2.5,
"period": 4
}
}
],
"legend": [
{
"data": [
"Frequency of high frequency words"
],
"selected": {
"Frequency of high frequency words": true
},
"show": true,
"padding": 5,
"itemGap": 10,
"itemWidth": 25,
"itemHeight": 14
}
],
"tooltip": {
"show": true,
"trigger": "item",
"triggerOn": "mousemove|click",
"axisPointer": {
"type": "line"
},
"showContent": true,
"alwaysShowContent": false,
"showDelay": 0,
"hideDelay": 100,
"textStyle": {
"fontSize": 14
},
"borderWidth": 0,
"padding": 5
},
"xAxis": [
{
"show": true,
"scale": false,
"nameLocation": "end",
"nameGap": 15,
"gridIndex": 0,
"inverse": false,
"offset": 0,
"splitNumber": 5,
"minInterval": 0,
"splitLine": {
"show": false,
"lineStyle": {
"show": true,
"width": 1,
"opacity": 1,
"curveness": 0,
"type": "solid"
}
}
}
],
"yAxis": [
{
"show": true,
"scale": false,
"nameLocation": "end",
"nameGap": 15,
"gridIndex": 0,
"inverse": false,
"offset": 0,
"splitNumber": 5,
"minInterval": 0,
"splitLine": {
"show": false,
"lineStyle": {
"show": true,
"width": 1,
"opacity": 1,
"curveness": 0,
"type": "solid"
}
},
"data": [
"So",
"learning",
"research",
"photos",
"screen",
"make",
"children",
"percent",
"social",
"devices",
"year",
"found",
"online",
"data",
"people",
"years",
"parents",
"technology",
"wellbeing",
"time"
]
}
],
"title": [
{
"text": "CET6",
"subtext": "Statistical",
"padding": 5,
"itemGap": 10
}
]
};
chart_52278b250e324ef4b6962156dc3c3ce8.setOption(option_52278b250e324ef4b6962156dc3c3ce8);
</script>
</body>
</html>
六、代码下载地址
由于项目代码量和数据集较大,感兴趣的同学可以直接下载代码,使用过程中如遇到任何问题可以在评论区进行评论,我都会一一解答。
代码下载:
- 代码分享:基于python+Django的英文数据分析与可视化系统