这里写自定义目录标题
- 作业回顾
- 作业步骤
- 完成方法
- 恢复数据库
- 设置查询日志的开关
- 删除退市股票以及新股的无效交易日的数据
- 删除个股数据表
- 查询merged_table这张表里3开头的股票中每个月成交量最大的股票
- 下载日志文件,备份数据库
作业回顾
作业内容发布在社区里:https://bbs.csdn.net/topics/613244256
作业步骤
1、恢复数据库
2、打开一般查询和慢查询日志,并设置为file输出,指定输出位置
3、删除退市和新股的无效交易日数据
4、删除个股数据表
5、在merged_table里查询数据
6、获取日志文件
7、备份数据库
作业环境: 云容器 8核8G
完成方法
恢复数据库
#新建数据库
CREATE DATABASE stocks;
USE stocks;
# 利用备份文件恢复数据
SOURCE /media/stocks_backup.sql;
# 验证是否所有的表格都导入
SHOW tables;
设置查询日志的开关
# 设置一般查询日志
show variables like 'general_log';
set global general_log=on;
show variables like 'log_output';
set global log_output='file';
set global general_log_file='/tmp/general_log';
# 设置慢查询日志
show variables like 'slow_query_log';
--slow_query_log = on;
show variables like 'slow_query_log_file';
set global slow_query_log_file='/tmp/slow_log';
删除退市股票以及新股的无效交易日的数据
退市的股票(exlisting = True)的无效交易日数据指交易日期在退市日期(end_date)之后的数据。
新股(newlisting = True)的无效交易日数据指交易日期在新股上市之前(start_date)的数据。
# 删除退市股票无效交易日的数据
DELETE merged_table FROM merged_table inner join stockList
WHERE stockList.exlisting = 'True' and merged_table.date > stockList.end_date and merged_table.code = stockList.code;
# 删除新股无效交易日的数据
DELETE merged_table FROM merged_table inner join stockList
WHERE stockList.newlisting = 'True' and merged_table.date < stockList.start_date and merged_table.code = stockList.code;
我们利用stockList表的信息将merged_table中无效交易日的数据剔除掉。效果如下图:
删除个股数据表
构建存储程序删除个股数据表
delimiter $
CREATE PROCEDURE drop_tables()
BEGIN
SET @list := (SELECT GROUP_CONCAT(code) FROM stockList);
SET @counter := 1;
WHILE @counter <= (SELECT COUNT(code) FROM stockList) DO
SET @str := (SELECT SUBSTRING_INDEX( SUBSTRING_INDEX(@list,',',@counter),',',-1));
SET @tablename := CONCAT('price_',@str);
SET @sql_text = CONCAT('DROP TABLE ' '', @tablename, ';');
PREPARE stmt FROM @sql_text;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
SET @counter = @counter + 1;
END WHILE;
END $
delimiter ;
执行结果如下图:
查询merged_table这张表里3开头的股票中每个月成交量最大的股票
因为交易日的数据是以天为单位,每个月的成交量需要对成交量按月进行聚合。首先,构造一个查询,查询以3开头的股票每个月的成交量之和。
# 查询出3开头的股票每月成交量之和
select code,sum(volume),month(date) from merged_table where code like "3%" group by code,month(date) order by month(date),sum(volume) desc;
在上面的查询基础上,构造嵌套查询。
select max(t1.sum), code,t1.month FROM
(select code,sum(volume) as sum,month(date) as month from merged_table where code like "3%" group by code,month(date) order by month(date),sum(volume) desc) as t1
group by t1.month;
可以看出嵌套查询基本上是在之前的查询结果基础上构建的,用括号将之前的查询结果括起来,用别名引用,在选择要进行查询的字段和聚合的方法。
结果输出后,我们发现只有股票的代码,没有股票的名称,于是我们进一步构造一个嵌套查询,将股票的名称输出。
#构建嵌套查询语句 ,输出股票的名称
select t2.code,t2.month, stockList.display_name,t2.vol from stockList
right join
(select max(t1.sum) as vol, code, t1.month as month FROM
(select code,sum(volume) as sum,month(date) as month from merged_table where code like "3%" group by code,month(date) order by month(date),sum(volume) desc) as t1
group by month) as t2
on t2.code = stockList.code
order by t2.month;
下载日志文件,备份数据库
按照查询日志开关设置时指定的目录去查找到日志文件,下载到本地。备份数据库(略)。
查询时间超过10s的指令会被记录到慢查询日志,日志如下:
/usr/sbin/mysqld, Version: 10.3.36-MariaDB (MariaDB Server). started with:
Tcp port: 3306 Unix socket: /var/lib/mysql/mysql.sock
Time Id Command Argument
# Time: 230307 22:26:00
# User@Host: root[root] @ localhost []
# Thread_id: 8 Schema: stocks QC_hit: No
# Query_time: 102.367915 Lock_time: 0.000156 Rows_sent: 0 Rows_examined: 204863635
# Rows_affected: 35791 Bytes_sent: 13
use stocks;
SET timestamp=1678199160;
DELETE merged_table FROM merged_table inner join stockList
WHERE stockList.exlisting = 'True' and merged_table.date > stockList.end_date and merged_table.code = stockList.code;
# Time: 230307 22:32:24
# User@Host: root[root] @ localhost []
# Thread_id: 8 Schema: stocks QC_hit: No
# Query_time: 200.943780 Lock_time: 0.000162 Rows_sent: 0 Rows_examined: 410867984
# Rows_affected: 42398 Bytes_sent: 13
SET timestamp=1678199544;
DELETE merged_table FROM merged_table inner join stockList
WHERE stockList.newlisting = 'True' and merged_table.date < stockList.start_date and merged_table.code = stockList.code;
# Time: 230307 22:39:47
# User@Host: root[root] @ localhost []
# Thread_id: 8 Schema: stocks QC_hit: No
# Query_time: 39.337147 Lock_time: 0.000149 Rows_sent: 0 Rows_examined: 25704899
# Rows_affected: 0 Bytes_sent: 11
SET timestamp=1678199987;
call drop_tables();
# Time: 230308 0:05:37
# User@Host: root[root] @ localhost []
# Thread_id: 10 Schema: stocks QC_hit: No
# Query_time: 472.102015 Lock_time: 0.000226 Rows_sent: 1240 Rows_examined: 1343811841
# Rows_affected: 0 Bytes_sent: 28499
SET timestamp=1678205137;
SELECT t1.code, max(t1.v), month(date) from merged_table t2
left join
( select code,sum(volume) as v,month(date) as d from merged_table where code like "3%" group by code,month(date)) t1
on t1.d = month(date)
group by t1.code;
从慢查询日志可以清楚的看到,查询的发起时间,查询时长,锁的时长,扫描的行,受影响的行。可以推断出,慢查询的过程中系统进行了多次全表扫描。
本次作业的解析到此结束。因为一直有事情耽搁,发布比较晚。请读者见谅。