MySQL数据库基础（三）：多表查询，子查询，开窗函数

十一、多表查询（重点、难点）

表与表之间的关系

在SQL语句中，数据表与数据表之间，如果存在关系，一般一共有3种情况：

① 一对一关系（高级）

比如有A、B两张表，A表中的每一条数据，在B表中有一条唯一的数据与之对应。

用户表user

user_id（用户编号）	账号username	密码password
001	admin	admin888
002	itheima	123456

用户详情表user_items

user_id（用户编号）	真实姓名	年龄	联系方式
001	张三	16	10086
002	李四	18	10010

我们把用户表与用户详情表之间的关系就称之为一对一关系。

② 一对多关系（重点）

比如有A、B两张表，A表中的每一条数据，在B表中都有多条数据与之对应，我们把这种关系就称之为一对多关系产品分类表

分类id编号	分类名称
1	手机
2	电脑

产品信息表

产品id编号	产品名称	产品价格	所属分类id编号
1	Apple iPhone 13	6799.00	1
2	Redmi Note 9	3499.00	1

我们把产品分类表与产品表之间的关系就称之为一对多关系。

③ 多对多关系（高级）

用户表

用户编号	登录账号	登录密码
1	admin	admin888
2	itheima	123456

权限表

权限id编号	权限名称
1	增加
2	删除
3	修改
4	查询

虽然从以上图解来看，两者之间好像没有任何联系，但是两者之间其实是有关系的，这种关系需要通过一张临时表进行呈现。

每个用户，应该有对应的权限，admin账号可以做增删改查，itheima账号可以做查询

反过来

每个权限都应该对应多个用户，查询权限 => admin/itheima

中间表：用户_权限表

用户id编号	权限的id编号
1（admin）	1（增加）
1	2（删除）
1	3（修改）
1	4（查询）
2	4（查询）

交叉连接(了解)

没有意义，但是它是所有连接的基础。其功能就是将表1和表2中的每一条数据进行连接。

结果：

字段数 = 表1字段 + 表2的字段

记录数 = 表1中的总数量 * 表2中的总数量（笛卡尔积）

select * from students cross join classes;
或
select * from students, classes;

1、内连接

☆ 连接查询的介绍

连接查询可以实现多个表的查询，当查询的字段数据来自不同的表就可以使用连接查询来完成。连接查询可以分为:

内连接查询
左外连接查询
右外连接查询
自连接查询（自己查询自己）

☆ 内连接查询

查询两个表中符合条件的共有记录

内连接查询语法格式:

select 字段 from 表1 inner join 表2 on 表1.字段1 = 表2.字段2

说明:

inner join 就是内连接查询关键字
on 就是连接查询条件

例1：使用内连接查询学生表与班级表:

select * from students as s inner join classes as c on s.cls_id = c.id;

☆ 小结

内连接使用inner join .. on .., on 表示两个表的连接查询条件
内连接根据连接查询条件取出两个表的 “交集”

2、左外连接

☆ 左连接查询

以左表为主根据条件查询右表数据，如果根据条件查询右表数据不存在使用null值填充

左连接查询语法格式:

select 字段 from 表1 left join 表2 on 表1.字段1 = 表2.字段2

说明:

left join 就是左连接查询关键字
on 就是连接查询条件
表1 是左表
表2 是右表

例1：使用左连接查询学生表与班级表:

select * from students as s left join classes as c on s.cls_id = c.id;

☆ 小结

例1：使用右连接查询学生表与班级表:

select * from students as s right join classes as c on s.cls_id = c.id;

☆ 小结

4、自连接查询(扩展)

自连接查询：数据表自己连接自己，前提：连接操作时必须为数据表定义别名！

左表和右表是同一个表，根据连接查询条件查询两个表中的数据。

两个实际的工作场景，求省市区信息，求分类导航信息

cid	name	pid
1	图书	null
2	童书	1
3	中国儿童文学	2

地域：area

pid 全称 parent id（父级ID编号），如果pid值为null代表本身就是父级，如果pid是一个具体的数值，则代表其属于子级

例1：查询省的名称为“广东省”的所有城市

创建areas表:

use db_itheima;
create table tb_area(
aid int not null AUTO_INCREMENT,
atitle varchar(20),
pid int,
primary key(aid)
) default charset=utf8;

执行sql文件给areas表导入数据:

insert into tb_area values (null, '广东省', null),(null, '山西省', null),(null, '深圳市', 1), (null, '广州市', 1);

自连接查询的用法:

select c.id, c.title, c.pid, p.title from areas as c inner join areas as p on c.pid = p.id where p.title = '广东省';

说明:

☆ 小结

十二、子查询(三步走)

左连接使用left join .. on .., on 表示两个表的连接查询条件
左连接以左表为主根据条件查询右表数据，右表数据不存在使用null值填充。
3、右外连接

☆ 右连接查询

以右表为主根据条件查询左表数据，如果根据条件查询左表数据不存在使用null值填充

右连接查询语法格式:
select 字段 from 表1 right join 表2 on 表1.字段1 = 表2.字段2

说明:
right join 就是右连接查询关键字
on 就是连接查询条件
表1 是左表
表2 是右表
右连接使用right join .. on .., on 表示两个表的连接查询条件
右连接以右表为主根据条件查询左表数据，左表数据不存在使用null值填充。
自连接查询必须对表起别名
自连接查询就是把一张表模拟成左右两张表，然后进行连表查询。
自连接就是一种特殊的连接方式，连接的表还是本身这张表

1、子查询（嵌套查询）的介绍

在一个 select 语句中,嵌入了另外一个 select 语句, 那么被嵌入的 select 语句称之为子查询语句，外部那个select语句则称为主查询.

主查询和子查询的关系:

子查询是嵌入到主查询中

子查询是辅助主查询的,要么充当条件,要么充当数据源(数据表)

子查询是可以独立存在的语句,是一条完整的 select 语句

十三、外键约束（扩展）

2、子查询的使用

例1. 查询学生表中大于平均年龄的所有学生:

需求：查询年龄 > 平均年龄的所有学生

前提：

① 获取班级的平均年龄值

② 查询表中的所有记录，判断哪个同学 > 平均年龄值第一步：写子查询

select avg(age) from students;

第二步：写主查询

select * from students where age > (平均值);

第三步：第一步和第二步进行合并

select * from students where age > (select avg(age) from students);

例2. 查询tb_goods产品表中具有分类信息的产品

需求：查询产品表中具有分类信息的产品（没有与之对应分类信息的产品不显示）

前提：① 查询分类表中，到底有哪些分类（获取cid编号）

② 到产品表中进行判断，判断这个商品的cid编号与①中的是否相等第一步：编写子查询

select cid from tb_category;

第二步：编写主查询

select * from tb_goods where cid in (所有分类cid编号)

第三步：把主查询和子查询合并

select * from tb_goods where cid in (select cid from tb_category);

例3. 查找年龄最小且成绩最低的学生:

第一步：获取年龄最小值和成绩最小值

select min(age), min(score) from student;

第二步：查询所有学员信息（主查询）

select * from students where (age, score) = (最小年龄, 最少成绩);

第三步：把第一步和第二步合并

select * from students where (age, score) = (select min(age), min(score) from students);

注：数据表中必须有这样一条记录，否则可能查询不到结果，重点练习子查询返回多个结果情况。

3、小结

子查询是一个完整的SQL语句，子查询被嵌入到一对小括号里面掌握子查询编写三步走

十三、外键约束（扩展）

主键：primary key

外键：foreign key（应用场景：在两表或多表关联的时候设置的，用于标志两个表之间的关联关系）

create table 数据表名称(
字段名称字段类型字段约束[5种情况]
) default charset=utf8;

① 主键约束primary key

② 默认值约束default

③ 非空约束not null

④ 唯一约束unique key

⑤ 外键约束foreign key

原则：在一张表中，其是主键。但是在另外一张表中，其是从键（非主键），但是这个字段是两张表的关联字段。

1、外键约束作用

外键约束:对外键字段的值进行更新和插入时会和引用表中字段的数据进行验证，数据如果不合法则更新和插入会失败，保证数据的有效性。

dage表：

id编号（主键）	name姓名
1	陈浩南
2	乌鸦哥

xiaodi表：

id编号（主键）	name姓名	dage_id（外键）
1	山鸡	1
2	大天二	1
3	乌鸦的小弟	2

外键设计原则：保证两张表的关联关系，保证数据的一致性。在选择时，一般在一个表中时关联字段，在另外一个表中是主键，则这个字段建议设置为外键。

2、对于已经存在的字段添加外键约束

-- 为cls_id字段添加外键约束
alter table 数据表 add foreign key(外键字段) references 数据表(主键)
[on delete cascade| set null] [on update cascade | set null];

3、在创建数据表时设置外键约束

-- 创建一个大哥表
create table dage(
id int not null auto_increment,
name varchar(20),
primary key(id)
) default charset=utf8;
-- 添加测试数据
insert into dage values (null, '陈浩南');
insert into dage values (null, '乌鸦');

-- 创建一个小弟表
create table xiaodi(
id int not null auto_increment,
name varchar(20),
dage_id int,
primary key(id)
) default charset=utf8;
-- 把dage_id设置为主键
alter table xiaodi add foreign key(dage_id) references dage(id) on delete cascade;
-- 插入测试数据
insert into xiaodi values (null, '山鸡', 1);
insert into xiaodi values (null, '大天二', 1);
insert into xiaodi values (null, '乌鸦的小弟', 2);

-- 测试外键
delete from dage where id = 2;
select * from xiaodi; -- 看看乌鸦的小弟是否还存在

-- 删除外键
show create table xiaodi; -- 查看外键名称(如xiaodi_ibfk_1)
alter table xiaodi drop foreign key xiaodi_ibfk_1;

4、删除外键约束

-- 需要先获取外键约束名称,该名称系统会自动生成,可以通过查看表创建语句来获取名称
show create table 数据表;

-- 获取名称之后就可以根据名称来删除外键约束
alter table 数据表 drop foreign key 外键名;

十四、索引[了解]

① 编写SQL ② SQL优化（查询数据把查询时间缩短）

TB级别，10s => 0.01s

1、索引概述

索引作用: 快速检索数据(提高查询效率)，InnoDB引擎其底层主要是使用B+ Tree结构

2、普通索引使用

主键就是一个索引，比如百万条数据，没有主键索引，查询可能需要3-5s，如果我们添加了主键索引且刚好，要查询的字段就是主键，则可以缩短到零点零几秒。

备注：主键、外键、唯一键其实也是索引

创建索引: create index index_cname on category(cname); create index index_cname on category(cname(20));

修改表添加索引: alter table category add index index_cname(cname(20));

查询索引: show index from category;

删除索引: drop index index_cname on category;

查看所有库或者表的索引:

# 了解: mysql 是系统自带的数据库。innodb_index_stats 表记录 innodb 引擎(数据库核心) 的索引状态.
# 查看数据库的所有索引:
select *
from mysql.innodb_index_stats where database_name="bigdata_db";

# 查看数据表的所有索引:
select *
from mysql.innodb_index_stats where database_name="bigdata_db" and table_name="products";

3、唯一索引使用

create unique index index_cname on category(cname(20));
alter table category add unique index index_cname(cname(20));

-- 开启运行时间监测：
set profiling=1;
-- 查找第1万条数据ha-99999
select * from tb_index where title='ha-99999';
-- 查看执行的时间：
show profiles;
-- 给title字段创建索引：
alter table tb_index add index (title);
-- 再次执行查询语句
select * from test_index where title='ha-99999';
-- 再次查看执行的时间
show profiles;

4、索引使用注意

创建索引
删除索引
索引不是越多越好. 索引使用应该注意以下问题:
磁盘空间消耗
创建索引和维护索引的时间消耗
经常增删改数据，索引需要动态维护，效率低下。
不经常查询的字段不需要创建索引
大部分值相同的字段不需要创建索引
扩展:
开启mysql时间检测: set profiling=1;
查看sql语句执行时间: show profiles;

十五、开窗函数(mysql 8.0后新的)

1、数据准备

create table employee (
empid int,
ename varchar(20) ,
deptid int,
salary decimal(10,2)
) default charset=utf8;

insert into employee values(1,'刘备',10,5500.00);
insert into employee values(2,'赵云',10,4500.00);
insert into employee values(2,'张飞',10,3500.00);
insert into employee values(2,'关羽',10,4500.00);

insert into employee values(3,'曹操',20,1900.00);
insert into employee values(4,'许褚',20,4800.00);
insert into employee values(5,'张辽',20,6500.00);
insert into employee values(6,'徐晃',20,14500.00);

insert into employee values(7,'孙权',30,44500.00);
insert into employee values(8,'周瑜',30,6500.00);
insert into employee values(9,'陆逊',30,7500.00);

2、开窗函数使用

格式:

select *,
row_number() over (partition by deptid order by salary) as row_n
from employee;

# select *,
# row_number() over (order by salary) as row_n
# from employee;

# 1. 查询每一个部门的薪资排名
select *,
row_number() over (partition by deptid order by salary) as row_n,
rank() over (partition by deptid order by salary) as rank_n,
dense_rank() over (partition by deptid order by salary) as drank_n
from employee;

# 开窗函数:
# row_number：显示排序后的行数
# rank: 显示名次，可以并列排名，下一个排名会跳跃并列个数
# dense_rank: 显示名次，可以并列排名，下一个排名不会跳跃

# 2. 查询每个部门薪资排名第2的员工;
select * from
(select
*,
dense_rank() over (partition by deptid order by salary desc) as row_n
from employee) c where c.row_n=2;