对于聚合的功能MySQL是都是默默的发展。在最新的8.0.31版本中提供对集合操作INTERSECT和EXCEPT。这样一来,集合操作功能基本圆满了。MySQL5.7.40版本是不支持这个集合的。
In this release MySQL adds support for the SQL standard INTERSECT and EXCEPT table operators.
query_a INTERSECT query_b includes only rows appearing in both result sets.
query_a EXCEPT query_b returns any rows from the result set of query_a which are not in the result of query_b.
INTERSECT and EXCEPT both support DISTINCT and ALL, with DISTINCT the default in both cases. (This is the same as for UNION).
INTERSECT groups before EXCEPT or UNION, so TABLE r EXCEPT TABLE s INTERSECT TABLE t is evaluated as TABLE r EXCEPT (TABLE s INTERSECT TABLE t).
See INTERSECT Clause, and EXCEPT Clause, for additional information and examples. (Bug #1309, Bug #31336, Bug #11747209, Bug #11744757)
目前MySQL里提供的集合操作包含:【UNION ,UNION ALL,INTERSECT,EXCEPT】
集合类型 | 数据 | 结果 |
---|---|---|
UNION ALL | A={1,2,3},B={3,4,5,6} | ALL={1,2,3,3,4,5,6} |
UNION | A={1,2,3},B={3,4,5,6} | AUB={1,2,3,4,5,6} |
INTERSECT | A={1,2,3},B={3,4,5,6} | A∩B={3} |
EXCEPT | A={1,2,3},B={3,4,5,6} | A-B={1,2} |
集合
测试集合
CREATE TABLE `DataSetA` (
`id` bigint NOT NULL ,
`name` varchar(10) DEFAULT NULL
);
INSERT INTO DataSetA(id,name) VALUES(1,'A'),(2,'A'),(3,'C');
CREATE TABLE `DataSetB` (
`id` bigint NOT NULL ,
`name` varchar(10) DEFAULT NULL
);
INSERT INTO DataSetB(id,name) VALUES(3,'C'),(4,'D'),(5,'E'),(6,'F');
CREATE TABLE `DataSetC` (
`id` bigint NOT NULL,
`name` varchar(10) DEFAULT NULL
);
INSERT INTO DataSetC(id,name) VALUES(6,'F'),(7,'G'),(7,'G'),(8,'H');
不同的结果操作集:
执行计划:
除了UNION ALL,都需要通过Using temporary 进行运算,所以性能还是存在一定影响。毕竟涉及临时表。
集合中的DISTINCT和ALL
INTERSECT和EXCEPT都支持DISTINCT和ALL与UNION相同),在这两种情况下默认为DISTINCT。
字面意义相同是否去重或全部显示。
uery_block INTERSECT [ALL | DISTINCT] query_block
mysql> table DataSetC INTERSECT DISTINCT table DataSetC;
+----+------+
| id | name |
+----+------+
| 6 | F |
| 7 | G |
| 8 | H |
+----+------+
3 rows in set (0.00 sec)
mysql> table DataSetC INTERSECT ALL table DataSetC;
+----+------+
| id | name |
+----+------+
| 6 | F |
| 7 | G |
| 7 | G |
| 8 | H |
+----+------+
4 rows in set (0.00 sec)
复制
备注:对于INTERSECT ALL,左手表中任何唯一行的最大支持的重复数是4294967295。
注意事项
从底层实现和测试现象来看,集合对比方式是用object方式进行对比。只要列数量对就可以。类型无关紧要。但实际业务中肯定是满足字段,类型一致。
1.集合对比必须具有相同的列数,列数不同出现问题:
mysql> SELECT ID FROM DataSetA
INTERSECT
SELECT ID ,NAME FROM DataSetB;
ERROR 1222 (21000): The used SELECT statements have a different number of columns
2. 不同类型对比,可以进行对比。但对于最终结果来说就没意义。
mysql> SELECT ID FROM DataSetA
UNION
SELECT NAME FROM DataSetA;
+------+
| ID |
+------+
| 1 |
| 2 |
| 3 |
| A |
| C |
+------+
5 rows in set (0.00 sec)
3.版本对比写法不同
MySQL 5.7相比,在MySQL 8.0中,对SELECT和集合的解析器规则进行了重构,使其更加一致并减少了重复.
- LIMIT 写法
#mysql5.7.40不支持
mysql> (SELECT 1 AS result UNION SELECT 2) LIMIT 1;
ERROR 1064 (42000): You have an error in your SQL syntax;
check the manual that corresponds to your MySQL server version for the right
syntax to use near 'UNION SELECT 2) LIMIT 1' at line 1
#mysql8.0.31支持
mysql> (SELECT 1 AS result UNION SELECT 2) LIMIT 1;
+--------+
| result |
+--------+
| 1 |
+--------+
1 row in set (0.00 sec)
- FOR UPDATE 写法
#mysql5.7.40
mysql> SELECT 1 FOR UPDATE UNION SELECT 1 FOR UPDATE;
+---+
| 1 |
+---+
| 1 |
+---+
1 row in set (0.00 sec)
mysql> (SELECT 1 FOR UPDATE) UNION (SELECT 1 FOR UPDATE);
+---+
| 1 |
+---+
| 1 |
+---+
1 row in set (0.00 sec)
#mysql8.0.31 更严谨
mysql> SELECT 1 FOR UPDATE UNION SELECT 1 FOR UPDATE;
ERROR 1064 (42000): You have an error in your SQL syntax;
check the manual that corresponds to your MySQL server version for the right
syntax to use near 'UNION SELECT 1 FOR UPDATE' at line 1
mysql> (SELECT 1 FOR UPDATE) UNION (SELECT 1 FOR UPDATE);
+---+
| 1 |
+---+
| 1 |
+---+
1 row in set (0.00 sec)
总结
集合对比操作也算聚合的一种,在MySQL里性能影响还是比较可观的。基本要通过临时表进行处理。所以环境中需要合理使用。
参考:
https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-31.html