在业务开发过程中,显示的开启事务并且在事务处理过程中对不同的情况进行显示的COMMIT或ROLLBACK,这是一个完整数据库事务处理的闭环过程。
这种在应用开发逻辑层面去handle的事务执行的结果,既确保了事务操作的数据完整性,又遵循了业务处理逻辑。所以显示的提交或回滚事务也是开发规范中的要求,但是也有一些存量的业务系统或开发人员并不能严格按照这一规范执行,进而在实际生产过程中引发故障。这里介绍一个因为开启事务后未显示的回滚导致DDL阻塞进而引发的问题。
应用系统使用的是MySQL生态的数据库,业务使用的是分区表,业务在处理时候因为当日的分区没有创建导致插入报错,应用逻辑上每日又有对表新增分区的操作,结果是事务没有显示回滚导致新增表分区的DDL阻塞,进而又引发后续的问题。
1、MySQL数据库故障模拟
1.1 创建分区表并插入数据
登录mysql数据库并创建分区表
CREATE TABLE tt1 (
id int NOT NULL,
sdate date NOT NULL,
c1 varchar(4) NOT NULL,
PRIMARY KEY (id, sdate)
)
PARTITION BY RANGE columns(sdate) (
PARTITION p20240524 VALUES LESS THAN ('2024-05-25'),
PARTITION p20240525 VALUES LESS THAN ('2024-05-26')
);
1.2 显示的开启事务并插入数据
mysql> begin;
mysql> select * from tango.tt1;
+----+------------+-----+
| id | sdate | c1 |
+----+------------+-----+
| 1 | 2024-05-25 | aaa |
+----+------------+-----+
1 row in set (0.00 sec)
insert into tt1 values(1,'2024-05-25','aaa');
mysql> insert into tt1 values(3,'2024-05-27','ccc');
ERROR 1526 (HY000): Table has no partition for value from column_list
数据库执行报错提示插入的记录分区不存在。
1.3 查看数据库表中锁和事务的状态
mysql> select * from performance_schema.metadata_locks where object_name='tt1';
+-------------+---------------+-------------+-------------+-----------------------+--------------+---------------+-------------+-------------------+-----------------+----------------+
| OBJECT_TYPE | OBJECT_SCHEMA | OBJECT_NAME | COLUMN_NAME | OBJECT_INSTANCE_BEGIN | LOCK_TYPE | LOCK_DURATION | LOCK_STATUS | SOURCE | OWNER_THREAD_ID | OWNER_EVENT_ID |
+-------------+---------------+-------------+-------------+-----------------------+--------------+---------------+-------------+-------------------+-----------------+----------------+
| TABLE | tango | tt1 | NULL | 140712994313232 | SHARED_READ | TRANSACTION | GRANTED | sql_parse.cc:5768 | 85 | 24 |
| TABLE | tango | tt1 | NULL | 140712994947616 | SHARED_WRITE | TRANSACTION | GRANTED | sql_parse.cc:5768 | 85 | 25 |
+-------------+---------------+-------------+-------------+-----------------------+--------------+---------------+-------------+-------------------+-----------------+----------------+
2 rows in set (0.00 sec)
可以看到表持有SHARED_READ和SHARED_WRITE锁,并不因为事务执行失败而释放,这也是mysql系数据库内核机制,事务报错后数据库层面并没有执行rollback操作,而是由应用自己决定是rollback还是commit。
1.4 其它业务执行新增分区的DDL操作
mysql> ALTER table tt1 ADD PARTITION ( PARTITION p20240526 VALUES LESS THAN ('2024-05-27') );
此时这个DDL操作会hang住,查看表的元数据锁情况
mysql> select * from performance_schema.metadata_locks where object_name='tt1';
+-------------+---------------+-------------+-------------+-----------------------+-------------------+---------------+-------------+-------------------+-----------------+----------------+
| OBJECT_TYPE | OBJECT_SCHEMA | OBJECT_NAME | COLUMN_NAME | OBJECT_INSTANCE_BEGIN | LOCK_TYPE | LOCK_DURATION | LOCK_STATUS | SOURCE | OWNER_THREAD_ID | OWNER_EVENT_ID |
+-------------+---------------+-------------+-------------+-----------------------+-------------------+---------------+-------------+-------------------+-----------------+----------------+
| TABLE | tango | tt1 | NULL | 140712801139968 | SHARED_READ | TRANSACTION | GRANTED | sql_parse.cc:5768 | 123 | 21 |
| TABLE | tango | tt1 | NULL | 140712793308528 | SHARED_WRITE | TRANSACTION | GRANTED | sql_parse.cc:5768 | 123 | 22 |
| TABLE | tango | tt1 | NULL | 140712926580592 | SHARED_UPGRADABLE | TRANSACTION | GRANTED | sql_parse.cc:5768 | 121 | 20 |
| TABLE | tango | tt1 | NULL | 140712928177104 | EXCLUSIVE | TRANSACTION | PENDING | mdl.cc:3753 | 121 | 20 |
+-------------+---------------+-------------+-------------+-----------------------+-------------------+---------------+-------------+-------------------+-----------------+----------------+
5 rows in set (0.00 sec)
可以看到一个pending状态的锁状态,查看对应的SQL语句,知道是新增分区的DDL操作。
mysql> select THREAD_ID,EVENT_ID,EVENT_NAME,TIMER_START,TIMER_END,TIMER_WAIT,LOCK_TIME,SQL_TEXT,STATEMENT_ID from events_statements_current where thread_id=121;
+-----------+----------+---------------------------+------------------+------------------+----------------+-----------+---------------------------------------------------------------------------------------+--------------+
| THREAD_ID | EVENT_ID | EVENT_NAME | TIMER_START | TIMER_END | TIMER_WAIT | LOCK_TIME | SQL_TEXT | STATEMENT_ID |
+-----------+----------+---------------------------+------------------+------------------+----------------+-----------+---------------------------------------------------------------------------------------+--------------+
| 121 | 20 | statement/sql/alter_table | 2670208499587000 | 2687425357664000 | 17216858077000 | 246000000 | ALTER table tt1 ADD PARTITION ( PARTITION p20240526 VALUES LESS THAN ('2024-05-27') ) | 32613 |
+-----------+----------+---------------------------+------------------+------------------+----------------+-----------+---------------------------------------------------------------------------------------+--------------+
1 row in set (0.00 sec)
这里的DDL操作,在mysql数据库中通过参数lock_wait_timeout控制DDL等待超时时间,超过该时间DDL会报错。默认该参数配置为31536000s,实际生产业务系统会设置30~60s,一些核心业务系统会设置为5s。但是在DDL阻塞期间,也会影响新的业务的执行。
1.5 影响新的业务操作
mysql> select * from tango.tt1;
该操作也会hang住,查看对应的锁情况,也是处于pending状态。也就是阻塞的DDL操作会影响接下去的业务对该表的访问,直到DDL超时失败后,后续的业务才会正常。
mysql> select * from performance_schema.metadata_locks where object_name='tt1';
+-------------+---------------+-------------+-------------+-----------------------+-------------------+---------------+-------------+-------------------+-----------------+----------------+
| OBJECT_TYPE | OBJECT_SCHEMA | OBJECT_NAME | COLUMN_NAME | OBJECT_INSTANCE_BEGIN | LOCK_TYPE | LOCK_DURATION | LOCK_STATUS | SOURCE | OWNER_THREAD_ID | OWNER_EVENT_ID |
+-------------+---------------+-------------+-------------+-----------------------+-------------------+---------------+-------------+-------------------+-----------------+----------------+
| TABLE | tango | tt1 | NULL | 140712801139968 | SHARED_READ | TRANSACTION | GRANTED | sql_parse.cc:5768 | 123 | 21 |
| TABLE | tango | tt1 | NULL | 140712793308528 | SHARED_WRITE | TRANSACTION | GRANTED | sql_parse.cc:5768 | 123 | 22 |
| TABLE | tango | tt1 | NULL | 140712926580592 | SHARED_UPGRADABLE | TRANSACTION | GRANTED | sql_parse.cc:5768 | 121 | 20 |
| TABLE | tango | tt1 | NULL | 140712928177104 | EXCLUSIVE | TRANSACTION | PENDING | mdl.cc:3753 | 121 | 20 |
| TABLE | tango | tt1 | NULL | 140713468045808 | SHARED_READ | TRANSACTION | PENDING | sql_parse.cc:5768 | 120 | 6 |
+-------------+---------------+-------------+-------------+-----------------------+-------------------+---------------+-------------+-------------------+-----------------+----------------+
5 rows in set (0.00 sec)
1.6 在这个场景下存在的问题
- 事务处理报错时,业务层没有handle这个报错,并显示的去做commit或rollback;
- 表分区的预创建和监控:对于分区表是要有预先创建分区的机制,每天或每月定时窗口创建一批分区,同时分区不足时能够及时告警出来;
- 数据库层元数据锁等待超时:有些不重要的业务系统将lock_wait_timeout设置为600s设置更大,在该故障场景下是存在问题的,相当于DDL阻塞的这期间新的业务也会受到影响。所以将该参数设置到合理区间,比如5~60s是有必要的。
对于MySQL生态的数据库,事务内执行失败后数据库没有锁资源没有释放本身机制上没有问题,像国产数据库中TiDB、GoldenDB都有类似的现象。对于其它数据库,比如Oracle、PostgreSQL等,针对这个场景是什么样的表现,接下去以openGauss数据库为例进行验证。
2、openGauss数据库下故障场景模拟
2.1 登录openGauss单机版数据库,并创建分区表
gsql -d postgres -p 5432
[opgauss@tango-01 data]$ gsql -d postgres -p 5432
gsql ((openGauss-lite 5.0.2 build 48a25b11) compiled at 2024-05-14 10:41:04 commit 0 last mr release)
openGauss=# create database tango;
tango=# CREATE TABLE tt1 (
tango(# id int NOT NULL,
tango(# sdate date NOT NULL,
tango(# c1 varchar(4) NOT NULL
tango(# )
tango-# PARTITION BY RANGE(sdate) (
tango(# PARTITION p20240524 VALUES LESS THAN ('2024-05-25'),
tango(# PARTITION p20240525 VALUES LESS THAN ('2024-05-26')
tango(# );
CREATE TABLE
tango=# \dt
Schema | Name | Type | Owner | Storage
--------+------+-------+---------+----------------------------------
public | tt1 | table | opgauss | {orientation=row,compression=no}
2.2 开启事务并插入数据
tango=# begin;
BEGIN
tango=# select * from tt1;
id | sdate | c1
----+---------------------+-----
1 | 2024-05-25 00:00:00 | aaa
(1 row)
tango=# insert into tt1 values(3,'2024-05-28','ccc');
ERROR: inserted partition key does not map to any table partition
提示报错分区不存在
2.3 另外开启一个任务执行新增分区操作
tango=# ALTER table tt1 ADD PARTITION p20240526 VALUES LESS THAN ('2024-05-27');
ALTER TABLE
可以看到分区是新增成功的。
2.4 查看这种场景下表的锁和事务状态信息
tango=# SELECT l.locktype,l.database,l.relation::regclass,l.page,l.tuple,l.virtualxid,l.transactionid,l.classid,l.objid,l.objsubid,l.pid,l.mode,l.granted FROM pg_locks l JOIN pg_class c ON l.relation = c.oid WHERE c.relname = 'tt1';
locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | pid | mode | granted
----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+-----------------+-----------------+---------
relation | 16384 | tt1 | | | | | | | | 140405684233984 | AccessShareLock | t
(1 row)
tango=# SELECT datname,pid,sessionid,usename,application_name,backend_start,xact_start,query_start,state,query FROM pg_stat_activity where datname='tango';
datname | pid | sessionid | usename | application_name | backend_start | xact_start | query_start |
state | query
---------+-----------------+-----------+---------+------------------+-------------------------------+-------------------------------+-------------------------------+------
---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------
tango | 140405684233984 | 8 | opgauss | gsql | 2024-05-26 15:45:47.008274+08 | 2024-05-26 15:47:40.481015+08 | 2024-05-26 15:47:45.822262+08 | idle
in transaction | select * from tt1;
当执行失败后,事务处于idle in transaction (aborted)状态,表锁持有的锁也不存在了。
tango=# SELECT l.locktype,l.database,l.relation::regclass,l.page,l.tuple,l.virtualxid,l.transactionid,l.classid,l.objid,l.objsubid,l.pid,l.mode,l.granted FROM pg_locks l JOIN pg_class c ON l.relation = c.oid WHERE c.relname = 'tt1';
locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | pid | mode | granted
----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+-----+------+---------
(0 rows)
tango=# SELECT datname,pid,sessionid,usename,application_name,backend_start,xact_start,query_start,state,query FROM pg_stat_activity where datname='tango';
datname | pid | sessionid | usename | application_name | backend_start | xact_start | query_start |
state | query
---------+-----------------+-----------+---------+------------------+-------------------------------+-------------------------------+-------------------------------+------
-------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------
----
tango | 140405684233984 | 8 | opgauss | gsql | 2024-05-26 15:45:47.008274+08 | | 2024-05-26 15:49:09.048895+08 | idle
in transaction (aborted) | insert into tt1 values(3,'2024-05-28','ccc');
可以看到openGauss数据库和MySQL数据库在这种故障场景下的不同表现,对于openGauss数据库而言,当事务内处理失败后,事务已经被数据库rollback了,事务中所持有的表锁也相应的释放了,其它如Oracle、PostgreSQL数据库是有相同的表现。
其它数据库因为时间关系暂时不验证了,总结针对这个场景需要优化的点有:①业务开发时候对事务报错主动处理,并显示的执行commit或rollback操作;②数据库层设置合理的DDL超时时间;③对分区表进行预创建和有效的监控手段;④数据库的DDL操作和业务处理主流程松耦合,尽量在投产窗口执行。
参考资料:
- https://docs-opengauss.osinfra.cn/zh/docs/5.0.0-lite