项目场景
项目开发中有两张表:c_bill
(账单表),c_bill_detail
(账单明细表),他们的表结构如下(这里只保留必要信息):
CREATE TABLE `c_bill_detail` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT COMMENT '主键',
`bill_detail_no` varchar(32) NOT NULL DEFAULT '' COMMENT '对账单编号',
`receivable_date` datetime(3) DEFAULT NULL COMMENT '应收日期',
`order_type` varchar(20) NOT NULL DEFAULT '' COMMENT
`bill_no` varchar(32) NOT NULL DEFAULT '' COMMENT '账单编号',
`invoice_amount` decimal(12,4) NOT NULL COMMENT '开票金额',
`active` tinyint NOT NULL DEFAULT '1' COMMENT '是否逻辑删除',
PRIMARY KEY (`id`) USING BTREE,
KEY `idx_bill_no` (`bill_no`) USING BTREE
) ENGINE=InnoDB COMMENT='客户账单明细';
CREATE TABLE `c_bill` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT COMMENT '主键',
`bill_no` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '对账单编号',
`should_receive_amount` decimal(12,4) NOT NULL COMMENT '应收总额',
`actual_should_receive_amount` decimal(12,4) NOT NULL COMMENT '实际应收金额',
`invoice_status` tinyint DEFAULT NULL COMMENT '开票状态(字典:invoice-status)',
`invoice_amount` decimal(12,4) NOT NULL COMMENT '开票金额',
PRIMARY KEY (`id`) USING BTREE,
UNIQUE KEY `uk_bill_no` (`bill_no`) USING BTREE
) ENGINE=InnoDB COMMENT='客户账单';
c_bill
表跟c_bill_detail
表是1对多的关系,c_bill
表中的invoice_amount
是由c_bill_detail
表中的invoice_amount
统计出来的。
统计sql如下:
UPDATE c_bill
SET invoice_amount = (SELECT ifnull(sum(invoice_amount), 0)
FROM c_bill_detail
WHERE bill_no = #{billNo}
AND active = 1
AND order_type in ('sale_order', 'supplement_order', 'subject_sale_order')),
invoice_date = #{invoiceDate},
invoice_status =
CASE
WHEN invoice_amount = should_receive_amount THEN 1
WHEN invoice_amount = 0 THEN 0
ELSE 2
END
where bill_no = #{billNo}
and active = 1;
业务层面,账单进行开发票操作后,会更新c_bill_detail
表跟c_bill
问题描述
有一天线上出现告警:
从日志上看发生了死锁,通过定位代码发现跟执行以下sql有关:
UPDATE c_bill
SET invoice_amount = (SELECT ifnull(sum(invoice_amount), 0)
FROM c_bill_detail
WHERE bill_no = #{billNo}
AND active = 1
AND order_type in ('sale_order', 'supplement_order', 'subject_sale_order')),
invoice_date = #{invoiceDate},
invoice_status =
CASE
WHEN invoice_amount = should_receive_amount THEN 1
WHEN invoice_amount = 0 THEN 0
ELSE 2
END
where bill_no = #{billNo}
and active = 1;
通过数据库锁分析得到如下信息:
从上面信息可以得到以下信息:
- 事务1等待
c_bill_detail
表的S
锁,该锁对应的索引名称是PRIMARY
(也就是主键索引,id) - 事务1持有
c_bill
表的X
锁,该锁对应的索引名称是uk_bill_no
。 - 事务2等待
c_bill
表的X
锁,该锁对应的索引名称为uk_bill_no
。 - 事务2持有
c_bill_detail
表额S
锁,该锁对应的索引名称是PRIMARY
(也就是主键索引,id)
通过上面可以看出,事务1跟事务2直接的锁进入了死循环,形成了死锁。
原因分析:
死锁数据分析
上面的途中,给出了死锁有关的两个索引:c_bill_detail
表的主键索引,跟c_bill
表的主键索引。c_bill
表知道是执行了上面提到的统计sql,那么,c_bill_detail
表是执行了什么操作呢?
首先通过审计找出当时这两个事务的操作:
上面是线程3213915(事务A)的有关操作,可以看到对c_bill_detail
表有如下更新:
-- SQL
UPDATE
c_bill_detail
SET
receivable_date = '2023-11-15 00:00:00',
invoice_status = 2,
invoice_amount = 305412,
updater = '管理员',
updater_code = 'ADMINISTRATOR',
update_time = '2023-12-08 17:47:52.382000'
WHERE id = 146947
AND active = 1
线程3213754(事务B)操作如下
UPDATE
c_bill_detail
SET
receivable_date = '2023-11-15 00:00:00',
invoice_status = 2,
invoice_amount = 305412,
updater = '管理员',
updater_code = 'ADMINISTRATOR',
update_time = '2023-12-08 17:47:52.381000'
WHERE id = 147471
AND active = 1;
从上面可以看出事务A对表c_bill_detail
的id = 146947
数据进行了更新,事务B对表c_bill
的id=147471
进行了更新。
通过审计日志还发现,事务A跟事务B也都更新了c_bill
表,而且都是更新了bill_no=XSZD202309070005
这一行数据。
事务A:
UPDATE
c_bill
SET
invoice_amount =
(SELECT
IFNULL(SUM(invoice_amount), 0)
FROM
c_bill_detail
WHERE bill_no = 'XSZD202309070005'
AND active = 1
AND order_type IN (
'sale_order',
'supplement_order',
'subject_sale_order'
)),
invoice_date = '2023-12-08 00:00:00',
invoice_status =
CASE
WHEN invoice_amount = should_receive_amount
THEN 1
WHEN invoice_amount = 0
THEN 0
ELSE 2
END
WHERE bill_no = 'XSZD202309070005'
AND active = 1;
事务B:
-- SQL
UPDATE
c_bill
SET
invoice_amount =
(SELECT
IFNULL(SUM(invoice_amount), 0)
FROM
c_bill_detail
WHERE bill_no = 'XSZD202309070005'
AND active = 1
AND order_type IN (
'sale_order',
'supplement_order',
'subject_sale_order'
)),
invoice_date = '2023-12-08 00:00:00',
invoice_status =
CASE
WHEN invoice_amount = should_receive_amount
THEN 1
WHEN invoice_amount = 0
THEN 0
ELSE 2
END
WHERE bill_no = 'XSZD202309070005'
AND active = 1;
上图可以看出,事务B最终在更新c_bill
时失败回滚了(因为发生了死锁)。
通过查看数据发现,c_bill_detail
表id = 146947
跟id=147471
对应的bill_no
都是XSZD202309070005
。
到这里,只是发现了数据上的关联,还是不知道为什么会出现死锁,下面在其他环境进行复现。
select语句添加了共享读锁
为了更好复现这个死锁情况,现将线上的sql执行顺序整理如下:
下面在本地数据库,选取c_bill_detail
的 id=19380
和id=19381
,这两条数据有相同的bill_no=XSZD202211226768
开启两个事务,分别按照上面表格的sql数据进行执行,同时观察锁情况:
事务A 更新id = 19380:
Database changed
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> UPDATE
-> c_bill_detail
-> SET
-> receivable_date = '2023-11-15 00:00:00',
-> invoice_status = 2,
-> invoice_amount = 305412,
-> updater = '管理员',
-> updater_code = 'ADMINISTRATOR',
-> update_time = '2023-12-08 17:47:52.382000'
-> WHERE id = 19380
-> AND active = 1;
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0
事务B更新id = 19381的记录:
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> UPDATE
-> c_bill_detail
-> SET
-> receivable_date = '2023-11-15 00:00:00',
-> invoice_status = 2,
-> invoice_amount = 305412,
-> updater = '管理员',
-> updater_code = 'ADMINISTRATOR',
-> update_time = '2023-12-08 17:47:52.381000'
-> WHERE id = 19381
-> AND active = 1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
这是观察锁情况
mysql> select * from performance_schema.data_locks\G;
*************************** 1. row ***************************
// 省略表意向锁
*************************** 2. row ***************************
ENGINE: INNODB
ENGINE_LOCK_ID: 139645347952208:230:8:12:139645358792496
ENGINE_TRANSACTION_ID: 65810
THREAD_ID: 563867
EVENT_ID: 34
OBJECT_SCHEMA: fresh
OBJECT_NAME: c_bill_detail
PARTITION_NAME: NULL
SUBPARTITION_NAME: NULL
INDEX_NAME: PRIMARY
OBJECT_INSTANCE_BEGIN: 139645358792496
LOCK_TYPE: RECORD
LOCK_MODE: X,REC_NOT_GAP
LOCK_STATUS: GRANTED
LOCK_DATA: 19381
*************************** 3. row ***************************
// 省略表锁
*************************** 4. row ***************************
ENGINE: INNODB
ENGINE_LOCK_ID: 139645347951400:230:8:9:139645358786480
ENGINE_TRANSACTION_ID: 65809
THREAD_ID: 563866
EVENT_ID: 34
OBJECT_SCHEMA: fresh
OBJECT_NAME: c_bill_detail
PARTITION_NAME: NULL
SUBPARTITION_NAME: NULL
INDEX_NAME: PRIMARY
OBJECT_INSTANCE_BEGIN: 139645358786480
LOCK_TYPE: RECORD
LOCK_MODE: X,REC_NOT_GAP
LOCK_STATUS: GRANTED
LOCK_DATA: 19380
4 rows in set (0.00 sec)
从上面看出,c_bill_detail
表的id=19381
和id=19380
的数据加上了X
锁,这是意料之中的。
接下来事务A执行更新c_bill
表
mysql> UPDATE
-> c_bill
-> SET
-> invoice_amount =
-> (SELECT
-> IFNULL(SUM(invoice_amount), 0)
-> FROM
-> c_bill_detail
-> WHERE bill_no = 'XSZD202211226768'
-> AND active = 1
-> AND order_type IN (
-> 'sale_order',
-> 'supplement_order',
-> 'subject_sale_order'
-> )),
-> invoice_date = '2023-12-08 00:00:00',
-> invoice_status =
-> CASE
-> WHEN invoice_amount = should_receive_amount
-> THEN 1
-> WHEN invoice_amount = 0
-> THEN 0
-> ELSE 2
-> END
-> WHERE bill_no = 'XSZD202211226768'
-> AND active = 1;
此时事务A发生了阻塞
这时查看锁情况:
mysql> select * from performance_schema.data_locks\G;
*************************** 1. row ***************************
// 表锁
*************************** 2. row ***************************
ENGINE: INNODB
ENGINE_LOCK_ID: 139645347952208:230:8:16:139645358792496
ENGINE_TRANSACTION_ID: 65820
THREAD_ID: 563867
EVENT_ID: 43
OBJECT_SCHEMA: fresh
OBJECT_NAME: c_bill_detail
PARTITION_NAME: NULL
SUBPARTITION_NAME: NULL
INDEX_NAME: PRIMARY
OBJECT_INSTANCE_BEGIN: 139645358792496
LOCK_TYPE: RECORD
LOCK_MODE: X,REC_NOT_GAP
LOCK_STATUS: GRANTED
LOCK_DATA: 19381
*************************** 3. row ***************************
//c_bill 表意向锁
*************************** 4. row ***************************
// c_bill_detail表意向锁
*************************** 5. row ***************************
ENGINE: INNODB
ENGINE_LOCK_ID: 139645347951400:230:8:15:139645358786480
ENGINE_TRANSACTION_ID: 65819
THREAD_ID: 563866
EVENT_ID: 44
OBJECT_SCHEMA: fresh
OBJECT_NAME: c_bill_detail
PARTITION_NAME: NULL
SUBPARTITION_NAME: NULL
INDEX_NAME: PRIMARY
OBJECT_INSTANCE_BEGIN: 139645358786480
LOCK_TYPE: RECORD
LOCK_MODE: X,REC_NOT_GAP
LOCK_STATUS: GRANTED
LOCK_DATA: 19380
*************************** 6. row ***************************
ENGINE: INNODB
ENGINE_LOCK_ID: 139645347951400:229:5:6:139645358786824
ENGINE_TRANSACTION_ID: 65819
THREAD_ID: 563866
EVENT_ID: 45
OBJECT_SCHEMA: fresh
OBJECT_NAME: c_bill
PARTITION_NAME: NULL
SUBPARTITION_NAME: NULL
INDEX_NAME: uk_bill_no
OBJECT_INSTANCE_BEGIN: 139645358786824
LOCK_TYPE: RECORD
LOCK_MODE: X,REC_NOT_GAP
LOCK_STATUS: GRANTED
LOCK_DATA: 'XSZD202211226768', 5117
*************************** 7. row ***************************
ENGINE: INNODB
ENGINE_LOCK_ID: 139645347951400:229:7:6:139645358787168
ENGINE_TRANSACTION_ID: 65819
THREAD_ID: 563866
EVENT_ID: 45
OBJECT_SCHEMA: fresh
OBJECT_NAME: c_bill
PARTITION_NAME: NULL
SUBPARTITION_NAME: NULL
INDEX_NAME: PRIMARY
OBJECT_INSTANCE_BEGIN: 139645358787168
LOCK_TYPE: RECORD
LOCK_MODE: X,REC_NOT_GAP
LOCK_STATUS: GRANTED
LOCK_DATA: 5117
*************************** 8. row ***************************
ENGINE: INNODB
ENGINE_LOCK_ID: 139645347951400:230:58:117:139645358787512
ENGINE_TRANSACTION_ID: 65819
THREAD_ID: 563866
EVENT_ID: 45
OBJECT_SCHEMA: fresh
OBJECT_NAME: c_bill_detail
PARTITION_NAME: NULL
SUBPARTITION_NAME: NULL
INDEX_NAME: idx_bill_no
OBJECT_INSTANCE_BEGIN: 139645358787512
LOCK_TYPE: RECORD
LOCK_MODE: S
LOCK_STATUS: GRANTED
LOCK_DATA: 'XSZD202211226768', 19380
*************************** 9. row ***************************
ENGINE: INNODB
ENGINE_LOCK_ID: 139645347951400:230:58:118:139645358787512
ENGINE_TRANSACTION_ID: 65819
THREAD_ID: 563866
EVENT_ID: 45
OBJECT_SCHEMA: fresh
OBJECT_NAME: c_bill_detail
PARTITION_NAME: NULL
SUBPARTITION_NAME: NULL
INDEX_NAME: idx_bill_no
OBJECT_INSTANCE_BEGIN: 139645358787512
LOCK_TYPE: RECORD
LOCK_MODE: S
LOCK_STATUS: GRANTED
LOCK_DATA: 'XSZD202211226768', 19381
*************************** 10. row ***************************
ENGINE: INNODB
ENGINE_LOCK_ID: 139645347951400:230:8:16:139645358787856
ENGINE_TRANSACTION_ID: 65819
THREAD_ID: 563866
EVENT_ID: 45
OBJECT_SCHEMA: fresh
OBJECT_NAME: c_bill_detail
PARTITION_NAME: NULL
SUBPARTITION_NAME: NULL
INDEX_NAME: PRIMARY
OBJECT_INSTANCE_BEGIN: 139645358787856
LOCK_TYPE: RECORD
LOCK_MODE: S,REC_NOT_GAP
LOCK_STATUS: WAITING
LOCK_DATA: 19381
10 rows in set (0.00 sec)
从上面的Row10发现,事务A跟表c_bill_detail
的id = 19381
的记录添加了S
锁,并且在锁状态为等待状态。
接着事务B也执行更新c_bill
表,发现就会出现死锁的情况。
到这里总结上面的持锁过程:
- 事务A先持有
t_bill_detail
的id=19380
的X
锁 - 接着事务B持有
t_bll_detail
的id=19381
的X
锁,与上一把没有存在锁竞争,都能正常执行 - 事务A更新
c_bill
,这时不仅给表c_bill
表的bill_no=XSZD202211226768
的记录加上了X
锁,同时也给c_bill_detail
表id=19381
的记录添加了S
锁,并且处于等待状态。 - 事务B更新
c_bill
同样会给c_bill_detail
的id=19380
的记录添加S
锁。
这里有一下几点需要注意:
S
锁跟X
锁不兼容,会出现锁等待的情况。- 普通的
select
语句是不加锁的,但是在update
语句中进行select
查询赋值,这时的select
就会添加上共享锁。 - 共享锁主要是保证每次读取的都是最新的值(读取时不支持修改)。
以上就是生成环境形成死锁的分析过程。关于X
锁跟S
锁的更多说明,可以参考:Innodb中的锁