0.简介
本文介绍PG中的锁技术,主要包括PG中两阶段锁的介绍和PG中各种不同级别的锁,死锁问题介绍,以及如何去查看锁。
1.PG中两阶段锁
1.1 需要锁机制的原因
PG中的隔离性是通过MVCC和两阶段锁实现的,有了MVCC为什么还要使用悲观的锁来去控制?可以想象如果单纯使用快照隔离去做控制,那么在不加锁的情况下,事务都会生成快照,对于元组来说就是会获取每个元组的快照(特定版本),这时候去提交,判断事务涉及的所有元组和数据库中版本是否一致,不一致就需要去做回滚,会造成大量回滚,所以对于MVCC其实主要解决了读写不冲突的问题,而一些其他冲突需要通过锁机制在过程中进行锁定,而不是全部放在提交时判断,这样就能减少回滚。
1.2 两阶段锁介绍
两阶段锁加锁方式:读取一个数据库对象时,对这个对象加共享锁(S锁),当对一个数据库对象做修改时,对这个对象加排他锁(X锁)。
整个加锁放锁阶段分为两个过程:
1)增长阶段:事务可以尝试去申请任何类型的锁,但是这个阶段不能去做锁的释放。
2)收缩阶段:事务可以去做锁的释放,但不能再去申请任何类型的锁。
可以借助反证法来证明两阶段锁的正确性,也就是说只要有环就不能满足冲突可串行化。
由上图的关系可以得知,T1对T2有冲突依赖,即这两个事务在同一个对象上有冲突操作,那么必须是T1先释放锁,T2才被允许对这个对象上锁。
如果T1和T2都遵守两阶段锁协议,那么,T1处于收缩阶段之后,T2处于增长阶段;进而可知,当T2处于收缩阶段时,T3则处于增长阶段。依此类推。
由于优先图有环,也就是存在Tn对T1的冲突依赖,所以当Tn处于收缩阶段时,T1才能处于增长阶段。
而我们已经假设T1处于收缩阶段了,这违背了两阶段锁协议。由此可知,优先图中如果有环,就不满足两阶段锁协议的要求。
PG在两阶段锁基础上做了增强,被称为S2PL(Strict-2PL),只有在事务提交时才会统一放锁。
1.3 两阶段锁死锁问题
死锁是多个进程/线程在执行时因资源竞争而形成的一种僵局,如无外力作用,两个实体都无法向前继续推进。从理论上来说,发生死锁需要同时满足以下四个条件:
1)互斥条件:对资源有排他控制权
2)请求和保持条件:请求新资源,同时不释放自己的资源
3)不抢占条件:不能剥夺其他进程/线程的资源,只能等待自行释放
4)环路等待条件:等待关系形成环路
对于DBMS 来说,四个条件可以满足:
1)互斥条件:互斥锁
2)请求和保持条件:事务遵循两阶段锁定 (2PL),在扩展阶段不会释放锁
3)不抢占条件:事务不能释放别的事务的锁
4)环路等待条件:以非确定顺序,而是以执行语句来获得锁
常见的解决策略有以下三种:
1)死锁预防
2)死锁避免
3)死锁检测与解除
三种策略开销由高到低,PG的死锁处理是第三种,因为死锁并不经常发生,所以在发生时再进行检测和处理,能有更好的性能和更少的资源占用。
1.3.1 死锁检测分析
死锁的检测时机:进程在尝试申请锁时,如果发现锁被其他进程持有,且请求锁的模式和已经占有的锁模式存在冲突时,就会触发定时器,进入睡眠,如果再定时器超时前,获得了锁,就没有发生死锁现象,如果超时就需要进行死锁检测算法,代码如下:
void
CheckDeadLockAlert(void)
{
int save_errno = errno;
got_deadlock_timeout = true;
/*
* Have to set the latch again, even if handle_sig_alarm already did. Back
* then got_deadlock_timeout wasn't yet set... It's unlikely that this
* ever would be a problem, but setting a set latch again is cheap.
*
* Note that, when this function runs inside procsignal_sigusr1_handler(),
* the handler function sets the latch again after the latch is set here.
*/
SetLatch(MyLatch);
errno = save_errno;
}
ProcSleep(LOCALLOCK *locallock,LockMethod lockMethodTable)
{
/* 超时标志初始化 */
got_deadlock_timeout =false;
/* 启动定时器 */
enable_timeout_after(DEADLOCK_TIMEOUT,DeadlockTimeout);
do
{
else
{
WaitLatch(MyLatch, WL_LATCH_SET,0,
PG_WAIT_LOCK | locallock->tag.lock.locktag_type);
/* 进程被唤醒 */
ResetLatch(MyLatch);
/* 如果进程是因为死锁超时被唤醒,那么检测死锁 */
if(got_deadlock_timeout)
{
CheckDeadLock();
got_deadlock_timeout =false;
}
CHECK_FOR_INTERRUPTS();
}
/* ... */
}while(myWaitStatus == STATUS_WAITING);
/* ... */
/* 注销定时器 */
disable_timeout(DEADLOCK_TIMEOUT,false);
}
1.3.2 锁等待队列
如果进程等待一个锁,就会将其放到一个等待队列中,如果进程已经获得了和队伍中某些进程冲突的锁,就会将其放到冲突的进程之前,降低死锁的可能性。
1.3.3 死锁检测和处理
PG死锁检测函数检测环路,如果存在环路,在原有就满足前三个条件(互斥,请求保持,不抢占)的基础上,就能确认是发生了死锁。
PG中环路等待是被放在了一张有向图里面,图的顶点代表的是请求者,有向边代表的是等待的关系(请求者指向持有者)。结构如下,具体代码可以看deadlock.c。
typedef struct
{
PGPROC *waiter; /* the leader of the waiting lock group */
PGPROC *blocker; /* the leader of the group it is waiting for */
LOCK *lock; /* the lock being waited for */
int pred; /* workspace for TopoSort */
int link; /* workspace for TopoSort */
} EDGE;
每次检测从当前进程节点出发,如果最后环路回到了起点,说明有该进程参与形成的死锁。
另外,PG还有hard edge和soft edge的概念:
1)hard edge:已经形成请求依赖关系
2)soft edge:后来会形成请求依赖关系
这样如果soft edge会导致环的产生,可以提前进行锁等待队列重排来进行处理,如果处理不了,触发死锁检测的事务会进行回滚来打破死锁状态。
2.PG中锁类型
2.1 PG锁涉及的数据结构
加锁类型,表明是relation锁、page锁、行锁等,主要是锁的不同级别:
typedef enum LockTagType
{
LOCKTAG_RELATION, /* whole relation */
LOCKTAG_RELATION_EXTEND, /* the right to extend a relation */
LOCKTAG_DATABASE_FROZEN_IDS, /* pg_database.datfrozenxid */
LOCKTAG_PAGE, /* one page of a relation */
LOCKTAG_TUPLE, /* one physical tuple */
LOCKTAG_TRANSACTION, /* transaction (for waiting for xact done) */
LOCKTAG_VIRTUALTRANSACTION, /* virtual transaction (ditto) */
LOCKTAG_SPECULATIVE_TOKEN, /* speculative insertion Xid and token */
LOCKTAG_OBJECT, /* non-relation database object */
LOCKTAG_USERLOCK, /* reserved for old contrib/userlock code */
LOCKTAG_ADVISORY, /* advisory user locks */
} LockTagType;
加锁模式,用来表示共享,排他,访问等,主要是锁的不同权限:
#define NoLock 0
#define AccessShareLock 1 /* SELECT */
#define RowShareLock 2 /* SELECT FOR UPDATE/FOR SHARE */
#define RowExclusiveLock 3 /* INSERT, UPDATE, DELETE */
#define ShareUpdateExclusiveLock 4 /* VACUUM (non-FULL), ANALYZE, CREATE
* INDEX CONCURRENTLY */
#define ShareLock 5 /* CREATE INDEX (WITHOUT CONCURRENTLY) */
#define ShareRowExclusiveLock 6 /* like EXCLUSIVE MODE, but allows ROW
* SHARE */
#define ExclusiveLock 7 /* blocks ROW SHARE/SELECT...FOR UPDATE */
#define AccessExclusiveLock 8 /* ALTER TABLE, DROP TABLE, VACUUM FULL,
* and unqualified LOCK TABLE */
#define MaxLockMode 8 /* highest standard lock mode */
冲突关系,用来判断是否有加锁冲突:
static const LOCKMASK LockConflicts[] = {
0,
/* AccessShareLock */
LOCKBIT_ON(AccessExclusiveLock),
/* RowShareLock */
LOCKBIT_ON(ExclusiveLock) | LOCKBIT_ON(AccessExclusiveLock),
/* RowExclusiveLock */
LOCKBIT_ON(ShareLock) | LOCKBIT_ON(ShareRowExclusiveLock) |
LOCKBIT_ON(ExclusiveLock) | LOCKBIT_ON(AccessExclusiveLock),
/* ShareUpdateExclusiveLock */
LOCKBIT_ON(ShareUpdateExclusiveLock) |
LOCKBIT_ON(ShareLock) | LOCKBIT_ON(ShareRowExclusiveLock) |
LOCKBIT_ON(ExclusiveLock) | LOCKBIT_ON(AccessExclusiveLock),
/* ShareLock */
LOCKBIT_ON(RowExclusiveLock) | LOCKBIT_ON(ShareUpdateExclusiveLock) |
LOCKBIT_ON(ShareRowExclusiveLock) |
LOCKBIT_ON(ExclusiveLock) | LOCKBIT_ON(AccessExclusiveLock),
/* ShareRowExclusiveLock */
LOCKBIT_ON(RowExclusiveLock) | LOCKBIT_ON(ShareUpdateExclusiveLock) |
LOCKBIT_ON(ShareLock) | LOCKBIT_ON(ShareRowExclusiveLock) |
LOCKBIT_ON(ExclusiveLock) | LOCKBIT_ON(AccessExclusiveLock),
/* ExclusiveLock */
LOCKBIT_ON(RowShareLock) |
LOCKBIT_ON(RowExclusiveLock) | LOCKBIT_ON(ShareUpdateExclusiveLock) |
LOCKBIT_ON(ShareLock) | LOCKBIT_ON(ShareRowExclusiveLock) |
LOCKBIT_ON(ExclusiveLock) | LOCKBIT_ON(AccessExclusiveLock),
/* AccessExclusiveLock */
LOCKBIT_ON(AccessShareLock) | LOCKBIT_ON(RowShareLock) |
LOCKBIT_ON(RowExclusiveLock) | LOCKBIT_ON(ShareUpdateExclusiveLock) |
LOCKBIT_ON(ShareLock) | LOCKBIT_ON(ShareRowExclusiveLock) |
LOCKBIT_ON(ExclusiveLock) | LOCKBIT_ON(AccessExclusiveLock)
};
2.2 PG锁的流程
2.2.1 加锁流程
加锁逻辑主要在LockAcquire中:先查找本地是否有,有的话将其granted+1,如果没有就请求。
2.2.2 解锁流程
也是先查找本地是否存在,存在的话将其所有者数量减一。
2.3 PG锁查看
两个会话分别执行:
会话1:
postgres=# begin;
BEGIN
postgres=# insert into t4 values(7);
INSERT 0 1
会话2:
postgres=# begin;
BEGIN
postgres=# select * from t4;
a
---
5
(1 row)
postgres=# insert into t4 values(7);
INSERT 0 1
postgres=# select * from t4;
a
---
5
7
(2 rows)
postgres=# truncate t4;
此时会陷入等待
使用以下语句查看锁等待信息:
with
t_wait as
(
select a.mode,a.locktype,a.database,a.relation,a.page,a.tuple,a.classid,a.granted,
a.objid,a.objsubid,a.pid,a.virtualtransaction,a.virtualxid,a.transactionid,a.fastpath,
b.state,b.query,b.xact_start,b.query_start,b.usename,b.datname,b.client_addr,b.client_port,b.application_name
from pg_locks a,pg_stat_activity b where a.pid=b.pid and not a.granted
),
t_run as
(
select a.mode,a.locktype,a.database,a.relation,a.page,a.tuple,a.classid,a.granted,
a.objid,a.objsubid,a.pid,a.virtualtransaction,a.virtualxid,a.transactionid,a.fastpath,
b.state,b.query,b.xact_start,b.query_start,b.usename,b.datname,b.client_addr,b.client_port,b.application_name
from pg_locks a,pg_stat_activity b where a.pid=b.pid and a.granted
),
t_overlap as
(
select r.* from t_wait w join t_run r on
(
r.locktype is not distinct from w.locktype and
r.database is not distinct from w.database and
r.relation is not distinct from w.relation and
r.page is not distinct from w.page and
r.tuple is not distinct from w.tuple and
r.virtualxid is not distinct from w.virtualxid and
r.transactionid is not distinct from w.transactionid and
r.classid is not distinct from w.classid and
r.objid is not distinct from w.objid and
r.objsubid is not distinct from w.objsubid and
r.pid <> w.pid
)
),
t_unionall as
(
select r.* from t_overlap r
union all
select w.* from t_wait w
)
select locktype,datname,relation::regclass,page,tuple,virtualxid,transactionid::text,classid::regclass,objid,objsubid,
string_agg(
'Pid: '||case when pid is null then 'NULL' else pid::text end||chr(10)||
'Lock_Granted: '||case when granted is null then 'NULL' else granted::text end||' , Mode: '||case when mode is null then 'NULL' else mode::text end||' , FastPath: '||case when fastpath is null then 'NULL' else fastpath::text end||' , VirtualTransaction: '||case when virtualtransaction is null then 'NULL' else virtualtransaction::text end||' , Session_State: '||case when state is null then 'NULL' else state::text end||chr(10)||
'Username: '||case when usename is null then 'NULL' else usename::text end||' , Database: '||case when datname is null then 'NULL' else datname::text end||' , Client_Addr: '||case when client_addr is null then 'NULL' else client_addr::text end||' , Client_Port: '||case when client_port is null then 'NULL' else client_port::text end||' , Application_Name: '||case when application_name is null then 'NULL' else application_name::text end||chr(10)||
'Xact_Start: '||case when xact_start is null then 'NULL' else xact_start::text end||' , Query_Start: '||case when query_start is null then 'NULL' else query_start::text end||' , Xact_Elapse: '||case when (now()-xact_start) is null then 'NULL' else (now()-xact_start)::text end||' , Query_Elapse: '||case when (now()-query_start) is null then 'NULL' else (now()-query_start)::text end||chr(10)||
'SQL (Current SQL in Transaction): '||chr(10)||
case when query is null then 'NULL' else query::text end,
chr(10)||'--------'||chr(10)
order by
( case mode
when 'INVALID' then 0
when 'AccessShareLock' then 1
when 'RowShareLock' then 2
when 'RowExclusiveLock' then 3
when 'ShareUpdateExclusiveLock' then 4
when 'ShareLock' then 5
when 'ShareRowExclusiveLock' then 6
when 'ExclusiveLock' then 7
when 'AccessExclusiveLock' then 8
else 0
end ) desc,
(case when granted then 0 else 1 end)
) as lock_conflict
from t_unionall
group by
locktype,datname,relation,page,tuple,virtualxid,transactionid::text,classid,objid,objsubid ;
查看所有已经获取锁的sql
SELECT pid, state, usename, query, query_start
from pg_stat_activity
where pid in (
select pid from pg_locks l join pg_class t on l.relation = t.oid
and t.relkind = 'r'
);