PgSQL - 17新特性 - 块级别增量备份
PgSQL可通过pg_basebackup进行全量备份。在构建复制关系时,创建备机时需要通过pg_basebackup全量拉取一个备份,形成一个mirror。但很多场景下,我们往往不需要进行全量备份/恢复,数据量特别大的时候,这个代价太大了。GPDB中有个工具gprecoverseg支持全量备份和增量备份。所谓全量备份,主要通过pg_basebackup从其他节点全量拷贝一份数据过来;而增量备份主要通过pg_rewind工具,只拷贝新增的数据。而PgSQL中单独的pg_rewind,仅从分叉点之前最近的checkpoint位置开始解析WAL,解析出变动的数据页,然后仅将变动的数据页拷贝过来。所以,仅靠pg_rewind实现不了完美的增量备份。
正在开发中的PgSQL17在pg_basebackup中新增了增量备份的功能。
1、使用方法
1.1 创建用例表及插入数据
=# CREATE TABLE just_for_fun (last_updated timestamptz);
=# INSERT INTO just_for_fun (last_updated) VALUES (now());
=# UPDATE just_for_fun SET last_updated = now();
1.2 执行pg_basebackup
=$ mkdir /var/tmp/backups; pg_basebackup -D /var/tmp/backups
=$ ls -l /var/tmp/backups/
total 360
-rw------- 1 pgdba pgdba 227 Jan 8 17:16 backup_label
-rw------- 1 pgdba pgdba 226076 Jan 8 17:16 backup_manifest
drwx------ 7 pgdba pgdba 4096 Jan 8 17:16 base/
…
-rw------- 1 pgdba pgdba 88 Jan 8 17:16 postgresql.auto.conf
-rw------- 1 pgdba pgdba 29806 Jan 8 17:16 postgresql.conf
相对于老版本的pg_basebackup多了backup_mainfest文件。该备份将PGDATA下的内容拷贝到/var/tmp/backups下。如果修改下冲突配置项,比如端口配置port,则可以通过pg_ctl -D /var/tmp/backups start直接启动。
当然,也可以备份成.tar文件:
=$ rm -rf /var/tmp/backups/; mkdir /var/tmp/backups; pg_basebackup -Ft -D /var/tmp/backups
=$ ls -l /var/tmp/backups/
total 56176
-rw------- 1 pgdba pgdba 226218 Jan 8 17:19 backup_manifest
-rw------- 1 pgdba pgdba 40509440 Jan 8 17:19 base.tar
-rw------- 1 pgdba pgdba 16778752 Jan 8 17:19 pg_wal.tar
1.3 backup_mainfest文件
=$ cat /var/tmp/backups/backup_manifest | head -n 10
{ "PostgreSQL-Backup-Manifest-Version": 1,
"Files": [
{ "Path": "backup_label", "Size": 227, "Last-Modified": "2024-01-08 16:21:14 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "f6db08ca" },
{ "Path": "tablespace_map", "Size": 0, "Last-Modified": "2024-01-08 16:21:14 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "00000000" },
{ "Path": "pg_xact/0000", "Size": 8192, "Last-Modified": "2024-01-08 16:21:13 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "c79e44f3" },
{ "Path": "PG_VERSION", "Size": 3, "Last-Modified": "2024-01-08 13:08:53 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "64440205" },
{ "Path": "pg_multixact/offsets/0000", "Size": 8192, "Last-Modified": "2024-01-08 13:09:02 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "23464490" },
{ "Path": "pg_multixact/members/0000", "Size": 8192, "Last-Modified": "2024-01-08 13:08:53 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "23464490" },
{ "Path": "conf.d/depesz.conf", "Size": 512, "Last-Modified": "2024-01-08 13:08:54 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "c6f171e0" },
{ "Path": "pg_ident.conf", "Size": 2640, "Last-Modified": "2024-01-08 13:08:53 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "0ce04d87" },
…
{ "Path": "base/5/2652", "Size": 16384, "Last-Modified": "2024-01-08 13:08:53 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "259eec8e" },
{ "Path": "pg_logical/replorigin_checkpoint", "Size": 8, "Last-Modified": "2024-01-08 16:21:13 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "c74b6748" },
{ "Path": "current_logfiles", "Size": 44, "Last-Modified": "2024-01-08 13:08:54 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "97357c1c" },
{ "Path": "log/postgresql-2024-01-08_140854.log", "Size": 1021834, "Last-Modified": "2024-01-08 16:21:14 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "d2498fb2" },
{ "Path": "global/pg_control", "Size": 8192, "Last-Modified": "2024-01-08 16:21:14 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "43872087" }
],
"WAL-Ranges": [
{ "Timeline": 1, "Start-LSN": "4/86000028", "End-LSN": "4/86000750" }
],
"Manifest-Checksum": "106517baea81404769cd4deb686ff58b58997308f0d90d9afbfa9d0111a5003d"}
这个文件可以用于校验备份是否完成,也可以用于看下自从上次备份以来改变了哪些东西。
1.4 做一个全量备份
=$ rm -rf /var/tmp/backups/; mkdir /var/tmp/backups/
=$ pg_basebackup -Ft -D "/var/tmp/backups/$( date +%Y-%m-%d_%H%M%S-FULL )"
=$ ls -l /var/tmp/backups/
total 4
drwx------ 2 pgdba pgdba 4096 Jan 8 17:39 2024-01-08_173902-FULL/
=$ ls -l /var/tmp/backups/2024-01-08_173902-FULL/
total 56356
-rw------- 1 pgdba pgdba 226219 Jan 8 17:39 backup_manifest
-rw------- 1 pgdba pgdba 40691712 Jan 8 17:39 base.tar
-rw------- 1 pgdba pgdba 16778752 Jan 8 17:39 pg_wal.tar
做增量备份,指定-i:
=$ pg_basebackup -i /var/tmp/backups/2024-01-08_173902-FULL/backup_manifest -Ft -D "/var/tmp/backups/$( date +%Y-%m-%d_%H%M%S-INCREMENTAL )"
pg_basebackup: error: could NOT initiate base backup: ERROR: incremental backups cannot be taken unless WAL summarization IS enabled
pg_basebackup: removing DATA directory "/var/tmp/backups/2024-01-08_173956-INCREMENTAL"
需要开启wal summarization:
$ ALTER system SET summarize_wal = ON;
$ SELECT pg_reload_conf();
1.5 wal_summarization
默认为off,开启后会启动一个wal summarizer进程,自动生成wal summarize信息;当然还需要wal_level>minimal才能开启。记录到一段WAL的内容中:文件大小的变化、哪些block发生变化、需要被更新或删除、lsn范围。每个summary文件包含的信息:
1)某一个TLI上的一个LSN范围
2)每个relation,包括:
a "limit block" which is 0(文件被创建或销毁) if a relation is created or destroyed withina certain range of WAL records
or otherwise the shortest length(文件缩小至某个值) to which the relation was truncated during that range of WAL records
or otherwise InvalidBlockNumber(无效块号).
In addition, it stores a list of blocks which have been modified during that range of WAL records, (被修改过的blocks id). but excluding blocks which were removed by truncation after they were modified and never subsequently modified again. (不记录被truncate并且后面没有被修改过的blocks id)
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=174c480508ac25568561443e6d4a82d5c1103487
Wal summarizer就是哪个LSN范围内的变动?2.1节进行讲述。
1.6 增量备份
=$ pg_basebackup -i /var/tmp/backups/2024-01-08_173902-FULL/backup_manifest -Ft -D "/var/tmp/backups/$( date +%Y-%m-%d_%H%M%S-INCREMENTAL )"
=$ ls -l /var/tmp/backups/
total 8
drwx------ 2 pgdba pgdba 4096 Jan 8 17:39 2024-01-08_173902-FULL/
drwx------ 2 pgdba pgdba 4096 Jan 8 17:40 2024-01-08_174043-INCREMENTAL/
=$ ls -l /var/tmp/backups/2024-01-08_174043-INCREMENTAL/
total 23860
-rw------- 1 pgdba pgdba 236528 Jan 8 17:40 backup_manifest
-rw------- 1 pgdba pgdba 7413248 Jan 8 17:40 base.tar
-rw------- 1 pgdba pgdba 16778752 Jan 8 17:40 pg_wal.tar
增量备份的base.tar只有7MB,而全量备份有40MB。
增量备份和全量备份中的backup_manifest中文件个数一样,增量备份有2中类型文件:
=$ jq .Files[13] /var/tmp/backups/2024-01-08_174043-INCREMENTAL/backup_manifest
{
"Path": "global/1214_fsm",
"Size": 24576,
"Last-Modified": "2024-01-08 16:10:41 GMT",
"Checksum-Algorithm": "CRC32C",
"Checksum": "722d586a"
}
以及
=$ jq .Files[12] /var/tmp/backups/2024-01-08_174043-INCREMENTAL/backup_manifest
{
"Path": "global/INCREMENTAL.2695",
"Size": 12,
"Last-Modified": "2024-01-08 13:08:53 GMT",
"Checksum-Algorithm": "CRC32C",
"Checksum": "e34c7d7c"
}
不以INCREMENTAL开头的文件是普通文件,不是增量的。否则需要拉取一些更早的备份。
1.7 增量备份的合并
=$ mkdir /var/tmp/backups/FULL
=$ tar -x -C /var/tmp/backups/FULL -f /var/tmp/backups/2024-01-08_173902-FULL/base.tar
=$ tar -x -C /var/tmp/backups/FULL/pg_wal/ -f /var/tmp/backups/2024-01-08_173902-FULL/pg_wal.tar
=$ cp /var/tmp/backups/2024-01-08_173902-FULL/backup_manifest /var/tmp/backups/FULL/
=$ mkdir /var/tmp/backups/INCR
=$ tar -x -C /var/tmp/backups/INCR -f /var/tmp/backups/2024-01-08_174043-INCREMENTAL/base.tar
=$ tar -x -C /var/tmp/backups/INCR/pg_wal -f /var/tmp/backups/2024-01-08_174043-INCREMENTAL/pg_wal.tar
=$ cp /var/tmp/backups/2024-01-08_174043-INCREMENTAL/backup_manifest /var/tmp/backups/INCR/
pg_combinebackup将一个全量备份+一个或多个增量备份合并为一个全新的全量备份:
=$ pg_combinebackup -o /var/tmp/backups/combined /var/tmp/backups/FULL /var/tmp/backups/INCR
2、内核原理
2.1 manifest中的WAL-ranges
1)WAL-ranges中的Timeline为备份前checkpoint时的时间线
2)WAL-ranges中的Start-LSN为备份前checkpoint的位置
3)WAL-ranges中的End-LSN备份后XLOG_BACKUP_END后的位置
详情可查询下面函数调用逻辑:
perform_base_backup
do_pg_backup_start(...);
|-- RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT | (fast ? CHECKPOINT_IMMEDIATE : 0));
| state->startpoint = ControlFile->checkPointCopy.redo;
|-- state->starttli = ControlFile->checkPointCopy.ThisTimeLineID;
state.startptr = backup_state->startpoint;
state.starttli = backup_state->starttli;
...备份
do_pg_backup_stop(backup_state, !opt->nowait);
|-- state->stoppoint = XLogInsert(RM_XLOG_ID, XLOG_BACKUP_END);
| state->stoptli = XLogCtl->InsertTimeLineID;
| RequestXLogSwitch(false);
|-- ...
endptr = backup_state->stoppoint;
endtli = backup_state->stoptli;
//ckp时的redo点 -- 备份结束后XLOG_BACKUP_END位置
AddWALInfoToBackupManifest(&manifest, state.startptr, state.starttli, endptr, endtli);
|-- manifest 中End-LSN为endptr位置即backup end位置,Start-LSN为state.startptr位置即开始备份前ckp位置
2.2 wal Summarize中每条记录的是哪个WAL范围的数据变化?
其实,记录的是每个checkpoint周期的WAL中记录的变动的block等信息:
SummarizeWAL
...
while (1){
//Now read the next record.
record = XLogReadRecord(xlogreader, &errormsg);
switch (XLogRecGetRmid(xlogreader)){
case RM_SMGR_ID:
SummarizeSmgrRecord(xlogreader, brtab);
break;
case RM_XACT_ID:
SummarizeXactRecord(xlogreader, brtab);
break;
case RM_XLOG_ID:
stop_requested = SummarizeXlogRecord(xlogreader);
|-- info = XLogRecGetInfo(xlogreader) & ~XLR_INFO_MASK;
| if (info == XLOG_CHECKPOINT_REDO || info == XLOG_CHECKPOINT_SHUTDOWN){
| return true;
| }
|-- return false;
break;
default:
break;
}
if (stop_requested && xlogreader->ReadRecPtr > summary_start_lsn){
//遇到Checkpoint即停止解析
summary_end_lsn = xlogreader->ReadRecPtr;//checkpoint
break;
}
解析得到更改的block
if (summary_end_lsn > summary_start_lsn){
//summary文件名:
//tli(最老的未summarized的时间线)startlsn(最老的未summarized的wal)endlsn(ckp位置)
snprintf(temp_path, MAXPGPATH,
XLOGDIR "/summaries/temp.summary");
snprintf(final_path, MAXPGPATH,
XLOGDIR "/summaries/%08X%08X%08X%08X%08X.summary",
tli,
LSN_FORMAT_ARGS(summary_start_lsn),
LSN_FORMAT_ARGS(summary_end_lsn));
...将summary内容写入该文件
}
}
2.3 增量备份
1)pg_basebackup作为客户端通过GetConnection连接服务端,服务端会fork出wal sender进程与之交互
2)pg_basebackup通过RetrieveWalSegSize向wal sender进程发送“SHOW wal_segment_size”,wal sender通过exec_replication_command处理发过来的命令。GetPGVariable获取到wal_segment_size大小,并发送回去:直到ReadyForQuery的pq_flush才将内容发送过去
3)接着pg_basebackup就进入了BaseBackup函数中
4)RunIdentifySystem向wal sender发送IDENTIFY_SYSTEM,wal sender通过IdentifySystem函数获取到系统标记systemid、时间线timeline等发送回去
5)然后进入增量备份相关步骤:PQsendQuery向wal sender发送UPLOAD_MANIFEST命令,wal sender通过UploadManifest进行处理,先发送PGRES_COPY_IN,pg_basebackup接收到后,读取指定的backup_manifest并将它发送给wal sender;wal sender通过HandleUploadManifestPacket放到IncrementalBackupInfo::buf中,直到pg_basebackup发来EOF ‘c’表示发送完。
6)wal sender解析出WAL Ranges内容,也就是得到备份前checkpoint位置
7)pg_basebackup发送BASE_BACKUP命令发起增量备份
8)wal sender通过SendBaseBackup从backup_mainifest解析的checkpoint位置开始(因为checkpoint前的内容都是已备份过的)找到需要的wal summary文件,根据其文件名(tli+ start lsn+ckp )找到需要增量的summary文件(记录的是本次增量备份需要的变动block列表等信息),根据summary文件中的内容,将本次增量备份内容发送回去
3、总结
1)wal Summarize进程通过解析每个checkpoint周期内的WAL日志,将变更信息记录到summary文件中
2)每次备份(全量备份或增量备份)都会生成一个manifest文件,文件中WAL-ranges部分会记录下备份前执行的checkpoint的WAL位置
3)通过manifest中记录的checkpoint位置就可以判断哪个summary文件是上次备份结束,本次增量备份开始的地方
4)遍历summary文件,得到增量变更,然后将变更页发送到pg_basebackup,由pg_basebackup写到指定位置,完成增量备份。
5)增量备份的完成,需要借助wal summary进程,该进程会读取WAL日志并解析,记录到变更,这个IO等代价需要考虑到业务中
参考
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=dc212340058b4e7ecfc5a7a81ec50e7a207bf288
https://www.depesz.com/2024/01/08/waiting-for-postgresql-17-add-support-for-incremental-backup/