etcd进阶
etcd简介:
etcd是CoreOS团队于2013年6月发起的开源项目,它的目标是构建一个高可用的分布式键值(key-value)数据库。etcd内部采用raft协议作为一致性算法,etcd基于Go语言实现。
官方网站:https://etcd.io/
github地址:https://github.com/etcd-io/etcd
官方硬件推荐:https://etcd.io/docs/v3.5/op-guide/hardware/
官方文档:https://etcd.io/docs/v3.5/op-guide/maintenance/
etcd具有下面这些属性
etcd具有下面这些属性:
完全复制:集群中的每个节点都可以使用完整的存档
高可用性:Etcd可用于避免硬件的单点故障或网络问题
一致性:每次读取都会返回跨多主机的最新写入
简单:包括一个定义良好、面向用户的API(gRPC)
安全:实现了带有可选的客户端证书身份验证的自动化TLS
快速:每秒10000次写入的基准速度
可靠:使用Raft算法实现了存储的合理分布Etcd的工作原理
etcd的配置文件
root@k8s-etcd1:~# cat /etc/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/ #数据保存目录
ExecStart=/usr/bin/etcd \ #二进制文件路径--name=etcd1 \ #当前node 名称--cert-file=/etc/etcd/ssl/etcd.pem \--key-file=/etc/etcd/ssl/etcd-key.pem \--peer-cert-file=/etc/etcd/ssl/etcd.pem \--peer-key-file=/etc/etcd/ssl/etcd-key.pem \--trusted-ca-file=/etc/kubernetes/ssl/ca.pem \--peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \--initial-advertise-peer-urls=https://172.31.7.105:2380 \ #通告自己的集群端口--listen-peer-urls=https://172.31.7.105:2380 \ #集群之间通讯端口--listen-client-urls=https://172.31.7.105:2379,http://127.0.0.1:2379 \ #客户端访问地址--advertise-client-urls=https://172.31.7.105:2379 \ #通告自己的客户端端口--initial-cluster-token=etcd-cluster-0 \ #创建集群使用的token,一个集群内的节点保持一致--initial-cluster=etcd1=https://172.31.7.105:2380,etcd2=https://172.31.7.106:2380,etcd3=https://172.31.7.107:2380 \ #集群所有的节点信息--initial-cluster-state=new \ #新建集群的时候的值为new,如果是已经存在的集群为existing。--data-dir=/var/lib/etcd #数据目录路径
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
---------------------------- ---------------------------- ---------------------------- ---------------------------- ---------------------------- ---------------------------- ---------------------------- ---------------------------- ---------------------------- ---------------------------- ---------------------
#我的配置文件
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd
ExecStart=/usr/local/bin/etcd \
--name=etcd-10.0.0.116 \
--cert-file=/etc/kubernetes/ssl/etcd.pem \
--key-file=/etc/kubernetes/ssl/etcd-key.pem \
--peer-cert-file=/etc/kubernetes/ssl/etcd.pem \
--peer-key-file=/etc/kubernetes/ssl/etcd-key.pem \
--trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
--peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
--initial-advertise-peer-urls=https://10.0.0.116:2380 \
--listen-peer-urls=https://10.0.0.116:2380 \
--listen-client-urls=https://10.0.0.116:2379,http://127.0.0.1:2379 \
--advertise-client-urls=https://10.0.0.116:2379 \
--initial-cluster-token=etcd-cluster-0 \
--initial-cluster=etcd-10.0.0.116=https://10.0.0.116:2380,etcd-10.0.0.117=https://10.0.0.117:2380,etcd-10.0.0.118=https://10.0.0.118:2380 \
--initial-cluster-state=new \
--data-dir=/var/lib/etcd \
--wal-dir= \
--snapshot-count=50000 \
--auto-compaction-retention=1 \
--auto-compaction-mode=periodic \
--max-request-bytes=10485760 \
--quota-backend-bytes=8589934592
Restart=always
RestartSec=15
LimitNOFILE=65536
OOMScoreAdjust=-999
[Install]
WantedBy=multi-user.target
etcd进阶-选举简介
节点角色:集群中每个节点只能处于 Leader、Follower 和 Candidate 三种状态的一种
follower:追随者(Redis Cluster的Slave节点)
candidate:候选节点,选举过程中。
leader:主节点(Redis Cluster的Master节点)
节点启动后基于termID(任期ID)进行相互投票,termID是一个整数默认值为0,在Raft算法中,一个term代表leader
的一段任期周期,每当一个节点成为leader时,就会进入一个新的term, 然后每个节点都会在自己的term ID上加1,
以便与上一轮选举区分开来。
etcd进阶-选举
首次选举:
1、各etcd节点启动后默认为 follower角色、默认termID为0、如果发现集群内没有leader,则会变成 candidate角色并进行选举 leader。
2、candidate(候选节点)向其它候选节点发送投票信息(RequestVote),默认投票给自己。
3、各候选节点相互收到另外的投票信息(如A收到BC的,B收到AC的,C收到AB的),然后对比日志是否比自己的更新,如果比自己的更新,则将自己的选票投给目的候选人,并回复一个包含自己最新日志信息的响应消息,如果C的日志更新,那么将会得到A、B、C的投票,则C全票当选,如果B挂了,得到A、C的投票,则C超过半票当选。
4、C向其它节点发送自己leader心跳信息,以维护自己的身份(heartbeatinterval、默认100毫秒)。
5、其它节点将角色切换为Follower并向leader同步数据。
6、如果选举超时(election-timeout )、则重新选举,如果选出来两个leader,则超过集群总数半票的生效。
后期选举:
当一个follower节点在规定时间内未收到leader的消息时,它将转换为candidate状态,向其他节点发送投票请求(自己的term ID和日志更新记录),并等待其他节点的响应,如果该candidate的(日志更新记录最新),则会获多数投票,它将成为新的leader。
新的leader将自己的termID +1 并通告至其它节点。
如果旧的leader恢复了,发现已有新的leader,则加入到已有的leader中并
将自己的term ID更新为和leader一致,在同一个任期内所有节点的term ID是
一致的。
etcd进阶-查看成员信息
配置 优化:
--max-request-bytes=10485760 #request size limit(请求的最大字节数,默认一个key最大1.5Mib,官方推荐最大不要超出10Mib)
--quota-backend-bytes=8589934592 #storage size limit(磁盘存储空间大小限制,默认为2G,此值超过8G启动会有警告信息)
集群碎片整理:
[root@k8s-etcd1 ~]#ETCDCTL_API=3 /usr/local/bin/etcdctl defrag --cluster --endpoints=https://10.0.0.116:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem
Finished defragmenting etcd member[https://10.0.0.116:2379]
Finished defragmenting etcd member[https://10.0.0.117:2379]
Finished defragmenting etcd member[https://10.0.0.118:2379]
etcd有多个不同的API访问版本,v1版本已经废弃,etcd v2 和 v3 本质上是共享同一套 raft 协议代码的两个独立的应用,接口不一样,存储不一
样,数据互相隔离。也就是说如果从 Etcd v2 升级到 Etcd v3,原来v2 的数据还是只能用 v2 的接口访问,v3 的接口创建的数据也只能访问通过
v3 的接口访问。
WARNING:
Environment variable ETCDCTL_API is not set; defaults to etcdctl v2. #默认使用V2版本
Set environment variable ETCDCTL_API=3 to use v3 API or ETCDCTL_API=2 to use v2 API. #设置API版本
root@k8s-etcd1:~# ETCDCTL_API=3 etcdctl --help
root@k8s-etcd1:~# ETCDCTL_API=3 etcdctl member --help
NAME:
etcdctl member - member add, remove and list subcommands
USAGE:
etcdctl member command [command options] [arguments...]
COMMANDS:
list enumerate existing cluster members
add add a new member to the etcd cluster
remove remove an existing member from the etcd cluster
update update an existing member in the etcd cluster
OPTIONS:--help, -h show help
root@k8s-etcd1:~# ETCDCTL_API=3 etcdctl member list
[root@k8s-etcd1 ~]#ETCDCTL_API=3 /usr/local/bin/etcdctl --write-out=table member list --endpoints=https://10.0.0.116:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem
+------------------+---------+-----------------+-------------------------+-------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+-----------------+-------------------------+-------------------------+------------+
| ad494a8abe2cd50c | started | etcd-10.0.0.116 | https://10.0.0.116:2380 | https://10.0.0.116:2379 | false |
| b5f9408145d08046 | started | etcd-10.0.0.117 | https://10.0.0.117:2380 | https://10.0.0.117:2379 | false |
| eb9b6ed34d1464a0 | started | etcd-10.0.0.118 | https://10.0.0.118:2380 | https://10.0.0.118:2379 | false |
+------------------+---------+-----------------+-------------------------+-------------------------+------------+
etcd进阶-验证节点心跳状态
[root@k8s-etcd1 ~]#export NODE_IPS="10.0.0.116 10.0.0.117 10.0.0.118"
[root@k8s-etcd1 ~]#for ip in ${NODE_IPS}; do ETCDCTL_API=3 /usr/local/bin/etcdctl --endpoints=https://${ip}:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem endpoint health; done
https://10.0.0.116:2379 is healthy: successfully committed proposal: took = 18.725626ms
https://10.0.0.117:2379 is healthy: successfully committed proposal: took = 18.984525ms
https://10.0.0.118:2379 is healthy: successfully committed proposal: took = 26.566695ms
etcd进阶-详细信息
[root@k8s-etcd1 ~]#export NODE_IPS="10.0.0.116 10.0.0.117 10.0.0.118"
[root@k8s-etcd1 ~]#for ip in ${NODE_IPS}; do ETCDCTL_API=3 /usr/local/bin/etcdctl --write-out=table endpoint status --endpoints=https://${ip}:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem; done
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://10.0.0.116:2379 | ad494a8abe2cd50c | 3.5.5 | 623 kB | false | false | 59 | 194500 | 194500 | |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://10.0.0.117:2379 | b5f9408145d08046 | 3.5.5 | 623 kB | true | false | 59 | 194500 | 194500 | |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://10.0.0.118:2379 | eb9b6ed34d1464a0 | 3.5.5 | 623 kB | false | false | 59 | 194500 | 194500 | |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
etcd进阶-查看etcd数据
root@k8s-etcd1:~# ETCDCTL_API=3 etcdctl get / --prefix --keys-only #以路径的方式所有key信息
查看pod信息:
root@k8s-etcd1:~# ETCDCTL_API=3 etcdctl get / --prefix --keys-only | grep pod
namespace信息:
root@k8s-etcd1:~# ETCDCTL_API=3 etcdctl get / --prefix --keys-only | grep namespaces
查看deployment控制器信息:
root@k8s-etcd1:~# ETCDCTL_API=3 etcdctl get / --prefix --keys-only | grep deployment
查看calico组件信息:
root@k8s-etcd1:~# ETCDCTL_API=3 etcdctl get / --prefix --keys-only | grep calico
#auger解码
[root@k8s-etcd1 ~]#etcdctl get /registry/pods/kube-system/calico-node-phl9w | auger decode
etcd进阶-etcd增删改查
#添加数据
root@etcd1:~# ETCDCTL_API=3 /usr/local/bin/etcdctl put /name "tom"
OK
#查询数据
root@etcd1:~# ETCDCTL_API=3 /usr/local/bin/etcdctl get /name
/name
tom
#改动数据,#直接覆盖就是更新数据
root@etcd1:~# ETCDCTL_API=3 /usr/local/bin/etcdctl put /name "jack"
OK
#验证改动
root@etcd1:~# ETCDCTL_API=3 /usr/local/bin/etcdctl get /name
/name
jack
#删除数据
root@etcd1:~# ETCDCTL_API=3 /usr/local/bin/etcdctl del /name
1
root@etcd1:~# ETCDCTL_API=3 /usr/local/bin/etcdctl get /name
etcd进阶etcd数据watch机制
基于不断监看数据,发生变化就主动触发通知客户端,Etcd v3 的watch机制支持watch某个固定的key,也支持watch一个范围。
#在etcd node1上watch一个key,没有此key也可以执行watch,后期可以再创建:
root@k8s-etcd1:~# ETCDCTL_API=3 /usr/local/bin/etcdctl watch /data
#在etcd node2修改数据,验证etcd node1是否能够发现数据变化
root@k8s-etcd2:~# ETCDCTL_API=3 /usr/local/bin/etcdctl put /data "data v1"
OK
root@k8s-etcd2:~# ETCDCTL_API=3 /usr/local/bin/etcdctl put /data "data v1"
OK
[root@k8s-etcd2 ~]#ETCDCTL_API=3 /usr/local/bin/etcdctl watch /data
[root@k8s-etcd1 ~]#ETCDCTL_API=3 /usr/local/bin/etcdctl put /data "data v1"
OK
[root@k8s-etcd1 ~]#ETCDCTL_API=3 /usr/local/bin/etcdctl put /data "data v2"
OK
[root@k8s-etcd1 ~]#ETCDCTL_API=3 /usr/local/bin/etcdctl del /data
1
[root@k8s-etcd2 ~]#ETCDCTL_API=3 /usr/local/bin/etcdctl watch /data
PUT
/data
data v1
PUT
/data
data v2
DELETE
/data
etcd进阶-etcd V3 API版本数据备份与恢复
WAL是write ahead log(预写日志)的缩写,顾名思义,也就是在执行真正的写操作之前先写一个日志,预写日志。
wal: 存放预写式日志,最大的作用是记录了整个数据变化的全部历程。在etcd中,所有数据的修改在提交前,都要先写入到WAL中。
V3版本备份数据:
root@k8s-etcd1:~# ETCDCTL_API=3 etcdctl snapshot save snapshot.db
V3版本恢复数据:
root@k8s-etcd1:~# ETCDCTL_API=3 etcdctl snapshot restore snapshot.db --data-dir=/opt/etcd-testdir #将数据恢复到一个新的不存在的目录中
#自动备份数据
root@k8s-etcd1:~# mkdir /data/etcd-backup-dir/ -p
root@k8s-etcd1:~# cat etcd-backup.sh
#!/bin/bash
source /etc/profile
DATE=`date +%Y-%m-%d_%H-%M-%S`
ETCDCTL_API=3 /usr/local/bin/etcdctl snapshot save /data/etcd-backup-dir/etcd-snapshot-${DATE}.db
etcd进阶-etcd 集群v3版本数据备份与恢复
修改成3.5.3的main.yaml
[root@k8s-deploy tasks]#pwd
/etc/kubeasz/roles/cluster-restore/tasks
#main.yml和etcd.service 对应的参数
[root@k8s-deploy tasks]#vim main.yml
- name: etcd 数据恢复
shell: "cd /etcd_backup && \
ETCDCTL_API=3 {{ bin_dir }}/etcdctl snapshot restore snapshot.db \
--name etcd-{{ inventory_hostname }} \
--initial-cluster {{ ETCD_NODES }} \
--initial-cluster-token etcd-cluster-0 \
--initial-advertise-peer-urls https://{{ inventory_hostname }}:2380"
[root@k8s-etcd1 ~]#cat /etc/systemd/system/etcd.service
[Unit]
............................................
............................................
............................................
(省略)
[Service]
............................................
............................................
--name=etcd-10.0.0.116 \
--cert-file=/etc/kubernetes/ssl/etcd.pem \
--key-file=/etc/kubernetes/ssl/etcd-key.pem \
--peer-cert-file=/etc/kubernetes/ssl/etcd.pem \
--peer-key-file=/etc/kubernetes/ssl/etcd-key.pem \
--trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
--peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
--initial-advertise-peer-urls=https://10.0.0.116:2380 \
--listen-peer-urls=https://10.0.0.116:2380 \
--listen-client-urls=https://10.0.0.116:2379,http://127.0.0.1:2379 \
--advertise-client-urls=https://10.0.0.116:2379 \
--initial-cluster-token=etcd-cluster-0 \
--initial-cluster=etcd-10.0.0.116=https://10.0.0.116:2380,etcd-10.0.0.117=https://10.0.0.117:2380,etcd-10.0.0.118=https://10.0.0.118:2380 \
--initial-cluster-state=new \
............................................
............................................
[root@k8s-master1 ~]#kubectl apply -f nginx.yaml
[root@k8s-master1 ~]#kubectl get pod -n myserver -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
myserver-nginx-deployment-6dc97c87d7-zgtkp 1/1 Running 0 36m 10.200.36.71 10.0.0.111 <none> <none>
root@k8s-deploy:/etc/kubeasz# ./ezctl backup k8s-cluster1
root@k8s-deploy:/etc/kubeasz# kubectl get deployment -n myserver
root@k8s-deploy:/etc/kubeasz# kubectl delete deployment -n myserver myserver-nginx-deployment
root@k8s-deploy:/etc/kubeasz# ./ezctl backup k8s-cluster1
在恢复数据期间API server不可用,必须在业务低峰期操作或者是在其它紧急场景:
root@k8s-deploy:/etc/kubeasz# grep db_to_restore ./roles/ -R #选择恢复的文件
./roles/cluster-restore/defaults/main.yml:db_to_restore: "snapshot.db"
./roles/cluster-restore/tasks/main.yml: src: "{{ cluster_dir }}/backup/{{ db_to_restore }}"
#将第一次全量备份覆盖到snapshot.db
[root@k8s-deploy backup]#cp snapshot_202411081550.db snapshot.db
[root@k8s-deploy backup]#ll
total 7808
drwxr-xr-x 2 root root 4096 Nov 8 16:00 ./
drwxr-xr-x 5 root root 4096 Oct 19 06:23 ../
-rw------- 1 root root 2658336 Nov 8 15:50 snapshot_202411081550.db
-rw------- 1 root root 2658336 Nov 8 15:53 snapshot_202411081553.db
-rw------- 1 root root 2658336 Nov 8 16:02 snapshot.db
[root@k8s-deploy kubeasz]#./ezctl restore k8s-cluster1
验证恢复后的集群状态
[root@k8s-master1 ~]#kubectl get deployment -n myserver
NAME READY UP-TO-DATE AVAILABLE AGE
myserver-nginx-deployment 1/1 1 1 37m
etcd进阶-ETCD数据恢复流程
当etcd集群宕机数量超过集群总节点数一半以上的时候(如总数为三台宕机两台),就会导致整合集群宕机,后期需要重新恢复数据,
则恢复流程如下:
1、恢复服务器系统
2、重新部署ETCD集群
3、停止kube-apiserver/controller-manager/scheduler/kubelet/kube-proxy
4、停止ETCD集群
5、各ETCD节点恢复同一份备份数据
6、启动各节点并验证ETCD集群
7、启动kube-apiserver/controller-manager/scheduler/kubelet/kube-proxy
8、验证k8s master状态及pod数据