问题1:数据库进入只读模式
最近在项目中使用clickhouse的时候,遇到了一个批量插入后报错的问题。报错的内容是数据库进入了只读模式,导致数据写不进去。发现有大量的批量写入报错日志信息。(关键异常信息:DB::Exception: Table is in readonly mode)
部署环境:使用的clickhouse的版本是20.8,比较早的版本了。微服务协调组件使用的是zookeeper。三个节点的集群。
主要原因:
-
数据量很大时,ClickHouse中的表可能会进入只读模式,这主要是因为元数据(metadata)的增长超过了协调服务(如ZooKeeper)能够有效处理的范围。
-
ZooKeeper需要在多个节点之间保持元数据的一致性。随着元数据的增加,同步这些数据的时间也会增加,从而可能导致同步延迟。
-
ZooKeeper非常强大且可靠,但它也有其自身的性能限制。当元数据的更新过于频繁时,ZooKeeper可能无法及时处理所有的更新请求。
总结:数据量过大导致Zookeeper负载过高,继而引发ClickHouse进入只读模式。
数据库报错信息:
2024.08.12 09:40:15.396881 [ 13043 ] {2f16eacd-1c5f-4560-bc5f-c121280643d7} <Error> DynamicQueryHandler: Code: 242, e.displayText() = DB::Exception: Table is in readonly mode (zookeeper path: /clickhouse/aiot_tb/1/ts_kv_cluster), Stack trace (when copying this message, always include the lines below):
0. Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x13cd24bc in /usr/bin/clickhouse
1. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0xa4346c9 in /usr/bin/clickhouse
2. ? @ 0x9efb0cf in /usr/bin/clickhouse
3. DB::StorageReplicatedMergeTree::write(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::Context const&) @ 0x10fa0404 in /usr/bin/clickhouse
4. DB::PushingToViewsBlockOutputStream::PushingToViewsBlockOutputStream(std::__1::shared_ptr<DB::IStorage> const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::Context const&, std::__1::shared_ptr<DB::IAST> const&, bool) @ 0x114eea16 in /usr/bin/clickhouse
5. DB::InterpreterInsertQuery::execute() @ 0x10ceed1e in /usr/bin/clickhouse
6. ? @ 0x10e0e178 in /usr/bin/clickhouse
7. DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, DB::Context&, std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)>) @ 0x10e0fe08 in /usr/bin/clickhouse
8. DB::HTTPHandler::processQuery(DB::Context&, Poco::Net::HTTPServerRequest&, HTMLForm&, Poco::Net::HTTPServerResponse&, DB::HTTPHandler::Output&) @ 0x117e3a2e in /usr/bin/clickhouse
9. DB::HTTPHandler::handleRequest(Poco::Net::HTTPServerRequest&, Poco::Net::HTTPServerResponse&) @ 0x117e7260 in /usr/bin/clickhouse
10. Poco::Net::HTTPServerConnection::run() @ 0x13c1466e in /usr/bin/clickhouse
11. Poco::Net::TCPServerConnection::start() @ 0x13c11197 in /usr/bin/clickhouse
12. Poco::Net::TCPServerDispatcher::run() @ 0x13c1158b in /usr/bin/clickhouse
13. Poco::PooledThread::run() @ 0x13d481ae in /usr/bin/clickhouse
14. Poco::ThreadImpl::runnableEntry(void*) @ 0x13d44ea9 in /usr/bin/clickhouse
15. start_thread @ 0x7ea5 in /usr/lib64/libpthread-2.17.so
16. __clone @ 0xfdf3d in /usr/lib64/libc-2.17.so
(version 20.8.3.18)
2024.08.12 09:40:15.397685 [ 12358 ] {17419ab6-1261-439d-b3df-6e80cfcab4b3} <Error> executeQuery: Code: 242, e.displayText() = DB::Exception: Table is in readonly mode (zookeeper path: /clickhouse/aiot_tb/1/ts_kv_cluster) (version 20.8.3.18) (from 192.168.249.17:55888) (in query: insert into ts_kv_cluster (entity_id,key,ts,bool_v,str_v,long_v,dbl_v,json_v) FORMAT RowBinary ), Stack trace (when copying this message, always include the lines below):
0. Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x13cd24bc in /usr/bin/clickhouse
1. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0xa4346c9 in /usr/bin/clickhouse
2. ? @ 0x9efb0cf in /usr/bin/clickhouse
3. DB::StorageReplicatedMergeTree::write(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::Context const&) @ 0x10fa0404 in /usr/bin/clickhouse
4. DB::PushingToViewsBlockOutputStream::PushingToViewsBlockOutputStream(std::__1::shared_ptr<DB::IStorage> const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::Context const&, std::__1::shared_ptr<DB::IAST> const&, bool) @ 0x114eea16 in /usr/bin/clickhouse
5. DB::InterpreterInsertQuery::execute() @ 0x10ceed1e in /usr/bin/clickhouse
6. ? @ 0x10e0e178 in /usr/bin/clickhouse
7. DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, DB::Context&, std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)>) @ 0x10e0fe08 in /usr/bin/clickhouse
8. DB::HTTPHandler::processQuery(DB::Context&, Poco::Net::HTTPServerRequest&, HTMLForm&, Poco::Net::HTTPServerResponse&, DB::HTTPHandler::Output&) @ 0x117e3a2e in /usr/bin/clickhouse
9. DB::HTTPHandler::handleRequest(Poco::Net::HTTPServerRequest&, Poco::Net::HTTPServerResponse&) @ 0x117e7260 in /usr/bin/clickhouse
10. Poco::Net::HTTPServerConnection::run() @ 0x13c1466e in /usr/bin/clickhouse
11. Poco::Net::TCPServerConnection::start() @ 0x13c11197 in /usr/bin/clickhouse
12. Poco::Net::TCPServerDispatcher::run() @ 0x13c1158b in /usr/bin/clickhouse
13. Poco::PooledThread::run() @ 0x13d481ae in /usr/bin/clickhouse
14. Poco::ThreadImpl::runnableEntry(void*) @ 0x13d44ea9 in /usr/bin/clickhouse
15. start_thread @ 0x7ea5 in /usr/lib64/libpthread-2.17.so
16. __clone @ 0xfdf3d in /usr/lib64/libc-2.17.so
zookeeper报错信息
关键异常总结:
Unexpected exception in LearnerHandler: java.net.SocketException: Socket closed
Ignoring unexpected exception java.lang.InterruptedException
Unexpected exception in LearnerHandler: java.net.SocketException: Connection reset
Exception while sending packets in LearnerHandler java.net.SocketException: Broken pipe (Write failed)
will be dropped if server is in r-o mode
2024-08-12 12:15:01,957 [myid:1] - WARN [SyncThread:1:FileTxnLog@393] - fsync-ing the write ahead log in SyncThread:1 took 14266ms which will adversely effect operation latency.File size is 67108880 bytes. See the ZooKeeper troubleshooting guide
2024-08-12 12:15:09,968 [myid:1] - WARN [SyncThread:1:FileTxnLog@393] - fsync-ing the write ahead log in SyncThread:1 took 8010ms which will adversely effect operation latency.File size is 67108880 bytes. See the ZooKeeper troubleshooting guide
2024-08-12 12:15:09,974 [myid:1] - INFO [CommitProcessor:1:LeaderSessionTracker@104] - Committing global session 0x20404e1f22a0002
2024-08-12 12:15:09,974 [myid:1] - INFO [CommitProcessor:1:LeaderSessionTracker@104] - Committing global session 0x20404e1f22a0003
2024-08-12 12:15:09,974 [myid:1] - INFO [CommitProcessor:1:LeaderSessionTracker@104] - Committing global session 0x30404dd5e2c0004
2024-08-12 12:15:11,475 [myid:1] - WARN [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):LearnerHandler@1077] - Closing connection to peer due to transaction timeout.
2024-08-12 12:15:11,475 [myid:1] - INFO [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):LearnerHandler@1158] - Synchronously closing socket to learner 3.
2024-08-12 12:15:11,476 [myid:1] - ERROR [LearnerHandler-/192.168.249.20:52462:LearnerHandler@718] - Unexpected exception in LearnerHandler:
java.net.SocketException: Socket closed
at java.net.SocketInputStream.read(SocketInputStream.java:204)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:96)
at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:86)
at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:134)
at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:655)
2024-08-12 12:15:11,476 [myid:1] - WARN [LearnerHandler-/192.168.249.20:52462:LearnerHandler@734] - ******* GOODBYE /192.168.249.20:52462 ********
2024-08-12 12:15:11,476 [myid:1] - WARN [LearnerHandler-/192.168.249.20:52462:LearnerHandler@1048] - Ignoring unexpected exception
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)
at org.apache.zookeeper.server.quorum.LearnerHandler.shutdown(LearnerHandler.java:1046)
at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:736)
2024-08-12 12:15:11,482 [myid:1] - WARN [NIOWorkerThread-15:ZooKeeperServer@1417] - Connection request from old client /192.168.249.19:47342; will be dropped if server is in r-o mode
2024-08-12 12:15:14,149 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection$Messenger$WorkerReceiver@390] - Notification: my state:LEADING; n.sid:3, n.state:LOOKING, n.leader:3, n.round:0x34e, n.peerEpoch:0x34c, n.zxid:0x34c0002996f, message format version:0x2, n.config version:0x0
2024-08-12 12:15:14,154 [myid:1] - INFO [LearnerHandler-/192.168.249.20:52948:LearnerHandler@509] - Follower sid: 3 : info : bigdata03:2888:3888:participant
2024-08-12 12:15:14,155 [myid:1] - INFO [LearnerHandler-/192.168.249.20:52948:ZKDatabase@346] - On disk txn sync enabled with snapshotSizeFactor 0.33
2024-08-12 12:15:14,155 [myid:1] - INFO [LearnerHandler-/192.168.249.20:52948:LearnerHandler@804] - Synchronizing with Learner sid: 3 maxCommittedLog=0x34c0002996f minCommittedLog=0x34c0002977b lastProcessedZxid=0x34c0002996f peerLastZxid=0x34c0002996f
2024-08-12 12:15:14,155 [myid:1] - INFO [LearnerHandler-/192.168.249.20:52948:LearnerHandler@849] - Sending DIFF zxid=0x34c0002996f for peer sid: 3
2024-08-12 12:15:15,968 [myid:1] - WARN [SyncThread:1:FileTxnLog@393] - fsync-ing the write ahead log in SyncThread:1 took 5987ms which will adversely effect operation latency.File size is 67108880 bytes. See the ZooKeeper troubleshooting guide
2024-08-12 12:15:17,187 [myid:1] - WARN [SyncThread:1:FileTxnLog@393] - fsync-ing the write ahead log in SyncThread:1 took 1218ms which will adversely effect operation latency.File size is 67108880 bytes. See the ZooKeeper troubleshooting guide
2024-08-12 12:15:17,188 [myid:1] - INFO [CommitProcessor:1:LeaderSessionTracker@104] - Committing global session 0x10327dc03f20009
2024-08-12 12:15:18,500 [myid:1] - WARN [SyncThread:1:FileTxnLog@393] - fsync-ing the write ahead log in SyncThread:1 took 1310ms which will adversely effect operation latency.File size is 67108880 bytes. See the ZooKeeper troubleshooting guide
2024-08-12 12:15:19,987 [myid:1] - WARN [SyncThread:1:FileTxnLog@393] - fsync-ing the write ahead log in SyncThread:1 took 1486ms which will adversely effect operation latency.File size is 67108880 bytes. See the ZooKeeper troubleshooting guide
2024-08-12 12:15:22,481 [myid:1] - ERROR [LearnerHandler-/192.168.249.20:52948:LearnerHandler@718] - Unexpected exception in LearnerHandler:
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:96)
at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:86)
at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:134)
at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:655)
2024-08-12 12:15:22,481 [myid:1] - ERROR [Sender-/192.168.249.20:52948:LearnerHandler@371] - Exception while sending packets in LearnerHandler
java.net.SocketException: Broken pipe (Write failed)
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at org.apache.zookeeper.server.quorum.LearnerHandler.sendPackets(LearnerHandler.java:335)
at org.apache.zookeeper.server.quorum.LearnerHandler.access$200(LearnerHandler.java:65)
at org.apache.zookeeper.server.quorum.LearnerHandler$1.run(LearnerHandler.java:750)
2024-08-12 12:15:22,481 [myid:1] - INFO [LearnerHandler-/192.168.249.20:52948:LearnerHandler@1158] - Synchronously closing socket to learner 3.
2024-08-12 12:15:22,481 [myid:1] - WARN [LearnerHandler-/192.168.249.20:52948:LearnerHandler@734] - ******* GOODBYE /192.168.249.20:52948 ********
2024-08-12 12:15:22,482 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection$Messenger$WorkerReceiver@390] - Notification: my state:LEADING; n.sid:3, n.state:LOOKING, n.leader:3, n.round:0x34e, n.peerEpoch:0x34c, n.zxid:0x34c0002996f, message format version:0x2, n.config version:0x0
2024-08-12 12:15:22,976 [myid:1] - WARN [SyncThread:1:FileTxnLog@393] - fsync-ing the write ahead log in SyncThread:1 took 1494ms which will adversely effect operation latency.File size is 67108880 bytes. See the ZooKeeper troubleshooting guide
2024-08-12 12:15:23,835 [myid:1] - INFO [LearnerHandler-/192.168.249.20:52950:LearnerHandler@509] - Follower sid: 3 : info : bigdata03:2888:3888:participant
2024-08-12 12:15:23,836 [myid:1] - INFO [LearnerHandler-/192.168.249.20:52950:ZKDatabase@346] - On disk txn sync enabled with snapshotSizeFactor 0.33
2024-08-12 12:15:23,836 [myid:1] - INFO [LearnerHandler-/192.168.249.20:52950:LearnerHandler@804] - Synchronizing with Learner sid: 3 maxCommittedLog=0x34c0002997f minCommittedLog=0x34c0002978b lastProcessedZxid=0x34c0002997f peerLastZxid=0x34c0002996f
2024-08-12 12:15:23,836 [myid:1] - INFO [LearnerHandler-/192.168.249.20:52950:LearnerHandler@868] - Using committedLog for peer sid: 3
解决方式1
优化zookeeper的配置
这个方式目前来看并不能完全解决,但是可以有效降低zookeeper的问题时间。
修改zookeeper安装目录下面的配置zoo.cfg
核心的点:
autopurge.snapRetainCount=3
syncLimit=10
forceSync=no
配置含义解析:
autopurge.snapRetainCount
保留autopurge.snapRetainCount个最近的数据快照(dataDir)和对应的事务日志文件(dataLogDir),其余的将会删除掉。默认值是3,最小值也是3。
syncLimit
集群中的 Leader 节点和 Follower 节点之间请求和应答时能容忍的最多心跳数
forceSync
最关键的一点,该参数用于配置ZooKeeper服务器是否在事务提交的时候,将日志写入操作强制刷入磁盘,默认情况下是“yes”,即每次事务日志写入操作都会实时刷入磁盘。forceSync=no,这个对写请求的性能提升很有帮助,是指每次写请求的数据都要从pagecache中固化到磁盘上,才算是写成功返回。当写请求数量到达一定程度的时候,后续写请求会等待前面写请求的forceSync操作,造成一定延时。如果追求低延时的写请求,配置forceSync=no,数据写到pagecache后就返回。但是机器断电的时候,pagecache中的数据有可能丢失。
我用这个方法优化后,报错减少了非常多,偶发性的还有少量报错。也是因为zookeeper同步的时候会有。但不是只读模式的报错,而是请求数据库超时导致的。
参考:ZooKeeper调优 (qq.com)
解决方式2
更换服务器协调组件,Zookeeper组件切换ClickHouseKeeper组件。
这个我还没有验证。这个更换需要ck数据库的版本也比较高,太低的换不了。所以数据库版本低的也需要同步升级。
问题2:主节点磁盘占用增长快
我在主从节点的服务器发现,主节点的数据量是从节点的好几倍不止。感觉有问题,以为像mysql那种bin日志等。后期查看发现原来是部分系统日志没有自动删除导致的。
主节点的store目录通常用于存储集群元数据信息,比如分片和副本的配置信息,集群的元数据和日志文件等。
主节点的日志大小-131G
从节点的日志大小-38G
验证方法
先查看各个日志表的大小,确认是不是该问题导致
SELECT
sum(rows) AS `总行数`,
formatReadableSize(sum(data_uncompressed_bytes)) AS `原始大小`,
formatReadableSize(sum(data_compressed_bytes)) AS `压缩大小`,
round((sum(data_compressed_bytes) / sum(data_uncompressed_bytes)) * 100, 0) AS `压缩率`,
`table` AS `表名`
FROM system.parts where database = 'system' group by `table`
查看发现这几个日志表确实很大:
解决方法
设置日志表的 TTL 来自动清理数据。例如保存 30 天的日志可以使用以下 SQL:
ALTERTABLE query_log MODIFY TTL event_date + toIntervalDay(30);
参考:Clickhouse:日志表占用大量磁盘空间怎么办?_Clickhouse_Ken_InfoQ写作社区