redis底层数据结构之ziplist

news2025/10/23 11:04:11

一、概述

一种连续内存空间存储的顺序数据结构，每个元素可以是字符串或整数。优点:节省内存空间。适用于存储小规模的列表或有序集合。缺点:修改操作可能引发连锁更新，影响性能。使用场景: 在列表键元素较少或元素都是小整数时使用。

在这里插入图片描述

二、ziplist结构

(直达源码–>src/ziplist.c)

Header（头部）: 包含了ziplist的总字节长度、尾部（最后一个元素）的偏移量，以及ziplist中元素的数量。这些信息有助于快速地访问ziplist的基本属性和迅速找到ziplist的尾部。
Entry（项&条目）: 每个项可以存储一个整数或者一个字符串。
End（结束符）: 一个特定的字节（通常是0xFF），标记着ziplist的末尾，确保了ziplist结构的正确解析和遍历的终止。

* ZIPLIST OVERALL LAYOUT
* ======================
*
* The general layout of the ziplist is as follows:
*
* <zlbytes> <zltail> <zllen> <entry> <entry> ... <entry> <zlend>
*
* NOTE: all fields are stored in little endian, if not specified otherwise.
*
* <uint32_t zlbytes> is an unsigned integer to hold the number of bytes that
* the ziplist occupies, including the four bytes of the zlbytes field itself.
* This value needs to be stored to be able to resize the entire structure
* without the need to traverse it first.
*
* <uint32_t zltail> is the offset to the last entry in the list. This allows
* a pop operation on the far side of the list without the need for full
* traversal.
*
* <uint16_t zllen> is the number of entries. When there are more than
* 2^16-2 entries, this value is set to 2^16-1 and we need to traverse the
* entire list to know how many items it holds.
*
* <uint8_t zlend> is a special entry representing the end of the ziplist.
* Is encoded as a single byte equal to 255. No other normal entry starts
* with a byte set to the value of 255.

三、Entry结构

Previous Entry Length（前一项的长度）: 存储前一项的长度，使得ziplist可以被双向遍历。
Encoding（编码）: 指定了存储的值是整数还是字符串，以及使用了哪一种格式或长度。
Content（内容）: 实际存储的数据，可以是一个整数的二进制表示，或者是一个字符串的字节序列。

* ZIPLIST ENTRIES
* ===============
*
* Every entry in the ziplist is prefixed by metadata that contains two pieces
* of information. First, the length of the previous entry is stored to be
* able to traverse the list from back to front. Second, the entry encoding is
* provided. It represents the entry type, integer or string, and in the case
* of strings it also represents the length of the string payload.
* So a complete entry is stored like this:
*
* <prevlen> <encoding> <entry-data>

在entry中存储的是比较小的int类型时，encoding和entry-data会合并一起表示，此时会没有entry-data字段

* Sometimes the encoding represents the entry itself, like for small integers
* as we'll see later. In such a case the <entry-data> part is missing, and we
* could have just:
*
* <prevlen> <encoding>

前一个entry长度prevlen编码方式：如果这个长度小于 254 字节，它只会消耗一个字节，将长度表示为无符号 8 位整数。当长度大于等于254时，会消耗5个字节。第一个字节设置为 254 (FE) 以指示后面有更大的值。其余 4 个字节以前一个条目的长度为值。

* So practically an entry is encoded in the following way:
*
* <prevlen from 0 to 253> <encoding> <entry>
*
* Or alternatively if the previous entry length is greater than 253 bytes
* the following encoding is used:
*
* 0xFE <4 bytes unsigned little endian prevlen> <encoding> <entry>

encoding字段内容：条目的encoding字段取决于条目的内容。当条目是字符串时，encoding第一个字节的前 2 位将保存用于存储字符串长度的编码类型，后面是字符串的实际长度。当条目是整数时，encoding前 2 位都设置为 1。接下来的 2 位用于指定此标头之后将存储哪种整数。不同类型和编码的概述如下。第一个字节始终足以确定条目的类型。

* |00pppppp| - 1 byte
*      String value with length less than or equal to 63 bytes (6 bits).
*      "pppppp" represents the unsigned 6 bit length.
* |01pppppp|qqqqqqqq| - 2 bytes
*      String value with length less than or equal to 16383 bytes (14 bits).
*      IMPORTANT: The 14 bit number is stored in big endian.
* |10000000|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes
*      String value with length greater than or equal to 16384 bytes.
*      Only the 4 bytes following the first byte represents the length
*      up to 2^32-1. The 6 lower bits of the first byte are not used and
*      are set to zero.
*      IMPORTANT: The 32 bit number is stored in big endian.
* |11000000| - 3 bytes
*      Integer encoded as int16_t (2 bytes).
* |11010000| - 5 bytes
*      Integer encoded as int32_t (4 bytes).
* |11100000| - 9 bytes
*      Integer encoded as int64_t (8 bytes).
* |11110000| - 4 bytes
*      Integer encoded as 24 bit signed (3 bytes).
* |11111110| - 2 bytes
*      Integer encoded as 8 bit signed (1 byte).
* |1111xxxx| - (with xxxx between 0001 and 1101) immediate 4 bit integer.
*      Unsigned integer from 0 to 12. The encoded value is actually from
*      1 to 13 because 0000 and 1111 can not be used, so 1 should be
*      subtracted from the encoded 4 bit value to obtain the right value.
* |11111111| - End of ziplist special entry.
*
* Like for the ziplist header, all the integers are represented in little
* endian byte order, even when this code is compiled in big endian systems.

四、为什么ZipList特别省内存

ziplist节省内存是相对于普通的list来说的，如果是普通的数组，那么它每个元素占用的内存是一样的且取决于最大的那个元素（很明显它是需要预留空间的）；
所以ziplist在设计时就很容易想到要尽量让每个元素按照实际的内容大小存储，所以增加encoding字段，针对不同的encoding来细化存储大小；
这时候还需要解决的一个问题是遍历元素时如何定位下一个元素呢？在普通数组中每个元素定长，所以不需要考虑这个问题；但是ziplist中每个data占据的内存不一样，所以为了解决遍历，需要增加记录上一个元素的length，所以增加了prelen字段

五、ziplist的缺点

不预留内存空间：ziplist 不预留额外的内存空间，在写操作时可能需要频繁进行内存分配和释放操作，影响性能。
立即缩容：在移除节点后，ziplist 立即缩容，可能导致频繁的内存分配和释放操作。
链式扩容：节点如果扩容，可能导致节点占用的内存增长，并且在超过一定字节后，可能会导致链式扩容。链式扩容的时间复杂度为 O(N)，虽然在大多数情况下概率较小，但在恶劣情况下会导致性能下降。

链式扩容（级联更新）这种现象发生在 ziplist 数据结构中，主要是因为 ziplist 存储了关于前一个元素长度的元数据信息。当一个元素被插入或修改导致其长度增加，并且这个长度增加导致存储前一个元素长度的空间（prevlen）不足以容纳新的长度时，就需要对当前元素进行重新分配以适应长度的变化。这种重新分配可能会影响到相邻的元素，导致它们也需要进行重新分配来适应自己前一个元素的长度变化，从而形成一种“级联”效应。级联更新是 ziplist 的一个潜在缺陷，因为它可能会导致对多个连续元素进行重复的内存分配操作，从而影响性能。这是在 Redis 的后续版本中引入 listpack 作为替代的重要原因之一，listpack 设计上试图减少这种级联更新的可能性，提高数据操作的效率。

综上所述，尽管 ziplist 能够有效地节省内存空间，但在写操作频繁、节点删除较多或节点扩容较大时，可能会出现性能问题。