数据结构：LRU Cache

news2026/3/3 14:23:30

数据结构：LRU Cache

- LRU Cache
- 实现
- - 类架构
  - set
  - get
  - 测试
- 总代码

LRU Cache

cache意为缓存，硬件层面指CPU与主存之间的缓存，用于减缓两者之间的速度差距。广义上，可以把cache简单理解为一个临时存储区域。

cache的容量是非常有限的，当cache的容量使用完后，如果还有新的内容添加进来，那么此时就要舍弃掉原有的部分内容，从而腾出空间放新内容。

所谓的LRU cache，全称Latest Recent Used Cache。这种缓存的替换原则，是将最久未使用的数据替换掉，来放新内容。本博客将以C++实现一个LRU Cache。

实现LRU Cache的方法非常多，但是要实现最高效的增加与获取数据，也就是get和set都是O(1)的时间复杂度，最经典的方案是哈希表 + 双向链表。

对于双向链表来说，插入与删除的时间复杂度都是O(1)，但是查询的速度是O(N)。因此引入一个哈希表，降低查询的时间复杂度到O(1)。

另外的，由于链表是顺序形存储结构，所以其节点的先后也可以表示顺序，所以把最近访问的值放在链表的头部，最久没访问的值放在链表尾部。在LRU Cache满时，直接尾删链表即可，时间复杂度还是O(1)。

如图：

在这里插入图片描述

蓝色的是哈希表，红色的是双向链表。在哈希表中，存储key和指向链表的指针。

查询LRU Cache时，通过key计算哈希表的位置，拿到指针，直接访问链表的指定节点了，这个过程时间复杂度为O(1)。

插入删除时，也通过key在哈希表中拿到指针，随后进行链表的插入删除操作，该过程时间复杂度也为O(1)。

实现

类架构

template <typename K, typename V, typename Hash = hash<K>>
class LRUCache
{
public:
    LRUCache(int capacity)
        : _capacity(capacity)
    {}
    
private:
    unordered_map<K, typename list<pair<K, V>>::iterator, Hash> _hash;
    list<pair<K, V>> _cache;
    int _capacity;
};

模板参数：

K：key的类型
V：value的类型
Hash：哈希函数，如果传入的K是复杂类型，unordered_map可能无法计算哈希值，此时要用户自己传哈希函数

类成员：

_hash：存储从key到指向链表指针的映射，C++中直接使用迭代器list<pair<K, V>>::iterator代替指针
_cache：缓存本体，存储pair<K, V>键值对
_capacity：缓存的最大容量

在构造函数时，用户传入一个capacity，表示该LRU Cache的最大容量。

set

函数声明：

void set(K key, V value)

在set值时，key可能存在，也可能不存在，此时就要判断，然后分别处理：

如果key存在：

更新value的值为新的value
该key视为被访问，将链表节点移到链表头部

代码：

auto hash_it = _hash.find(key);

if (hash_it != _hash.end()) // key原先存在
{
    auto list_it = hash_it->second; // 指向链表节点的迭代器
    list_it->second = value; // 更新节点值

    _cache.emplace_front(list_it->first, list_it->second); // 构造一个节点头插到链表
    _cache.erase(list_it); // 删除链表原先的节点
    hash_it->second = _cache.begin(); // 更新哈希表指向迭代器的指针
}

首先通过_hash.find(key)查询key是否在哈希表存在，得到迭代器hash_it，如果hash_it != _hash.end()，说明key原先存在。

hash_it->second通过哈希表拿到指向链表节点的迭代器，list_it->second = value将节点的值更新。

接下来完成节点的移动，把节点移动到链表头部。

为了提高效率，此处直接删除原先的节点，并用原先的数据构造一个新节点头插到链表。

由于此时已经创建了新节点，原先哈希表存储的迭代器就失效了，hash_it->second = _cache.begin();更新哈希表的迭代器。

如果key不存在，那么就要创建新的节点，但是在创建新节点之前，就要考虑LRU Cache是否已经满了？如果满了，要把最久没使用的数据删掉。

如果满了，把最久没使用的数据删掉
利用key和value构造新节点，头插到链表
哈希表新增key

else
{
    if (_capacity == _hash.size())
    {
        pair<K, V>& back = _cache.back(); // 最久没使用的元素
        _hash.erase(back.first); // 哈希表中删除
        _cache.pop_back(); // 在链表中删除
    }

    _cache.emplace_front(key, value);
    _hash[key] = _cache.begin();
}

首先if (_capacity == _hash.size())判断是否缓存已满，如果满了，取出链表尾部的元素_cache.back()，这就是最久没使用的元素，同时在哈希表和链表中删掉节点。

_cache.emplace_front(key, value)在链表中头插新节点，_hash[key] = _cache.begin()在哈希表中报错新的迭代器与key的映射关系。

get

get也很简单：

判断key是否存在，不存在直接返回
将被查询的节点，移动到链表头部
返回查询的value

V get(K key)
{
    auto hash_it = _hash.find(key);
    if (hash_it == _hash.end())
        return V();

    auto list_it = hash_it->second;
    _cache.emplace_front(list_it->first, list_it->second);
    _cache.erase(list_it);
    hash_it->second = _cache.begin();

    return _cache.begin()->second;
}

首先hash_it == _hash.end()判断是否存在该元素，如果不存在直接返回V()。

auto list_it = hash_it->second拿到指向节点的迭代器，与之前一样，在头部构造一个数据一样的新节点，然后把迭代器指向的节点erase掉。此处哈希表的迭代器会失效，hash_it->second = _cache.begin()更新哈希表的迭代器。

最后返回用户查询的value值，由于之前已经把节点移到头部了，所以就是_cache.begin()。

测试

写一个print函数，输出当前链表：

void print()
{
    int i = 1;
    for (auto& p : _cache)
    {
        cout << i++ << ": " << p.first << " - " << p.second << endl;
    }
}

测试代码：

int main()
{
    LRUCache<int, string> lru(5);

    cout << "插入五个数:" << endl;
    lru.set(1, "a");
    lru.set(2, "b");
    lru.set(3, "c");
    lru.set(4, "d");
    lru.set(5, "e");
    lru.print();

    cout << "获取key 1:" << endl;
    lru.get(1);
    lru.print();

    cout << "插入第六个数:" << endl;
    lru.set(6, "f");
    lru.print();

    return 0;
}

测试结果：

插入五个数:
1: 5 - e
2: 4 - d
3: 3 - c
4: 2 - b
5: 1 - a
获取key 1:
1: 1 - a
2: 5 - e
3: 4 - d
4: 3 - c
5: 2 - b
插入第六个数:
1: 6 - f
2: 1 - a
3: 5 - e
4: 4 - d
5: 3 - c

总代码

LRUCache.hpp：

#pragma once
#include <vector>
#include <list>
#include <unordered_map>

using namespace std;

template <typename K, typename V, typename Hash = hash<K>>
class LRUCache
{
public:
    LRUCache(int capacity)
        : _capacity(capacity)
    {}

    void set(K key, V value)
    {
        auto hash_it = _hash.find(key);

        if (hash_it != _hash.end())
        {
            auto list_it = hash_it->second;
            list_it->second = value;

            _cache.emplace_front(list_it->first, list_it->second);
            _cache.erase(list_it);
            hash_it->second = _cache.begin();
        }
        else
        {
            if (_capacity == _hash.size())
            {
                pair<K, V>& back = _cache.back();
                _hash.erase(back.first);
                _cache.pop_back();
            }

            _cache.emplace_front(key, value);
            _hash[key] = _cache.begin();
        }
    }

    V get(K key)
    {
        auto hash_it = _hash.find(key);
        if (hash_it == _hash.end())
            return V();

        auto list_it = hash_it->second;

        _cache.emplace_front(list_it->first, list_it->second);
        _cache.erase(list_it);
        hash_it->second = _cache.begin();

        return _cache.begin()->second;
    }

    void print()
    {
        int i = 1;
        for (auto& p : _cache)
        {
            cout << i++ << ": " << p.first << " - " << p.second << endl;
        }
    }

private:
    unordered_map<K, typename list<pair<K, V>>::iterator, Hash> _hash;
    list<pair<K, V>> _cache;
    int _capacity;
};

test.cpp：

#include <iostream>

#include "LRUCache.hpp"

int main()
{
    LRUCache<int, string> lru(5);

    cout << "插入五个数:" << endl;
    lru.set(1, "a");
    lru.set(2, "b");
    lru.set(3, "c");
    lru.set(4, "d");
    lru.set(5, "e");
    lru.print();

    cout << "获取key 1:" << endl;
    lru.get(1);
    lru.print();

    cout << "插入第六个数:" << endl;
    lru.set(6, "f");
    lru.print();

    return 0;
}