【c++】【STL】unordered_set 底层实现（简略版）

ps:这个是我自己看的不保证正确，觉得太长的后面会总结整个调用逻辑

unordered_set 内部实现

template <class _Kty, class _Hasher = hash<_Kty>, class _Keyeq = equal_to<_Kty>, class _Alloc = allocator<_Kty>>
class unordered_set : 
public _Hash<_Uset_traits<_Kty, _Uhash_compare<_Kty, _Hasher, _Keyeq>, _Alloc, false>> {
    // hash table of key-values, unique keys
public:
    static_assert(!_ENFORCE_MATCHING_ALLOCATORS || is_same_v<_Kty, typename _Alloc::value_type>,
        _MISMATCHED_ALLOCATOR_MESSAGE("unordered_set<T, Hasher, Eq, Allocator>", "T"));

private:
    using _Mytraits      = _Uhash_compare<_Kty, _Hasher, _Keyeq>;
    using _Mybase        = _Hash<_Uset_traits<_Kty, _Mytraits, _Alloc, false>>;
    using _Alnode        = typename _Mybase::_Alnode;
    using _Alnode_traits = typename _Mybase::_Alnode_traits;
    using _Key_compare   = typename _Mybase::_Key_compare;

public:
    using hasher    = _Hasher;
    using key_type  = _Kty;
    using key_equal = _Keyeq;
    ......

1 unordered_set

unordered_set 是从 _Hash 继承而来。
_Hash 是实际的哈希表实现，包含哈希冲突解决、扩容等核心逻辑。

2 _Uset_traits

unordered_set 通过模板参数传入到 _Uset_traits

_Uset_traits<_Kty, _Uhash_compare<_Kty, _Hasher, _Keyeq>, _Alloc, false>

参数	解释	例子
`_Kty`	存储的键的类型	`int`, `std::string`
`_Hasher`	哈希函数	`std::hash<int>`
`_Keyeq`	键比较器	`std::equal_to<int>`
`_Alloc`	分配器	`std::allocator<int>`
`false`	是否允许重复键（为 `false` 表示不允许）	`unordered_set` 需要唯一键

2.1 _Uset_traits 负责配置

在 unordered_set 中，_Uset_traits 负责设置哈希表的行为，包括：
✅ 元素的存储方式
✅ 哈希冲突解决方式

哈希冲突解决方式由 _Hash 通过 _Vec（桶）和 _List（链表）来完成。
_Uset_traits 通过 hasher 和 key_equal 定义了哈希冲突的对比方式。

✅ 扩容策略
_Uset_traits 定义如下：

template <class _Kty, // key type (same as value type)
    class _Tr, // comparator predicate type
    class _Alloc, // actual allocator type (should be value allocator)
    bool _Mfl> // true if multiple equivalent keys are permitted
class _Uset_traits : public _Tr { // traits required to make _Hash behave like a set
public:
    using key_type            = _Kty;
    using value_type          = _Kty;
    using _Mutable_value_type = _Kty;
    using key_compare         = _Tr;
    using allocator_type      = _Alloc;
    ......

2.2 _Uhash_compare 负责哈希和比较

在 _Uset_traits 中，_Hasher 和 _Keyeq 通过 _Uhash_compare 进行包装。
_Uhash_compare 是构造哈希表的关键：

_Myval1 = _Hasher
_Myval2 = _Keyeq 和最大负载因子

🔑 总结继承链

unordered_set
⬇️ 继承
_Hash
⬇️ 通过 _Uset_traits 配置
_Uhash_compare
⬇️ 存储哈希函数、比较器、负载因子

调用路径完整总结

std::unordered_set<int> s;

➡️ unordered_set 构造
➡️ _Uset_traits 设置 key_type, hash, equal, allocator
➡️ _Uhash_compare 负责存储哈希和比较器
➡️ _Hash 处理哈希表创建、冲突解决、扩容等核心逻辑

1 _Uhash_compare

_Uhash_compare 的作用是:封装哈希函数（_Hasher）和键值比较器（_Keyeq）

哈希函数：用于生成键的哈希值—> 哈希函数 _Hasher
比较函数：用于判断键是否相等—> 键值比较器 _Keyeq
最大桶大小的管理：管理哈希桶的最大负载因子

class _Uhash_compare 封装了这些操作。
下面是vs2022对应代码：

template <class _Kty, class _Hasher, class _Keyeq>
class _Uhash_compare
    : public _Uhash_choose_transparency<_Kty, _Hasher, _Keyeq> { // traits class for unordered containers
public:
    enum { // parameters for hash table
        bucket_size = 1 // 0 < bucket_size
    };

    _Uhash_compare() noexcept(
        conjunction_v<is_nothrow_default_constructible<_Hasher>, is_nothrow_default_constructible<_Keyeq>>)
        : _Mypair(_Zero_then_variadic_args_t{}, _Zero_then_variadic_args_t{}, 0.0f) {}

    explicit _Uhash_compare(const _Hasher& _Hasharg) noexcept(
        conjunction_v<is_nothrow_copy_constructible<_Hasher>, is_nothrow_default_constructible<_Keyeq>>)
        : _Mypair(_One_then_variadic_args_t{}, _Hasharg, _Zero_then_variadic_args_t{}, 0.0f) {}

    explicit _Uhash_compare(const _Hasher& _Hasharg, const _Keyeq& _Keyeqarg) noexcept(
        conjunction_v<is_nothrow_copy_constructible<_Hasher>, is_nothrow_copy_constructible<_Keyeq>>)
        : _Mypair(_One_then_variadic_args_t{}, _Hasharg, _One_then_variadic_args_t{}, _Keyeqarg, 0.0f) {}

    template <class _Keyty>
    _NODISCARD size_t operator()(const _Keyty& _Keyval) const noexcept(_Nothrow_hash<_Hasher, _Keyty>) {
        // hash _Keyval to size_t value
        return static_cast<size_t>(_Mypair._Get_first()(_Keyval));
    }

    template <class _Keyty1, class _Keyty2>
    _NODISCARD bool operator()(const _Keyty1& _Keyval1, const _Keyty2& _Keyval2) const
        noexcept(_Nothrow_compare<_Keyeq, _Keyty1, _Keyty2>) {
        // test if _Keyval1 NOT equal to _Keyval2
        return !static_cast<bool>(_Mypair._Myval2._Get_first()(_Keyval1, _Keyval2));
    }

    _NODISCARD float& _Get_max_bucket_size() noexcept {
        return _Mypair._Myval2._Myval2;
    }

    _NODISCARD const float& _Get_max_bucket_size() const noexcept {
        return _Mypair._Myval2._Myval2;
    }

    void swap(_Uhash_compare& _Rhs) noexcept(
        conjunction_v<_Is_nothrow_swappable<_Hasher>, _Is_nothrow_swappable<_Keyeq>>) {
        _Swap_adl(_Mypair._Get_first(), _Rhs._Mypair._Get_first());
        auto& _Lsecond = _Mypair._Myval2;
        auto& _Rsecond = _Rhs._Mypair._Myval2;
        _Swap_adl(_Lsecond._Get_first(), _Rsecond._Get_first());
        _STD swap(_Lsecond._Myval2, _Rsecond._Myval2);
    }

    _Compressed_pair<_Hasher, _Compressed_pair<_Keyeq, float>> _Mypair;
};

_Compressed_pair<_Hasher, _Compressed_pair<_Keyeq, float>> _Mypair;

通过一个 pair（即 _Compressed_pair）存储哈希函数、键值比较器和最大负载因子（即扩容因子）。
_Mypair结构：

1.1 _Uhash_compare 定义

template <class _Kty, class _Hasher, class _Keyeq>
class _Uhash_compare
    : public _Uhash_choose_transparency<_Kty, _Hasher, _Keyeq> {

模板参数解释

参数	作用
_Kty	关键字类型（key type）
_Hasher	哈希函数类型
_Keyeq	键值比较器类型

1.2 构造函数解释

在这里插入图片描述

1.2.1 默认构造函数

_Uhash_compare() noexcept(
    conjunction_v<is_nothrow_default_constructible<_Hasher>, is_nothrow_default_constructible<_Keyeq>>)
    : _Mypair(_Zero_then_variadic_args_t{}, _Zero_then_variadic_args_t{}, 0.0f) {}

✅ 作用：

默认构造 _Hasher 和 _Keyeq。
将最大负载因子初始化为 0.0f。

✅ _Zero_then_variadic_args_t{} 是一个标签，用于指示在 Compressed_pair 中使用默认构造。

1.2.2 带哈希函数的构造函数

explicit _Uhash_compare(const _Hasher& _Hasharg) noexcept(
    conjunction_v<is_nothrow_copy_constructible<_Hasher>, is_nothrow_default_constructible<_Keyeq>>)
    : _Mypair(_One_then_variadic_args_t{}, _Hasharg, _Zero_then_variadic_args_t{}, 0.0f) {}

✅ 作用：

传入一个哈希函数，并用默认构造创建比较器。
将最大负载因子初始化为 0.0f。

✅ _One_then_variadic_args_t{} 是一个标签，表示先传入 _Hasher，再调用 _Keyeq 的默认构造。

1.2.3 带哈希函数和比较器的构造函数

explicit _Uhash_compare(const _Hasher& _Hasharg, const _Keyeq& _Keyeqarg) noexcept(
    conjunction_v<is_nothrow_copy_constructible<_Hasher>, is_nothrow_copy_constructible<_Keyeq>>)
    : _Mypair(_One_then_variadic_args_t{}, _Hasharg, _One_then_variadic_args_t{}, _Keyeqarg, 0.0f) {}

✅ 作用：

同时传入哈希函数和比较器。
将**最大负载因子（扩容因子）**初始化为 0.0f。

✅ _One_then_variadic_args_t{} 作用：

第一个 _One_then_variadic_args_t{}：传入哈希函数
第二个 _One_then_variadic_args_t{}：传入键比较器

1.2.4 举例表示

1：默认构造 unordered_map

std::unordered_map<int, int> m;

➡️ 执行路径：

默认构造会触发 _Uhash_compare() 默认构造函数：

_Uhash_compare() 
    : _Mypair(_Zero_then_variadic_args_t{}, _Zero_then_variadic_args_t{}, 0.0f) {}

_Compressed_pair 的这个构造函数会被调用：

template <class... _Other2>
constexpr explicit _Compressed_pair(_Zero_then_variadic_args_t, _Other2&&... _Val2)

_Myval1 被默认构造（即哈希函数为 std::hash 默认构造）
_Myval2 被构造成 {std::equal_to{}, 0.0f}

ps:_Compressed_pair 在后面会提到
_Myval1：存储哈希函数对象（_Hasher）
_Myval2：存储键值比较对象（_Keyeq） 和 最大负载因子

➡️ 构造完成后，最大负载因子 被设置为 1.0f（在 rehash() 之后调整）。

2 传入哈希函数构造 unordered_set

std::unordered_set<int, std::hash<int>> m;

➡️ 执行路径：

触发 _Uhash_compare 的这个构造函数：

explicit _Uhash_compare(const _Hasher& _Hasharg)
    : _Mypair(_One_then_variadic_args_t{}, _Hasharg, _Zero_then_variadic_args_t{}, 0.0f) {}

_Compressed_pair 的这个构造函数被调用：

template <class _Other1, class... _Other2>
constexpr _Compressed_pair(_One_then_variadic_args_t, _Other1&& _Val1, _Other2&&... _Val2)
    : _Myval1(_STD forward<_Other1>(_Val1)), _Myval2(_STD forward<_Other2>(_Val2)...) {}

_Myval1 被设置为 std::hash<int>
_Myval2 被设置为 {std::equal_to<int>(), 0.0f}

➡️ 最大负载因子在 rehash() 之后被调整为 1.0f。

3 传入哈希函数 + 比较器构造 unordered_set

std::unordered_set<int, std::hash<int>, std::equal_to<int>> m;

➡️ 执行路径：

触发 _Uhash_compare 的这个构造函数：

explicit _Uhash_compare(const _Hasher& _Hasharg, const _Keyeq& _Keyeqarg)
    : _Mypair(_One_then_variadic_args_t{}, _Hasharg, _One_then_variadic_args_t{}, _Keyeqarg, 0.0f) {}

_Compressed_pair 的这个构造函数被调用：

template <class _Other1, class... _Other2>
constexpr _Compressed_pair(_One_then_variadic_args_t, _Other1&& _Val1, _Other2&&... _Val2)
    : _Myval1(_STD forward<_Other1>(_Val1)), _Myval2(_STD forward<_Other2>(_Val2)...) {}

_Myval1 被设置为 std::hash<int>
_Myval2 被设置为 {std::equal_to<int>(), 0.0f}

➡️ 最大负载因子在 rehash() 之后被调整为 1.0f。

_Uhash_compare 构造时，创建 _Compressed_pair 对象。
_Compressed_pair 里第一个参数存储哈希函数，
第二个参数存储一个新的 _Compressed_pair，用于存储比较器和最大负载因子。
默认的最大负载因子（扩容因子）设为 0.0f

0.0f 在初始化时作为占位值 ✅
在 rehash() 或 _Forced_rehash() 中，最大负载因子会被设置为 1.0f ✅
rehash() 的扩容策略根据当前的负载情况和桶大小动态调整 ✅

rehash()是扩容策略的函数

1.3 其它涉及到的函数

在这里插入图片描述

在_Uhash_compare 里面是_Compressed_pair

_Compressed_pair<_Hasher, _Compressed_pair<_Keyeq, float>> _Mypair;

2 _Compressed_pair

2.1 源码

template <class _Ty1, class _Ty2>
class _Compressed_pair<_Ty1, _Ty2, false> final { // store a pair of values, not deriving from first
public:
    _Ty1 _Myval1;
    _Ty2 _Myval2;

    template <class... _Other2>
    constexpr explicit _Compressed_pair(_Zero_then_variadic_args_t, _Other2&&... _Val2) noexcept(
        conjunction_v<is_nothrow_default_constructible<_Ty1>, is_nothrow_constructible<_Ty2, _Other2...>>)
        : _Myval1(), _Myval2(_STD forward<_Other2>(_Val2)...) {}

    template <class _Other1, class... _Other2>
    constexpr _Compressed_pair(_One_then_variadic_args_t, _Other1&& _Val1, _Other2&&... _Val2) noexcept(
        conjunction_v<is_nothrow_constructible<_Ty1, _Other1>, is_nothrow_constructible<_Ty2, _Other2...>>)
        : _Myval1(_STD forward<_Other1>(_Val1)), _Myval2(_STD forward<_Other2>(_Val2)...) {}

    constexpr _Ty1& _Get_first() noexcept {
        return _Myval1;
    }

    constexpr const _Ty1& _Get_first() const noexcept {
        return _Myval1;
    }
};

_Compressed_pair 被用来存储两个值：

_Myval1：存储哈希函数对象（_Hasher）
_Myval2：存储键值比较对象（_Keyeq） 和 最大负载因子

class _Compressed_pair final : private _Ty1 { // store a pair of values, deriving from empty first
public:
    _Ty2 _Myval2;

    using _Mybase = _Ty1; // for visualization

    template <class... _Other2>
    constexpr explicit _Compressed_pair(_Zero_then_variadic_args_t, _Other2&&... _Val2) noexcept(
        conjunction_v<is_nothrow_default_constructible<_Ty1>, is_nothrow_constructible<_Ty2, _Other2...>>)
        : _Ty1(), _Myval2(_STD forward<_Other2>(_Val2)...) {}

    template <class _Other1, class... _Other2>
    constexpr _Compressed_pair(_One_then_variadic_args_t, _Other1&& _Val1, _Other2&&... _Val2) noexcept(
        conjunction_v<is_nothrow_constructible<_Ty1, _Other1>, is_nothrow_constructible<_Ty2, _Other2...>>)
        : _Ty1(_STD forward<_Other1>(_Val1)), _Myval2(_STD forward<_Other2>(_Val2)...) {}

    constexpr _Ty1& _Get_first() noexcept {
        return *this;
    }

    constexpr const _Ty1& _Get_first() const noexcept {
        return *this;
    }
};

扩容因子先被初始化为0.0f
在这里插入图片描述

_Myval2 是一个对象，其结构如下：

struct _Myval2_type {
    _Keyeq _Keyeqarg;
    float _Max_load_factor;
};

所以 _Mypair 实际结构如下：
在这里插入图片描述

2.2 上面涉及到_Uhash_compare构造函数的部分

构造 _Mypair 的地方：

explicit _Uhash_compare(const _Hasher& _Hasharg, const _Keyeq& _Keyeqarg) noexcept(
    conjunction_v<is_nothrow_copy_constructible<_Hasher>, is_nothrow_copy_constructible<_Keyeq>>)
    : _Mypair(_One_then_variadic_args_t{}, _Hasharg, _One_then_variadic_args_t{}, _Keyeqarg, 0.0f) {}

在 _Compressed_pair 里对应的是：

template <class _Other1, class... _Other2>
constexpr _Compressed_pair(_One_then_variadic_args_t, _Other1&& _Val1, _Other2&&... _Val2) noexcept(
    conjunction_v<is_nothrow_constructible<_Ty1, _Other1>, is_nothrow_constructible<_Ty2, _Other2...>>)
    : _Myval1(_STD forward<_Other1>(_Val1)), _Myval2(_STD forward<_Other2>(_Val2)...) {}

_Val1 ➔ _Hasher（哈希函数对象）
_Val2... ➔ _Keyeq（键值比较对象）和 0.0f（最大负载因子）

等价于：

_Mypair._Myval1 = _Hasharg; // 存储哈希函数对象
_Mypair._Myval2 = { _Keyeqarg, 0.0f }; // 存储键比较对象和最大负载因子

_Myval1 = 存储哈希函数 _Hasher
_Myval2 = 存储键值比较对象 _Keyeq + 最大负载因子

2.3 扩容因子

在这里插入图片描述
这里就是获取扩容因子

在 rehash()中–>_Forced_rehash()
扩容因子会在 rehash() 之后更新：

_Mypair._Get_second()._Myval2 = 新的最大负载因子;

_Mypair._Get_second() 访问的是 _Myval2
_Myval2._Max_load_factor 会根据当前桶数和元素数量动态调整

2.4 总结

成员	含义	作用
`_Myval1`	`_Hasher`	存储哈希函数对象，负责将键映射为哈希值
`_Myval2`	`_Keyeq` + 最大负载因子	存储键值比较器 + 最大负载因子，负责元素比较和控制扩容行为

所以：

_Myval1 👉 控制哈希映射
_Myval2 👉 控制键值比较 + 控制负载因子（扩容因子）

3 扩容部分

unordered_set 第一次触发扩容后，内部会调用 rehash() 将最大负载因子设置成 1.0。

3.1 rehash代码

 void rehash(size_type _Buckets) { // rebuild table with at least _Buckets buckets
        // don't violate a.bucket_count() >= a.size() / a.max_load_factor() invariant:
        _Buckets = (_STD max) (_Min_load_factor_buckets(_List.size()), _Buckets);
        if (_Buckets <= _Maxidx) { // we already have enough buckets; nothing to do
            return;
        }
        _Forced_rehash(_Buckets);
    }

3.1.2 _Min_load_factor_buckets(_List.size())

_Min_load_factor_buckets() 计算的目的是：
- 维护最大负载因子的约束：

NODISCARD size_type _Min_load_factor_buckets(const size_type _For_size) const noexcept {
        // returns the minimum number of buckets necessary for the elements in _List
        return static_cast<size_type>(_CSTD ceilf(static_cast<float>(_For_size) / max_load_factor()));
    }

目的： 通过最大负载因子（max_load_factor）计算出当前存储元素所需的最小桶数量。

_For_size：表示当前容器中已有的元素数量。
max_load_factor()：最大负载因子，默认值为 1.0。
ceilf()：向上取整，确保计算结果为整数，保证即使余数很小也会分配一个完整的桶。

推导公式
最大负载因子=元素数量/桶数
可以通过最大负载因子计算出当前的桶数–>桶数=元素数量/最大负载因子

eg:

当前有 10 个元素
最大负载因子为 0.8
10/0.8 = 12.5—>向上取整= 13—>（需要至少 13 个桶）

3.1.3 _Buckets = (_STD max) (…)

std::max 选择当前计算出的最小桶数量和传入的 _Buckets 值。
这一步确保在 rehash() 过程中，桶数量不会低于当前负载因子要求的最低桶数。

3.1.4 if (_Buckets <= _Maxidx)

_Maxidx 表示当前桶的数量（即 bucket_count()）。
如果 _Buckets 小于等于当前桶数量，表示桶数量足够，无需扩容，直接返回。

4 扩容函数 _Forced_rehash(_Buckets)

4.1 源代码

void _Forced_rehash(size_type _Buckets) {
// Force rehash of elements in _List, distrusting existing bucket assignments in _Vec.
// Assumes _Buckets is greater than _Min_buckets, and that changing to that many buckets doesn't violate
// load_factor() <= max_load_factor().

// Don't violate power of 2, fits in half the bucket vector invariant:
 // (we assume because vector must use single allocations; as a result, its max_size fits in a size_t)
const unsigned long _Max_storage_buckets_log2 = _Floor_of_log_2(static_cast<size_t>(_Vec.max_size() >> 1));
const auto _Max_storage_buckets               = static_cast<size_type>(1) << _Max_storage_buckets_log2;
if (_Buckets > _Max_storage_buckets) {
_Xlength_error("invalid hash bucket count");
}

// The above test also means that we won't perform a forbidden full shift when restoring the power of
// 2 invariant
// this round up to power of 2 in addition to the _Buckets > _Maxidx above means
// we'll at least double in size (the next power of 2 above _Maxidx)
_Buckets                       = static_cast<size_type>(1) << _Ceiling_of_log_2(static_cast<size_t>(_Buckets));
const _Unchecked_iterator _End = _Unchecked_end();

 _Vec._Assign_grow(_Buckets << 1, _End);
_Mask   = _Buckets - 1;
_Maxidx = _Buckets;

_Clear_guard _Guard{this};

_Unchecked_iterator _Inserted = _Unchecked_begin();

// Remember the next _Inserted value as splices will change _Inserted's position arbitrarily.
for (_Unchecked_iterator _Next_inserted = _Inserted; _Inserted != _End; _Inserted = _Next_inserted) {
++_Next_inserted;

auto& _Inserted_key     = _Traits::_Kfn(*_Inserted);
const size_type _Bucket = bucket(_Inserted_key);

// _Bucket_lo and _Bucket_hi are the *inclusive* range of elements in the bucket, or _Unchecked_end() if
// the bucket is empty; if !_Standard then [_Bucket_lo, _Bucket_hi] is a sorted range.
_Unchecked_iterator& _Bucket_lo = _Vec._Mypair._Myval2._Myfirst[_Bucket << 1];
_Unchecked_iterator& _Bucket_hi = _Vec._Mypair._Myval2._Myfirst[(_Bucket << 1) + 1];

if (_Bucket_lo == _End) {
// The bucket was empty, set it to the inserted element.
_Bucket_lo = _Inserted;
_Bucket_hi = _Inserted;
continue;
}
// Search the bucket for the insertion location and move element if necessary.
_Unchecked_const_iterator _Insert_before = _Bucket_hi;
if (!_Traitsobj(_Inserted_key, _Traits::_Kfn(*_Insert_before))) {
// The inserted element belongs at the end of the bucket; splice it there and set _Bucket_hi to the
// new bucket inclusive end.
++_Insert_before;
if (_Insert_before != _Inserted) { // avoid splice on element already in position
_Mylist::_Scary_val::_Unchecked_splice(_Insert_before._Ptr, _Inserted._Ptr, _Next_inserted._Ptr);
}
_Bucket_hi = _Inserted;
continue;
}
// The insertion point isn't *_Bucket_hi, so search [_Bucket_lo, _Bucket_hi) for insertion point; we
// go backwards to maintain sortedness when !_Standard.
for (;;) {
if (_Bucket_lo == _Insert_before) {
// There are no equivalent keys in the bucket, so insert it at the beginning.
// Element can't be already in position here because:
// * (for !_Standard) _Inserted_key < *_Insert_before or
// * (for _Standard) _Inserted_key != *_Insert_before
_Mylist::_Scary_val::_Unchecked_splice(_Insert_before._Ptr, _Inserted._Ptr, _Next_inserted._Ptr);
_Bucket_lo = _Inserted;
break;
}
if (!_Traitsobj(_Inserted_key, _Traits::_Kfn(*--_Insert_before))) {
// Found insertion point, move the element here, bucket bounds are already okay.
++_Insert_before;
// Element can't be already in position here because all elements we're inserting are after all
// the elements already in buckets, and *_Insert_before isn't the highest element in the bucket.
_Mylist::_Scary_val::_Unchecked_splice(_Insert_before._Ptr, _Inserted._Ptr, _Next_inserted._Ptr);
break;
          }
       }
   }

_Forced_rehash() 是 unordered_set触发扩容（rehash）时的核心操作。
扩容的主要目的是为了：
✅ 解决哈希冲突
✅ 保持最大负载因子（即 load_factor <= max_load_factor()）
✅ 保证哈希桶大小是 2 的幂次方（以便通过位运算提升效率）

4.2 函数定义

void _Forced_rehash(size_type _Buckets);

参数名	说明
`_Buckets`	需要的桶数量（由 `_Min_load_factor_buckets()` 计算得出）

工作原理（整体思路）

计算新的桶数量（满足最小的，根据负载因子计算出的桶数）,位运算–>将桶数量调整为最接近的 2 的幂
如果桶数量超过最大允许数量，抛出异常。
重新分配存储空间。
将所有元素从旧桶重新分配到新桶（按照新的哈希分布）。
调整哈希表元信息（掩码、最大桶索引等）。

4.3 ① 确认桶数量是否超限

const unsigned long _Max_storage_buckets_log2 = _Floor_of_log_2(static_cast<size_t>(_Vec.max_size() >> 1));
const auto _Max_storage_buckets = static_cast<size_type>(1) << _Max_storage_buckets_log2;
if (_Buckets > _Max_storage_buckets) {
    _Xlength_error("invalid hash bucket count");
}

_Vec.max_size()：返回当前 _Vec 支持的最大元素数量。
>> 1：为保证 vector 的容量不会超出限制，最大桶数量为 vector 最大长度的一半。
超出最大桶数量限制时，直接抛出长度错误。

4.4 ② 调整桶数量到 2 的幂

_Buckets = static_cast<size_type>(1) << _Ceiling_of_log_2(static_cast<size_t>(_Buckets));

_Ceiling_of_log_2(x)：返回不小于 log2(x) 的最小整数，确保桶数量为2 的幂。
1 << n：将桶数量调整为最接近的 2 的幂（位运算）。

eg ：插入元素后需要 5 个桶

_Min_load_factor_buckets() 返回 5
_Ceiling_of_log_2(5) = 3
1 << 3 = 8

➡️ 结果：调整后桶数量为 8（最接近的 2 的幂）

为什么要调整成 2 的幂？

便于使用位运算替代取模
- hash % bucket_size → hash & (bucket_size - 1)
- 位运算速度快于取模运算
更高效的内存分配
- 2 的幂次分配，内存对齐，提升访问速度
减少哈希冲突
- 通过位运算定位索引更均匀
  ✅ 结论

unordered_map 和 unordered_set 在内部调整桶数量时，最终都会转换为最接近的 2 的幂。
_Min_load_factor_buckets() 可能会返回非 2 的幂的值，但实际分配桶数量最终通过 _Ceiling_of_log_2() 转换为 2 的幂。

4.5 ③ 分配新的存储空间

_Vec._Assign_grow(_Buckets << 1, _End);

<< 1：将桶数量扩大一倍，原因是存储 _Bucket_lo 和 _Bucket_hi。
_Assign_grow()：为哈希桶重新分配内存。
- _Bucket_lo：桶的起始位置。
- _Bucket_hi：桶的结束位置。

4.6 ④ 更新哈希表的元信息

_Mask = _Buckets - 1;
_Maxidx = _Buckets;

_Mask：用于通过位运算快速定位桶索引
_Maxidx：保存当前最大桶索引

4.7 ⑤ 重新分配元素

_Unchecked_iterator _Inserted = _Unchecked_begin();

for (_Unchecked_iterator _Next_inserted = _Inserted; _Inserted != _End; _Inserted = _Next_inserted) {
    ++_Next_inserted;

遍历所有元素
根据新桶数量和哈希函数，重新定位每个元素的桶位置
将元素放入新的桶中

4.8 ⑥ 重新插入桶

const size_type _Bucket = bucket(_Inserted_key);

_Unchecked_iterator& _Bucket_lo = _Vec._Mypair._Myval2._Myfirst[_Bucket << 1];
_Unchecked_iterator& _Bucket_hi = _Vec._Mypair._Myval2._Myfirst[(_Bucket << 1) + 1];

_Bucket = bucket(_Inserted_key)：用新的哈希函数定位元素所在桶。
_Bucket_lo 和 _Bucket_hi：维护当前桶的首尾位置。

4.9 ⑦ 插入到桶内

根据哈希冲突策略（开放链地址法）：

若桶为空，直接插入：

if (_Bucket_lo == _End) {
    _Bucket_lo = _Inserted;
    _Bucket_hi = _Inserted;
    continue;
}

若桶已存在，检查插入位置：

if (!_Traitsobj(_Inserted_key, _Traits::_Kfn(*_Insert_before))) {
    ++_Insert_before;
    if (_Insert_before != _Inserted) {
        _Mylist::_Scary_val::_Unchecked_splice(_Insert_before._Ptr, _Inserted._Ptr, _Next_inserted._Ptr);
    }
    _Bucket_hi = _Inserted;
    continue;
}

采用直接插入法或尾部插入法。

关键操作定位

步骤	关键操作	解释
调整桶数量	`_Ceiling_of_log_2()`	调整到 2 的幂次方
分配内存	`_Assign_grow()`	分配新桶
更新索引	`_Mask = _Buckets - 1`	设置新桶索引的位掩码
重新插入	`_Mylist::_Scary_val::_Unchecked_splice()`	将元素插入到桶中
更新桶指针	`_Bucket_lo = _Inserted`	更新桶起始位置

5 负载因子修改—>max_load_factor()

最大负载因子在扩容后通常不会会被调整。
实际负载因子会到 0.5 ~ 1.0 的区间内。
这个调整发生在 max_load_factor() 执行后，由用户手动调用才会修改，通过重新设置 _Myval2 来改变

void max_load_factor(float _Newmax) noexcept /* strengthened */ {
    _STL_ASSERT(!(_CSTD isnan) (_Newmax) && _Newmax > 0, "invalid hash load factor");
    _Max_bucket_size() = _Newmax;
}

max_load_factor() 是 unordered_map 和 unordered_set 的公开接口，用于设置最大负载因子。
_Max_bucket_size() 其实是 _Mypair._Myval2._Myval2 的封装，最终修改的正是存储负载因子的内部值。

_Max_bucket_size()
在 unordered_map 的实现中， _Max_bucket_size() 是对 _Mypair._Myval2._Myval2 的封装，最终会设置到 _Myval2，即：
在这里插入图片描述

NODISCARD const float& _Get_max_bucket_size() const noexcept {
        return _Mypair._Myval2._Myval2;
    }

修改的本质是：

_Mypair._Myval2._Myval2 = _Newmax;

如何被触发
用户手动调用：

max_load_factor(1.0f)

更新 _Myval2._Myval2 的值