Hash入门-通过线性探测解决哈希冲突

news2025/4/27 13:00:09

unordered_set

void test_unordered_set()
{
	unordered_set<int> us;
	us.insert(4);
	us.insert(2);
	us.insert(1);
	us.insert(5);
	us.insert(6);
	us.insert(2);
	us.insert(2);
	//去重
	unordered_set<int>::iterator it = us.begin();
	while (it != us.end())
	{
		cout << *it << " ";
		++it;
	}
	cout << endl;
}

unordered_set：可以做到去重+排序。

set：可以做到排序+去重。

unordered_map

void test_unordered_map()
{
	unordered_map<string, string> dict;
	dict.insert(make_pair("sort", "排序"));
	dict["string"] = "字符串";
	dict.insert(make_pair("left", "左边"));
	unordered_map<string, string>::iterator dit = dict.begin();
	while (dit != dict.end())
	{
		cout << dit->first << " : " << dit->second << endl;
		++dit;
	}
	cout << endl;
}

unordered_map按插入的顺序输出，map会先根据key排序然后再输出。

总结：

map和set和unordered_map/unordered_set的区别和联系

1.都可以实现key和key/value的搜索场景，功能和使用基本一样

2.map/set底层是红黑树实现的遍历后是有序的，增删查改的时间复杂都为O(logN)

3.unordered_map/unordered_set的底层是使用哈希表实现的，遍历出来是无序的，增删查改的时间复杂都为O(l)

4.map/set是双向迭代器，unordered_map/unordered_set是单向迭代器。

961. 在长度 2N 的数组中找出重复 N 次的元素

. - 力扣（LeetCode）. - 备战技术面试？力扣提供海量技术面试资源，帮助你高效提升编程技能,轻松拿下世界 IT 名企 Dream Offer。https://leetcode.cn/problems/n-repeated-element-in-size-2n-array/description/

int repeatedNTimes(vector<int>& nums) {
        unordered_map<int,int> countMap;
        for(auto e:nums)
            countMap[e]++;
        for(auto e:countMap)
            if(e.second==nums.size()/2)
                return e.first;
        return -1;
    }

. - 力扣（LeetCode）. - 备战技术面试？力扣提供海量技术面试资源，帮助你高效提升编程技能,轻松拿下世界 IT 名企 Dream Offer。https://leetcode.cn/problems/intersection-of-two-arrays/

vector<int> intersection(vector<int>& nums1, vector<int>& nums2) {
        unordered_set<int> s1;
        for(auto e:nums1)
            s1.insert(e);
        unordered_set<int> s2;
        for(auto e:nums2)
            s2.insert(e);
        vector<int> vRet;
        for(auto e:s1)
            if(s2.find(e)!=s2.end())
                vRet.push_back(e);
        return vRet;
    }

哈希函数

常见哈希函数

1. 直接定制法--(常用) 取关键字的某个线性函数为散列地址：Hash（Key）= A*Key + B 优点：简单、均匀缺点：需要事先知道关键字的分布情况使用场景：适合查找比较小且连续的情况面

2. 除留余数法--(常用) 设散列表中允许的地址数为m，取一个不大于m，但最接近或者等于m的质数p作为除数，按照哈希函数：Hash(key) = key% p(p将关键码转换成哈希地址

哈希冲突

在往hash数组中存11时，就会发生hash冲突。

哈希冲突

对于两个数据元素的关键字和 (i != j)，有 != ，但有：Hash( ) == Hash( )，即：不同关键字通过相同哈希哈数计算出相同的哈希地址，该种现象称为哈希冲突或哈希碰撞。

哈希冲突解决

闭散列

闭散列：也叫开放定址法，当发生哈希冲突时，如果哈希表未被装满，说明在哈希表中必然还有空位置，那么可以把key存放到冲突位置中的“下一个” 空位置中去。那如何寻找下一个空位置呢？

线性探测

线性探测：从发生冲突的位置开始，依次向后探测，直到寻找到下一个空位置为止。

插入

通过哈希函数获取待插入元素在哈希表中的位置

如果该位置中没有元素则直接插入新元素，如果该位置中有元素发生哈希冲突，使用线性探测找到下一个空位置，插入新元素

二次探测

按2次方往后找空位置。

代码实现

enum State
{
	EMPTY,
	EXITS,
	DELETE,
};

template<class T>
struct HashData
{
	T _data;
	State _state;
};
template<class K,class T>
class HashTable
{

private:
	vector<HashData> _tables;
	size_t _num;//存储的有效数据的个数
};

三个转态对应查找时的三种情况

insert

bool Insert(const T& x)
{
	KeyOfT koft;
	//计算x在表中的位置
	size_t index = koft(x) % _tables.size();
	while (_tables[index]._state == EXITS)
	{
		if (_tables[index]._data == x)
			return false;
		++index;
		if (index == _tables.size())
			index = 0;
	}
	_tables[index]._data = x;
	_tables[index]._state = EXITS;
	_num++;
}

负载因子

负载因子=表中数据/表的大小 -> 衡量哈希表满的程度
表接近满，插入数据容易冲突，冲突越多，效率越低

增容问题

增容是一次重新洗牌，重新印射的过程。

if (_tables.size() == 0 || _num * 10 / _tables.size() >= 7)
{
	//1.开新表
	//2.重新计算数据在新表中的位置
	//3.释放旧表
	vector<HashData> newtables; 
	size_t newsize = _tables.size() == 0 ? 10 : _tables.size() * 2;
	newtables.resize(newsize);
	for (size_t i = 0; i < _tables.size(); ++i)
	{
		if (_tables[i]._state == EXITS)
		{
			size_t index = koft(_tables[i]._data)%newtables.size();
			while (newtables[index]._state == EXITS)
			{
				++index;
				if (index == _tables.size())
					index = 0;
			}
			newtables[index] = _tables[i];
		}
	}
	swap(_tables, newtables);
}

find

HashData* Find(const K& key)
{
	KeyOfT koft;
	size_t index = key % _tables.size();
	while (_tables[index]._state != EMPTY)
	{
		if (koft(_tables[index]._data) == key)
		{
			if (_tables[index]._state == EXITS)
			{
				return &_tables[index];
			}
			else if (_tables[index]._state == DELETE)
			{
				return nullptr;
			}
		}
		++index;
		if (index == _tables.size())
		{
			index = 0;
		}
	}
	return nullptr;
}

erase

bool Erase(const K& key)
{
	HashData* ret = Find(key);
	if (ret)
	{
		ret->_state = DELETE;
		return true;
	}
	else
		return false;
}

开散列

开散列法又叫链地址法(开链法)，首先对关键码集合用散列函数计算散列地址，具有相同地址的关键码归于同一子集合，每一个子集合称为一个桶，各个桶中的元素通过一个单链表链接起来，各链表的头结点存储在哈希表中。

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/2152909.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！