【C++】map和set的介绍和使用

1.序列式容器与关联式容器

序列式容器：

底层为线性序列的数据结构，里面存储的是元素本身。如vector/list/string/deque/forward_list。

关联式容器：

也是用来存储数据的，于序列式容器不同的是，里面存储的是<key，value>结构的键值对，在数据检索时比序列式容器效率高。如map/set/unordered_map/unordered_set。

2.树形结构的关联式容器

根据应用场景的不同，STL总共实现了两种不同结构的管理式容器：树型结构与哈希结构。树型结构的关联式容器主要有四种：map、set、multimap、multiset。这四种容器的共同点是：使用平衡搜索树(即红黑树)作为其底层结果，容器中的元素是一个有序的序列。下面一依次介绍每一个容器。

2.set

set文档介绍

set底层是二叉搜索树的K模型，map底层是二叉搜索树的KV模型，关于二叉搜索树可以查看上篇【二叉树进阶】二叉搜索树-CSDN博客

T: set 中存放元素的类型，实际在底层存储 <value, value> 的键值对。

Compare ： set 中元素默认按照小于来比较。

Alloc ： set 中元素空间的管理方式，使用 STL 提供的空间配置器管理。

简单使用：
void test_set1()
{
	set<int> s;
	s.insert(1);
	s.insert(3);
	s.insert(4);
	s.insert(4);
	s.insert(5);
	//迭代器遍历
	set<int>::iterator it = s.begin();
	while (it != s.end())
	{
		cout << *it << " ";
		++it;
	}
	cout << endl;
	//范围for遍历
	for (auto& e : s)
	{
		cout << e << " ";
	}
	cout << endl;
    //拷贝构造
	set<int> copy(s);
	for (auto& e : copy)
	{
		cout << e << " ";
	}
}
可以看到，set具有排序+去重的效果，set底层就是搜索树，中序是有序的。

按从小到大顺序排序。

删除操作：
//1、使用迭代器方式删除
set<int>::iterator pos = s.find(40);
if (pos != s.end())
{
	s.erase(pos);
}
//2、直接删除
s.erase(40);
//3、使用算法里的find删除
set<int>::iterator pos = find(s.begin(),s.end(),40);
s.erase(pos);
说明：

使用迭代器方式删除，迭代器的位置必须是有效位置，否则，如果不检查，就会报错。

直接删除，值存在就删除，不存在就不删。

使用set自己的find和算法里的find有什么区别？

效率的问题。算法里的find效率是O(N)，set底层是搜索树，find效率为O(log_2 N)。

算法里的find是用函数模板实现的，目的是让任何类型的迭代器都可以使用。

2.1set特点

1、set是K模型，主要用于查找“在不在”，特点就是快，因为底层是搜索树。

例如：

假设要把中国所有人的身份信息放进set中，查找一个人，只需要查找31次就可以了。

2^10*2^10*2^10~=2^30==10亿，2^31==20亿。

2、增、删、查时间复杂度为O(log_2 N)。

3、修改：不允许修改，因为修改就会破坏搜索树的结构。

2.2set应用举例

set为K模型，主要应用查找“在不在”的问题，这里举一个“简易字典”的例子。

在项目工程中，一般要求不展开std。（此示例下为没有展开std的写法）

简易字典：
void test_map2()
{
	std::map<std::string, std::string> dict;
	dict.insert(pair<std::string, std::string>("left", "左边"));
	dict.insert(pair<std::string, std::string>("sort", "排序"));
	dict.insert(pair<std::string, std::string>("string", "排序"));
	std::map < std::string, std::string>::iterator it = dict.begin();
	while (it != dict.end())
	{
		cout << it->first << ":" << it->second << endl;
		++it;
	}
	cout << endl;
}
按key的ASCII从小到大排序（走的是二叉树的中序为有序）。

3.map

map文档介绍

key: 键值对中 key 的类型。

T ：键值对中 value 的类型。

Compare: 比较器的类型， map 中的元素是按照 key 来比较的，缺省情况下按照小于来比较，一般情况下( 内置类型元素 ) 该参数不需要传递，如果无法比较时 ( 自定义类型 ) ，需要用户自己显式传递比较规则( 一般情况下按照函数指针或者仿函数来传递 )。

Alloc ：通过空间配置器来申请底层空间，不需要用户传递，除非用户不想使用标准库提供的

空间配置器。

简单使用：

这样插入会报错。来看一下insert的说明。

这里insert的参数是value_type& val，value_type就是pair。

再继续探索insert之前，就引入了键值对这一概念。

3.1键值对pair

用来表示具有一一对应关系的结构，一般只包含两个成员变量key和value，key代表键值，value表示与key对应的信息。

SGI-STL中关于键值对的定义：

为什么map的insert参数要传pair呢？与它的遍历有关。
	map<int, int> m;
	m.insert(pair<int,int>(1, 1));
	m.insert(pair<int,int>(2, 2));
	m.insert(pair<int,int>(3, 3));
	m.insert(pair<int,int>(4, 4));
	//遍历
	map<int, int>::iterator it = m.begin();
	while (it != m.end())
	{
		cout << *it << " ";
        ++it;
	}
	cout << endl;
这种方式会报错，map中存有两个值，*it会返回两个值，C++是不允许一次返回两个值的，要返回两个值，把它们构成一个结构返回可以。

pair就相当于一个结构。*it返回的是一个pair，是一个key,value的结构体。

上述会报错的原因也就是*it是一个pair，自定义类型没有重载，不支持输出。
//正确遍历方式
while (it != m.end())
{
	cout << (*it).first << ":" << (*it).second << endl;
    //cout << it->first << ":" << it->second << endl;
	++it;
}
说明：

operator* 返回值是节点中值的引用，

operator-> 返回值是节点中值的指针，也就是pair<k,v>指针，这里"it->first"本应该是"it->->first"，编译器为了可读性省略了一个"->"。

insert参数也可以用make_pair。

3.2make_pair

void test_map1()
{
	map<int, int> m;
	pair<int, int> a(0,0);
	m.insert(a);
	m.insert(pair<int,int>(1, 1));
	m.insert(pair<int,int>(2, 2));
	m.insert(pair<int,int>(3, 3));//pair构造函数，构造一个匿名对象
	m.insert(make_pair(5, 5));//函数模板构造一个pair对象
	//遍历
	map<int, int>::iterator it = m.begin();
	while (it != m.end())
	{
		cout << (*it).first << ":" << (*it).second << endl;
		++it;
	}
}

日常更喜欢用make_pair，因为不用声明模板参数，会自动推演。

3.3map应用举例

map为KV模型，根据Key找Value。这里举一个“查找水果次数”的例子。

查找水果次数：

示例方法一：

void test_map3()
{
	string strs[] = { "西瓜","苹果","西瓜","西瓜","西瓜","西瓜","苹果","樱桃" };
	map<string, int> countMap;
	for (auto& str : strs)
	{
		map<string, int>::iterator ret = countMap.find(str);
		if (ret != countMap.end())
		{
			ret->second++;
		}
		else
		{
			countMap.insert(make_pair(str, 1));
		}
	}
	for (auto& e : countMap)
	{
		cout << e.first << ":" << e.second << endl;
	}
}

3.4map的insert

依据map的应用举例，我们还要对map做进一步探索。

返回值是一个pair。

解释：

如果插入的值已经存在，则插入失败，iterator指向已经存在的元素，bool为false，

如果插入的值不存在，则插入，iterator指向新插入的元素，bool为true。

示例方法二：
void test_map3()
{
	string strs[] = { "西瓜","苹果","西瓜","西瓜","西瓜","西瓜","苹果","樱桃" };
	map<string, int> countMap;
	for (auto& str : strs)
	{
        //如果水果不在map中，直接插入。
        //如果水果在map中，通过返回值拿到水果所在节点的迭代器，++次数。
		pair<map<string, int>::iterator, int> ret = countMap.insert(make_pair(str, 1));
		if (ret.second == false)
		{
			ret.first->second++;
		}
	}
	for (auto& e : countMap)
	{
		cout << e.first << ":" << e.second << endl;
	}
}
这里，insert充当了两个作用，一个作用是查找，一个作用是插入。

3.5map的operator[]

这里operator[]是通过调用insert来实现的。

图文分析：

operator[]的作用就是给一个key，去查找key对应的value。

为什么这里不用find实现呢？

原因：假设用find，如果map中没有这个key，如何返回？

可以用抛异常，这里先不做讲解。

说明：

这里用insert：

1、如果key不在map中，则插入pair<key_type,mapped_type()>，mapped_type()为缺省值。（调用mapped_type类型的默认构造函数构造一个对象，若是int则为0，若是string则为空。），再返回key所在节点映射对象（mapped_type)的引用。

2、如果key在map中，则插入失败，返回key所在节点映射对象（mapped_type)的引用。

所以map中的operator[]有三层作用：

1、插入

2、查找key对应的映射对象

3、修改key对应的映射对象

示例方法三：
void test_map3()
{
	string strs[] = { "西瓜","苹果","西瓜","西瓜","西瓜","西瓜","苹果","樱桃" };
	map<string, int> countMap;
	//1、如果水果在map中，则operator[]会插入pair<str,0>,返回映射对象（次数）的引用，对它进行++。
	//2、如果水果不在map中，则operator[]会返回映射对象（次数）的引用，对它进行++。
	for (auto& str : strs)
	{
		countMap[str]++;
	}
	for (auto& e : countMap)
	{
		cout << e.first << ":" << e.second << endl;
	}
	countMap["香蕉"];//插入
	countMap["香蕉"] = 1;//修改
	cout << countMap["香蕉"] << endl;//查找
	countMap["猕猴桃"] = 3;//插入+修改
	for (auto& e : countMap)
	{
		cout << e.first << ":" << e.second << endl;
	}
}
map中，一般使用operator[]去完成：

1、插入+修改

2、修改

一般不会用他去查找，因为如果key不存在，会插入数据。

关于map的相关操作使用的接口：

1、增 insert+operator[]

2、删 erase

3、查 find

4、改 operator[]

5、遍历 iterator+范围for

遍历出来的数据是按key来排序的，因为底层是搜索树，遍历走的是中序。

4.multiset和multimap

multiset文档介绍

multimap文档介绍

multiset和set的区别就在于multiset允许键值冗余。
void test_multi()
{
	multiset<int> s;
	s.insert(1);
	s.insert(2);
	s.insert(2);
	s.insert(2);
	s.insert(4);
	s.insert(3);
	for (auto& e : s)
	{
		cout << e << endl;
	}
}
查找相同元素查找的是按顺序排的第一个。

multimap和map的区别也是一样允许键值冗余。map中的key是唯一的，而multimap中key是可以重复的。

但是multimap没有operator[]，因为对于相同的key，去查找value，不知道要找的是哪一个key对应的value，由于键值冗余，允许插入相同的，multimap的insert返回值也没有pair了。

关于multiset和multimap的其他接口都和set与map的接口相同。

5.在OJ中的使用

5.总结

此篇主要对set和map、multiset和multimap做了简单的理解和相关应用的举例。

关于其他的接口比较简单，由于和STL中vector、list、string等接口类似，这里没有做具体讲解，只做了特别的几个接口的说明。

遇事不决，一定要查文档。

关于set和map更深层次的探索，请看下篇。