一、为什么学习 string 类？

二、标准库中的 string 类

三、C++ STL容器是什么？

四、string 类的成员函数

4.1 - 构造函数

4.2 - 赋值运算符重载

4.3 - 容量操作

4.4 - 遍历及访问操作

4.4.1 - operator[] 和 at

4.4.2 - 迭代器

4.5 - 修改操作

4.6 - 字符串操作

五、string 类的非成员函数重载

六、C++ STL 迭代器详解

6.1 - 迭代器的类别

6.2 - 迭代器的定义方式

一、为什么学习 string 类？

在 C 语言中，字符串是以 '\0' 结尾的一些字符的集合，为了操作方便，C 标准库中提供了一系列的处理字符串的库函数，但是这些库函数与字符串是分离的，不符合 OOP 思想，而且底层空间需要用户自己管理，稍不留神可能还会越界访问。

二、标准库中的 string 类

字符串是表示字符序列的对象。

标准 string 类通过类似于标准字节容器（standard container of bytes）的接口为此类对象提供支持，但添加了专门设计用于处理单字节字符串的功能。

string 类是 basic_string 类模板的一个实例化，该模板使用 char（即字节）作为其 character type，以及使用默认的 char_traits and allocator type（有关模板的详细信息，请参阅 basic_string）。

typedef basic_string<char> string;

请注意，此类独立于所使用的编码来处理字节：如果用于处理多字节或可变长度字符（如 UTF-8）的序列，则此类的所有成员（如 length 或 size）及其迭代器仍将以字节（而不是实际编码字符）为单位运行。

三、C++ STL容器是什么？

简单理解，容器就是一些类模板的集合，但和普通类模板不同的是，容器中封装的是组织数据的方法（也就是数据结构）。STL 提供有 3 类标准容器，分别是序列容器、排序容器和哈希容器，其中后两类容器有时也统称为关联容器。它们各自的含义如下表所示：

容器种类	功能
序列容器	主要包括 vector 向量容器、list 列表容器以及 deque 双端队列容器。之所以被称为序列容器，是因为元素在容器中的位置同元素的值无关，即容器不是排序的。将元素插入容器时，指定在什么位置，元素就会位于什么位置。
排序容器	包括 set 集合容器、multiset 多重集合容器、map 映射容器以及 multimap 多重映射容器。排序容器中的元素默认是由小到大排序好的，即便是插入元素，元素也会插入到适当位置。所以关联容器在查找时具有非常好的性能。
哈希容器	C++ 11 新加入 4 种关联容器，分别是 unordered_set 哈希集合、unordered_multiset 哈希多重集合、unordered_map 哈希映射以及 unordered_multimap 哈希多重映射。和排序容器不同，哈希容器中的元素是未排序的，元素的位置由哈希函数确定。

注意，由于哈希容器直到 C++ 11 才被正式纳入 C++ 标准程序库，而在此之前，"民间" 流传着 hash_set、hash_multiset、hash_map、hash_multimap 版本，不过这些版本只能在某些支持 C++ 11 的编译器下使用（例如 VS），有些编译器（例如 GCC/G++）是不支持的。

另外，以上 3 类容器的存储方式完全不同，因此使用不同容器完成相同操作的效率也大不相同。所以在实际使用时，要善于根据想实现的功能，选择合适的容器。

另外，string 类属于标准库，不属于 STL，但它与 STL 容器有很多相似的操作，因此在正式学习 STL 之前，首先学习 string 类。

四、string 类的成员函数

4.1 - 构造函数

      default (1) string();
         copy (2) string(const string& str);
    substring (3) string(const string& str, size_t pos, size_t len = npos);
from c-string (4) string(const char* s);
  from buffer (5) string(const char* s, size_t n);
         fill (6) string(size_t n, char c);

说明：

默认构造函数（default constructor）：构造一个空的 string 类对象，即空字符串。
拷贝构造函数（copy constructor）。
拷贝 str 的一部分，这部分从字符位置 pos 开始并跨越 len 个字符（如果 str 太短或者 len 为 string::npos，则拷贝到 str 的末尾）。
npos 是公共静态成员常量（public static member constant），值为 size_t 类型数据的最大可能值。
```
static const size_t npos = -1;
```
拷贝 s 指向的以 \0 结尾的字符串。
从 s 指向的字符数组中拷贝前 n 个字符。
用 n 个 c 字符填充 string 类对象。

示例：

#include <iostream>
#include <string>
using namespace std;

int main()
{
    string s1;
    cout << s1 << endl;  // 空

    string s2("hello world");
    string s3("hello world", 5);
    cout << s2 << endl;  // hello world
    cout << s3 << endl;  // hello

    string s4(s2);
    string s5(s2, 6, 3);
    string s6(s2, 6);
    cout << s4 << endl;  // hello world
    cout << s5 << endl;  // wor
    cout << s6 << endl;  // world
    cout << string::npos << endl;  // 4294967295（2^32 - 1）

    string s7(10, '*');
    cout << s7 << endl;  // **********
    return 0;
}

4.2 - 赋值运算符重载

   string (1) string& operator=(const string& str);
 c-string (2) string& operator=(const char* s);
character (3) string& operator=(char c);

示例：

#include <iostream>
#include <string>
using namespace std;

int main()
{
    string s1;
    string s2("hello world");

    s1 = s2;
    cout << s1 << endl;  // hello world

    s1 = "你好，世界";
    cout << s1 << endl;  // 你好，世界

    s1 = 'A';
    cout << s1 << endl;  // A
    return 0;
}

4.3 - 容量操作

length：

size_t length() const;

返回字符串有效字符长度。

size：

size_t size() const;

size() 和 length() 底层实现原理完全相同，引入 size() 的原因是为了与其他容器的接口保持一致，所以一般情况下基本都是使用 size()。

empty：

bool empty() const;

如果字符串的长度为 0，返回 true，否则返回 false。

max_size：

size_t max_size() const;

返回字符串可以达到的最大长度。

注意：不同的编译器以及不同的版本实现可能不一样。

capacity：

size_t capacity() const;

返回当前为字符串分配的存储空间的大小。

注意：和 max_size() 一样，对于 capacity()，不同的编译器以及不同的版本实现可能也不一样。

clear：

void clear();

清空有效字符。

注意：clear() 不改变底层空间的大小。

reserve：

void reserve(size_t n = 0);

如果 n 大于当前的字符串容量（capacity），则该函数会让容量增加到 n（或更大）。

在其他所有情况下，缩小容量被视为非约束性请求（a non-blinding request）：容器实现可以自由优化，否则，字符串的容量大于 n。

该函数不会对字符串的长度造成任何影响，并且也不会改变字符串的内容。

resize：

void resize(size_t n);
void resize(size_t n, char c);

如果 n 大于当前字符串长度，则在末尾插入 '\0' 或字符 c，在此过程中，可能会改变底层容量的大小。

如果 n 小于当前字符串长度，则删除前 n 个字符以外的字符。

示例：

#include <iostream>
#include <string>
using namespace std;

// 测试环境：VS2019
int main()
{
    string s("hello world");
    cout << s.length() << endl;  // 11
    cout << s.size() << endl;  // 11
    cout << s.empty() << endl;  // 0
    cout << s.max_size() << endl;  // 2147483647
    cout << s.capacity() << endl;  // 15

    s.clear();
    cout << s.size() << endl;  // 0
    cout << s.capacity() << endl;  // 15

    s = "hello world";
    // n > capacity
    s.reserve(20);
    cout << s.capacity() << endl;  // 31
    // n < capacity
    s.reserve(5);
    cout << s.capacity() << endl;  // 31
    s.clear();
    s.reserve(5);
    cout << s.capacity() << endl;  // 15

    s = "hello world";
    // n > size
    s.resize(20, 'x');
    cout << s << endl;  // hello worldxxxxxxxxx
    cout << s.size() << endl;  // 20
    cout << s.capacity() << endl;  // 31
    // n < size
    s.resize(5);
    cout << s << endl;  // hello
    cout << s.size() << endl;  // 5
    cout << s.capacity() << endl;  // 31
    return 0;
}

4.4 - 遍历及访问操作

4.4.1 - operator[] 和 at

operator[]：

      char& operator[](size_t pos);
const char& operator[](size_t pos) const;

at：

      char& at(size_t pos);
const char& at(size_t pos) const;

operator[]() 和 at() 的功能类似，都是返回字符串中 pos 位置处的字符的引用。

不同之处在于，当 pos 不是合法位置时，operator[]() 会直接报错，而 at() 会抛出 out_of_range 异常。

示例一：

#include <iostream>
#include <string>
using namespace std;

int main()
{
    string s("01234");
    for (size_t i = 0; i < s.size(); ++i)
    {
        // s[i] += 5;
        // cout << s[i];
        s.at(i) += 5;
        cout << s.at(i);
    }
    // 56789
    cout << endl;
    return 0;
}

示例二：

#include <iostream>
#include <string>
using namespace std;

int main()
{
    try 
    {
        string s("01234");
        // s[10] = 'x';  // 直接报错
        s.at(10) = 'x';
    }
    catch (const exception& e)
    {
        cout << e.what() << endl;  // invalid string position
    }
    return 0;
}

4.4.2 - 迭代器

begin：

      iterator begin();
const_iterator begin() const;

返回指向字符串第一个字符的迭代器。

end：

      iterator end();
const_iterator end() const;

返回指向字符串最后一个字符的下一个位置的迭代器。

示例一：

#include <iostream>
#include <string>
using namespace std;

int main()
{
    string s1("01234");
    for (string::iterator it1 = s1.begin(); it1 != s1.end(); ++it1)
    {
        *it1 += 5;  // 写
        cout << *it1;  // 读
    }
    // 56789
    cout << endl;

    const string s2("01234");
    for (string::const_iterator it2 = s2.begin(); it2 != s2.end(); ++it2)
    {
        // 必须使用 const_iterator，否则权限就放大了
        // *it2 += 5;  // 不能写
        cout << *it2;  // 只能读
    }
    // 01234
    cout << endl;
    return 0;
}

为了方便，可以使用 auto 关键字：

for (auto it1 = s1.begin(); it1 != s1.end(); ++it1) { ... }
for (auto it2 = s2.begin(); it2 != s2.end(); ++it2) { ... }

基于范围的 for 循环的内部实现机制还是依赖于迭代器的相关实现：

#include <iostream>
#include <string>
using namespace std;

int main()
{
	string s("01234");
	for (char& e : s)  // 或者 for (auto& e : s) { ... }
	{
		e += 5;
		cout << e;
	}
	// 56789
	cout << endl;
	return 0;
}

rbegin：

      reverse_iterator rbegin() const;
const_reverse_iterator rbegin() const;

返回指向字符串最后一个字符的反向迭代器。

rend：

      reverse_iterator rend() const;
const_reverse_iterator rend() const;

返回指向字符串第一个字符的前一个位置的反向迭代器。

示例二：

#include <iostream>
#include <string>
using namespace std;

int main()
{
    string s("01234");
    for (auto it = s.rbegin(); it != s.rend(); ++it)
    {
        *it += 5;
        cout << *it;
    }
    // 98765
    cout << endl;
    return 0;
}

4.5 - 修改操作

operator+= ：

   string (1) string& operator+=(const string& str);
 c-string (2) string& operator+=(const char* s);
character (3) string& operator+=(char c);

append ：

   string (1) string& append(const string& str);
substring (2) string& append(const string& str, size_t subpos, size_t sublen);
 c-string (3) string& append(const char* s);
   buffer (4) string& append(const char* s, size_t n);
     fill (5) string& append(size_t n, char c);

示例一：

#include <iostream>
#include <string>
using namespace std;

int main()
{
    string s1;
    string s2("hello world");
    s1 += s2;
    cout << s1 << endl;  // hello world
    s1 += "xxx";
    cout << s1 << endl;  // hello worldxxx
    s1 += 'y';
    cout << s1 << endl;  // hello worldxxxy

    s1.clear();
    s1.append(s2);
    cout << s1 << endl;  // hello world
    s1.append(s2, 6, 3);
    cout << s1 << endl;  // hello worldwor
    s1.append("xxx");
    cout << s1 << endl;  // hello worldworxxx
    s1.append("yyy", 1);
    cout << s1 << endl;  // hello worldworxxxy
    s1.append(2, 'z');
    cout << s1 << endl;  // hello worldworxxxyzz
    return 0;
}

insert：

           string (1) string& insert(size_t pos, const string& str);
        substring (2) string& insert(size_t pos, const string& str, size_t subpos, size_t sublen);
         c-string (3) string& insert(size_t pos, const char* s);
           buffer (4) string& insert(size_t pos, const char* s, size_t);
             fill (5) string& insert(size_t pos, size_t n, char c);
                         void insert(iterator p, size_t n, char c);
single character (6) iterator insert(iterator p, char c);

erase：

 sequence (1)  string& erase(size_t pos = 0, size_t len = npos);
character (2) iterator erase(iterator p);
    range (3) iterator erase(iterator first, iterator last);

示例二：

int main()
{
    string s1("xxxxxyyyyy");
    string s2("hello world");
    // cout << s1.insert(5, s2) << endl;  // xxxxxhello worldyyyyy
    // cout << s1.insert(5, s2, 6, 5) << endl;  // xxxxxworldyyyyy
    // cout << s1.insert(5, "zzzzz") << endl;  // xxxxxzzzzzyyyyy
    // cout << s1.insert(5, "zzzzz", 3) << endl;  // xxxxxzzzyyyyy
    // cout << s1.insert(5, 5, '*') << endl;  // xxxxx*****yyyyy
    /* 
       s1.insert(s1.begin() + 5, 5, '*');
       cout << s1 << endl;  // xxxxx*****yyyyy
    */
    s1.insert(s1.begin() + 5, 'Z');
    cout << s1 << endl;  // xxxxxZyyyyy
    
    
    // cout << s1.erase(5, 1) << endl;  // xxxxxyyyyy
    /*
       s1.erase(s1.begin() + 5);
       cout << s1 << endl;  // xxxxxyyyyy
    */
    s1.erase(s1.begin() + 5, s1.end());  // [first, last)
    cout << s1 << endl;  // 
    return 0;
}

4.6 - 字符串操作

c_str：

const char* c_str() const;

返回一个指向字符数组的指针，该数组包括构成 string 类对象的字符序列，以及在末尾符加的 '\0'。

find：

   string (1) size_t find(const string& str, size_t pos = 0) const;
 c-string (2) size_t find(const char* s, size_t pos = 0) const;
   buffer (3) size_t find(const char* s, size_t pos, size_t n) const;
character (4) size_t find(char c, size_t pos = 0) const;

rfind：

   string (1) size_t rfind(const string& str, size_t pos = npos) const;
 c-string (2) size_t rfind(const char* s, size_t pos = npos) const;
   buffer (3) size_t rfind(const char* s, size_t pos, size_t n) const;
character (4) size_t rfind(char c, size_t pos = npos) const;

substr：

string substr(size_t pos = 0, size_t len = npos) const;

示例：

#include <iostream>
#include <string>
using namespace std;

// URL（Uniform Resource Locator）：全球资源定位器
// protocol：协议
// domain name：域名
// URI（Uniform Resource Identifier）：统一资源标识符

int main()
{
    string url("https://legacy.cplusplus.com/reference/string/string/");

    size_t begin1 = 0;
    size_t end1 = url.find("://", begin1);
    string protocol;
    if (end1 != string::npos)
    {
        protocol = url.substr(begin1, end1 - begin1);
        cout << protocol << endl;  // https
    }

    size_t begin2 = end1 + 3;
    size_t end2 = url.find('/', begin2);
    string domainName, uri;
    if (end2 != string::npos)
    {
        domainName = url.substr(begin2, end2 - begin2);
        uri = url.substr(end2 + 1);
        cout << domainName << endl;  // legacy.cplusplus.com
        cout << uri << endl;  // reference/string/string/
    }

    return 0;
}

五、string 类的非成员函数重载

函数名称	功能说明
operator+	Concatenate strings（尽量少用，因为是传值返回）
relational operators	Relational operators for string
swap	Exchanges the value of two strings
operator>>	Extract string from stream
operator<<	Insert string into stream
getline	Get line from stream into string

六、C++ STL 迭代器详解

无论是序列容器还是关联容器，最常做的操作无疑是遍历容器中存储的元素，而实现此操作，多数情况会选用 "迭代器（iterator）" 来实现。那么，迭代器到底是什么呢？

我们知道，尽管不同容器的内部结构各异，但它们本质上都是用来存储大量数据的，换句话说，都是一串能存储多个数据的存储单元。因此，诸如数据的排序、查找、求和等需要对数据进行遍历的操作方法应该是类似的。

既然类似，完全可以利用泛型技术，将它们设计成适用所有容器的通用算法，从而将容器和算法分离开。但实现此目的需要有一个类似中介的装置，它除了要具有对容器进行遍历读写数据的能力之外，还要能对外隐藏容器的内部差异，从而以统一的界面向算法传送数据。这是泛型思维发展的必然结果，于是迭代器就产生了。

简单来讲，迭代器和指针非常类似，它可以是需要的任意类型，通过迭代器可以指向容器中的某个元素，如果需要，还可以对元素进行读/写操作。

6.1 - 迭代器的类别

STL 为每一种标准容器定义了一种迭代器类型，这意味着，不同容器的迭代器不同，其功能强弱也有所不同。

容器的迭代器的功能强弱，决定了该容器是否支持 STL 中的某种算法。

常用的迭代器按功能强弱分为输入迭代器、输出迭代器、前向迭代器、双向迭代器、随机访问迭代器 5 种。

输入迭代器和输出迭代器比较特殊，它们不是把数组或容器当作操作对象，而是把输入流/输出流作为操作对象。有关这 2 个迭代器，我们会在后续的学习中做详细介绍。

前向迭代器（forward iterator）：

假设 p 是一个前向迭代器，则 p 支持 ++p，p++，*p 操作，还可以被复制或赋值，可以用 == 和 != 运算符进行比较。此外，两个正向迭代器还可以相互赋值。
双向迭代器（bidirectional iterator）：

双向迭代器具有前向迭代器的全部功能，除此之外，假设 p 是一个双向迭代器，则还可以进行 --p 或者 p-- 操作（即一次向后移动一个位置）。
随机访问迭代器（random access iterator）：

随机访问迭代器具有双向迭代器的全部功能，除此之外，假设 p 是一个随机访问迭代器，i 是一个整型变量或常量，则 p 还支持以下操作：
- p += i; 即使得 p 往后移动 i 个元素。
- p -= i; 即使得 p 往前移动 i 个元素。
- p + i; 即返回 p 后面第 i 个元素的迭代器。
- p - i; 即返回 p 前面第 i 个元素的迭代器。
- p[i]; 即返回 p 后面第 i 个元素的引用。
此外，两个随机访问迭代器 p1、p2 还可以用 <、>、<=、>= 运算符进行比较。另外，表达式 p2 - p1 也是有定义的，其返回值表示 p2 所指元素和 p1 所指元素的序号之差（也可以说是 p2 和 p1 之间的元素减一）。

C++ 11 标准中不同容器指定使用的迭代器类型：

容器	对应的迭代器类型
array	随机访问迭代器
vector	随机访问迭代器
deque	随机访问迭代器
list	双向迭代器
set/ multiset	双向迭代器
map/ multiset	双向迭代器
forward_list	前向迭代器
unordered_map/ unordered_multimap	前向迭代器
unordered_set/ unordered_multiset	前向迭代器
stack	不支持迭代器
queue	不支持迭代器

注意，容器适配器 stack 和 queue 没有迭代器，它们包含有一些成员函数，可以用来对元素进行访问。

6.2 - 迭代器的定义方式

尽管不同容器对应着不同类别的迭代器，但这些迭代器有着较为统一的定义方式，具体分为 4 种，如下表所示：

迭代器定义方式	具体格式
正向迭代器	容器类名::iterator 迭代器名;
常量正向迭代器	容器类名::const_iterator 迭代器名;
反向迭代器	容器类名::reverse_iterator 迭代器名;
常量反向迭代器	容器类名::const_reverse_iterator 迭代器名;