前言
本期浅学一下STL的stirng。
内容概览:
- STL
- string
- 是什么
- 为什么
- 怎么用(接口介绍及使用)
博主水平有限,不足之处望请斧正!
先导
STL
C++中非常重要的一个东西,STL(Standard Template Library) 标准模版库,是C++标准库中的一部分。STL主要包含
- 可复用的组件库
- 数据结构和算法的软件框架
分为六个部分的话:
- 容器(数据结构)
- 算法
- 迭代器
- 函数对象
- 适配器
- 内存分配器
几乎所有代码都采用了类模版和函数模版,大大提高代码复用性。
主要版本有:
- 原始版本
- HP版:由Alexandar Stepanov和Meng Lee在惠普实验室开发。
- 衍生版本
- SGI版:由Alexandar Stepanov和Matt Austern在SGI开发,代码风格好,开源,且任何人都能修改和销售。Linux的g++采用。
- P.J.版:由 P.J.Plauger在他自己的三个人的公司开发,实现比较复杂,易读性不是特别高,且并不开源。微软的vs系列采用。
- RW版:由 Rouge Wave 公司开发,C++builder采用
string
是什么
string 是通过 basic_string 这个类模版,使用<char>实例化出来的类。本质是动态增长的字符数组。
嗯?字符串不就都是char组成的吗,
【为什么还要搞个模版,让我们指定类型实例化呢?】
这里涉及到编码的问题:
编码
是什么
把要放进计算机的文字符号映射为二进制。(计算机中只有0和1)
其实我们在学习C语言的时候就接触过——ascii码。计算机最早是从英国过去美国的人搞的,也就用英语,那怎么通过计算机显示英语呢?
英语仅由较少的字母和符号即可显示,所以1~127就能映射所有文字符号。
比如,要存储 “abc”,映射进计算机就是 97 98 99的二进制。
int main()
{
char str[] = "abc";
printf("%d %d %d\n", str[0], str[1], str[2]);
return 0;
}
97 98 99
如果正常打印,计算机发现这是char类型,就拿着值到表里找,97对应’a’…
那别的国家咋办呢,中文的文字这么多。
unicode
unicode,统一码,也叫万国码。用统一的标准表示了很多国家的语言。
有 UTF-8、UTF-16、UTF-32三种,日常使用最多的是UTF-8,用2个字节表示,可以表示常用的汉字,兼容ASCII。
但对中文还是不太够,我们自己搞了GBK(国标),用2个字节表示
int main()
{
char str[] = "培根";
cout << str << endl;
cout << "size:" << sizeof(str) << endl;
//对“培”的第二个字节++会怎样?
++str[2];
cout << str << endl;
++str[2];
cout << str << endl;
++str[2];
cout << str << endl;
--str[2];
cout << str << endl;
--str[2];
cout << str << endl;
--str[2];
cout << str << endl;
return 0;
}
培根
size:5
培郭
培葫
培基
培葫
培郭
培根
可以看到,此处的编码按照字音来,有一个场景就是帮助净化网络环境(敏感词的谐音字也能变成“***”)。
到这里我们也能理解为什么string要写成模版了,不同的编码就指定不同的类型。
如果我们要存UTF-16的字符串,就用u16stirng。
如果我们要存UTF-16的字符串,就用u32stirng。
如果我们要存宽字符串,就用wstirng。
string就对应UTF-8的编码规则,也是最常用的。
所以,一种可能的框架:
template <class T>
class basic_string
{
public:
//...
private:
T* _str;
size_t _size;
size_t _capacity;
};
basic_string<char> s_8;
basic_string<char16_t> s_16;
basic_string<char32_t> s_32;
basic_string<wchar_t> s_w;
*推荐给一个大家查看C++文档的网站(非官方,但好用)
为什么
C语言中,字符串是以\0结尾的字符的集合,为了操作方便,C标准库也提供了字符串操作的函数。但这些库函数和字符串是分离开的,通过char*指针来操作,不符合OOP(面向对象程序设计)的思想。而且底层的内存需要自己管理,不注意还会越界访问。所以C++就封装了一个stirng类。
怎么用
有些接口会略过,知道有这个接口就行,用的时候查下文档简简单单。
Constructor (构造)
描述 | 接口 |
---|---|
default (1) | string(); |
copy (2) | string (const string& str); |
substring (3) | string (const string& str, size_t pos, size_t len = npos); |
from c-string (4) | string (const char* s); |
from sequence (5) | string (const char* s, size_t n); |
fill (6) | string (size_t n, char c); |
range (7) | template <class InputIterator> string (InputIterator first, InputIterator last); |
(1) default:默认空字符串。
int main()
{
string s;
cout << s << endl << "111" << endl;
return 0;
}
111
(2) copy:拷贝构造。
int main()
{
string s1("bacon");
string s2(s1);
cout << s2 << endl;
return 0;
}
bacon
(3) substring:用pos开始的n个字符构成的子字符串来构造。
int main()
{
string s1("123 456 bacon", 0, 7);
cout << s1 << endl;
return 0;
}
123 456
(4) from c-string:通过c式字符串构造。
int main()
{
string s("bacon");
cout << s << endl;
return 0;
}
bacon
(5) from sequence:用长度为n的字符序列构造。
(6) fill:用n个c构造。
int main()
{
string s(10, '!');
cout << s << endl;
return 0;
}
!!!!!!!!!!
(7) range:用迭代器区间构造(后面讲)。
int main()
{
string s1(10, '!');
string s2(s1.begin(), s1.end());
cout << s2 << endl;
return 0;
}
!!!!!!!!!!
Destructor (析构)
释放空间
Operator = (赋值运算符重载)
描述 | 接口 |
---|---|
string (1) | string& operator= (const string& str); |
c-string (2) | string& operator= (const char* s); |
character (3) | string& operator= (char c); |
int main()
{
string s1, s2, s3;
s1 = "it's "; // c-string
s2 = "a string."; // single character
s3 = s1 + s2; // string
cout << s3 << endl;
return 0;
}
Non-Member function overloads (非成员函数重载)
(1)operator+
Concatenate strings (function )
描述 | 接口 |
---|---|
string (1) | string operator+ (const string& lhs, const string& rhs); |
c-string (2) | string operator+ (const string& lhs, const char* rhs); |
string operator+ (const char* lhs, const string& rhs); | |
character (3) | string operator+ (const string& lhs, char rhs); |
string operator+ (char lhs, const string& rhs); |
(2)relational operators
Relational operators for string (function )
就是一些比较符号,string 和 string / char* / char 都能比较
(3)swap
Exchanges the values of two strings (function )
void swap (string& x, string& y);
(4)operator>>
Extract string from stream (function )
(5)operator<<
Insert string into stream (function )
(6)getline
Get line from stream into string (function )
int main()
{
string s1;
string s2;
getline(cin, s1);
getline(cin, s2);
string s3 = s1 + s2;
cout << s3 << endl;
string s4 = "!!!";
s4.swap(s3);
cout << s3 << endl;
cout << s4 << endl;
return 0;
}
123//输入
abc//输入
123abc
!!!
123abc
诶?好像有个<algorithm>算法库,里面不是有个swap吗?
确实,是这样实现的:
template <class T> void swap ( T& a, T& b )
{
T c(a); a=b; b=c;
}
可以发现,需要构造一个临时对象c。但是对于我们动态增长的stirng,会涉及深拷贝的问题。
那我们void swap (string& x, string& y);又做了什么呢?
其实给string写的swap,只是交换了stirng的成员,深拷贝的问题也直接解决了。
Iterators (迭代器)
啥是iterator?
部分成员类型 | 描述 |
---|---|
iterator | a random access iterator to char (convertible to const_iterator ) |
const_iterator | a random access iterator to const char |
reverse_iterator | reverse_iterator<iterator> |
const_reverse_iterator | reverse_iterator<const_iterator> |
string的迭代器,底层大概是一个char指针。在string阶段可以暂时浅显地把迭代器理解为指针。
(1)begin
Return iterator to beginning (public member function )
iterator begin();
const_iterator begin() const;
(2)end
Return iterator to end (public member function )
iterator end();
const_iterator end() const;
(3)rbegin
Return reverse iterator to reverse beginning (public member function )
reverse_iterator rbegin();
const_reverse_iterator rbegin() const;
(4)rend
Return reverse iterator to reverse end (public member function )
reverse_iterator rend();
const_reverse_iterator rend() const;
(5)cbegin
Return const_iterator to beginning (public member function )
const_iterator cbegin() const noexcept;
(noexcept先不管,后面会讲的哈)
(6)cend
Return const_iterator to end (public member function )
const_iterator cbegin() const noexcept;
(7)crbegin
Return const_reverse_iterator to reverse beginning (public member function )
const_reverse_iterator crbegin() const noexcept;
(8)crend
Return const_reverse_iterator to reverse end (public member function )
const_reverse_iterator crend() const noexcept;
- r:反向
- c:const对象的迭代器
以上迭代器不混用
这些接口返回的都是一个iterator(迭代器),我们得用同样类型接收返回值,而iterator是成员类型,所以要指定类域。
int main()
{
string s("it's a string.");
string::iterator it = s.begin();
while(it != s.end())
{
cout << *it << ' ';
++it;
}
cout << endl;
return 0;
}
i t ' s a s t r i n g .
int main()
{
string s("it's a string.");
string::reverse_iterator it = s.rbegin();
while(it != s.rend())
{
cout << *it << ' ';
++it;
}
cout << endl;
return 0;
}
int main()
{
const string s("it's a string.");
string::const_reverse_iterator it = s.crbegin();
while(it != s.crend())
{
cout << *it << ' ';
++it;
}
cout << endl;
return 0;
}
. g n i r t s a s ' t i
其实,范围for就是利用迭代器,将范围for的代码替换成用迭代器遍历的代码。之前说范围for对于自定义类型,要有begin和end方法也是这样。
int main()
{
string s("it's a string.");
for(char ch : s)
{
cout << ch << ' ';
}
cout << endl;
return 0;
}
i t ' s a s t r i n g .
Capacity (容量)
(1)size
Return length of string (public member function )
size_t size() const;
(2)length
Return length of string (public member function )
size_t length() const;
(3)max_size
Return maximum size of string (public member function )
size_t max_size() const;
(4)resize
Resize string (public member function )
void resize (size_t n);
void resize (size_t n, char c);
如果 n < _size,会减小\_size删除数据(不缩容)
如果 n > _size,会增大\_size,可传一个c,自动将增加的部分填充成c
(5)capacity
Return size of allocated storage (public member function )
size_t capacity() const;
(6)reserve
Request a change in capacity (public member function )
void reserve (size_t n = 0);
(7)clear
Clear string (public member function )
void clear();
(8)empty
Test if string is empty (public member function )
bool empty() const;
(9)shrink_to_fit
Shrink to fit (public member function ) 不建议频繁使用,因为得异地开辟并拷贝。
void shrink_to_fit();
size + capacity + length + max_size + clear + empty:
int main()
{
string s = "123456";
cout << "size: " << s.size() << endl;
cout << "length: " << s.length() << endl;
cout << "capacity: " << s.capacity() << endl;
cout << "max_size: " << s.max_size() << endl;
s.clear();
cout << "----string cleared----" << endl;
cout << "size: " << s.size() << endl;
cout << "capacity: " << s.capacity() << endl;
return 0;
}
size: 6
length: 6
capacity: 22
max_size: 18446744073709551599
----string cleared----
size: 0
capacity: 22
resize + reserve + shrink_to_fit:
int main()
{
string s = "123456";
cout << "size: " << s.size() << endl;
cout << "capacity: " << s.capacity() << endl;
s.resize(3);
cout << "----string resized to 3----" << endl;
cout << "size: " << s.size() << endl;
cout << "capacity: " << s.capacity() << endl;
s.reserve(20);
cout << "----string reserved to 20----" << endl;
cout << "size: " << s.size() << endl;
cout << "capacity: " << s.capacity() << endl;
s.shrink_to_fit();
cout << "----string shrinked_to_fit----" << endl;
cout << "size: " << s.size() << endl;
cout << "capacity: " << s.capacity() << endl;
return 0;
}
size: 6
capacity: 22
----string resized to 3----
size: 3
capacity: 22
----string reserved to 20----
size: 3
capacity: 22
----string shrinked_to_fit----
size: 3
capacity: 22
诶?shrink_to_fit的功能不是把capacity fit (适应) 至size吗?这里咋没动。这是编译器干的事,为什么它不让我缩容?
因为编译器自己对string的capacity有最小限度,比如我的XCode,就规定string的capacity最小是22。
对于所有缩容操作:编译器对string的capacity有自己的最小限度。
int main()
{
string s(100, '?');
cout << s.capacity() << endl;
s.resize(10);
s.shrink_to_fit();
cout << s.capacity() << endl;
return 0;
}
111
22//虽然size是10,但capacity最多只能适应至22
Elements access (元素的访问)
(1) operator[]
Get character of string (public member function )
char& operator[] (size_t pos);
const char& operator[] (size_t pos) const;
(2)at
Get character in string (public member function )
char& at (size_t pos);
const char& at (size_t pos) const;
(3)back
Access last character (public member function )
char& back();
const char& back() const;
(4)front
Access first character (public member function )
char& front();
const char& front() const;
int main()
{
string s("it's a string.");
for(size_t i = 0; i < s.size(); ++i)
{
// cout << s[i] << ' ';
cout << s.at(i) << ' ';
}
cout << endl;
cout << "front:" << s.front() << endl;
cout << "back:" << s.back() << endl;
return 0;
}
Modifiers
(1)operator+=
Append to string (public member function )
描述 | 接口 |
---|---|
string (1) | string& operator+= (const string& str); |
c-string (2) | string& operator+= (const char* s); |
character (3) | string& operator+= (char c); |
(2)append
Append to string (public member function )
描述 | 接口 |
---|---|
string (1) | string& append (const string& str); |
substring (2) | string& append (const string& str, size_t subpos, size_t sublen); |
c-string (3) | string& append (const char* s); |
buffer (4) | string& append (const char* s, size_t n); |
fill (5) | string& append (size_t n, char c); |
range (6) | template <class InputIterator> string& append (InputIterator first, InputIterator last); |
(3)push_back
Append character to string (public member function )
void push_back (char c);
(4)assign
Assign content to string (public member function )
描述 | 接口 |
---|---|
string (1) | string& assign (const string& str); |
substring (2) | string& assign (const string& str, size_t subpos, size_t sublen); |
c-string (3) | string& assign (const char* s); |
buffer (4) | string& assign (const char* s, size_t n); |
fill (5) | string& assign (size_t n, char c); |
range (6) | template <class InputIterator> string& assign (InputIterator first, InputIterator last); |
(5)insert
Insert into string (public member function )
描述 | 接口 |
---|---|
string (1) | string& insert (size_t pos, const string& str); |
substring (2) | string& insert (size_t pos, const string& str, size_t subpos, size_t sublen); |
c-string (3) | string& insert (size_t pos, const char* s); |
buffer (4) | string& insert (size_t pos, const char* s, size_t n); |
fill (5) | string& insert (size_t pos, size_t n, char c); |
void insert (iterator p, size_t n, char c); | |
single character (6) | iterator insert (iterator p, char c); |
range (7) | template <class InputIterator> void insert (iterator p, InputIterator first, InputIterator last); |
(6)erase
Erase characters from string (public member function )
描述 | 接口 |
---|---|
sequence (1) | string& erase (size_t pos = 0, size_t len = npos); |
character (2) | iterator erase (iterator p); |
range (3) | iterator erase (iterator first, iterator last); |
(7)replace
Replace portion of string (public member function )
用得不多。
(8)swap
Swap string values (public member function )
void swap (string& str);
(9)pop_back
Delete last character (public member function )
void pop_back();
+= + push_back + pop_back:
int main()
{
string s = "Jay chou is ";
cout << s << endl;
s += "cool";
cout << s << endl;
s.push_back('!');
cout << s << endl;
s.pop_back();
cout << s << endl;
return 0;
}
Jay chou is
Jay chou is cool
Jay chou is cool!
Jay chou is cool
append + assign + insert + erase:
int main()
{
string s = "Jay chou is ";
cout << s << endl;
string tmp = "very handsome";
s.append(tmp);
cout << s << endl;
string s2;
s2.assign(s);
cout << s << endl;
size_t pos = 0;
s2.insert(pos, "----");
cout << s2 << endl;
s2.erase(pos, 7);
cout << s2 << endl;
return 0;
}
Jay chou is
Jay chou is very handsome
Jay chou is very handsome
----Jay chou is very handsome
chou is very handsome
String operations (字符串操作)
(1)c_str
Get C string equivalent (public member function )
const char* c_str() const;
这个接口还是很重要的,很多接口如Linux的系统调用,就只能接收C式字符串。
(2)data
Get string data (public member function )
const char* data() const;
(3)get_allocator
Get allocator (public member function ) (后面学)
(4)copy
Copy sequence of characters from string (public member function )
(5)find
Find content in string (public member function )
描述 | 接口 |
---|---|
string (1) | size_t find (const string& str, size_t pos = 0) const; |
c-string (2) | size_t find (const char* s, size_t pos = 0) const; |
buffer (3) | size_t find (const char* s, size_t pos, size_t n) const; |
character (4) | size_t find (char c, size_t pos = 0) const; |
(6)rfind
Find last occurrence of content in string (public member function )
(7)find_first_of
Find character in string (public member function )
(8)find_last_of
Find character in string from the end (public member function )
(9)find_first_not_of
Find absence of character in string (public member function )
(10)find_last_not_of
Find non-matching character in string from the end (public member function )
(11)substr
Generate substring (public member function )
(12)compare
Compare strings (public member function )
c_str + find + rfind:
int main()
{
string s = "code is beautiful.";
size_t pos = s.find(' ');
printf("%s\n", s.c_str() + pos);
pos = s.rfind(' ');
printf("%s\n", s.c_str() + pos);
return 0;
}
is beautiful.
beautiful.
Member constants (成员常量)
npos
Maximum value for size_t (public static member constant )
int main()
{
string s = "code is beautiful.";
cout << string::npos << endl;
return 0;
}
18446744073709551615
今天的分享就到这里啦
这里是培根的blog,与你一同进步!
下期见~