简介
关于QString乱码的一些补充。主要就两点,QChar、QString底层存储的字符都是16进制的Unicode编码。
QChar
QChar对应16位的Unicode字符集。
The QChar class provides a 16-bit Unicode character.
In Qt, Unicode characters are 16-bit entities without any markup or structure. This class represents such an entity. It is lightweight, so it can be used everywhere. Most compilers treat it like a unsigned short.
QString
因为QString存储的是QChar,而QChar是16位、2字节的Unicode字符。对于大于65535的Unicode字符,则存储在连续的两个QChar中。
The QString class provides a Unicode character string.
QString stores a string of 16-bit QChars, where each QChar corresponds one Unicode 4.0 character. (Unicode characters with code values above 65535 are stored using surrogate pairs, i.e., two consecutive QChars.)
//源码QString.h
typedef QTypedArrayData<ushort> QStringData;
class Q_CORE_EXPORT QString
{
public:
typedef QStringData Data;
//...
public:
typedef Data * DataPtr;
inline DataPtr &data_ptr() { return d; }
}
代码
以下程序验证中文字符串在QString
中以Unicode的编码保存。
#include <QtCore/QCoreApplication>
#include <QString>
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
QString str;
str = QString::fromStdWString(L"杨奶粉");
auto data = str.data();
return a.exec();
}
1.中文字符串“杨奶粉”存储到QString
2.通过str获取QChar数组指针
3.QChar与Unicode编码的比对。注意:调试器内存中看到的数据高位在高地址,低位在低地址。
参考文献:
1.“About the Unicode Character Database”
2. QString Class | Qt Core 6.5.0