日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 人文社科 > 生活经验 >内容正文

生活经验

UTF-8 CPP的使用

發(fā)布時(shí)間:2023/11/27 生活经验 23 豆豆
生活随笔 收集整理的這篇文章主要介紹了 UTF-8 CPP的使用 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

? ? ? ? ?UTF-8 CPP是一個(gè)簡(jiǎn)單、小巧、輕量級(jí)、跨平臺(tái)的UTF-8編碼字符串庫。

???????? 下面對(duì)其使用方法進(jìn)行簡(jiǎn)單的介紹:

1.??????從http://sourceforge.net/projects/utfcpp/下載最新的utf8_v2_3_4.zip源碼,將其解壓縮;

2.??????新建一個(gè)vs2013 控制臺(tái)工程TestUTF-8CPP,將utf-8cpp中的src文件加入到包含目錄中;

3.??????參考http://utfcpp.sourceforge.net/,測(cè)試代碼內(nèi)容為:

#include "stdafx.h"
#include <iostream>
#include <string>
#include <fstream>
#include <vector>
#include <assert.h>
#include "utf8.h"// checks whether the content of a file is valid UTF-8 encoded text without reading the content into the memory
bool valid_utf8_file(const char* file_name)
{std::ifstream ifs(file_name);if (!ifs)return false; // even better, throw herestd::istreambuf_iterator<char> it(ifs.rdbuf());std::istreambuf_iterator<char> eos;return utf8::is_valid(it, eos);
}// The function will replace any invalid UTF-8 sequence with a Unicode replacement character
void fix_utf8_string(std::string& str)
{std::string temp;utf8::replace_invalid(str.begin(), str.end(), back_inserter(temp));str = temp;
}int main(int argc, char* argv[])
{const char* test_file_path = "../../../demo/test.txt";// Open the test file(contains UTF-8 encoded text)std::ifstream fs8(test_file_path);if (!fs8.is_open()) {std::cout << "Could not open " << test_file_path << std::endl;return -1;}if (!valid_utf8_file(test_file_path))return -1;unsigned line_count = 1;std::string line;// Play with all the lines in the filewhile (getline(fs8, line)) {// check for invalid utf-8 (for a simple yes/no check, there is also utf8::is_valid function)std::string::iterator end_it = utf8::find_invalid(line.begin(), line.end());if (end_it != line.end()) {std::cout << "Invalid UTF-8 encoding detected at line " << line_count << "\n";std::cout << "This part is fine: " << std::string(line.begin(), end_it) << "\n";}// Get the line length (at least for the valid part)int length = utf8::distance(line.begin(), end_it);std::cout << "Length of line " << line_count << " is " << length << "\n";// Convert it to utf-16std::vector<unsigned short> utf16line;utf8::utf8to16(line.begin(), end_it, back_inserter(utf16line));// And back to utf-8std::string utf8line;utf8::utf16to8(utf16line.begin(), utf16line.end(), back_inserter(utf8line));// Confirm that the conversion went OK:if (utf8line != std::string(line.begin(), end_it))std::cout << "Error in UTF-16 conversion at line: " << line_count << "\n";line_count++;}std::string str = "ABCD";std::vector<unsigned short> utf16result;utf8::utf8to16(str.begin(), str.end(), std::back_inserter(utf16result));size_t size1 = utf16result.size();std::string str2 = "濦粿夿旴";std::string utf8str;utf8::utf16to8(str2.begin(), str2.end(), std::back_inserter(utf8str));size_t size2 = utf8str.length();char utf8_with_surrogates[] = "\xe6\x97\xa5\xd1\x88\xf0\x9d\x84\x9e";std::vector <unsigned short> utf16result1;utf8::utf8to16(utf8_with_surrogates, utf8_with_surrogates + 9, back_inserter(utf16result1));assert(utf16result1.size() == 4);assert(utf16result1[2] == 0xd834);assert(utf16result1[3] == 0xdd1e);unsigned short utf16string[] = { 0x41, 0x0448, 0x65e5, 0xd834, 0xdd1e };std::vector<unsigned char> utf8result;utf8::utf16to8(utf16string, utf16string + 5, back_inserter(utf8result));assert(utf8result.size() == 10);char* szSex = "\xe7\x94\xb7\x00";std::basic_string<wchar_t> sex;utf8::utf8to16(szSex, szSex + strlen(szSex), back_inserter(sex));if (sex != L"男") {std::cout << "unicode char utf16 error" << std::endl;return -1;}std::cout << "ok!" << std::endl;return 0;
}

GitHub:https://github.com/fengbingchun/UTF-8CPP_Test

總結(jié)

以上是生活随笔為你收集整理的UTF-8 CPP的使用的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。

歡迎分享!

轉(zhuǎn)載請(qǐng)說明來源于"生活随笔",并保留原作者的名字。

本文地址:UTF-8 CPP的使用