日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

What are TCHAR, WCHAR, LPSTR, LPWSTR, LPCTSTR (etc.)?

發布時間:2024/4/11 编程问答 31 豆豆
生活随笔 收集整理的這篇文章主要介紹了 What are TCHAR, WCHAR, LPSTR, LPWSTR, LPCTSTR (etc.)? 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

2019獨角獸企業重金招聘Python工程師標準>>>

http://www.codeproject.com/Articles/76252/What-are-TCHAR-WCHAR-LPSTR-LPWSTR-LPCTSTR-etc

?

Many C++ Windows programmers get confused over what bizarre identifiers like TCHAR, LPCTSTR are. In this article, I would attempt by best to clear out the fog.

In general, a character can be represented in 1 byte or 2 bytes. Let's say 1-byte character is ANSI character - all English characters are represented through this encoding. And let's say a 2-byte character is Unicode, which can represent ALL languages in the world.

The Visual C++ compiler supports char and wchar_t as native data-types for ANSI and Unicode characters, respectively. Though there is more concrete definition of Unicode, but for understanding assume it as two-byte character which Windows OS uses for multiple language support.

There is more to Unicode than 2-bytes character representation Windows uses. Microsoft Windows use UTF-16 character encoding.

What if you want your C/C++ code to be independent of character encoding/mode used?

Suggestion: Use generic data-types and names to represent characters and string.

For example, instead of replacing:

Collapse | Copy Code

char cResponse; // 'Y' or 'N' char sUsername[64]; // str* functions

with

Collapse | Copy Code

wchar_t cResponse; // 'Y' or 'N' wchar_t sUsername[64]; // wcs* functions

In order to support multi-lingual (i.e., Unicode) in your language, you can simply code it in more generic manner:

Collapse | Copy Code

#include<TCHAR.H> // Implicit or explicit include TCHAR cResponse; // 'Y' or 'N' TCHAR sUsername[64]; // _tcs* functions

The following project setting in General page describes which Character Set is to be used for compilation: (General -> Character Set)

This way, when your project is being compiled as Unicode, the TCHAR would translate to wchar_t. If it is being compiled as ANSI/MBCS, it would be translated to char. You are free to use char and wchar_t, and project settings will not affect any direct use of these keywords.

TCHAR is defined as:

Collapse | Copy Code

#ifdef _UNICODE typedef wchar_t TCHAR; #else typedef char TCHAR; #endif

The macro _UNICODE is defined when you set Character Set to "Use Unicode Character Set", and therefore TCHARwould mean wchar_t. When Character Set if set to "Use Multi-Byte Character Set", TCHAR would mean char.

Likewise, to support multiple character-set using single code base, and possibly supporting multi-language, use specific functions (macros). Instead of using strcpy, strlen, strcat (including the secure versions suffixed with_s); or wcscpy, wcslen, wcscat (including secure), you should better use use _tcscpy, _tcslen, _tcscatfunctions.

As you know strlen is prototyped as:

Collapse | Copy Code

size_t strlen(const char*);

And, wcslen is prototyped as:

Collapse | Copy Code

size_t wcslen(const wchar_t* );

You may better use _tcslen, which is logically prototyped as:

Collapse | Copy Code

size_t _tcslen(const TCHAR* );

WC is for Wide Character. Therefore, wcs turns to be wide-character-string. This way, _tcs would mean _T Character String. And you know _T may be char or what_t, logically.

But, in reality, _tcslen (and other _tcs functions) are actually not functions, but macros. They are defined simply as:

Collapse | Copy Code

#ifdef _UNICODE #define _tcslen wcslen #else #define _tcslen strlen #endif

You should refer TCHAR.H to lookup more macro definitions like this.

You might ask why they are defined as macros, and not implemented as functions instead? The reason is simple: A library or DLL may export a single function, with same name and prototype (Ignore overloading concept of C++). For instance, when you export a function as:

Collapse | Copy Code

void _TPrintChar(char);

How the client is supposed to call it as?

Collapse | Copy Code

void _TPrintChar(wchar_t);

_TPrintChar cannot be magically converted into function taking 2-byte character. There has to be two separate functions:

Collapse | Copy Code

void PrintCharA(char); // A = ANSI void PrintCharW(wchar_t); // W = Wide character

And a simple macro, as defined below, would hide the difference:

Collapse | Copy Code

#ifdef _UNICODE void _TPrintChar(wchar_t); #else void _TPrintChar(char); #endif

The client would simply call it as:

Collapse | Copy Code

TCHAR cChar; _TPrintChar(cChar);

Note that both TCHAR and _TPrintChar would map to either Unicode or ANSI, and therefore cChar and the argument to function would be either char or wchar_t.

Macros do avoid these complications, and allows us to use either ANSI or Unicode function for characters and strings. Most of the Windows functions, that take string or a character are implemented this way, and for programmers convenience, only one function (a macro!) is good. SetWindowText is one example:

Collapse | Copy Code

// WinUser.H #ifdef UNICODE #define SetWindowText SetWindowTextW #else #define SetWindowText SetWindowTextA #endif // !UNICODE

There are very few functions that do not have macros, and are available only with suffixed W or A. One example isReadDirectoryChangesW, which doesn't have ANSI equivalent.


You all know that we use double quotation marks to represent strings. The string represented in this manner is ANSI-string, having 1-byte each character. Example:

Collapse | Copy Code

"This is ANSI String. Each letter takes 1 byte."

The string text given above is not Unicode, and would be quantifiable for multi-language support. To represent Unicode string, you need to use prefix L. An example:

Collapse | Copy Code

L"This is Unicode string. Each letter would take 2 bytes, including spaces."

Note the L at the beginning of string, which makes it a Unicode string. All characters (I repeat all characters) would take two bytes, including all English letters, spaces, digits, and the null character. Therefore, length of Unicode string would always be in multiple of 2-bytes. A Unicode string of length 7 characters would need 14 bytes, and so on. Unicode string taking 15 bytes, for example, would not be valid in any context.

In general, string would be in multiple of sizeof(TCHAR) bytes!

When you need to express hard-coded string, you can use:

Collapse | Copy Code

"ANSI String"; // ANSI L"Unicode String"; // Unicode_T("Either string, depending on compilation"); // ANSI or Unicode // or use TEXT macro, if you need more readability

The non-prefixed string is ANSI string, the L prefixed string is Unicode, and string specified in _T or TEXT would be either, depending on compilation. Again, _T and TEXT are nothing but macros, and are defined as:

Collapse | Copy Code

// SIMPLIFIED #ifdef _UNICODE #define _T(c) L##c#define TEXT(c) L##c #else #define _T(c) c#define TEXT(c) c #endif

The ## symbol is token pasting operator, which would turn _T("Unicode") into L"Unicode", where the string passed is argument to macro - If _UNICODE is defined. If _UNICODE is not defined, _T("Unicode") would simply mean "Unicode". The token pasting operator did exist even in C language, and is not specific about VC++ or character encoding.
Note that these macros can be used for strings as well as characters. _T('R') would turn into L'R' or simple 'R' - former is Unicode character, latter is ANSI character.

No, you cannot use these macros to convert variables (string or character) into Unicode/non-Unicode text. Following is not valid:

Collapse | Copy Code

char c = 'C'; char str[16] = "CodeProject";_T(c); _T(str);

The bold lines would get successfully compiled in ANSI (Multi-Byte) build, since _T(x) would simply be x, and therefore _T(c) and _T(str) would come out to be c and str, respectively. But, when you build it with Unicode character set, it would fail to compile:

Collapse | Copy Code

error C2065: 'Lc' : undeclared identifier error C2065: 'Lstr' : undeclared identifier

I would not like to insult your intelligence by describing why and what those errors are.

There exist set of conversion routine to convert MBCS to Unicode and vice versa, which I would explain soon.

It is important to note that almost all functions that take string (or character), primarily in Windows API, would have generalized prototype in MSDN and elsewhere. The function SetWindowTextA/W, for instance, be classified as:

Collapse | Copy Code

BOOL SetWindowText(HWND, const TCHAR*);

But, as you know, SetWindowText is just a macro, and depending on your build settings, it would mean either of following:

Collapse | Copy Code

BOOL SetWindowTextA(HWND, const char*); BOOL SetWindowTextW(HWND, const wchar_t*);

Therefore, don't be puzzled if following call fails to get address of this function!

Collapse | Copy Code

HMODULE hDLLHandle; FARPROC pFuncPtr;hDLLHandle = LoadLibrary(L"user32.dll");pFuncPtr = GetProcAddress(hDLLHandle, "SetWindowText"); //pFuncPtr will be null, since there doesn't exist any function with name SetWindowText !

From User32.DLL, the two functions SetWindowTextA and SetWindowTextW are exported, not the function with generalized name.

Interestingly, .NET Framework is smart enough to locate function from DLL with generalized name:

Collapse | Copy Code

[DllImport("user32.dll")] extern public static int SetWindowText(IntPtr hWnd, string lpString);

No rocket science, just bunch of ifs and else around GetProcAddress!

All of the functions that have ANSI and Unicode versions, would have actual implementation only in Unicode version. That means, when you call SetWindowTextA from your code, passing an ANSI string - it would convert the ANSI string to Unicode text and then would call SetWindowTextW. The actual work (setting the window text/title/caption) will be performed by Unicode version only!

Take another example, which would retrieve the window text, using GetWindowText. You call GetWindowTextA, passing ANSI buffer as target buffer. GetWindowTextA would first call GetWindowTextW, probably allocating a Unicode string (a wchar_t array) for it. Then it would convert that Unicode stuff, for you, into ANSI string.

This ANSI to Unicode and vice-versa conversion is not limited to GUI functions, but entire set of Windows API, which do take strings and have two variants. Few examples could be:

  • CreateProcess
  • GetUserName
  • OpenDesktop
  • DeleteFile
  • etc

It is therefore very much recommended to call the Unicode version directly. In turn, it means you should alwaystarget for Unicode builds, and not ANSI builds - just because you are accustomed to using ANSI string for years. Yes, you may save and retrieve ANSI strings, for example in file, or send as chat message in your messenger application. The conversion routines do exist for such needs.

Note: There exists another typedef: WCHAR, which is equivalent to wchar_t.


The TCHAR macro is for a single character. You can definitely declare an array of TCHAR. What if you would like to express a character-pointer, or a const-character-pointer - Which one of the following?

Collapse | Copy Code

// ANSI characters foo_ansi(char*); foo_ansi(const char*); /*const*/ char* pString; // Unicode/wide-string foo_uni(WCHAR*); wchar_t* foo_uni(const WCHAR*); /*const*/ WCHAR* pString; // Independent foo_char(TCHAR*); foo_char(const TCHAR*); /*const*/ TCHAR* pString;

After reading about TCHAR stuff, you would definitely select the last one as your choice. There are better alternatives available to represent strings. For that, you just need to include Windows.h. Note: If your project implicitly or explicitly includes Windows.h, you need not include TCHAR.H

First, revisit old string functions for better understanding. You know strlen:

Collapse | Copy Code

size_t strlen(const char*);

Which may be represented as:

Collapse | Copy Code

size_t strlen(LPCSTR);

Where symbol LPCSTR is typedef'ed as:

Collapse | Copy Code

// Simplified typedef const char* LPCSTR;

The meaning goes like:

  • LP - Long Pointer
  • C - Constant
  • STR - String

Essentially, LPCSTR would mean (Long) Pointer to a Constant String.

Let's represent strcpy using new style type-names:

Collapse | Copy Code

LPSTR strcpy(LPSTR szTarget, LPCSTR szSource);

The type of szTarget is LPSTR, without C in the type-name. It is defined as:

Collapse | Copy Code

typedef char* LPSTR;

Note that the szSource is LPCSTR, since strcpy function will not modify the source buffer, hence the constattribute. The return type is non-constant-string: LPSTR.

Alright, these str-functions are for ANSI string manipulation. But we want routines for 2-byte Unicode strings. For the same, the equivalent wide-character str-functions are provided. For example, to calculate length of wide-character (Unicode string), you would use wcslen:

Collapse | Copy Code

size_t nLength; nLength = wcslen(L"Unicode");

The prototype of wcslen is:

Collapse | Copy Code

size_t wcslen(const wchar_t* szString); // Or WCHAR*

And that can be represented as:

Collapse | Copy Code

size_t wcslen(LPCWSTR szString);

Where the symbol LPCWSTR is defined as:

Collapse | Copy Code

typedef const WCHAR* LPCWSTR; // const wchar_t*

Which can be broken down as:

  • LP - Pointer
  • C - Constant
  • WSTR - Wide character String

Similarly, strcpy equivalent is wcscpy, for Unicode strings:

Collapse | Copy Code

wchar_t* wcscpy(wchar_t* szTarget, const wchar_t* szSource)

Which can be represented as:

Collapse | Copy Code

LPWSTR wcscpy(LPWSTR szTarget, LPWCSTR szSource);

Where the target is non-constant wide-string (LPWSTR), and source is constant-wide-string.

There exist set of equivalent wcs-functions for str-functions. The str-functions would be used for plain ANSI strings, and wcs-functions would be used for Unicode strings.

Though, I already advised to use Unicode native functions, instead of ANSI-only or TCHAR-synthesized functions. The reason was simple - your application must only be Unicode, and you should not even care about code portability for ANSI builds. But for the sake of completeness, I am mentioning these generic mappings.

To calculate length of string, you may use _tcslen function (a macro). In general, it is prototyped as:

Collapse | Copy Code

size_t _tcslen(const TCHAR* szString);

Or, as:

Collapse | Copy Code

size_t _tcslen(LPCTSTR szString);

Where the type-name LPCTSTR can be classified as:

  • LP - Pointer
  • C - Constant
  • T = TCHAR
  • STR = String

Depending on the project settings, LPCTSTR would be mapped to either LPCSTR (ANSI) or LPCWSTR (Unicode).

Note: strlen, wcslen or _tcslen will return number of characters in string, not the number of bytes.

The generalized string-copy routine _tcscpy is defined as:

Collapse | Copy Code

size_t _tcscpy(TCHAR* pTarget, const TCHAR* pSource);

Or, in more generalized form, as:

Collapse | Copy Code

size_t _tcscpy(LPTSTR pTarget, LPCTSTR pSource);

You can deduce the meaning of LPTSTR!

Usage Examples

First, a broken code:

Collapse | Copy Code

int main() {TCHAR name[] = "Saturn";int nLen; // Or size_tlLen = strlen(name); }

On ANSI build, this code will successfully compile since TCHAR would be char, and hence name would be an array ofchar. Calling strlen against name variable would also work flawlessly.

Alright. Let's compile the same with with UNICODE/_UNICODE defined (i.e. "Use Unicode Character Set" in project settings). Now, the compiler would report set of errors:

  • error C2440: 'initializing' : cannot convert from 'const char [7]' to 'TCHAR []'
  • error C2664: 'strlen' : cannot convert parameter 1 from 'TCHAR []' to 'const char *'

And the programmers would start committing mistakes by correcting it this way (first error):

Collapse | Copy Code

TCHAR name[] = (TCHAR*)"Saturn";

Which will not pacify the compiler, since the conversion is not possible from TCHAR* to TCHAR[7]. The same error would also come when native ANSI string is passed to a Unicode function:

Collapse | Copy Code

nLen = wcslen("Saturn"); // ERROR: cannot convert parameter 1 from 'const char [7]' to 'const wchar_t *'

Unfortunately (or fortunately), this error can be incorrectly corrected by simple C-style typecast:

Collapse | Copy Code

nLen = wcslen((const wchar_t*)"Saturn");

And you'd think you've attained one more experience level in pointers! You are wrong - the code would give incorrect result, and in most cases would simply cause Access Violation. Typecasting this way is like passing a float variable where a structure of 80 bytes is expected (logically).

The string "Saturn" is sequence of 7 bytes:

'S' (83)
'a' (97)
't' (116)
'u' (117)
'r' (114)
'n' (110)
'\0' (0)

But when you pass same set of bytes to wcslen, it treats each 2-byte as a single character. Therefore first two bytes [97, 83] would be treated as one character having value: 24915 (97<<8 | 83). It is Unicode character: ?. And the next character is represented by [117, 116] and so on.

For sure, you didn't pass those set of Chinese characters, but improper typecasting has done it! Therefore it is very essential to know that type-casting will not work! So, for the first line of initialization, you must do:

Collapse | Copy Code

TCHAR name[] = _T("Saturn");

Which would translate to 7-bytes or 14-bytes, depending on compilation. The call to wcslen should be:

Collapse | Copy Code

wcslen(L"Saturn");

In the sample program code given above, I used strlen, which causes error when building in Unicode. The non-working solution is C-sytle typecast:

Collapse | Copy Code

lLen = strlen ((const char*)name);

On Unicode build, name would be of 14-bytes (7 Unicode characters, including null). Since string "Saturn" contains only English letters, which can be represented using original ASCII, the Unicode letter 'S' would be represented as [83, 0]. Other ASCII characters would be represented with a zero next to them. Note that 'S' is now represented as2-byte value 83. The end of string would be represented by two bytes having value 0.

So, when you pass such string to strlen, the first character (i.e. first byte) would be correct ('S' in case of "Saturn"). But the second character/byte would indicate end of string. Therefore, strlen would return incorrect value 1 as the length of string.

As you know, Unicode string may contain non-English characters, the result of strlen would be more undefined.

In short, typecasting will not work. You either need to represent strings in correct form itself, or use ANSI to Unicode, and vice-versa, routines for conversions.

(There is more to add from this location, stay tuned!)


Now, I hope you understand the following signatures:

Collapse | Copy Code

BOOL SetCurrentDirectory( LPCTSTR lpPathName ); DWORD GetCurrentDirectory(DWORD nBufferLength,LPTSTR lpBuffer);

Continuing. You must have seen some functions/methods asking you to pass number of characters, or returning the number of characters. Well, like GetCurrentDirectory, you need to pass number of characters, and not number of bytes. For example:

Collapse | Copy Code

TCHAR sCurrentDir[255];// Pass 255 and not 255*2 GetCurrentDirectory(sCurrentDir, 255);

On the other side, if you need to allocate number or characters, you must allocate proper number of bytes. In C++, you can simply use new:

Collapse | Copy Code

LPTSTR pBuffer; // TCHAR* pBuffer = new TCHAR[128]; // Allocates 128 or 256 BYTES, depending on compilation.

But if you use memory allocation functions like malloc, LocalAlloc, GlobalAlloc, etc; you must specify the number of bytes!

Collapse | Copy Code

pBuffer = (TCHAR*) malloc (128 * sizeof(TCHAR) );

Typecasting the return value is required, as you know. The expression in malloc's argument ensures that it allocates desired number of bytes - and makes up room for desired number of characters.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

轉載于:https://my.oschina.net/ruiwong/blog/73957

總結

以上是生活随笔為你收集整理的What are TCHAR, WCHAR, LPSTR, LPWSTR, LPCTSTR (etc.)?的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。

主站蜘蛛池模板: 中文在线观看免费视频 | 午夜久| 亚洲免费观看高清完整 | 欧美人妻精品一区二区三区 | 国产成人久久精品麻豆二区 | 涩涩视屏| 国产免费不卡av | 少妇高清精品毛片在线视频 | 污的网站| 七仙女欲春2一级裸体片 | 国产福利影院 | 91亚洲国产成人久久精品网站 | 日日噜噜夜夜狠狠久久丁香五月 | 日日插日日操 | 天天色天天操天天射 | 魔女鞋交玉足榨精调教 | 美女视频黄频视频大全 | 国内自拍99 | 欧美综合在线视频 | 一本色道久久加勒比精品 | 久久久久无码精品国产sm果冻 | 一区二区日韩欧美 | 成人在线播放网站 | 欧美黄色激情视频 | 日韩视频免费观看高清完整版 | 国产精品精品软件视频 | 天天狠天天干 | 激情婷婷久久 | 精品一区国产 | 日韩毛片免费观看 | 国产aⅴ激情无码久久久无码 | 传媒av在线 | 成人影| 久久综合五月婷婷 | 精品日韩在线 | 色婷婷激情综合 | 欧美日在线 | 亚洲区av | 亚洲欧美一区二区三区不卡 | 亚洲AV永久无码国产精品国产 | 日韩av资源在线观看 | 国产婷婷色一区二区在线观看 | 免费观看黄色小视频 | 久久精品一二三区 | 中文字幕日韩精品无码内射 | 在线看片一区二区 | 华人色 | 日本一级片免费看 | 欧美一级黄 | av黄色在线播放 | 性色av浪潮av | 久久久无码18禁高潮喷水 | 两根大肉大捧一进一出好爽视频 | www.sesehu| 国产欧美在线一区 | 日本加勒比中文字幕 | av不卡一区二区三区 | 国产精品久久久无码一区 | 天天狠天天干 | 无码少妇精品一区二区免费动态 | 91亚洲高清 | 五月天激情国产综合婷婷婷 | 草草影院网址 | 久久成人国产精品入口 | 东北女人av | 日本黄色片网址 | 黄色91免费版 | 午夜影院啊啊啊 | 中文字幕第99页 | 91桃色视频在线观看 | 欧美日韩一二三 | av动漫网站 | 18在线观看免费入口 | 午夜影院久久久 | 五月天婷婷色综合 | 17c国产精品一区二区 | 在线看日韩 | 欧美精品日韩在线观看 | 国产一级性生活片 | 国产手机在线播放 | 色 综合 欧美 亚洲 国产 | 青娱乐99 | 日本少妇在线观看 | 欧美亚洲一区二区三区四区 | 超碰视屏 | 国产无遮挡又黄又爽免费视频 | 尤物毛片 | 欧美资源站 | 最新av导航 | 制服丝袜第二页 | 男女免费视频网站 | 亚洲三级久久 | 丁香六月色婷婷 | 韩国中文字幕在线观看 | 一级黄色免费片 | 每日更新av | 婷婷国产在线 | 五月天六月婷 | 成人免费在线视频观看 |