WCHAR and LPCTSTR

Hello Everyone!

I know LPCTSTR is a long pointer to a null terminated string, but is this string a regular char or a wchar_t? i.e. is LPCTSTR equivelent to wchar_t * ?

Should I be concerned about this anyway? I mean, I read that one can use tchar which would be replaced by char or wchar_t depending on which is defined (but there's no tchar type I presume).

I used to concatenate two strings as follows:

strcat(s1, s2);

Where s1 and s2 are char *.

Now how can I concatenate an LPCTSTR and a wchar_t?

I tried this:

_tcscat(lpszVerb, L"__.html")

But it erred, says it cannot convert a const wchar_t * to a char*.


Any general tips on working with characters and wide characters? What's the safest data type to work with when processing strings in files (less error prone, function-supported)?

Thanks!
Last edited on
This is a Windows-only thing. The Windows OS supports both ANSI and Unicode builds, but it started out with ANSI only. To support conversion of old code to Unicode, Microsoft took the TCHAR route.

The whole thing works like this: If you #define the UNICODE identifier, all TCHAR-related data types (such as LPCTSTR and LPTSTR) convert to wide char strings; if UNICODE is not #defined then they convert to single-byte (good ol' char) strings. Like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
typedef wchar_t WCHAR;

typedef char *LPSTR;
typedef const char *LPCSTR;
typedef WCHAR *LPWSTR;
typedef const WCHAR *LPCWSTR;

#ifdef UNICODE
typedef WCHAR TCHAR;
#else
typedef char TCHAR;
#endif

typedef TCHAR *LPTSTR;
typedef const TCHAR *LPCTSTR;


So that takes care of the data types (in a nutshell). Now to the Windows API functions. You need 2 types: One that takes wide strings and another one that takes narrow strings. Example:

1
2
int WINAPI GetWindowTextA(HWND hWnd, LPSTR lpString, int nMaxCount);
int WINAPI GetWindowTextW(HWND hWnd, LPWSTR lpString, int nMaxCount);


If you looked closely, you'll notice an A and a W at the end of the common and known function name GetWindowText(). What the...?, you say. That's right. Those are the real function names. The name GetWindowText is a macro definition. Like this:

1
2
3
4
#ifdef UNICODE
#define GetWindowText GetWindowTextW
#else
#define GetWindowText GetWindowTextA 


So we are almost done. String literals are the last thing. Microsoft defines the TEXT macro for those:

1
2
3
4
5
6
7
8
9
10
11
#ifdef UNICODE
#define __TEXT(x) L##x
#else
#define __TEXT(x) x
#endif

#define TEXT(x) __TEXT(x)

//It is used like this:

LPCTSTR myConstString = TEXT("Hello!"); //This will be ANSI or Unicode depending on whether you #define UNICODE. 


And that completes the basic picture here: If you #define UNICODE, you work with wide chars, but if you don't, everything automatically turn into narrow chars.

From the above, you can conclude that your use of _tcscat() was incorrect. You mixed data types. You must never mix data types. If the documentation says you use TCHARS, then just use TCHARS and string (and character) literals enclosed with TEXT(). In general:

If you use the function names ending in A, you must use char and related types, and you must use normal string and character literals.

If you use the function names ending in W, you must use wchar_t (or technically speaking, WCHAR) and related data types, and you must use L-prepended string and character literals.

If you use the function name wihout termination, you must use TCHAR and related data types, and you must use TEXT()-enclosed string and character literals.

Any other combination is plain wrong as you have clearly experienced. Always follow these rules; never deviate no matter how much code you see out there that doesn't follow this.

And that's from the Windows side of things. But wait! There's one more thing you need to know: The tchar.h header file works very similar to this, but it depends on the _UNICODE identifier (yes, with an underscore). This header also provides its own version of TEXT() called _T(), so you'll see them mixed. Also the functions in this header doesn't use the same function name conventions that Microsoft uses (-A or -W). Clear example is there in your code with _tcscat. The ANSI version is called strcat and the wide version is called wcscat().

One more note: If you use MS Visual Studio, projects default to Unicode builds, and you don't have to #define UNICODE or _UNICODE yourself. It is already #defined in the properties of the project.
thanks webJose for the explanation.
Million thanks for you,webJose.Very detailed and extremely useful guides.
Topic archived. No new replies allowed.