LPWSTR to std::string

The win32 api mostly uses the wchar_t and LPWSTR stuff to work with, and those types are incompatible with std::string(or so it seems) so now how could I simply convert LPWSTR to std::string and std::string to LPWSTR?

Also ofstream seems to write only memory adresses to file, could that be fixed?
You can use WideCharToMultiByte() and MultiByteToWideChar() to convert between ANSI and unicode strings. Ideally you would use std::wstring instead unless you absolutely have to use ANSI strings for some reason.

Oh, and you can explicitly call the ANSI version of the Windows function that has a string parameter to have it do the converting for you. ANSI version has an A at the end of the function name. EX: MessageBoxA instead of MessageBox.
Last edited on
how about with fstreams?
well I solved it with this:
1
2
3
4
5
6
7
8
9
10
std::string wstrtostr(const std::wstring &wstr)
{
    std::string strTo;
    char *szTo = new char[wstr.length() + 1];
    szTo[wstr.size()] = '\0';
    WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), -1, szTo, (int)wstr.length(), NULL, NULL);
    strTo = szTo;
    delete[] szTo;
    return strTo;
}
Errr... use std::wstring instead and leave the 20th century char in the past? You would then have full compatibility with LPWSTR.
Windows version of std::wstring is hardly a step into the future, since it's frozen in the age of 16-bit "Unicode", retrofitted to hold UTF-16.

The ideal is std::u32string (holding UTF-32) for program logic and std::string (holding UTF-8) for I/O.
Last edited on
May not be cutting edge but it is still a step forward.
About using u32string, how would I use it with WINAPI or CSTDIO? It seems there only is support for UTF16, at least with the MS CRT. Also, in a u32string, is a "character" a unsigned long (DWORD) ?
About using u32string, how would I use it with WINAPI or CSTDIO? It seems there only is support for UTF16, at least with the MS CRT

Yes, WinAPI supports UTF-16 and in a few places, UTF-8. For UTF-32, there are standard C++ conversion routines (supported since VS 2010), multiple libraries (iconv, ICU), and, really, 32-16 conversion is trivial to write yourself.

in a u32string, is a "character" a unsigned long (DWORD) ?

No, it is a char32_t.
No, it is a char32_t.

And the typedef for char32_t is? Signed Long i bet.
It's not a typedef (except in VS2010, but that's a bug)
Windows uses UTF-16 internally(Win32 +COM, .NET, WinRT, resource strings, registry strings, and so on). Using anything else just adds overhead to your program. UTF-16 is already capable of representing characters in 32 bits with "surrogates". I don't see how UTF-32 is in any way ideal.
Windows uses UTF-16 internally

And that would have been fine if those internals were actually internalized.

I don't see how UTF-32 is in any way ideal.

It gives your strings a 1:1 correspondence between elements of storage and code points. The elements of the Windows version of std::wstring do not correspond to anything meaningful.
Well that post went over my head to be quite honest, perhaps I should read up on localization. But I have to question the relevance of this in Windows application development because you are the first person I have seen to suggest using UTF-32 strings instead of UTF-16 despite the overhead involved: the extra memory required to store a single character, the constant allocating and freeing of temporary buffers for converted strings, and the actual process of converting. Why go through all of that when you can just give Windows what it expects?
Well, in fact, while Windows' standard is UTF16, even if you get UTF32 from the user whichever way, you will have to "truncate" the values to half.

The only usefulness is with files, if you reach a UTF32 file (maybe from a different os that supports UTF32) you will be able to read it correctly - but still will have to truncate it to display.

Anyways probably I'm not going to use UTF32 as far as there is no Windows/STDIO UTF32 standard - And when it will, I'll expand my class to use char32_t then.
Files (and other communication) are best in UTF-8, since there are no endianness issues and no stray zero bytes (unless you're in China, where Unicode is GB18030 :)

Take Linux for example: you take a std::wstring (UTF-32 there, as on almost every platform besides Windows), output to an std::wofstream or std::wcout, and get UTF-8 in file/on screen (if a utf8 locale is in effect, but that's the default on most distros). Same on the way back.

With UTF-16 in a wstring, you simply cannot *use* C++ I/O, because it's designed for 1-to-N and N-to-1 conversions only.
Topic archived. No new replies allowed.