Does wcout provide UTF-8?

Hi there!

Guys, I'm trying to figure out how does wcout work.

The problem is I use wcout to output some text to console (cmd.exe):

1
2
3
4
5
#include <iostream>
 
int main() {
  std::wcout << L"双喜雙喜!" << std::endl;
}


But in console I see:
????!
.

I changed the encoding of console by the following command:
chcp 65001
, but got the same result :(

So the question is what encoding does wcout provide?


Thank you!
By default, it won't convert your wide characters to anything, since it doesn't know what you're expecting: it supports a lot more than UTF-8 (for example, you could request GB18030 output)

On operating systems that provide support Unicode support in C++, such as Linux, you would have to imbue the right locale:

1
2
3
4
5
6
7
#include <iostream>
#include <locale>
int main() {
  std::wcout.sync_with_stdio(false);
  std::wcout.imbue(std::locale("en_US.utf8"));
  std::wcout << L"双喜雙喜!" << std::endl;
}


http://coliru.stacked-crooked.com/a/f4bfba567e72e990

Windows requires the use of non-standard APIs. For your output, you will need to use _setmode:

1
2
3
4
5
6
7
8
#include <iostream>
#include <io.h>
#include <fcntl.h>

int main() {
  _setmode(_fileno(stdout), _O_U16TEXT); // or _O_WTEXT
  std::wcout << L"双喜雙喜!" << std::endl;
}


(for more on _setmode, see MSDN: http://msdn.microsoft.com/en-us/library/tw4k6df8(v=vs.120).aspx )
Last edited on
@Cubbi: is there any reason to use the wide standard streams over the narrow standard streams in C++11?
Cubbi, I'm sorry is there a more common solution on Windows?

The problem is I'm using Embarcadero C++ Builder XE2 and it shows me the following errors on the provided code:

[BCC32 Error] File1.cpp(6): E2268 Call to undefined function '_setmode'
[BCC32 Error] File1.cpp(6): E2451 Undefined symbol '_O_U16TEXT'


UPD: Changed the code to the following:
1
2
3
4
5
6
7
8
#include <iostream>
#include <io.h>
#include <fcntl.h>

int main() {
  setmode(_fileno(stdout), _O_TEXT); // There is no _O_WTEXT
  std::wcout << L"双喜雙喜!" << std::endl;
}


But got the same result :-(
Last edited on
The issue is that if you're on Windows, you're using the Windows console, and to change the behavior of that console, you need to interact with Windows in a non-portable way.

You might have luck if you try piping your output to a file, however.
@Undestiny not familiar with Builder. On the off chance that it has support for C++11, you could use the C++ standard library:

1
2
3
4
5
6
7
8
// C++11 system-independent approach
#include <iostream>
#include <codecvt>
int main() {
  std::wbuffer_convert<std::codecvt_utf8<wchar_t>> conv_out(std::cout.rdbuf());
  std::wostream out(&conv_out);
  out << L"双喜雙喜!" << std::endl;
}


demo http://coliru.stacked-crooked.com/a/0e758db4056d295d

Otherwise, you may need a third-party library. Does boost.locale work with Builder?

@LB wide streams do wide/multibyte conversions for you, with narrow streams, you'd have to do it yourself.
@Cubbi: Really? How does that work? I thought that the wide output streams were just type aliases for basic_ostream with CharT=whcar_t, where does the overload for operator<< with narrow characters come in?
@LB a file is always a sequence of bytes, basic_ostream<wchar_t>'s operator<< takes a wchar_t, converts it into one or more bytes and stores them into the file/stdout/socket/etc.
Ok, guys, thank you! It's ridiculous but it seems there is no way to use standard 'wide' output to console on Windows :-)

But I have another question then. What encoding does cout provide?
I read somewhere that it provides UTF-8. So is it true?
Last edited on
What encoding does cout provide?

cout (by default) just sends the bytes you give it to the OS to display. And the way the OS interprets those bytes is usually tunable - through locale settings on Unix systems, or through codepages on Windows.
If you redirect the output of a program to a file, you will capture those bytes as-is.
Cubbi wrote:
wide streams do wide/multibyte conversions for you, with narrow streams, you'd have to do it yourself.
I took this to mean that wide streams converted from UTF8 to UTF16 for you, so it was just a misunderstanding. But, what did you mean?
@LB they can do that, along with many other conversions. For example, that's what std::wostream is doing in my example above ( http://www.cplusplus.com/forum/beginner/126557/#msg685301 ) on a Windows platform (where wchar_t is 16 bit)
Last edited on
I meant converting from char const * to wchar_t const * automatically. Apparently I'm bad at communicating too ;p
Last edited on
@LB string to string conversion is a whole other story. There have always been C95 library functions to do that: http://coliru.stacked-crooked.com/a/df94e6ac180761e0 (edit: oh, and wsprintf of course: http://coliru.stacked-crooked.com/a/d32dde945aa41073 ), as well as C++98's conversion facets: http://coliru.stacked-crooked.com/a/0148514de8b42b91
C++11 made it possible to use standard streams as well: http://coliru.stacked-crooked.com/a/db25ee25a69da568 but it's way easier (in C++11) to just wstring_convert: http://coliru.stacked-crooked.com/a/eb9e315f8ecaca47
Last edited on
Topic archived. No new replies allowed.