Dealing with non ascii characters in the console

I've spent the last three days searching about printing accented, box drawing, or whatever character on the console but I came up with only one solution so far, in which I'm quite skeptical. Basically I call the function setlocale(LC_ALL, "") and imagine you have to read or write from a screen buffer, would the operation complete correctly or will it encounter some problems? Also I've tried to change the code page trough the SetConsoleCP() function, but nothing happened. Other suggestions say I could use wchar_t, wstring and wcout but I read some articles which told me to avoid their usage or at least discourage it. Probably I'm a little lost and I'm messing up something, I don't know.
Last edited on
Which compiler are you using?
I'm using GNU GCC (MinGW version 5.1.0) as compiler and Code Blocks as IDE
Last edited on
Code::Blocks has a new 20.03 IDE version, with the MinGW 8.1.0 bundled as part of the setup. You might want to snag this.

This, of course, presumes you are using the Windows C::B version.
Other suggestions say I could use wchar_t, wstring and wcout but I read some articles which told me to avoid their usage or at least discourage it.


This may be your problem, you can't print unicode with std::cout and char.

Also you can't mix cout with wcout because stream will be set to one you use first! ,choose either wide or ascii mode and stick to it in your entry program including stream output (cout/wcout)

Those who say to avoid wcout don't know what they are talking about, that's just my opinion ofc.
whether you can print unicode to console or not depends. There is some command you can issue to the consoles to enable it, I believe, but I always forget it because english.
Here is sample to to write both to console and to file with unicode strings,
This is of course a windows solution, but should be easy to extend for other systems.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#include <iostream>
#include <locale>
#include <fstream>
#include <Windows.h>
#include <strsafe.h>


int main()
{
	// set locale
	std::locale loc(std::locale("ru_RU.utf8"));
	std::locale::global(loc);

	// set console output
	SetConsoleOutputCP(CP_UTF8);
	SetConsoleCP(CP_UTF8);

	// set file
	std::wofstream file;
	file.imbue(loc);
	file.open(L"output.txt");

	if (file.is_open())
	{
		// set sample unicode string
		wchar_t BOM = static_cast<wchar_t>(0xFEFF);
		wchar_t test_char = L'й';
		const wchar_t* test_str = L"Познер обнародовал личную переписку после обвинений Михалкова во лжи";

		// write to file
		file.put(BOM);
		file.put(test_char);
		std::size_t len = 0;

		if (FAILED(StringCchLengthW(test_str, STRSAFE_MAX_CCH, &len)))
		{
			std::wcout << L"Failed getting string length" << std::endl;
			return 0;
		}

		file.write(test_str, len);

		// write to console
		std::wcout << test_str << std::endl;

		if (!file.good())
		{
			std::wcerr << L"Failed to write" << std::endl;
		}

		file.close();
	}
	else
	{
		std::wcerr << L"Failed to open file" << std::endl;
	}

	std::wcout << L"Done!" << std::endl;

	std::wcin.get();
	return 0;
}
I kept searching and finally I found a solution along the lines of your (malibor) one. Which set the console output to CP_UTF8 etc. but it turned out it wasn't right, I still can't display accented character or whatever and instead I have blank spaces, thus I decided to stick around with the setlocale option. Anyway thank you for your answer I appreciated it
CP_UTF8 won't work for all possible characters, there are other code pages as well, so you have to update this for specific language.

It is possible to update this value at runtime, by using enumeration callback. if you want trully generic unicode program.
Last edited on
Topic archived. No new replies allowed.