Unicode?

Im still studying c++.I heard about unicode (which will , hopefully, make me be able to use characters other than ascii characters.).What do you suggest me about it?Is there a library that I can use in C++ code?In some place I saw "ICU" but given website is down.

hanst99 (2869)

Avoid unicode. Of course it's good for a cross language application and stuff, but Unicode is very hard to deal with, and the standard library of C/C++ doesn't help you very much there. C++ has the char type wchar_t, but it's not really suitable to deal with Unicode characters because the size of wchar_t is not clearly defined (usually 16 bits, but you can't depend on that), whereas Unicode uses 32 bit characters. Also, Unicode is very problematic to deal with (e.g. the 'ä' of the german language can be written with one or two characters), so it's really an advanced topic you should usually avoid unless you are writing a browser or text editor or anything else that absolutely NEEDS to deal with Unicode characters.

Last edited on

filipe (1165)

whereas Unicode uses 32 bit characters

I'm not particularly familiar with the details of Unicode encondings, but I'm sure there are different sizes and even variable size in the context of a same enconding. For instance, UTF-8 stores corresponding ASCII character in a single byte, making it backwards compatible with ASCII, but it may use up to 4 bytes for other characters.

Last edited on

hanst99 (2869)

Certainly, but Unicode characters are still addressed using 32 bit numbers. Just the encoding is slightly different. It doesn't make dealing with Unicode characters the least bit easier, rather this forces you to differentiate between ASCII, 2, 3 and 4 byte characters.

EDIT: Just to clarify, my little excourse about Unicode size was just to explain that wchar_t is compiler specific.