the woe and lo but lol

closed account (GTbMSL3A)
this is definitely woe and lo. but lol

apparently my text file has a character c such that isupper(c) will scream "assertion failed (c >=-1 && c <=255)"

so I'm looping through all the characters in the text file and erasing the characters that are triggering the assertion.

I know I can probably catch an exception, but meh. oh, and my program is still going through the text file because it's an English dictionary :/

might take an hour or two, right now it's up to the "C" words.
Last edited on
isupper(...) takes an int. If you provide a char you can simply

static_cast<unsigned char>(c)

to prevent this assertion.
closed account (GTbMSL3A)
oh cool, thank you coder777.

there are some weird characters that I didn't think would be in that dictionary.txt.

for example, 'æ'

I can't make this stuff up. lol lol

anyways, command prompt is up to the "L" words.

15 minutes from the "C" words to the "L" words.
Last edited on
for example, 'æ'


That's not a weird character. It is a letter in the alphabet in Danish and Norwegian.
for example, 'æ'
It is also in the ASCII Table:

http://www.asciitable.com/


But if that is not intended, it hints at a problem in the code that writes the file. Maybe an out of bounds access or a missing terminating '\0'.
It is also in the ASCII Table: http://www.asciitable.com/

That table is a lie though, there is no 'æ' in ASCII. They are showing the number "145", which is the code used by Windows extended character entry system, but even on Windows, that's not the code it uses: the default codepage in Windows is cp1252, which records 'æ' as 0xE6, just like iso8859-1. Since UTF-8 won the encoding wars of the mid-1990s, the only sensible way for that character to appear in a file is actually the two-byte sequence 0xC3 0xA6 (though using isupper() on that directly won't work either)
Last edited on
Topic archived. No new replies allowed.