I'm reading from a file that has quotation marks and when it reads the quotation marks, i get question marks printed out. what do i need to do to fit it?
str3 contains the read line from the file.
and i have it print out the word by word. it prints the word with the quotation mark but when i have a quotation mark in the line my it self it give me a question mark and sometimes it doesn't.
strrr=str3;
cstr = new char [strrr.size()+1];
strcpy (cstr, strrr.c_str());
pr[10]='\0';
pr=strtok (cstr," ");
while (pr!=NULL)
{
cout <<pr<<" ";
}
here is the code but it works on windows but wont work on my mac. i get a question mark instead of the quotation mark.
int main () {
char * cstr, *pr;
char str3[]="hello world \"";
string strrr=str3;
cstr = new char [strrr.size()+1];
strcpy (cstr, strrr.c_str());
pr=strtok (cstr," ");
while (pr!=NULL)
{
if (strcmp(pr,"\"")==0) // camparing quotation own compare
cout <<pr<<endl;
cout <<pr<<" "; // here it prints out a question mark
pr=strtok(NULL," ");
the file...
check it with a hex editor. Some text editors, MS word for one example, can substitute symbols with incorrect look-alikes. MS word WILL change quote marks from the real character to a symbolic incorrect value. So will other tools.
This has sat in the back of my head since you posted it — I wasn’t initially concerned about answering it as I figured it would be something stupid you would figure out.
If you are editing with anything but a plain-text editor, you will get non-ASCII characters in there.
You are using the bare-basics of C string manipulation, so you aren’t really in a position to handle the input much better.
If your editor is encoding “ and ” as UTF-8, the file will contain three chars for each: "\xE2\x80\x9C" and "\xE2\x80\x9D" respectively. That may translate to three question marks, depending on your I/O method.
(I don’t think you are encoding as UTF-16, otherwise you would get two question marks at the very beginning of the file and two question marks for each double-quote.)
Recommended solution: Use a code editor — one that only edits plain-text files, and make sure it writes files encoded in ANSI/ASCII/plain text/whatever it calls it; no UTF-anything or UCS-anything.