Reading eof but not eof

I am attempting to write a program that reads a random html file I pulled. Some reason when I attempt to read the file with the following code.

1
2
3
4
  while(!inFile.eof()){
     getline(inFile, sentence);
     cout << sentence;
}


It would read until the end of the fifth closing div. From what I've learned reading regular text file is fine but it does not for the html file I am attempting to read.
Have you tried putting '\n' after every line, and are you running the code in a terminal, not just clicking the exe. Could be a flushing problem where stdout is flushed when the program terminates but you don't see it since the program is gone since you are just running it by clicking on it.

also eof isn't a good error check, you should use the return of getline (which is a &istream with a bool overload, same with std::cin >> a), and it should look like while(getline(inFile, sentence))

edit: Yea I bet it was that weird "NO-BREAK SPACE" in unicode. But personally I think it should just work on mingw or VS even if it prints junk or a "invalid square box"... Have you tried running the program through cmd and seeing if it may be the IDE's problem (assuming that jgrasp doesn't run the black terminal box like cmd as a hooked process, as in it just prints the data into a sub window in the IDE)? Like I can print alt+255 to print a NBSP in my terminal and it works fine, it should even work if you had a NULL character in your text.
Last edited on
Poteto's right, check the return value of the getline().

The problem with eof is that the program doesn't know it's at the end of the file until it tries to read beyond the end. So if you read 10 bytes from a 10 byte file, eof() will be false. It's when you try to read the 11th byte that eof becomes true.
I am running the code in the IDE jgrasp, I assume it's like running in a cmd prompt or terminal.

I have written the output to a file and it yields the same result.

I will give that while statement a try and see how it works out.

Edit: Well I figure out what was the issue. I guess the random html code that I acquire includes things outside of ascii code. When I removed all misc code or convert them all it worked fine.
Last edited on
Topic archived. No new replies allowed.