Basics of reading text files and counting characters

I am learning programming in C. Hope that's OK on this C++ board. I am using gcc 4.9.2 on a Linux system.

I am experimenting with with counting characters in text files. My problem is that my program counts one more character than exists in the file, and I'm not sure why. I'll provide a couple of examples.

My first test works as expected. I am inputting characters from the keyboard, and terminating input with ctrl-d. I don't use my enter key at all. My input is the word apples and my program counts six characters as expected. here is my code for this test.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <stdio.h>

int main(void)
{
	int count = 0;
	
	while( (getchar()) != EOF )
	{
		count++;
	}
	
	printf("\nchar count: %d\n", count);
	
	return 0;
}


Next, I use the same code but read the text from a file using file redirection. Even though I again use the word "apples" and don't use my enter key after typing the word, my code now counts 7 characters.

I'll give one more example of reading from a file. In this test the result is the same. I'm using the same text file as in the previous test. here is the code, it should be self-explanatory

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <stdio.h>

int main(void)
{
	FILE *fp;
	int count = 0;
	
	fp = fopen("test.txt", "r");
	
	while( getc(fp) != EOF )
	{
		++count;
	}
	fclose(fp);
	
	printf("charcount: %d\n", count);
	
	return 0;
}


So my question is "Why the extra character when reading from a file?". The only thing I can think of is an extra newline character, but I haven't used my enter key at all.
Here's an example of your program behaving correctly:
http://coliru.stacked-crooked.com/a/809284ca3cc27423

Both Emacs and Vim add trailing newlines when saving files which do not contain them when they are saved.
You can disable this in Emacs by evalling
(setq require-final-newline nil)
In Vim, open in binary mode (with -b) and then run the command
:set noeol

You can verify your program's output using wc -c.
Last edited on
I am using the text editor pluma for coding. pluma comes with the MATE desktop. I did a little googling and didn't find anything specific about pluma, but apparently gedit does the same thing with extra newlines. Pluma is supposed to be a fork of gedit, so I guess pluma does the same thing.

This issue is not a show stopper with me, more a curiosity. I will point out that when I repeat my earlier tests, except that I add my own newline to the end of my text file, pluma still adds something because the character count I get is now 8 instead of 7.

regards.

Last edited on
Topic archived. No new replies allowed.