Reading from a file

Pages: 123
Ya, I spoke too soon. I didn't check all files before posting that, If the count is set to 1 like it is now, the third and fourth program are off count by 1. So the file that has 21 words, and the file that has 19 words.
If they are all off by the same constant amount then it’s a matter of finding out what is causing the difference and adjusting the program. If they are all off by different amounts then it will probably be harder to track down
I think I'll go with what seeplus said. That code is working as required, But I need to understand it a bit more. Is he actually using getline before opening the file? like so:
"std::getline(std::cin, fnam);

std::ifstream ifs(fnam);"

I didn't know that was possible. I also need to read again stringstream because I don't know what "iss" does in:
"for (std::string wrd; iss >> wrd; ++wcnt);"
Enter file name: test2.txt
This is a &%file that should!!,...


This file must have several spaces at the end of the file.
have 21 words.


There are 25 words:

#1#Enter #2#file #3name: #4#test2.txt
#5#This #6#is #7#a #8#&%file #9#that #10#should!!,...


#11#This #12#file #13#must #14#have #15#several #16#spaces #17#at #18#the #19#end #20#of #21#the #22file.
#23#have #24#21 #25#words.
Ok, so after some time I reached out to my teacher and he told me to watch a video lecture from the previous year that had almost all the problems solved. So I ended up using his code instead. It was indeed the teachers idea to close and open the file twice, and to my understanding this resets the pointer position, or place holder position, whatever you call it. One of my biggest problems was running into infinite loops because I didn't have proper while loop conditions such as "while (getline (inputFile, output))" instead I had the getline inside the loop which was ruining my loop. After that the whole thing was working, file after file. This is what I wrote:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main()
{
    // Holds word output, users file name, and file output
    string word;
    string fileName;
    string output;

    // Declare input file stream named inputFile
    ifstream inputFile;

    // Holds the word count
    int wordCount = 0;

    // Print out beginning statement
    cout << "    *** A SIMPLE FILE PROCESSING PROGRAM ***\n\n";
    cout << "Enter a filename or type quit to exit: ";

    cin >> fileName;


    // Loop as long as user did not type in quit
    while (fileName != "quit")
    {
        inputFile.open(fileName);

        // Check for fail of file open
        while (!inputFile.is_open())
        {
           cout << "File note found. Enter the correct filename: ";
           cin >> fileName;
           inputFile.open(fileName);
        }

        cout << fileName << " data\n";
        cout << "***********************\n";

        // Print file output as long as there is data to read
        while (getline (inputFile, output))
        {
            cout << output << "\n";
        }

        cout << "\n***********************\n";

        // Close and open to reset the read pointer
        inputFile.close();
        inputFile.open(fileName);

        // Count number of words as long as there is data to read
        while (inputFile >> word)
        {
            wordCount++;
        }

        // Print out ending statements
        cout << fileName << " has " << wordCount << " words.\n";

        inputFile.close();

        wordCount = 0;

        cout << "Enter a filename or type quit to exit: ";
        cin >> fileName;

    }
    cout << "Now exiting the program........\n";

    return 0;
}
And @againtry I'm not sure what you mean by your last post. What I learned is that it totally matters where the pointer is in this case, It doesn't just start over in the file randomly or when you wish it to. Also, I need to practice manipulating with strings and char's because I worked with numbers before and that seemed easier, Also writing more condensed loops/ and conditions. I'm going to mark this as solved.
Yeah - but that means reading the file twice which is not required. This hits performance if the file is large.

Also rather than closing/opening the same file to display and then to coubt, once you have read the file the first time you can reset the position of the file to be beginning by:

1
2
inputFile.clear();
inputFile.seekg(0);

This program can be simplified - eg by only asking for file name once. Consider:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <iostream>
#include <fstream>
#include <string>

int main() {
	const std::string getfname {"\nEnter a filename or type quit to exit : "};
	std::string fileName;

	std::cout << "    *** A SIMPLE FILE PROCESSING PROGRAM ***\n\n";

	while ((std::cout << getfname) && std::getline(std::cin, fileName) && fileName != "quit") {
		std::ifstream inputFile(fileName);

		if (inputFile) {
			std::cout << fileName << " data\n";
			std::cout << "***********************\n";

			for (std::string output; getline(inputFile, output); )
				std::cout << output << "\n";

			std::cout << "\n***********************\n";

			inputFile.clear();
			inputFile.seekg(0);

			unsigned wordCount {};

			for (std::string word; inputFile >> word; )
				++wordCount;

			std::cout << fileName << " has " << wordCount << " words.\n";
		} else
			std::cout << "File not found\n";
	}

	std::cout << "Now exiting the program........\n";
}

"Yeah - but that means reading the file twice which is not required. This hits performance if the file is large."
Well, that brings me to my next point. It's a love-hate relationship with my book (currently reading "starting out with C++" -Tony Gaddis) It's easy to read, however it does not go over things like pointer position (yet), and only lightly defines some topics sometimes like the getline function which is only a couple of paragraphs long. Also, I noticed in your previous code you used a getline before opening the file. This is possible?

"std::getline(std::cin, fnam);"

"std::ifstream ifs(fnam);"

and in your current example I did just happen to read about "inputFile.seekg(0)" seekg is used for input files, and I'm guessing "(0)" sets the pointer back to the beginning? Thank you I will try that out.
his getline is from the keyboard, not the file. cin is the standard input (keyboard) "file" (its a lot like a file, but not on disk, which is a whole new topic on streams).

the line after that opens the file, with the name that was typed in.

seek has some handy constants with good names you can use, beg and end I think, if you don't want to use magic numbers.


Last edited on
Oh, of course. It's stored in the keyboard buffer, wow ok. Thank you! I'll read up more on seek.
And @againtry I'm not sure what you mean by your last post.

@jetm0t0
FWIW the point of my #- ridden post is to show the correct word count, word by word, based on the particular sample you provided.

Regardless of the program/method used to determine the count, needless to say, the automated count must be the same as that manual count.

Interestingly, the punctuation in the sample can lead us astray, and the inclusion of the word count in the sample was a 'sinister' red herring - very smart :)

This must be the cheesiest way so far.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
#include <string>
#include<iostream>

using namespace std;


int filelen(string filename)
{  
    FILE * fp =fopen(filename.c_str(),"rt+");
       fseek(fp, 0, SEEK_END);
       int length =ftell(fp);
    fclose(fp);
    return(length);
}


void save(string filename,string content)
{
    FILE * f = fopen(filename.c_str(), "wb");
    int size= content.length();
    fwrite(&content[0],size,1,f) ;
    fclose(f);
}


void load(string filename,string  &content)
{
	FILE * f = fopen(filename.c_str(), "rb");
	int lngth=filelen(filename);
	content.resize(lngth,' ');
	fread(&content[0],lngth,1,f); 
   fclose(f);
}

int words(string s)
{
	int i,c=0;
	s=s+" ";
	for (i=1;i<=s.length()-1;i++)
	{if ((s[i]==32 or s[i]==10 or s[i]==13)   and (s[i-1] != 32 and s[i-1]!=13 and s[i-1]!=10))  c=c+1;}
	return c;
}

int fileexists(string filename)
{
FILE * f = fopen(filename.c_str(), "r");
if (f){
   fclose(f);
   	return 1;
}else{
	return 0;
}
}


int main()
{
	string s,ret;
s =	"This is a test.\n";
s=s+"Some     more            text\n";
s=s+"And   even more         text";
cout<<"To file:"<<endl;
cout<<s<<endl<<endl;

	save("Ltest.txt",s);
	load("Ltest.txt",ret);
	cout<<"From file:"<<endl;
	cout<<ret<<endl<<endl;
	cout<<"characters "<<filelen("test.txt")<<endl;
	cout<<"words "<<words(ret)<<endl;

   
	std::remove("Ltest.txt");
	if (fileexists("Ltest.txt")) {cout<<"Delete the file manually"<<endl;}
	else
	{ cout<<"The file has been deleted"<<endl;}

		cout <<"Press return to end . . ."<<endl; 
	 cin.get();
		 	
} 
Why c-style file i/o in a C++ program?

C++ style instead of C style:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76

#include <iostream>
#include <sstream>
#include <fstream>

using namespace std;


bool fileexists(const char *fileName)
{
    ifstream infile(fileName);
    return infile.good();
}

int filelen(const char* filename)
{
    ifstream in(filename,ifstream::ate | ifstream::binary);
    return in.tellg(); 
}

void load(const char* filename,string &content)
{
	ifstream file(filename);
    stringstream strStream;
    strStream << file.rdbuf(); 
    content = strStream.str();
    file.close();
}

void save(const char* filename,string content)
{		
		ofstream file(filename);
		file << content;
	    file.close();
}

int words(string s)
{
	int i,c=0;
	s=s+" ";
	for (i=1;i<=s.length()-1;i++)
	{if ((s[i]==32 or s[i]==10 or s[i]==13)   and (s[i-1] != 32 and s[i-1]!=13 and s[i-1]!=10))  c=c+1;}
	return c;
}


int main()
{
string s,ret;
s =	"This is a test.\n";
s=s+"Some     more            text\n";
s=s+"And   even more         text";


cout<<"To file:"<<endl;
cout<<s<<endl<<endl;

	save("Ltest.txt",s);
	load("Ltest.txt",ret);
	cout<<"From file:"<<endl;
	cout<<ret<<endl<<endl;
	cout<<"characters "<<filelen("Ltest.txt")<<endl;
	cout<<"words "<<words(ret)<<endl;
	
		std::remove("Ltest.txt");
	
	if (fileexists("Ltest.txt")) {cout<<"Delete the file manually"<<endl;}
	else
	{ cout<<"The file has been deleted"<<endl;}

		cout <<"Press return to end . . ."<<endl; 
	 cin.get();
		 	
}

 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <iostream>
#include <fstream>
#include <string>
#include <cctype>
using namespace std;

int main()
{
   string filename;
   cout << "Enter filename: ";   cin >> filename;
   ifstream in( filename );

   bool inWord = false;
   int numChars = 0, numAlnum = 0, numWords = 0, numLines = 0;
   for ( char c; in.get( c ); numChars++ )
   {
//    cout.put( c );            // for debug
      if ( !isspace( c ) )
      {
         if ( !inWord ) numWords++;
         if ( isalnum( c ) ) numAlnum++;
         inWord = true;
      }
      else
      {
         if ( c == '\n' ) numLines++;
         inWord = false;
      }
   }
   cout << "Number of characters (inc. EOL): " << numChars << '\n';
   cout << "Number of alphanumeric characters: " << numAlnum << '\n';
   cout << "Number of words: " << numWords << '\n';
   cout << "Number of lines: " << numLines << '\n';
}


Testfile:
Enter file name: test2.txt
This is a &%file that should!!,...


This file must have several spaces at the end of the file.
have 21 words.


Output:
Number of characters (inc. EOL): 138
Number of alphanumeric characters: 99
Number of words: 25
Number of lines: 6


(Note that Windows condenses the CR-LF pair to a single character on read; the actual number of reported bytes for the above file would be 138+6 in Windows.)
Last edited on
I counted 25 too, but on reflection there are only 7.
There is also the issue of when the last line isn't terminated by a new-line but by eof marker. In this case the number of lines is off by 1.

Also as Lastchance notes, for Windows the number or chars read (in text mode) is usually not the same as the size of the file. The size includes the '\r' chars which are discarded when reading in text mode. If the size of the file is needed, then either use file_system::file_size() (C++17) or use .seekg(0, std::ios::end) and then .tellg()
Last edited on
Perhaps something like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <iostream>
#include <fstream>
#include <string>
#include <cctype>
using namespace std;

int main() {
	string filename;

	cout << "Enter filename: ";
	cin >> filename;

	ifstream in(filename);

	if (!in)
		return (std::cout << "Problem opening file\n"), 1;

	size_t numChars {}, numAlnum {}, numWords {}, numLines {};

	for (char c {}, inWord {}; in.get(c); ++numChars)
		if (!isspace(c)) {
			numWords += !inWord;
			numAlnum += isalnum(static_cast<char>(c)) != 0;
			inWord = true;
		} else {
			numLines += c == '\n';
			inWord = false;
		}

	in.clear();
	in.seekg(-1, std::ios::end);

	cout << "Number of characters (inc. EOL): " << numChars << '\n';
	cout << "Number of alphanumeric characters: " << numAlnum << '\n';
	cout << "Number of words: " << numWords << '\n';
	cout << "Number of lines: " << numLines + (in.peek() != '\n') << '\n';
}


Last edited on
There is also the issue of when the last line isn't terminated by a new-line but by eof marker. In this case the number of lines is off by 1.


Well, all I can say is that the output from my program gave the correct number of lines (6 for that test file) when tested under Windows. I'll have to test it on unix next time I log on. I've no way of testing what happens on a Mac, although I thought that the end-of-line character there was a '\r' alone, so the program probably won't work.

As far as I'm aware, files (on modern operating systems) don't need or have an EOF marker.
Last edited on
Pages: 123