Really serious problem C++ even hard to explain

So I'm having a headache with this ugly bug... I need to count how many words (which are longest in txt files) in both txt files (CD1 and CD2) repeated... I know it sounds easy but I don't understand whats wrong! Please help me or give an advice how to do that. Thanks

Deividas

Here is my txt files CD1:
Be who you are and say what you feel, because those who mind don't matter, and those who matter don't mind

CD2:
Be who you are and say only what you really feel, because those who mind don't matter


And here is my code fragments (sorry I'm from Lithuania, so English is not my native language)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
void Read(const char CD[], string & eil, string & skyr){
	ifstream fd(CD1); // Kiekis CD2 !!!
	ofstream rf(RF);
	ofstream rf2(RF2);
	string max[10]; int ind = 0;
	rf << string(35, '-') << "Pradiniai duomenys" << string(35, '-') << endl;
	while(!fd.eof()){
		getline(fd, eil);
		rf << eil << endl;
		Less(eil);
		AnalyzeEil(eil, skyr, ind, max);
	}
	for(int j=0; j<ind; j++){
		if(Find(max[j]) == true){
		cout << max[j] << " " << max[j].length() << endl;
		}
	}
	fd.close();
	rf.close();
	rf2.close();
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
void Read2(const char CD2[], string & eil2, string & skyr){
	ifstream fd(CD2);
	ofstream rf(RF, ios::app);
	ofstream rf2(RF2, ios::app);
	rf << endl << string(70, '-') << endl;
	while(!fd.eof()){
		getline(fd, eil2);
		rf << eil2 << endl;
		Less(eil2);
	}
	fd.close();
	rf.close();
	rf2.close();
}

1
2
3
4
5
void Less(string & eil){
	for(int i=0; i<eil.length(); i++){
		eil[i] = tolower(eil[i]);
	}
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
void AnalyzeEil(string eil, string skyr, int & ind, string max[]){
string zodis;
    int zpr = 0, zpb = 0;
    while ((zpr = eil.find_first_not_of(skyr, zpb)) != string::npos){
        zpb = eil.find_first_of(skyr, zpr);
        zodis = eil.substr(zpr, zpb - zpr);
	    if(ind!=10){
			max[ind++] = zodis;
			//cout << Count(max[ind]) << endl;
		}else{
			for(int i=0; i<ind; i++){
				if(max[i].length() < zodis.length()){
					max[i] = zodis;
					//cout << Count(max[i]) << endl;
					break;
				}
			}
		}
	}
}

1
2
3
4
5
6
7
8
9
10
11
12
bool Find(string f){
	ifstream fd(CD2);
	string eil;
	while(!fd.eof()){
		getline(fd, eil);
		if(tolower(eil.find(f) != -1)){
			return true;
		}
	}
	fd.close(); 
	return false;
}

1
2
3
4
5
6
7
8
9
10
11
12
int Count(string k){
	ifstream fd2(CD2);
	string eil2;
	int kiek = 0;
	while(!fd2.eof()){
		getline(fd2, eil2);
		if(eil2.find(k) != -1){
			kiek++;
		}
	}
	return kiek;
}


Please somenone help me I think that my function Count is not correct!
Last edited on
Please somenone help me I think that my function Count is not correct!


No, it's not.

http://www.cplusplus.com/reference/string/string/find/
Funtion Count counts each line of text that contains at least one occurence of k. If there is more than one match per line, the extras are ignored, because find only goes once per getline.
Do one thing, and do it well

Read()
a ¿why is it opening two files for writing?
b The first argument is unused
c Don't loop on eof, use the reading operation instead while( getline(fd, eil) )
d `eil' does not make sense as an argument, but as a local variable
e If you don't pretend to modify `styr', pass it as const std::string &
f 13-17: ¿output? that shouldn't be here.
g 18-20: the destructors would take care of that.

Read2()
shares problem a, c, g with Read()
¿how is this different than `Read()'? (¿why should be different?)
You are overwriting the variable. At the end you'll only have the last line.

AnalyzeEil()
¿what is the purpose of this function?
10-18: ¿what is the purpose of the `else' block?

Find()
problem c
if(tolower(eil.find(f) != -1)){ ¿what?
you shouldn't need to be reopening the file over and over again.

Count()
problem c
.find() returns std::string::npos if it doesn't find the string. It may not be the same as -1



> I need to count how many words (which are longest in txt files)
> in both txt files (CD1 and CD2) repeated...
I don't understand your description
¿what would be the desired output?
Thank you all for reply. I need that in both txt files longest words reapeted and then i need to count their amount so for example my txt files

Be who you are and say what you feel, because those who mind don't matter, and those who matter don't mind


Be who you are and say only what you really feel, because those who mind don't matter

So my output should be (repeated in both files)
because 1 time
matter 1 time
who 2 time
and so on.
Last edited on
To JockX, thanks for reply. It is possible somehow to change my Count function? I'm really confused becouse I have no idea how i need to change that. Generraly speaking is it possible to count words on my algorythm?
Thanks

Deividas
To fix your function, make sure you repeat the eil2.find() as long as it finds something. But the next find() should start searching eil2 from the position where the previous k was found, so you may use the other version of find, that takes two arguments:
1
2
3
4
5
6
7
8
9
10
11
12
while(!fd2.eof()){            // Your loop
    size_t startAt = 0;
    while (true){             // My loop
        startAt = text.find(k, startAt); // start searching at positon startAt
        if( startAt != string::npos){ // if hit something
            startAt += k.length(); // move startAt to the position just after k
            kiek ++;
            continue;
        }
        break;
    } // my loop
} // your loop 

It would be even easier, if the function accepted entire content of file as one string, instead of processing it line by line, which unnecessarily adds complexity.
Thanks for reply.
It would be even easier, if the function accepted entire content of file as one string, instead of processing it line by line, which unnecessarily adds complexity.

But what if txt document has a lot of lines ? My teacher don't let us to do that, because program could break. I don't know why but when I change my function I get endless loop.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
int Count(string k){
	ifstream fd2(CD2);
	string eil2;
	int kiek = 0;
	while(!fd2.eof()){
		size_t startAt = 0;
		getline(fd2, eil2);
		while(true){
			startAt = eil2.find(k, startAt);
			if(startAt != string::npos){
				startAt = startAt + k.length();
				kiek++;
				continue;
			}
			break;
		}
	}
	return kiek;
}
Any ideas what's wrong?
Last edited on
> I get endless loop.
run through a debugger
interrupt your program
checkout the variables

(k may be empty)
I have realised why I get endless loop. Unfortunately my program still doesn't work :(. When I write
Count (eil) in my Read function I get results 0 0.
Read() should not count. Read() should read, period.
I would suggest to put all the words in a container (like std::vector), and pass that to the other functions.

Update your code
Also, you can test each function individually.
To ne555 thanks for reply, but my teacher said that put all words in a container or array is not a good thing, because in my txt file could be millions of lines, so program can break. So I need a better iea how can I solve this uggly bug.
You don't need a `Read()' function then (or change its name to something more meaningful)
You could simply work word by word instead of reading lines. That would simplify your `Count()' function to a sane level.


Edit: I still don't understand the `longest words repeated' part
In your example `who' has length 3, ¿why is it considered "longest" ?
Also `matter' appears two times in the first sentence, and one in the second. ¿why is your output 1?
¿do you want to simply count each word in the other file?
1
2
3
4
5
6
while read word from input1
   count=0
   while read aux from input2
      if word==aux
         ++count
   print word "appeared" count "times"
Last edited on
ne555 again thaks for quick reply, here is my full code http://pastebin.com/5TpWJiFs you see I try to seperate my line(eil) into Max function.
To ne555. I need to find 10 longest words in BOTH FILES. I need to find longest words from textfile 1 ant later from text file 2 and if they are the same I need to ciunt them.
The 10 longest in each file, and then count the mutual
Or, take all the mutual, and then count the 10 longest


I suppose that `Max()' should obtain the 10 longest words, but it is incorrect.
You simply overwrite any word that has less length than the one being tested.

Suppose a simplified situation where you are interest in the 2 longest words. So far you have a = "1234" and b = "12".
It comes a test string test = "123456", your algorithm will do
1
2
a = "123456";
b = "12";
but the correct would be
1
2
a = "123456";
b = "1234";


You need to maintain the array sorted, each time you test a new world, you `insert' it in the place so the array remains sorted (like insertion sort). Could also avoid repeated elements here.


I don't see you using `Count()' anywhere.
Again, simplify your functions so they do just one thing and test them separately.
ne555, thanks again, here is my new algorythm witch really counts the mutual words
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
void Mutual(string longestWords[], int & n, const char CD1[], const char CD2[], int longest){
	string eil;
	string zodziai[CMAX];
	while (longest != 0 || n < 10){
		ifstream fd(CD1);
		while (!fd.eof() && longest != 0 && n < 10){
			int kiek = 0;
			getline(fd, eil);
			Less(eil);
			AnalyseEil(eil, zodziai, kiek);
			for(int i = 0; i < kiek; i++){
				if(zodziai[i].length() == longest && NeraIrasytas(ilgiausiZodziai, n, zodziai[i]) && n < 10){
					if(YraAntrameTekste(CD2, zodziai[i])){
						longestWords[n++] = longest[i];
						 Kiek(ilgiausiZodziai);
					}
				}
			}
		}
		fd.close();
		longest--;
	}
}


But I still can't count the words countity in files :( How I need to do that?
?
Topic archived. No new replies allowed.