Counting duplicate strings in a vector efficiently

Assume I have some function like

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
void tallyList(vector<string> list)
{
	cout<<"Tally List called!\n";
	if(list.size()==0)
	{
		cout<<"the list is empty\n";
		return;
	}
	vector<string> nStrng;
	vector<int> nInt;

	for(int i=0;i<list.size();i++)
	{
		bool duplicate=false;

		for(int j=0;j<nStrng.size();j++)
		{
			if(list[i]==nStrng[j])
			{
				//cout<<"match found!\nAdd one to "<<nStrng[j]<<"\n";
				nInt[j]+=1;
				duplicate=true;
				break;
			}
		}
		if (duplicate==false)
		{
			//cout<<"New word found! "<<list[i]<<"\n";
			nStrng.push_back(list[i]);
			nInt.push_back(1);
		}
	}
	//cout<<"Print final tally...\n";
	cout<<nStrng.size()<<" unique words\n";
	for (int i=0;i<nStrng.size();i++)
	{
		cout<<nStrng[i]<<": "<<nInt[i]<<"\n";
	}

	//cout<<"Tally List complete.\n";
}


The above code works. My question is more about how I did it. Did I do anything wrong and or dumb? Is there something that could be changed to make my code cleaner or more efficient? Is there some completely different way to approach the problem that is more advantageous?

Thanks.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <set>
#include <algorithm>

void tallyList(vector<string> &list)
{
	//ends function if list is empty
	if (list.empty())
	{
		cout << "The list is empty" << endl;
		cin.ignore();
		return;
	}

	//creates a set which containes the unique words within list
	set<string> unique(list.begin(), list.end());

	cout << "unique words " << unique.size() << endl;

	//outputs a list of uniqe words
	for (auto element : unique)
		cout << element << ' ' << count(list.begin(), list.end(), element) << endl;
}



You should pass vector by reference passing by value can be expensive
You could make use of the unique function in the algorithm header file
http://www.cplusplus.com/reference/algorithm/unique/
Last edited on
Thanks! This was exactly the type of answer I was hoping for. My only question is what cin.ignore() does here
Last edited on
It is not always working method of waiting for confirmation enter press.
1
2
3
	//outputs a list of uniqe words
	for (auto element : unique)
		cout << element << ' ' << count(list.begin(), list.end(), element) << endl;
O(n^2)


1
2
3
4
5
6
7
8
9
10
std::map<std::string, int>
tallyList(const std::vector<std::string> &list){
   std::map<std::string, int> count;
   for( auto x : list )
      ++count[x];

   //count.size() has the number of unique words
   //count["word"] has the frequency
   return count;
}
O(n lg n)
you could also sort the vector and then use std::upper_bound and std::distance to count
std::unordered_map instead of std::map could give you an O(n) in theory.
Topic archived. No new replies allowed.