Removing duplicate words from Array of Structures

Working on a project where I am supposed to read in a file chosen by user and sort the words into a structure and the number of times they occur.

My stucture is as follows:
struct wordStruct
{
string word;
int occur;
};
wordStruct StrArray[1000];

I'm currently working in a function where I first sort the words then remove the duplicates. I have been able to sort the words, but and having trouble removing duplicates. Here is what I have thus far:

void sortArray (wordStruct StrArray[], int wordCount)
{
for(int spot = 0; spot < items -1; spot ++)
{
int idxMin = spot;
for(int idx = spot+1; idx < items; idx++)
if(list[idx].word < list[idxMin].word)
idxMin = idx;

if(idxMin! = spot)
swap(list[idxMin], list[spot]);
// Remove duplicates (Where I am having trouble)
if (spot > 0)
{
for (int idx = spot-1;idx < wordCount; idx++)
{
swap (StrArray[spot],StrArray[wordCount-1]);
wordCount--;
spot--;
}
}
}

I don't know what exactly i'm doing wrong in last 4 lines after trying it out a couple of different ways. Any help would be greatly appreciated

Do you know hash tables? If so, it would be easier to hash the words to a unique value and then use the hash value as the index into an array. The element of the array can be a structure that stores the word and the number of occurences.

Storing it in an array would be a bit difficult as I assume you'd have/want to fix the holes caused by removing the duplicate words.
No, i'm not familiar with them still in first c++ class unfortunatley. This is how our prof told us to do it for now so I think it will work somehow but I keep getting Segmentation fault (core dumped) compiler error mssg.
You shouldn't have any duplicates. Before adding a word, check to see if it is already there. If it is, just increase the occur value.

Your braces are messed up in your code, I would recommend indenting every time you make a open brace.

As you see, something very wrong here:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
void sortArray (wordStruct StrArray[], int wordCount)
{
	for(int spot = 0; spot < items -1; spot ++)
	{
		int idxMin = spot;
		for(int idx = spot+1; idx < items; idx++)
			if(list[idx].word < list[idxMin].word)
				idxMin = idx;

		if(idxMin! = spot)
			swap(list[idxMin], list[spot]);
		// Remove duplicates (Where I am having trouble)
		if (spot > 0)
		{
			for (int idx = spot-1;idx < wordCount; idx++)
			{
				swap (StrArray[spot],StrArray[wordCount-1]);
				wordCount--;
				spot--;
			}
	}
}
Last edited on
LowestOne is correct. Given this is a intro class to C++ I don't think efficiency is of that importance yet so just scan the list to see if the word exists first.
Topic archived. No new replies allowed.