Recording the frequency from an ngram

I am tasked with recording the frequency of the ngrams in the argument map<string,long> reference. Below is what I have so far, just to note that clean_string just removes spaces, special characters, and makes the line lowercase.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
vector<string> generate_ngrams(string w, size_t n) {
vector<string> ngrams;

	for (auto i = 0; i <= w.length() - n; i++) {
		ngrams.push_back(w.substr(i, n));

	} 

  return ngrams;
}

void process_line(map<string, long>& m, string line, size_t n) {
	string cleaned  = clean_string(line);
	generate_ngrams(cleaned, n);
}


Basically, what is suppose to happen is the ngram is suppose to be generted from my generate_ngram function and then I am suppose to compare the string using map m to find the repeats and output it as follows:

If my input is: this thin thing!! finding the n-gram which is 3

My output should be: hin:2, his:1, ing:1, int:1, ist:1, nth:1, sth:1, thi:3

I am not sure how to manipulate the elements to compare it to the map function.
Topic archived. No new replies allowed.