Function that prints the frequency of all words that appear in two given texts?

Hi. I need to create a function that prints the frequency of all words from two given texts. I must do this using mainly POINTERS. I already have the first part of the program which prints the common words from the two texts, now all I need is to display the frequency of each word. Any tips or suggestions are welcome.

PS: Disregard the actual texts, they are in Romanian.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
  int main()
{
    char text_A[512] = "Ana are mere. Andrei i le-a mancat pe toate.";
    char text_B[512] = "Ana nu mai are mere din cauza ca Andrei i le-a mancat.";

    char* token = strtok(text_A, " ,.!?");
    while (token)
    {
        if (std::strstr(text_B, token))
        {
            std::cout << token << std::endl;   
        }

        token = strtok(nullptr, " ,.!?");
    }
}
This one counts the words in text_A to not change your loop, but you may consider changing the loop to just iterate over 512 characters so that you can compare and tokenize both A and B at the same time.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <iostream>
#include <cstring>
#include <map>

int main() 
{
    char text_A[512] = "Ana are mere. Andrei i le-a mancat pe toate.";
    char text_B[512] = "Ana nu mai are mere din cauza ca Andrei i le-a mancat.";
    std::map<std::string, int> freq;

    char* token = strtok(text_A, " ,.!?");
    while (token)
    {
        if (std::strstr(text_B, token))
        {
            std::cout << token << std::endl;   
        }
        freq[token] += 1;

        token = strtok(nullptr, " ,.!?");
    }
    
    std::cout << "\n\n";
    for (auto& item : freq)
        std::cout << item.first << ": " << item.second << std::endl;
}
Thanks, but we are supposed to use mainly pointers to solve this, as I specified in the post. We didn't even study map structure yet :(
What are you allowed to use?
Well, this is a homework about Pointers. So I'm guessing we have to focus on pointers mostly. The only things we studied by now are Stack, Queue and Pointers. We are also allowed to use strtok and other things that don't require extra libraries to be imported. This is only the 4th week of the course.
what are all the things you need to do? Is it just
...to create a function that prints the frequency of all words from two given texts... using mainly POINTERS

? Is finding common words an actual requirement, or did you do that for fun?

map, ordered or unordered, is a very common way to determine frequency. You'll find this same strategy in all sorts of languages -- Ruby, Python, etc., where sometimes it's called a "hash", or a "dictionary". You can probably change your loop to use pointers, but in the end the storage will likely be in a map.
Last edited on
Just found a strtok_r which allows remembering a position (using pointers!) for a particular string. The regular strtok is static and so doesnt support remembering two string positions.

If you don't need the common strings, this will do the total frequency job of both texts:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include <iostream>
#include <cstring>
#include <map>

int main() 
{
    char text_A[] = "this... this... this... sentence is is awesome awesome awesome!";
    char text_B[] = "I don't know why you think this sentence is awesome?!";
    char delimiters[] = " ,.!?";
    std::map<std::string, int> freq;
    
    // Remember positions of tokenizers, using POINTERS!
    char* pos_a;
    char* pos_b;
    
    // The regular strtok is static and cant support multiple strings
    char* token_a = strtok_r(text_A, delimiters, &pos_a);
    char* token_b = strtok_r(text_B, delimiters, &pos_b);
    
    while (token_a || token_b)
    {
        if (token_a)
        {
            //std::cout << "found tokenA: "<<token_a<< std::endl;
            freq[token_a] += 1;
            token_a = strtok_r(nullptr, delimiters, &pos_a);
        }
        
        if (token_b)
        {
            //std::cout << "found tokenB: "<<token_b<< std::endl;
            freq[token_b] += 1;
            token_b = strtok_r(nullptr, delimiters, &pos_b);
        }
    }
    
    std::cout << "\n\n";
    for (auto& item : freq)
        std::cout << item.first << ": " << item.second << std::endl;
}


Since maps are, by default, ordered, so is the output:


I: 1
awesome: 4
don't: 1
is: 3
know: 1
sentence: 2
think: 1
this: 4
why: 1
you: 1

Can run it at https://repl.it/repls/AnguishedPrevailingSale
Last edited on
Ok so the assignment asks to print the common words of two texts using pointers, and then to display the frequency of words using pointers. Probably ok if in two separate programs. The thing is I am sure we can't use map yet, because we barely studied the basics. In the course objectives map/hash is at the bottom of the list. So I think we should use POINTERS ONLY for this. Thank you for your solutions anyway @icy1 and sorry for wasting your time.
Are you allowed to use a struct and array?
I mean somewhere you have store the count of each word.
In this case you could create a struct like this:
1
2
3
4
5
6
7
8
9
const int MAX_WORDS = 100; // or whatever 

struct WordInfo
{
  char word[20];
  int count;
};

WordInfo words[MAX_WORDS];

The you could iterate through all the words and look for it in the array. If you find it inc the count, otherwise add it with the count 1.
Yes, I am allowed. So how exactly would I iterate through words using pointers? I know how to iterate through chars, but how do I save an entire word? Do I need something like two pointers?
closed account (E0p9LyTq)
Here's an example using std::strtok() to parse out whole words.
http://en.cppreference.com/w/cpp/string/byte/strtok

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <cstring>
#include <iostream>
 
int main() 
{
    char input[100] = "A bird came down the walk";
    char *token = std::strtok(input, " ");
    while (token != NULL)
    {
        std::cout << token << '\n';
        token = std::strtok(NULL, " ");
    }
}

A
bird
came
down
the
walk

Easy to adapt for storing the tokenized words in an array. Only one pointer needed.

There is no C++ standard strtok() equivalent for dealing with a C++ std::string. There are 3rd party solutions. Boost has one, as well as numerous "hand-rolled" solutions.
http://oopweb.com/CPP/Documents/CPPHOWTO/Volume/C++Programming-HOWTO-7.html
closed account (E0p9LyTq)
Just found a strtok_r

POSIX platform only, not part of the C standard. C11 has strtok_s().
http://en.cppreference.com/w/c/string/byte/strtok

The strtok_s function differs from the POSIX strtok_r function by guarding against storing outside of the string being tokenized, and by checking runtime constraints.
So how exactly would I iterate through words using pointers?

You did it in your first post with strtok. Instead of printing the word you count it as I described earlier.
Topic archived. No new replies allowed.