Need a faster (more efficient) program

Write a program that, given a sequence of words, computes how many characters have in total the words that appear once, how many characters have in total the words that appear twice, how many characters have in total the words that appear thrice, etc.

Input

Input consists of several cases. Each case begins with a number 1 ≤ n ≤ 10^5, followed by n words.

Output

For each case of the input and for each number x of repetitions, print in a line the total number of characters (not counting repetitions) of the words that appear exactly x times, If there is no word with x repetitions, do not print anything for this x. Print an empty line after every case.

Public inputs
Input
12 z z ab dddddd z z ab yy yy yy yy x
2 c3po r2d2

Output
1 : 7
2 : 2
4 : 3

1 : 8


The program must have the following interface:
1
2
3
4
5
6
7
#include <iostream>
#include <vector>
using namespace std;

int main() {
//code here
}


The problem would be easy if you were allowed to change whatever part of the program but since you can only write in the main this discards efficient stuff like sorts, functions, etc.

I have tried during 6 straight hours and I have not been able to get a program efficient enough.

The best one I got so far I believe it's this one:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
int main() {
	int size;
	while (cin >> size) {
	    vector <string> v (size);
	    vector <string> result;
	    for (int i = 0; i < size; ++i) {
	        cin >> v[i];
	        if (not result.empty()) {
	            bool repeated = false;
	            for (int j = 0; j < result.size() and not repeated; ++j) {
	                if (v[i] == result[j]) repeated = true;
	            }
	            if (not repeated) result.push_back(v[i]);
	        }
	        else result.push_back(v[0]);
	    }
	    vector <int> counter (result.size(), 0);
	    for (int i = 0; i < result.size(); ++i) {
	        for (int j = 0; j < v.size(); ++j) {
	            if (result[i] == v[j]) ++counter[i];
	        }
	    }
	    for (int condition = 1; condition < result.size(); ++condition) {
	        int sum = 0;
    	        for (int i = 0; i < result.size(); ++i) {
    	            if (counter[i] == condition) {
    	                sum += result[i].size();
    	            }
    	        }
    	        if (sum != 0) cout << condition << " : " << sum << endl;
	    }
	    cout << endl;
	}
}


Here the code explained in parts:
1
2
3
4
5
6
7
8
9
10
11
12
13
            vector <string> v (size);
	    vector <string> result;
	    for (int i = 0; i < size; ++i) {
	        cin >> v[i];
	        if (not result.empty()) {
	            bool repeated = false;
	            for (int j = 0; j < result.size() and not repeated; ++j) {
	                if (v[i] == result[j]) repeated = true;
	            }
	            if (not repeated) result.push_back(v[i]);
	        }
	        else result.push_back(v[0]);
	    }

Reads input and saves it in a vector called v and simultaneously writes all different strings in a vector called result. This means result doesn't have repeated strings.

1
2
3
4
5
6
            vector <int> counter (result.size(), 0);
	    for (int i = 0; i < result.size(); ++i) {
	        for (int j = 0; j < v.size(); ++j) {
	            if (result[i] == v[j]) ++counter[i];
	        }
	    }

Creates a vector called counter which will hold the number of times a repeated string appears.

1
2
3
4
5
6
7
8
9
10
            for (int condition = 1; condition < result.size(); ++condition) {
	        int sum = 0;
    	        for (int i = 0; i < result.size(); ++i) {
    	            if (counter[i] == condition) {
    	                sum += result[i].size();
    	            }
    	        }
    	        if (sum != 0) cout << condition << " : " << sum << endl;
	    }
	    cout << endl;

A loop that will iterate result.size() times since we know result hold all different strings. A variable called sum will be added the size of the string depending if the word is repeated condition times.
Last edited on
what does 1 7 mean ?
I get that ab is in there twice.
yy is in there 4 times though. ??

what is 1:8 ??

I cant make sense of your input/output.

If you think a sort would help, use vector sort, built into the language, since you are using vectors anyway. I am not sure about the sort. It would put all the strings together for an easy counting, but the sort itself does a ton of comparison and unnecessary data movements. I think this would be slower, BUT the built in sort is really fast. You could try it, it may or may not be better. I think it would be worse if the strings and "words" were long.

should not be that inefficient. Grab the first string, check every string after that, count them, delete them all after counting. Repeat until nothing left. Deleting them makes it go faster and faster as it removes data to be checked by later operations.

I am not sure exactly what you are doing but pulling substrings and comparing them should be straightforward.

Combine those ideas and it should be PDQ. If you need it even faster, preallocate the space in your vectors and strings then we can peel apart the slowness of your new code once you have tried these ideas.

how long is it taking, anyway?

Last edited on
The program must have the following interface:
1
2
3
#include <iostream>
#include <vector>
using namespace std;

I'm not sure how seriously I can take instructions that ask (a) to use namespace std; and (b) says nothing about #include <string> for this specific exercise
http://www.cplusplus.com/forum/general/214265/2/#msg998799

and if we should #include <string> on our own initiative then why not #include <algorithm> etc as well?
You should fully read the problem, there is a limit on how many words there are but none of how much their length must be.

I am unable to use inclusions, libraries, functions, etc. Only able to change main.

About the input/output
12 z z ab dddddd z z ab yy yy yy yy x
strings repeated once: dddddd x
sum of lengths of strings = 6 + 1 = 7
1 : 7
strings repeated twice: ab
sum of lengths of strings = 2
2 : 2
strings repeated four times: z yy
sum of lengths of strings = 1 + 2 = 3
4 : 3

Grab the first string, check every string after that, count them, delete them all after counting. Repeat until nothing left. Deleting them makes it go faster and faster as it removes data to be checked by later operations.
Will try it.
Last edited on
>
1
2
3
4
5
6
7
#include <iostream>
#include <vector>
using namespace std;

int main() {
//code here
}


No #include <string> ?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <iostream>
#include <vector>
#include <string>


int main()
{
    std::vector<std::string> givenWords{"z", "z","ab", "dddddd", "z", "z", "ab", "yy", "yy", "yy", "yy", "x"};
    std::vector<std::string> doneWords{};
    std::vector<size_t> counter(givenWords.size(), 0);

    for (auto itrGiven = givenWords.begin(); itrGiven != givenWords.end(); ++itrGiven)
    {
        bool match{false};
        for (auto itrDone = doneWords.begin(); itrDone != doneWords.end(); ++itrDone)
        {
            if ((*itrGiven) == (*itrDone))
            {
                match = true;
                break;
            }
        }
        if(!match)
        {
            doneWords.push_back((*itrGiven));
            size_t num{};
            for (auto itr_dupe = itrGiven; itr_dupe != givenWords.end(); ++itr_dupe)
            {
                if((*itrGiven)== (*itr_dupe))++num;
            }
            counter[num] += ((*itrGiven).size());
        }
    }
    for (auto i = 0; i < counter.size(); ++i)
    {
        if(counter[i])std::cout << i  << " " << counter[i] << "\n";
    }
}
Grab the first string, check every string after that, count them, delete them all after counting.

as an aside, is it possible to delete strings from the vector using only the <vector> library?
@JLBorges no #include<string>
@gunnerfunner pls respect the interfase the problem gives and if possible the code should be in C++98, you can ignore using namespace std; if you want

This problem at less how I see it tries to make you a program that follows strict rules and with limited resources. A really good practice exercise I would say.

@jonnin was able to create this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
int main() {
	int size;
	while (cin >> size) {
	    vector <string> v (size);
	    for (int i = 0; i < v.size(); ++i) cin >> v[i];
	    vector <int> counter;
	    for (int i = 0; i < v.size(); ++i) {
	        counter.push_back(1);
	        for (int j = i + 1; j < v.size(); ++j) {
	            if (v[i] == v[j]) {
	                ++counter[i];
	                v.erase(v.begin() + j);
	                --j;
	            }
	        }
	    }
	    for (int condition = 1; condition < v.size(); ++condition) {
	        int sum = 0;
    	        for (int i = 0; i < v.size(); ++i) {
    	            if (counter[i] == condition) {
    	                sum += v[i].size();
    	            }
    	        }
    	        if (sum != 0) cout << condition << " : " << sum << endl;
	    }
	    cout << endl;
	}
}

Not good enough.

Which would be the most effective way to solve this respecting the rules?
Last edited on
if possible the code should be in C++98

OK, evaluate this first and then see if it merits a conversion to C++98:
http://www.cplusplus.com/forum/beginner/214561/#msg998821
I have no experience in C++11 so I cannot evaluate this even though I understand some parts. I am able to convert some easy programs from C++11 to C++98 but not this. Also I lack experience in pointers, working on it at the moment.
Last edited on
OK, let's give it a wait-and-see what comes up on the C++98 front but, honestly, C++98 in 2017??
Isn't it normal to first start with C++98, then move to C++11 and finally C++14?
I am going to create a program using sort to see more or less when I get a time limit exceed.
Last edited on
Ok this program got accepted however it doesn't follow the rules so if anyone wants to use it as a guide here it is:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;

struct data {
    string s;
    int repeated;
};

typedef vector <data> all_data;

bool comp (const data &d1, const data &d2) {
    return (d1.repeated > d2.repeated);
}

int main() {
    int size;
    while (cin >> size) {
        vector <string> v (size);
        for (int i = 0; i < size; ++i) cin >> v[i];
        sort (v.begin(), v.end());
        all_data u (size);
        u[0].s = v[0];
        int k = 0;
            for (int i = 0; i < size; ++i) {
                if (u[k].s != v[i]) {
                    ++k;
                    u[k].s = v[i];
                    u[k].repeated = 1;
                }
                else ++u[k].repeated;
            }
        sort (u.begin(), u.end(), comp);
        while (k >= 0) {
            int sum = u[k].s.size(), jump = 1;
            bool loop = true;
            for (int j = k - 1; j >= 0 and loop; --j) {
                if (u[k].repeated == u[j].repeated) {
                    ++jump;
                    sum += u[j].s.size();
                }
                else loop = false;
            }
            cout << u[k].repeated << " : " << sum << endl;
            k -= jump;
        }
        cout << endl;
    }
}
you CAN put # includes inside main, you know.
Putting them at the top is just good manners.

I may take a crack at it in a min, but I am not good at modern C++ so it might be a little medieval.
Last edited on
> you CAN put # includes inside main, you know.

IS:
Headers

The entities in the C ++ standard library are defined in headers, whose contents are made available to a translation unit when it contains the appropriate #include preprocessing directive.
...
A translation unit shall include a header only outside of any declaration or definition...
No diagnostic is required.


you CAN put # includes inside main
this may help.

I am getting compiler errors when trying to #include <algorithm> inside the main from a code like this one:
1
2
3
4
5
6
7
8
9
#include <iostream>
#include <vector>
using namespace std;

int main() {
    #include <algorithm>
    int n = 0;
    cout << n << endl;
}


Tried to search about what I am doing wrong but didn't found anything useful. Some say you cannot do it, some say depending on how the library is implemented will give compile errors, others say you can do it, but none give an apropiate exemple on how to do it.
Last edited on
Repeat: In a well formed C++ program, you CAN'T include a standard header within the definition of main.

You are fortunate in that you get compiler errors when you do something that should not be done;
"No diagnostic is required".

What we can or can not do with our own headers depends on how we have written a header;
what we can or can not do with standard library headers is governed by what the standard specifies.
Didn't see your previous post, that settles that we cannot use headers inside the main at less for this problem.

From now on, I will solve the problem myself.

Thanks for all the help and tips.
Sorry about that. Ive done it with my own, but never standard headers, because Ive never had my hands tied by silly requirements like this.


Topic archived. No new replies allowed.