My c++ program i a lot of slower than my java counterpart

Good day. I have written a program in java and c++ which takes an int list as an input and returns a list of all possible permutations. I used the exact same logic for my c++ and my java program (as far as I know) to accomplish this. The problem is that the java code is up to a factor 100 faster t than my c++ code. (I did a benchmark of permutating a 6 Element array 10000 times and took the time taken) Both codes consist of a Permute and a DoPermute method. The user calls the Permute method and it receives an int list as an argument. The Permute calls the DoPermute method, which calls itself recursively later. I would be happy, if someone would explain me how such different results are possible.

Here are the 2 codes:

C++:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
#include <vector>
#include <map>

class Permutation
{
	static void DoPermute(int* keys, int* values, int lenght, const std::vector<int>& choice, int count, std::vector<std::vector<int>>& result)
	{
		if (choice.size() == count)
		{
			result.push_back(choice);
			return;
		}

		for (int i = 0; i < length; ++i)
		{
			if (values[i] <= 0) continue;
			--values[i];
			std::vector<int> new_choice(choice);
			new_choice.push_back(keys[i]);
			DoPermute(keys, values, length, new_choice, count, result);
			++values[i];
		}
	}
public:
	static std::vector<std::vector<int>> Permute(const std::vector<int>& input)
	{
		//Determine amount of each element in input
		std::map<int, int> dict;
		for (int i : input)
		{
			if (dict.count(i) > 0) { dict[i] += 1; }
			else { dict[i] = 1; }
		}
		//Each Integer in input is put into the keys array, and the amount of each integer is put into the values array
		int length = dict.size();
		int* keys = new int[length];
		int* values = new int[length];
		auto iter = dict.begin();
		for (int i = 0; i < length; ++i) {
			auto pair = *iter;
			keys[i] = pair.first;
			values[i] = pair.second;
			++iter;
		}

		//Vector to store all combinations
		std::vector<std::vector<int>> result;
		DoPermute(keys, values, length, std::vector<int>(), input.size(), result);
		delete[] keys;
		delete[] values;
		return result;
	}
};



Java:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
import java.util.*;

class Permutation {

    static List<List<Integer>> permute(List<Integer> input) {
        Map<Integer, Integer> map = new HashMap<>();

        for (int i : input) {
            if (map.containsKey(i)) {
                map.put(i, map.get(i) + 1);
            } else {
                map.put(i, 1);
            }
        }

        int[] keys = new int[map.size()];
        int[] values = new int[map.size()];
        Iterator<Integer> keySet = map.keySet().iterator();
        for (int i = 0; i < map.size(); ++i) {
            int key = keySet.next();
            keys[i] = key;
            values[i] = map.get(key);
        }
        List<List<Integer>> result = new ArrayList<>();
        doPermute(keys, values, new ArrayList<>(), input.size(), result);
        return result;
    }

    private static void doPermute(int[] keys, int[] values, List<Integer> choice, int count, List<List<Integer>> result) {
        if (choice.size() == count) {
            result.add(choice);
            return;
        }
        for (int i = 0; i < keys.length; ++i) {
            if (values[i] <= 0) continue;
            --values[i];
            List<Integer> newChoice = new ArrayList<>(choice);
            newChoice.add(keys[i]);
            doPermute(keys, values, newChoice, count, result);
            ++values[i];
        }
    }

}
Last edited on
Haven't looked at your code; but you may want to have a look at the 'Possible implementation' of std::next_permutation() http://en.cppreference.com/w/cpp/algorithm/next_permutation


> permutating a 6 Element array 10000 times

Worst case (unique elements, so 6! permutations) should take about a second or so.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <vector>
#include <algorithm>
#include <ctime>
#include <cassert>
#include <iostream>

std::vector< std::vector<int> > all_permutations( std::vector<int> input )
{
    std::sort( std::begin(input), std::end(input) ) ;

    std::vector< std::vector<int> > result { input } ;
    
    // http://en.cppreference.com/w/cpp/algorithm/next_permutation
    while( std::next_permutation( std::begin(input), std::end(input) ) )
        result.push_back(input) ;

    return result ;
}

int main()
{
    // permutating a 6 Element array 10000 times
    const std::vector<int> array { 1, 3, 5, 2, 4, 6 } ;

    const auto start = std::clock() ;

    constexpr int N = 10'000 ;
    long long num_permutations = 0 ;
    for( int i = 0 ; i < N ; ++i ) num_permutations += all_permutations(array).size() ;

    const auto end = std::clock() ;

    // sanity check
    assert( num_permutations == N * 2LL * 3 * 4 * 5 * 6 ) ; // N * 6!

    std::cout << (end-start) * 1000.0 / CLOCKS_PER_SEC << " milliseconds.\n" ;
} 

clang++ libc++ -O3 640.099 milliseconds
g++ -O3 559.255 milliseconds.
http://coliru.stacked-crooked.com/a/b2ca43f8b65df2f3

Microsoft -Ox 1087 milliseconds. http://rextester.com/CIX3955
Your code is spending an enormous amount of time pointlessly copying vectors. Every time round your permutation loop, you are making a complete copy of the vector named choice, and all you do with that copy is add something to the end and pass it on.

Copying is very expensive, and as the vector gets longer it only gets more and more expensive. I don't know how expensive it is in the Java version (or maybe in the Java version no copy gets made; I don't know enough about Java to answer that).

Removing the pointless copying from the C++ version significantly reduces the time taken. A quick test I did permuting 10000 values once took 0.132 second in the original, and 0.004 seconds in the version without copying the vectors over and over and over again.

The majority of the time spent in user space in the 0.04 faster version looks like it's primarily involved in housekeeping in the vectors, so there's still a lot of time being spent dealing with memory. Every so often, a push_back will cause the vector to need to be reallocated and copied to make it bigger, which is expensive. Pre-allocating the vector to a suitable size should eliminate that. Again, I don't know if the Java version suffers in the same way.
Last edited on
In @JLBorges' code, simply replacing the vectors of vectors with a vector of arrays avoids extra indirections and allocations, and offers a roughly 600% speedup.
http://coliru.stacked-crooked.com/a/924231fb1956491c
Last edited on
> simply replacing the vectors of vectors with a vector of arrays avoids extra indirections and allocations

Yes.

But it would also violate the interface requirements in the original code:
std::vector<std::vector<int>> Permute(const std::vector<int>& input);
where the number of elements in the input sequence need not be a constant known at compile time.
(Though, with some boiler plate, a small sequence optimisation would still be possible.)
@Repeater but isn't copying necessary, because the vector choice needs to have the exact same contents, when the for loop increments?
In the function DoPermute, choice has three purposes only.

1) You work out the size of it
2) Sometimes you make a copy of it and stick it on the end of result
3) Sometimes you make a copy of it to pass it onwards.

Why make that copy in number 3? Why not just pass choice onwards by reference, since you never ever use it again?
Last edited on
you know how many you need, its a given computation... do a reserve on the vector so it never needs to reallocate internally.
Topic archived. No new replies allowed.