best way to distribute array elements uniquely between 2 smaller arrays

Let's say I have an array of values
const valuePool[10000] = {1, 2, 3, 5, 7, 9, 11, 13, 21, 22, ...};
(The actual values in array are not sequential, nor do they follow any particular pattern)

I need to distribute the values of valuePool randomly and uniquely between 2 other arrays and have found an acceptable solution. I think there is a problem if the size of valuePool is arbitrarily large though.

Currently, I have a function that chooses an index randomly and assigns the value of that index to the first array (subarr1[5000]), then copies the chosen value to a second array chosen[10000]. The function then repeats for subarr2[5000]. Each time an array index is selected, the value is compared against all values in the chosen[10000] array. If it is a duplicate, the index is chosen again. When both sub arrays have 5000 elements, the function returns.

When 9999 values have already been assigned, the odds of the function randomly choosing the last index are 1 in 10000. Since the processor performs millions of instructions each second, this low probability doesn't matter that much.
But what if the valuePool array has 1000000 elements, or 100 000 000 000? Is there a technique or algorithm that compensates for this low probability of choosing the final element? Or would I still just "brute force" my way through it and rely on the sheer number of calculations per second?
I need to distribute the values of valuePool randomly and uniquely between 2 other arrays and have found an acceptable solution.

what do you mean with uniquely?
may your valuePool have the same number twice?

I think there is a problem if the size of valuePool is arbitrarily large though.
Yeah, that's true.
You talk about complexety, your problem sounds like a O(n) problem, that means that the time of the operation rises linear with the number of elements.
The problem you are trying to solve looks like it has a O(n) complexity and therefore I think you can't really do much about arbitrary many elements...

I have a question:
do you really have arrays of that size or pointers to that much dynamically allocated memory?
Do you still need the old array?
do you need to modify the new arrays? (the old is const so I think that shouldn't be changed)
Last edited on
Use std::shuffle to rearrange the contents of the array. Then just copy the contents into the smaller arrays. If you can't change the original array then copy it first.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// shuffle algorithm example
#include <iostream>     // std::cout
#include <algorithm>    // std::shuffle
#include <random>       // std::default_random_engine
#include <chrono>       // std::chrono::system_clock
void print(int arr[], const unsigned SIZE);
int main ()
{

    const unsigned SIZE = 10;
    const unsigned HALF_SIZE = SIZE / 2;
    int arr[SIZE] = {4, 85, 99, 20, 1, 0, 54, 88, 12, 22};

    // obtain a time-based seed:
    unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();

    std::shuffle (arr, arr + SIZE, std::default_random_engine(seed));


    int arr2[HALF_SIZE] = {0};
    int arr3[HALF_SIZE] = {0};

    std::copy(arr, arr + HALF_SIZE, arr2);
    std::copy(arr + HALF_SIZE, arr + SIZE, arr3 );

    print(arr2, HALF_SIZE);
    print(arr3, HALF_SIZE);

    std::cin.ignore();

  return 0;
}
void print(int arr[], const unsigned SIZE)
{
    for(int i = 0; i < SIZE; ++i)
        std::cout << arr[i] << ", ";

    std::cout << std::endl;
}


I'm not sure what you mean by uniquely. Does that mean that the numbers in the original array could contain duplicates?

Note: some of the code above was copied from here
http://www.cplusplus.com/reference/algorithm/shuffle/
Last edited on
By uniquely, I meant that each element of the original array can only appear once in both of the smaller arrays, like a deck of cards. The deck is shuffled and cards are dealt to both players. There is only one copy of each card between both players.

The function that chooses a random index is like the shuffling of the deck and the chosen[] array is used to ensure only one copy of the element or 'card' exists in either sub array.

And each element of the original array appears only once.

@yanson: I just looked at your code and realized that I was trying to reinvent the wheel. It seems that my solution was an unoptimized and unnecessarily complex version of std::shuffle.

Last edited on
You have, however, an interesting problem.

1) Subarrays must have unique elements. (The same value may appear once in each subarray? Or once in both subarrays taken together?)

2) Subarrays must be randomized.

3) I assume the algorithm may work offline (meaning, that you have the complete and bounded valuePool before the algorithm begins to work).

Unfortunately, unless you factor in a significant memory cost, you can only do this by first sorting and then randomizing.

The STL can help you here:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// 1) Sort valuePool
std::sort( std::begin(valuePool), std::end(valuePool) );

// Split the values to your subarrays
// (There are many ways to do this. This method makes a value unique to both subarrays.) 
auto b = std::begin(valuePool);
auto e = std::end(valuePool);
auto d = std::distance(b,e);
std::unique_copy( b, std::next(b,d/2), std::begin(arr2) );
std::unique_copy( std::next(b,d/2), e, std::begin(arr3) );

// 2) Randomize the subarrays
std::shuffle( std::begin(arr2), std::end(arr2), my_rng );
std::shuffle( std::begin(arr3), std::end(arr3), my_rng );
Another way to do it:
1
2
3
4
5
6
7
8
9
10
11
12
13
// 1) sort and unique
auto b = std::begin(valuePool);
auto e = std::end(valuePool);
std::sort(b,e);
e = std::unique(b,e);

// 2) randomize
std::shuffle(b,e,my_rng);

// split
auto d = std::distance(b,e);
std::copy( b, std::next(b,d/2), std::begin(arr2) );
std::copy( std::next(b,d/2), e, std::begin(arr3) );

Hope this helps.
Topic archived. No new replies allowed.