choose random vector entry

Pages: 123
@Mikey Boy
Thanks!

kingkush wrote:

But yes i do completely agree if you understand icy1's example that is currently the best and easiest way to do it if you're looking to shuffle the vector.

Yeah, I don't want to shuffle the vector (or change the vector at all). Just pick a random element. I suppose one could copy the vector and shuffle the copy but this would probly be less efficient?

I also came across the rand( ) and srand( ) functions on youtube and wanted to ask what's the problem with it? I tested it with a self-written dice.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <iostream>
#include <ctime>
#include <cstdlib>

using namespace std;

int main()
{
    int one{0}, two{0}, three{0}, four{0}, five{0}, six{0};
    int z;
    int i=0;
    int N;
    cout << "how often do you want to roll the dice?"<<endl;
    cin >> N;
    srand(static_cast<unsigned int>(time(0)));

    while(i<N){
        z=(rand() % 6) +1;

        if(z==1) one++;
        else if(z==2) two++;
        else if(z==3) three++;
        else if(z==4) four++;
        else if(z==5) five++;
        else six++;
        i++;
    }
    cout << "1 was rolled so many times:    "<< one<<"  with probability P= "<< (double)one/N<<endl;
    cout << "2 was rolled so many times:    "<< two<<"  with probability P= "<< (double)two/N<<endl;
    cout << "3 was rolled so many times:    "<< three<<"  with probability P= "<< (double)three/N<<endl;
    cout << "4 was rolled so many times:    "<< four<<"  with probability P= "<< (double)four/N<<endl;
    cout << "5 was rolled so many times:    "<< five<<"  with probability P= "<< (double)five/N<<endl;
    cout << "6 was rolled so many times:    "<< six<<"  with probability P= "<< (double)six/N<<endl;
    return 0;
}


Now I understand that rand( ) yields a random int between 0 and 30.000 or something.
rand() % n +1 gives a random no. between 1 and n.
I see the problem that, due to modulo, lower int's are a bit more likely than higher ints.

If rand() only gave numbers between 0 and 15, for example, rand()%6 would give numbers between 0 and 5, but the 0,1,2,3 a bit more often.

For the actual rand()%n, however, this seems to be negligible if 30 000 >> n.

Is there another problem in using this 'easy' way or is it just that @Ganado's way is simply avoiding this problem?
The n in my code would be 400, so it might be enough then to choose the easy way.

edit: I will just check it with a dice with 400 sides :D

eidt2: So after 100 billion throws of a dice with 400 sides using rand( ) and srand( ), the relative frequency of the different sides can be divided in two groups. The frequency of the lower numbers is a bit higher than the frequency of the higher numbers, whereas the frequency inside of a group is constant. So the modulo effect is visible (at least at this amount of throws - I don't know yet how many 'throws' I am going to use in my actual program).

Testing Ganado's code now with my dice.
Last edited on
The community has found flaws with the C libs and is telling everyone to use the C++ <random> header; idk why you don't believe them. How is the other way not 'easy' ?

1
2
3
4
5
6
7
8
9
#include <iostream> 
#include <random>

int main()
{
    random_device rd;  // Usual name is 'rd' to not confuse readers familiar with C function
    
    int z = rd()%6 + 1;  // Rest is pretty much the same as yours, without srand or rand
}


I'm also curious about the usage of vectors you mentioned in the opening post. No vectors in these custom dice.
Last edited on
added some more to open up the dice ceiling with less copypasta
also, <iomanip> is awesome:
setw sets the width of the next item in stream
left and right from (std namespace) change alignment

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <iostream>
#include <iomanip>
#include <random>

using namespace std;

int main()
{
    random_device rd;
    int ceiling, r;
    double N;

    cout << "Dice ceiling (e.g. 6)? ";
    cin >> ceiling;
    cout << "Number of rolls? ";
    cin >> N;
    int arr[ceiling]{};
    int times = (int)N;

    while(times--)
    {
        r = rd()%ceiling;
        arr[r]++;
    }

    int rolled;
    for (int i=0; i<ceiling; ++i)
    {
        rolled = arr[i];
        cout << setw(2) << right << (i+1) << " was rolled " <<
                setw(6) << rolled << " times with probability P= " << 
                setw(9) << left << rolled/N << '\n';
    }
    return 0;
}


example output:
Dice ceiling (e.g. 6)?  10
Number of rolls?  30000
 1 was rolled   3030 times with probability P= 0.101    
 2 was rolled   3016 times with probability P= 0.100533 
 3 was rolled   2967 times with probability P= 0.0989   
 4 was rolled   3007 times with probability P= 0.100233 
 5 was rolled   2964 times with probability P= 0.0988   
 6 was rolled   2936 times with probability P= 0.0978667
 7 was rolled   3007 times with probability P= 0.100233 
 8 was rolled   3107 times with probability P= 0.103567 
 9 was rolled   2928 times with probability P= 0.0976   
10 was rolled   3038 times with probability P= 0.101267 
icy1 wrote:
The community has found flaws with the C libs and is telling everyone to use the C++ <random> header; idk why you don't believe them. How is the other way not 'easy' ?

Using rand() and srand() is easier to understand on a beginner level (less namespaces, clocks and classes and so on).
Also, it's not about 'not believing', but asking for confirmation about my guessed reason for why these functions are not perfect:
PhysicsIsFun wrote:
Now I understand that rand( ) yields a random int between 0 and 30.000 or something.
rand() % n +1 gives a random no. between 1 and n.
I see the problem that, due to modulo, lower int's are a bit more likely than higher ints.

If rand() only gave numbers between 0 and 15, for example, rand()%6 would give numbers between 0 and 5, but the 0,1,2,3 a bit more often.



icy1 wrote:

I'm also curious about the usage of vectors you mentioned in the opening post. No vectors in these custom dice.

The dice is just a test program to check the outcomes of different random generators (ganado's version vs. the youtube proposal with rand() and srand() ). It has nothing to do with the program where I actually am going to use it.

Now, if I compare the outcome of the srand() dice...
PhysicsIsFun wrote:

eidt2: So after 100 billion throws of a dice with 400 sides using rand( ) and srand( ), the relative frequency of the different sides can be divided in two groups. The frequency of the lower numbers is a bit higher than the frequency of the higher numbers, whereas the frequency inside of a group is constant. So the modulo effect is visible (at least at this amount of throws - I don't know yet how many 'throws' I am going to use in my actual program).


... with Ganado's proposal, I see that the relative frequencies of getting the numbers 1,...,400 after 100 billion throws with Granado's code are more uniform and a bit closer to the value 1/400.
This should confirm that this code is definitely better than the stuff I have seen on youtube.

It also means that it's better to ask here than watch typical beginner tutorials, if it's about a serious academic project ~~



lol, I'm glad you're having fun with exploration of random number generation.

Now, if you'd actually describe your actual concrete problem (preferably in a self-contained mini-program), I'm sure you'll get lots of concrete answers. You could be a politician, you know, with how you phrased the topic of this thread and then carefully avoided any details involving the vectors themselves ;D
Sorry, I am just a bit worried about using something without understanding the inner working or why I am using that routine and not another one :(

The program I am writing is a Monte Carlo simulation. If you are familiar with it, I can post the code as soon as I'm done. I will probably post it anyway, to fix all the bugs that prbly will be there ~~


Next, I also need a way to pick a random double in [-1,1].
Is there an equivalent version of Ganado's algorithm or an even simpler way?

edit: I realize I could just define a double delta=0.0001 and then use the previous code to generate a random_int in [-10 000, 10 000] and then get my random double through delta*random_int.
Last edited on
Use uniform_real_distribution instead of uniform_int_distribution :)
Produces random floating-point values i, uniformly distributed on the interval [a, b), that is, distributed according to the probability density function: ...

https://en.cppreference.com/w/cpp/numeric/random/uniform_real_distribution
Last edited on
Alright, thank you!

What about my proposal, is this worse?

edit: I realize I could just define a double delta=0.0001 and then use the previous code to generate a random_int in [-10 000, 10 000] and then get my random double through delta*random_int.
Not certain about it being worse in practice, but it strikes me that
a.) that's actually a discrete distribution (obviously)
b.) shrinking the delta to the point it's "close enough" might introduce precision issues.

Perhaps using an exactly-representable delta would be a better choice as it gets progressively smaller .
double delta = 0x1.p-10; // exactly representable, delta = (1 / 1024)
Last edited on
PhysicsIsFun, I suggest using uniform_real_distribution if you want to generate floating-point numbers between -1.0 and 1.0. Using a system of deltas just produces unnecessary discreteness; if your numbers are going to stay small, it doesn't really matter (each discrete point will probably be within the margin of error of measurement). But if you're planning on then multiplying that random number by a large factor, the discreteness of it will be more noticeable.

a.) that's actually a discrete distribution (obviously)

So is all floating-point math! But the gaps are definitely much, much wider by doing [-10000, -9999, ...10000] / 10000.
If you are a beginner, then there is no shame in doing things the beginners way. This is how it was done before someone made it better.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include<iostream>
#include <ctime>
#include <vector>

using namespace std;

int main() {

	srand(time(NULL));
	
	vector<int> myVector{ 1,2,3,4,5,6,7,8,9,10 };

	int aRandomInt = rand() % 10; //a random number from 0 to 9

	cout << myVector[aRandomInt];
		


	cin.get();
	return 0;
}
mbozzi wrote:
Not certain about it being worse in practice, but it strikes me that
a.) that's actually a discrete distribution (obviously)
b.) shrinking the delta to the point it's "close enough" might introduce precision issues.

Perhaps using an exactly-representable delta would be a better choice as it gets progressively smaller .
double delta = 0x1.p-10; // exactly representable, delta = (1 / 1024)


Good point, I did not think about precision stuff. Just that I could keep decreasing the delta to approximate the continuous distribution.

Ganado wrote:
PhysicsIsFun, I suggest using uniform_real_distribution if you want to generate floating-point numbers between -1.0 and 1.0. Using a system of deltas just produces unnecessary discreteness; if your numbers are going to stay small, it doesn't really matter (each discrete point will probably be within the margin of error of measurement). But if you're planning on then multiplying that random number by a large factor, the discreteness of it will be more noticeable.


I don't think I am going to multiply it with a large number, but I get your point and, for the sake of 'good practice', will go with your suggestion. Thanks again :)

@manga
While I am a C++ beginner, the results of my code must not be beginner-like. Your proposal was discussed above, but thanks anyway :)


Thank you guys, I have learned a lot in this thread!
Hey guys, a question to the random generator for the double values.

Here is my code based on @Ganado proposal earlier:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <iostream>
#include <random>
#include <chrono>

using namespace std;

//this function gives a random double in [a,b], similar method to random_index
double random_double(double a, double b){
    unsigned seed =chrono::system_clock::now().time_since_epoch().count();
    mt19937 gen(seed);
    uniform_real_distribution<> random_dist(a, b);

    return random_dist(gen);
}


Now, if I create 20 000 random doubles between -0.1 and 0.1 and write them to a file like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
#include "random_number_generators.h"
#include <fstream>

int main(){

    double dl=0.1;
    ofstream test_random("testrandom.dat");

    for(int i=1; i<20000;i++){
        test_random<<dl*random_double(-1,1)<<"  "<<i<<endl;
    }
    test_random.close();
}


... I get sections where the numbers don't seem to change at all and stay more or less constant:
It starts with 0.0894171 and stays constant

0.0894171  463
0.0894171  464
0.0894171  465
0.0894171  466
0.0894171  467
0.0894171  468
0.0894171  469
0.0894171  470
0.0894171  471
0.0894171  472
0.0894171  473
0.0894171  474
0.0894171  475
0.0894171  476
0.0894171  477
0.0894171  478
0.0894171  479
0.0894171  480
0.0894171  481
0.0894171  482
0.0894171  483
-0.00719267  484
-0.00719267  485
-0.00719267  486
-0.00719267  487
-0.00719267  488
-0.00719267  489
-0.00719267  490
-0.00719267  491
-0.00719267  492
-0.00719267  493
-0.00719267  494
-0.00719267  495
-0.00719267  496
-0.00719267  497
-0.00719267  498
-0.00719267  499
-0.00719267  500
-0.00719267  501
-0.00719267  502
-0.00719267  503
-0.00719267  504
-0.00719267  505


It doesn't change until 1316, then another number appears that stays constant for a while and so on. Is there something wrong with my code?

Regards
I see that the code you're using seeds the random number generator on every call.

If you're calling it so frequently that the seed value doesn't change, then you'll get the same value each time. You should seed you random number generator only once.
Ah, this is the problem?

So in my simulation I have like 10 thousands of loop steps, and in each one I want a random double from the interval.
For this purpose, I should only seed it once?

I thought if I seed it again and again, I would just change the set of random numbers I can get, so it doesn't change the randomness of my results and would not lead to problems.
Last edited on
Yes. A typical random number generator isn't random; it starts with a number (the seed) and calculates the next number based on that, and then the next based on that one, and so on.

Same seed, same sequence. Seed your generator once only.
Ok, so just for my understanding:

If I seed the generator based on actual time, and then get the first random number. And then reseed it instantenously again and take a random number, the random numbers will be the same, because the seeds were identical and I took the first number of both (identical) sequences?
Yes.

The sequence of pseudo random numbers generated depends on the internal state of the random number engine. Seeding a random number engine resets the internal state of the engine using the seed values that are provided.

If reasonably high quality pseudo randomness is required, something like this, perhaps:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#include <iostream>
#include <random>
#include <chrono>
#include <thread>
#include <iomanip>

std::mt19937 randomly_seeded_engine() {

    // create a seed sequence of several reasonably random values
    std::seed_seq seed_seq { (unsigned int) std::random_device{}(),
                             (unsigned int) std::chrono::system_clock::now().time_since_epoch().count(),
                             (unsigned int) std::random_device{}(),
                             (unsigned int) std::chrono::steady_clock::now().time_since_epoch().count(),
                             (unsigned int) std::random_device{}(),
                             (unsigned int) std::hash< std::thread::id >{}( std::this_thread::get_id() ) };

    return std::mt19937(seed_seq) ; // note: the seed sequence provides a warm up sequence for the rng
}

double random_double( double a, double b ) {

    static auto rng = randomly_seeded_engine() ;
    static std::uniform_real_distribution<double> distribution ;

    distribution.param( std::uniform_real_distribution<double>::param_type{ a, b } ) ;
    return distribution(rng) ;
}

int main() {

    const double dl = 0.1 ;
    const double a = -1 ;
    const double b = +1 ;

    std::cout << std::fixed << std::setprecision(8) << std::showpos ;

    for( int i = 0 ; i < 50 ; ++i ) {

        for( int j = 0 ; j < 10 ; ++j )
            std::cout << std::setw(11) << dl * random_double(a,b) << ' ' ;
        std::cout << '\n' ;
    }
}

http://coliru.stacked-crooked.com/a/47e1e31f9bd8a2e2
http://rextester.com/JPV13207
Thank you!

So I set the seed to a constant value now and put it in front of the loop. I make it constant to be able to reproduce my simulation results.
But it should already be good randomness like this, or?
You have probably seen a roulette wheel: https://www.vectorstock.com/royalty-free-vector/american-roulette-wheel-vector-13367259

1. Put your finger on one of the pockets.
2. When you read the number of the pocket, you move your finger to the next pocket.
3. You can repeat step 2 as many times as you want.

The reads give you a sequence of numbers. If you had chosen different pocket in step 1, the sequence would be different. The wheel is small though, and things start to repeat real quickly.


The pseudo-random engines are wheels too. They are just much bigger. Not 37/38 but billions of pockets. The step 1 is seeding.


The std::random_device does not need seeding, because is ought to use "real" randomness from hardware. There is no predictable sequence. However, at least MinGW-implementation of GCC on Windows does not return hardware randomness, but an entirely known sequence.


Pages: 123