MKL random generator correlates 100%

When checking my random generator, I am getting total correlation for the initialized random sequences. Why?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
    RandomGenerator::RandomGenerator()
{
    vslNewStream (&_stream, VSL_BRNG_MCG31, VSL_RNG_METHOD_UNIFORM_STD);

    vsRngUniform (VSL_RNG_METHOD_UNIFORM_STD, _stream, 1000, _numbers.data(), 0.0f, 1.0f);
    
        _index = 0;
    }

float
RandomGenerator::get_float(float a, float b)
{
    const auto u = _numbers[_index];

    _index++;

    _update();

    return a + (b - a) * u;
}

 for (int32_t i = 0; i < 2; i++)
    {

        RandomGenerator random;

        int32_t ncount = 0;

        float avg = 0;

        std::vector<float> x, y;

        for (int32_t j = 0; j < 100; j++)
        {
            x.push_back(random.get_float(0,10));
        }

        for (int32_t j = 0; j < 100; j++)
        {
            y.push_back(random.get_float(0,10));
        }

        float avgx = 0;

        float avgy = 0;

        for (int32_t j = 0; j < 100; j++)
        {
            avgx = avgx + x[j];

            avgy = avgy + y[j];
        }

        avgx = avgx / 100;

        avgy = avgy / 100;

        float xx = 0;

        float yy = 0;

        for (int32_t j = 0; j < 100; j++)
        {
            xx = xx + (x[j] - avgx)*(y[j] - avgy);

            yy = yy + (x[j] - avgx)*(x[j] - avgx)*(y[j] - avgy)*(y[j] - avgy);
        }

        std::cout<<avgx<<std::endl<<avgy<<std::endl;

        std::cout<<xx/std::sqrt(yy)<<std::endl;

    }
}


The result is 4.99885 5.10161 0.120765 4.99885 5.10161 0.120765 which just shows absolute correlation. Why?
Last edited on
At a rough guess you don't seed your random number generator after you (needlessly) create a new generator object on every pass through the loop:
RandomGenerator random;


You presumably mean it produces the same answers twice, not "shows absolute correlation" in a statistical sense.


The following code doesn't compute anything remotely statistical
yy = yy + (x[j] - avgx)*(x[j] - avgx)*(y[j] - avgy)*(y[j] - avgy);
and the following looks pretty unlikely notation ("xx") for a product of fluctuations in x and y:
xx = xx + (x[j] - avgx)*(y[j] - avgy);


If you want the correlation coefficient then it is
SUM{ (x - xav)*(y-yav) } / SQRT[ SUM{ (x-xav)^2 } * SUM{ (y-yav)^2 } ]
Those are SEPARATE sums of squares in the denominator: you appear to be computing a single sum of a very big product in yy.
Last edited on
Thanks, I understood my possible mistake!
Last edited on
Registered users can post here. Sign in or register to post.