True random number generator

Hi!

On June 2015, we've unexpected discovered, that it is possible to generate true random numbers using only software.

Can you help us?

a) Explain in detail why these codes generate random numbers.
b) Try to prove the opposite, that they are pseudo-random numbers.
c) Do they work in your computer and operating system?

For computers with more than two CPUs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include<chrono>
#include<iostream>
#include<thread>
using namespace std;

#define Mili 7
#define Base 7
#define Thrs 2

typedef unsigned char      Num;
typedef unsigned long long Out;

volatile Out out,tmp;

void inline thr(Num const num){
  while(true)
    out=out*Base+tmp,tmp=tmp*Base+num;
}

int main(){
  thread ths[Thrs];
  for(Num i=0;i<Thrs;++i)
    ths[i]=thread(thr,i);
  this_thread::sleep_for(chrono::milliseconds(Mili));
  while(true)
    cout.write((char*)&out,sizeof(out));
}

For computers with two CPUs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include<iostream>
using namespace std;

#define Base 7
#define Iter 53

typedef unsigned char      Num;
typedef unsigned long long Out;

volatile Out out,tmp;

int main(){
  for(Num i=0;i<Iter;++i)
    #pragma omp parallel for schedule(runtime)
    for(Num num=0;num<Iter;++num)
      out=out*Base+tmp,tmp=tmp*Base+num;
  while(true){
    #pragma omp parallel for schedule(runtime)
    for(Num num=0;num<Iter;++num)
      out=out*Base+tmp,tmp=tmp*Base+num;
    cout.write((char*)&out,sizeof(out));
  }
} 

For computers with one CPU:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include<chrono>
#include<iostream>
#include<thread>
using namespace std;

#define Nano 1
#define Mili 7
#define Base 7
#define Thrs 2

typedef unsigned char      Num;
typedef unsigned long long Out;

volatile Out out,tmp;

void inline thr(Num const num){
  while(true)
    out=out*Base+tmp,tmp=tmp*Base+num;
}

int main(){
  thread ths[Thrs];
  for(Num i=0;i<Thrs;++i)
    ths[i]=thread(thr,i);
  this_thread::sleep_for(chrono::milliseconds(Mili));
  while(true)
    cout.write((char*)&out,sizeof(out)),
    this_thread::sleep_for(chrono::nanoseconds(Nano));
} 

You can find more information here:

http://ncomputers.org/projects/random
http://youtube.com/watch?v=AKy00kxClJk

Thank you for helping us!
They don't. your thr function is a simple LCG which is a bad pseudorandom generator. Then you add a bit of undefined behavior with race condition when many threads are trying to access non-atomic variable without synchronisation, then you sleep for some time letting threads to change function several time and then outputting its value.

It is still a pseudorandom generator with resulting value dependend on amount of times loop was executed. (as x86-64 architecture is likely to make accesses atomic). Which is pretty trivial to predict or manipulate in case of serious attack.
MiiNiPaa: Thank you very much for your answer. I trust you! Let's try to predict the next number! How should we start?
If ncomputers.org is a company, there's some hopelessly clueless programmer in there that needs to get fired before they ruin the company with their ineptitude. Judging from your posts, you guys seem to not know the most trivial stuff, yet you somehow manage to get involved in seemingly non-trivial projects.
Last edited on
helios: Thank you very much for your answer. We are a team of researchers and are also imperfect beings.

If you can successfully predict the numbers, proving that they are pseudo-random, then he will be fired :D
@MiiNiPaa, helios
You seem to have forgotten who you are dealing with.
Is that so?
Duoas
Yes, probably: We are a small team of researchers. We come here and ask for your help. We can't prove that these codes generate true random numbers, only disprove it, finding the way to predict them or disproving the hypothesis, proving that they are not related to Heisenberg's uncertainty principle.

We are humans like you and other almost 8 billion beings, what means, that we can make mistakes.

We are here asking for your help, that's all.

helios
Yes, he is betting his career as researcher.
I was replying to Duoas.

If you can successfully predict the numbers, proving that they are pseudo-random
How much control do we have of the system, in this hypothetical scenario?
helios
1 - 7 Shielded CPUs with cset on Debian GNU/Linux, SSH, chrooted user

If your question goes to other direction, please let me know it :D
MiiNiPaa
Thank you very much! You helped me to get the answer! You could describe the code so good, that I've researched what you wrote and got the answer.

These codes generate true random numbers because:

1. They implement a linear congruential generator, which generate pseudo-random numbers.
2. Unpredictable data races upgrades the pseudo-randomness into true-randomness.
3. CPU time jitter upgrades the pseudo-randomness into true-randomness.

The CPUs are physical sources of entropy. Due to their energetic inefficiency, when executing a single instruction, they produce entropy, which is finally exchanged to the environment through the heatsink. This entropy impacts on the results of time jitter and data races.

To predict a number it is required to know the complete current state of the whole system, what is impossible thanks to the uncertainty principle.

The normal and paused versions offer better results when they are executed with higher priority, low workloads or by shielded CPUs, because this avoids, that the numbers remain unchanged for large sequences.

The quality of them differ:

- normal.cpp: Offers the best performance and generate true random numbers thanks to data races.
- openmp.cpp: Offers a high-performance alternative for computers with two CPUs, which also generate true random numbers thanks to data races.
- paused.cpp: Offers the best quality despite the lowest performance. This generate true random numbers thanks to data races when is executed by at least two CPUs and time jitter by one or more CPUs.

The numbers are like cats of Schrödinger's paradox. They are undefined until they are read.

References
https://www.kernel.org/doc/ols/2014/ols2014-mueller.pdf
http://www.researchgate.net/publication/224578073_Software_Random_Number_Generation_Based_on_Race_Conditions
Unpredictable data races
The race conditions in that particular code are very subtle. Yes, the exact sequence may impossible to predict, but the overall behavior will be extremely similar to an LCG with a few hiccups here and there. If you bet a cryptanalyst $1 for every number they guess right, they'll rob you blind.

3. CPU time jitter upgrades the pseudo-randomness into true-randomness.
You're not even measuring the jitter in that code. At least read your own citations.

The CPUs are physical sources of entropy. Due to their energetic inefficiency, when executing a single instruction, they produce entropy, which is finally exchanged to the environment through the heatsink. This entropy impacts on the results of time jitter and data races.
Are you confusing thermodynamic entropy and information-theoretical entropy?
helios
the exact sequence may impossible to predict

Thank you very much for your answer :D

At least read your own citations.

I read them.

Are you confusing thermodynamic entropy and information-theoretical entropy?

No.

If you bet a cryptanalyst $1 for every number they guess right, they'll rob you blind.

Let's run the NIST test! :D

http://csrc.nist.gov/publications/nistpubs/800-22-rev1a/SP800-22rev1a.pdf
helios
Thank you very much for helping me to understand and explain these stuffs better.

I ran the NIST test, but don't understand the results. So I contacted a professional cryptologist and are waiting for his answer. Do you want to see the results of these tests?

I wrote these explanations only for you. What do you think about them?

The thermodynamic entropy, produced by the CPUs energetic inefficiency, impacts on physical properties, such as size, due to the thermal expansion, and resistance, due to small variations of temperature. The results of the data races depends on the exact speed of the CPUs, which depends on the physical state of the CPUs.

The time is crucial to determine the result of a data race. When did thread 0 start to change the address 0x01? Which value had that memory address in that moment? Which was the state of thread 1? Which was the exact speed of thread 0 and thread 1? Which was the exact thermodynamic entropy of CPU 0 and CPU 1? Which was the exact conductance of each CPU circuit? Which was the exact frequency of computer's oscillators? Thanks to the uncertainty principle, these questions are theoretically impossible to answer. The smallest attempt to check the CPUs states will change the CPUs states. It is possible to know which state had the CPU, but it is not possible to know the current state. It is possible to know what happened, but it is not possible to know what is happening. How are you going to know the exact position of each electron without changing its position?

This implementation of CPU jitter also depends on the exact speed of the CPU. The jitter deviation occurs when the real speed of the CPU is confronted with the real time clock of the PC. The randomness happens when one or both rates of change have small variations, impacting on the amount of iterations of the linear congruential generator before the number is read. In other words, the randomness occurrs thanks to small changes of the relative speed between two clocks. One is supposed to have always the same speed, the real time clock, while the other changes slightly with the simple action of making the CPU execute one single instruction.

The thermodynamic entropy is used to produce informatic entropy.
Last edited on
The thermodynamic entropy, produced by the CPUs energetic inefficiency, impacts on physical properties, such as size, due to the thermal expansion, and resistance, due to small variations of temperature. The results of the data races depends on the exact speed of the CPUs, which depends on the physical state of the CPUs.
The relative timing of the threads depends most directly on how they are scheduled by the kernel. You can't just assert that the things you mentioned measurably affect the results, especially when those results are timing-dependent, not speed-dependent. You might have an argument if the cores heated up differently, but that's not the case, even in multi-processor systems, since all processors are performing the same task.

The time is crucial to determine the result of a data race. When did thread 0 start to change the address 0x01? Which value had that memory address in that moment? Which was the state of thread 1? Which was the exact speed of thread 0 and thread 1? Which was the exact thermodynamic entropy of CPU 0 and CPU 1? Which was the exact conductance of each CPU circuit? Which was the exact frequency of computer's oscillators? Thanks to the uncertainty principle, these questions are theoretically impossible to answer.
You're overcomplicating things. Like I said, there's no reason to think that the temperature of the CPU would affect the results.
This is, approximately, the code the compiler will generate for the threads:
1
2
3
4
5
6
7
8
9
10
11
    mov rdx, [num]
loop:
    mov rax, [out]
    mov rcx, [tmp]
    lea rax, [rax * 7 + rcx]
    mov [out], rax
    
    mov rax, [tmp]
    lea rax, [rax * 7 + rdx]
    mov [tmp], rax
    jmp loop
Things to notice:
1. Writing to out depends on a single read from out and a single read from tmp.
2. Writing to tmp depends on a single read from tmp.
3. Although not explicitly stated here, all addresses are aligned (because you didn't tell the compiler otherwise).
From #3 we know that loads and stores will be atomic. You will never get or save a value in a half-updated state.
#1 and #2 imply that the value of the write depends on consistently-valued variables, from the point of view of a single thread. If in order to get the next value of out you had to read out twice, a thread might encounter that the value of out has changed, even though it has not written to it.
So what will happen at any given moment almost all the time is that n threads will update out n times to the same value, and m threads will update tmp m times to the same value, where n+m = number of cores in the system. In order for the execution to significantly deviate from this, the scheduler would need to pause one of the threads between a load and a store for several iterations, such that when it resumes and stores the value, it will cause it to jump a few iterations back. I wouldn't except this situation to happen very often, though.

This implementation of CPU jitter also depends on the exact speed of the CPU. The jitter deviation occurs when the real speed of the CPU is confronted with the real time clock of the PC. The randomness happens when one or both rates of change have small variations, impacting on the amount of iterations of the linear congruential generator before the number is read.
You're reading the results as fast as the thread is able to, independently of any timings external to the CPU clock.
The paper on jitter obtained entropy from the variation in nanoseconds of the speed of checking the time in nanoseconds. You're not doing anything that's in any way similar to that.
helios
Seeing the system at macro level, I agree with you.

Seeing the system at micro level, I disagree with you.

At a macro level, a straight line is constantly straight.

At a macro level the CPU temperature is 53°C.

At a micro level, a straight line is dynamically trying to be straight.

At a micro level, the CPU temperature is ~53°C.

Two good examples are:

https://en.wikipedia.org/wiki/Gutmann_method#Technical_overview
https://en.wikipedia.org/wiki/Fourier_series

An easy explanation of how thermodynamic entropy impacts on the results of data races is:

(Both threads start at the same time and tmp=0).

1
2
3
    mov rax, [tmp]
    lea rax, [rax * 7 + rdx]
    mov [tmp], rax

The speed of CPU 1 was: 2.1000001 GHz
The speed of CPU 2 was: 2.0999999 GHz

The result:

Thread 1:
1
2
3
    mov rax, [tmp]          # rax=0
    lea rax, [rax * 7 +rdx] # rax=rax*7+0=0
    mov [tmp], rax          # tmp=0 

Thread 2:
1
2
3
    mov rax, [tmp]          # rax=0
    lea rax, [rax * 7 +rdx] # rax=rax*7+1=1
    mov [tmp], rax          # tmp=1 

Because thread 2 finished last, the result is: tmp=1

If thread 2 were faster than thread 1, the result were: tmp=0

You might have an argument if the cores heated up differently, but that's not the case

No matter how sophisticated is the cooling system. There will always be a microscopic difference, similar to the microscopic differences, that ballistics study.

even in multi-processor systems, since all processors are performing the same task.

All processors have different energetic efficency. Maybe one has: 99.7000001% and the other 99.70020001%. So, they will heat up different. Also the density of thermal conductor between two CPUs and the heatsink will be different. Maybe +/- 1 g/m3 or less.

I want to thank you for your kind help. You help me to learn, think, and improve myself. I admire your courage and domain of the english language. You're a motive to make me try to expand our knowledge and understanding, what is good for me and hope to be a good motive for you too.
Last edited on
# rax=rax*7+0=0
[...]
# rax=rax*7+1=1
Wut? Why is rdx different, when it's not modified at all inside the loop?

Because thread 2 finished last, the result is: tmp=1
If both threads entered that section at the same time, then they both read the same value off tmp, which means that LEA will update rax to the same value, since the result depends only on what was read from tmp. Regardless of what order they finish that section in, they will both write the same value back into tmp.

A thermodynamic argument can only prove that the generated sequence is at least somewhat random, which I have already conceded. My point is that this particular algorithm gets most of its entropy from the seed, and not from race conditions in the circuitry. If you want to show otherwise, you need to show that thermodynamics will cause the threads to go significantly out of sync from each other, such that at any given point the current element of the sequence has very little correlation with the previous element.
Instead of arguing the philosophy of quantum mechanics with me you could be fixing the algorithm, which I have indirectly explained how to do in my previous post.
Wut? Why is rdx different, when it's not modified at all inside the loop?

rdx stores the num variable.

1
2
3
4
5
6
7
8
void inline thr(Num const num){
  while(true)
    out=out*Base+tmp,tmp=tmp*Base+num;
}

thread ths[Thrs];
for(Num i=0;i<Thrs;++i)
  ths[i]=thread(thr,i);

A thermodynamic argument can only prove that the generated sequence is at least somewhat random, which I have already conceded. My point is that this particular algorithm gets most of its entropy from the seed, and not from race conditions in the circuitry. If you want to show otherwise, you need to show that thermodynamics will cause the threads to go significantly out of sync from each other, such that at any given point the current element of the sequence has very little correlation with the previous element.
Instead of arguing the philosophy of quantum mechanics with me you could be fixing the algorithm, which I have indirectly explained how to do in my previous post.

Ok helios :D
Last edited on
helios
you need to show that thermodynamics will cause the threads to go significantly out of sync from each other

Ok. That will be difficult and an expensive experiment. But probably it will be done.

the current element of the sequence has very little correlation with the previous element.

I agree with you. That's why it is required to answer these questions in order to predict the numbers:

When did thread 0 start to change the output? Which value had the possible output at that moment? Which was the state of thread 1? Which was the exact speed of thread 0 and thread 1? Which was the exact thermodynamic entropy of CPU 0 and CPU 1? Which was the exact conductance of each CPU circuit? Which was the exact frequency of computer's oscillators?

Instead of arguing the philosophy of quantum mechanics with me you could be fixing the algorithm, which I have indirectly explained how to do in my previous post.

Since you wrote that post I've been analyzing the code, trying to improve it. And used all options, that I could. This is the best way I actually have to generate true random numbers through CPU time jitter and data races.

From #3 we know that loads and stores will be atomic. You will never get or save a value in a half-updated state.

Thank you very much helios. I didn't know that. You taught me something new! :D

You're reading the results as fast as the thread is able to, independently of any timings external to the CPU clock.
The paper on jitter obtained entropy from the variation in nanoseconds of the speed of checking the time in nanoseconds. You're not doing anything that's in any way similar to that.

I agree with you. The version for more than two CPUs reads the results as fast as the thread is able to. The version for one CPU confronts the CPU speed with the real time clock.

Code analysis
This is an extract of the binary dump compiled for x86_64 and refers to: void inline thr(Num const num)

1
2
3
4
5
6
7
8
9
10
11
12
mov    (out),%rdx        #read the value of out
mov    (tmp),%rcx        #read the value of tmp
lea    0x0(,%rdx,8),%rax #step 1: multiply out*7
sub    %rdx,%rax         #step 2: multiply out*7
add    %rcx,%rax         #add the value of tmp to out
mov    %rax,(out)        #save the result (out=out*7+tmp)
mov    (tmp),%rdx        #read the value of tmp
lea    0x0(,%rdx,8),%rax #step 1: multiply tmp*7
sub    %rdx,%rax         #step 2: multiply tmp*7
add    %rsi,%rax         #add the value of num
mov    %rax,(tmp)        #save the result (tmp=tmp*7+num)
jmp    beginning         #do it again 

Defining a clock as a device with a rate of change through the time: A data race is the result of confronting two clocks and CPU time jitter is the confrontation of both clocks.

The CPU speeds change slightly due to their physical entropy caused by the execution of a single instruction, due to their energetic inefficiency.

When the entropy level is enough to activate the heatsink fan, then this entropy will be exchanged faster with the environment.

When executing two threads exactly at the same time in two different CPUs and when tmp=out=0, there can happen two cases:

Thread 0

1
2
3
4
5
6
7
8
9
10
11
mov    (out),%rdx        #rdx=0
mov    (tmp),%rcx        #rcx=0
lea    0x0(,%rdx,8),%rax #rax=0
sub    %rdx,%rax         #rax=0
add    %rcx,%rax         #rax=0
mov    %rax,(out)        #out=0
mov    (tmp),%rdx        #rdx=0
lea    0x0(,%rdx,8),%rax #rax=0
sub    %rdx,%rax         #rax=0
add    %rsi,%rax         #rax=0
mov    %rax,(tmp)        #tmp=0 

Thread 1

1
2
3
4
5
6
7
8
9
10
11
mov    (out),%rdx        #rdx=0
mov    (tmp),%rcx        #rcx=0
lea    0x0(,%rdx,8),%rax #rax=0
sub    %rdx,%rax         #rax=0
add    %rcx,%rax         #rax=0
mov    %rax,(out)        #out=0
mov    (tmp),%rdx        #rdx=0
lea    0x0(,%rdx,8),%rax #rax=0
sub    %rdx,%rax         #rax=0
add    %rsi,%rax         #rax=1
mov    %rax,(tmp)        #tmp=1 

Case #1

The relative speed between the two clocks is different than zero.
CPU 0 executed the above instructions at a speed of: 2.1111111 GHz
CPU 1 executed the above instructions at a speed of: 2.0999999 GHz
Relative speed between the two clocks: 0,0111112 GHz

CPU 0 were faster and CPU 1 where slower.

tmp=0 was written before tmp=1.
Hence the final result is: tmp=1

Case #2

Relative speed between the two clocks is exactly zero.

CPU 0 executed the above instructions at a speed of: 2.1111111 GHz
CPU 1 executed the above instructions at a speed of: 2.1111111 GHz
Relative speed between the two clocks: 0 GHz

CPU 0 and CPU 1 wrote a different value at the same time, what put the last bit of tmp on a superposition state.
It is 1 and 0 at the same time. This superposition may live for one planck time or until its value is read by thread 0, thread 1 or thread 2.

Thread 2

1
2
while(true)
  cout.write((char*)&out,sizeof(out));

It is also possible, that gravitational waves caused by the flow of energy inside the CPUs produce small time dilations.

Desynchronization

1
2
3
4
5
6
7
8
9
10
11
mov    (out),%rdx        #out may be changed by the other thread
mov    (tmp),%rcx        #tmp may be changed by the other thread
lea    0x0(,%rdx,8),%rax
sub    %rdx,%rax
add    %rcx,%rax
mov    %rax,(out)        #out may never be read
mov    (tmp),%rdx        #tmp may be changed by the other thread
lea    0x0(,%rdx,8),%rax
sub    %rdx,%rax
add    %rsi,%rax
mov    %rax,(tmp)        #tmp may never be read 
Last edited on
helios
3. Although not explicitly stated here, all addresses are aligned (because you didn't tell the compiler otherwise).
From #3 we know that loads and stores will be atomic. You will never get or save a value in a half-updated state.

I agree with you. Some operations happen atomically. I've verified it.

At least my computer makes byte operations atomically, but not operations with 64 bits.

I ran this code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include<iostream>
#include<thread>
using namespace std;

typedef unsigned long long Out;

volatile Out out;

void inline thr(Out const num){
  while(true)
    out=num;
}

int main(){
  thread ths[]={
    thread(thr,0xaaaaaaaaaaaaaaaa),
    thread(thr,0x5555555555555555)};
  while(true)
    cout.write((char*)&out,sizeof(out));
}

If the operations in out and tmp, which have a size of 64 bits, because they are long long, would be completely atomically, all 64 bits streamed by the above code, were:

aaaaaaaaaaaaaaaa

or

5555555555555555

But, sometimes the output were:

5555aa5555555555

or

aaaa55aaaaa55aaa

:D
Topic archived. No new replies allowed.