Why is there a volatile qualifier in the variable "d" in the function "f()"?

I am trying to understand the differences between processor time (CPU time) and the real time (Wall clock time). I found a very good example here:
http://en.cppreference.com/w/cpp/chrono/c/clock

Here is the code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include <iostream>
#include <iomanip>
#include <chrono>
#include <ctime>
#include <thread>
 
// the function f() does some time-consuming work
void f()
{
    volatile double d = 0;
    for(int n=0; n<10000; ++n)
       for(int m=0; m<10000; ++m)
           d += d*n*m;
}
 
int main()
{
    std::clock_t c_start = std::clock();
    auto t_start = std::chrono::high_resolution_clock::now();
    std::thread t1(f);
    std::thread t2(f); // f() is called on two threads
    t1.join();
    t2.join();
    std::clock_t c_end = std::clock();
    auto t_end = std::chrono::high_resolution_clock::now();
 
    std::cout << std::fixed << std::setprecision(2) << "CPU time used: "
              << 1000.0 * (c_end-c_start) / CLOCKS_PER_SEC << " ms\n"
              << "Wall clock time passed: "
              << std::chrono::duration<double, std::milli>(t_end-t_start).count()
              << " ms\n";
}


At one point we do an invocation of a thread where the function f() is passed as an input argument:
std::thread t1(f)
In the function f() a double variable with the volatile identifier is declared:
volatile double d = 0;

Here I read the definition of what a volatile variable is:
https://stackoverflow.com/questions/4437527/why-do-we-use-volatile-keyword-in-c

But overall... when we we use the qualifier "volatile" we say the compiler not to optimize the source code where the variable is used. Because, for example, an exterior interface could change the value of the variable.
However for me this is not the case in this example... So I don't understand why "d" has the qualifier volatile in this example. Does anybody know?
Thanks.
Last edited on
> when we we use the qualifier "volatile" we say the compiler not to optimize the code where the variable is used.

No. Operations on volatile qualified variables can be optimised.
For example, certain forms of strength reduction would still be possible
(replacing integer division by a power of 2 with a cheaper shift instruction).

Reads from and writes to volatile objects are part of the observable behaviour of the program.
This means that, with volatile int i = 7 ;
the sequence of statements ++i ; ++i ; ++i ; ++i can't be rewritten as i += 4 ;
and for( int j = 0 ; j < 10 ; ++j ) i += j ; can't be rewritten as i += 45 ;
My question is actually the following:
So I don't understand why "d" has the qualifier volatile in this example. Does anybody know?


Regarding the comment of @JLBorges:
1.-
Reads from and writes to volatile objects

In the example I am talking about a volatile variable not a volatile object. Can variables also be seen as objects in cpp?

2.-
certain forms of strength reduction would still be possible

What kind of compile optimization is block when using the keyword volatile? Only for loop arguments?
Last edited on
Can variables also be seen as objects in cpp?


http://en.cppreference.com/w/cpp/language/object

What kind of compile optimization is block when using the keyword volatile? Only for loop arguments?

I guess the as if rule might have something to do with it:

http://en.cppreference.com/w/cpp/language/as_if

Anyway, consult the vastly superior (light years ahead) JLBorges et al. for non guesses :+)
However for me this is not the case in this example... So I don't understand why "d" has the qualifier volatile in this example. Does anybody know?


The function is being used by 2 threads, I am guessing that is a big factor. Not sure what happens if the calls to it are interleaved, would the value of d go crazy? Does d have a separate representation in each thread, or does volatile enforce that somehow? I am guessing it doesn't. There is no mutex to prevent different threads running that code.

So I don't know, just guessing and putting ideas out, as per my user tag :+)
What kind of compile optimization is block when using the keyword volatile?

What volatile does precisely (with respect to accessing a volatile glvalue) is up to the implementation. As JLBorges indicates, in general, it means that accesses can't be re-ordered or removed; each access is considered a side-effect for the purposes of the optimizer.

The volatile keyword serves to prevent the accesses to d being combined, removed, or reordered, or more likely to prevent the function body from being optimized out entirely.
Last edited on
Here is an example:

With no volatile access,
1
2
3
4
5
6
7
8
9
10
11
int foo()
{
    int n = 1'000 ; 
    int v = 0 ;

    for( int i = 0 ; i < n ; ++i )
       for( int j = 0 ; j < n ; ++j )
          ++v ;
    
    return v ;
} 


is optimised away as int foo() { return 1'000'000 ; }


With volatile access, this kind of optimisation is not possible, the variable v is incremented 1'000'000 times :
1
2
3
4
5
6
7
8
9
10
11
int foo_with_volatile_var()
{
    int n = 1'000 ; 
    volatile int v = 0 ;

    for( int i = 0 ; i < n ; ++i )
       for( int j = 0 ; j < n ; ++j )
          ++v ;
    
    return v ;
} 

https://godbolt.org/g/U3EcTV

However even with volatile access, other kinds of optimisations are still possible; for example the LLVM compiler unrolls the inner loop, performing five increments per each of the 200 iterations. https://godbolt.org/g/jBWKN6
This is an example of optimisation (strength reduction) on an operation on a volatile variable that was mentioned earlier: division is replaced with a cheaper arithmetic shift, even for the volatile object.

1
2
3
void bar_with_volatile( volatile unsigned int& v ) { v /= 64 ; }

void bar_without_volatile( unsigned int& v ) { v /= 64 ; }

https://godbolt.org/g/bVQdgj
by the way

found a very good example here:
http://www.cplusplus.com/reference/ctime/clock/

The example quoted is actually from http://en.cppreference.com/w/cpp/chrono/c/clock
consult the vastly superior (light years ahead) JLBorges et al. for non guesses :+)

I am astounded

@JLBorges Thanks for this example:
1
2
3
4
5
6
7
8
9
10
11
int foo()
{
    int n = 1'000 ; 
    int v = 0 ;

    for( int i = 0 ; i < n ; ++i )
       for( int j = 0 ; j < n ; ++j )
          ++v ;
    
    return v ;
}  

vs
1
2
3
4
5
6
7
8
9
10
11
int foo_with_volatile_var()
{
    int n = 1'000 ; 
    volatile int v = 0 ;

    for( int i = 0 ; i < n ; ++i )
       for( int j = 0 ; j < n ; ++j )
          ++v ;
    
    return v ;
}  

I got it!. So in the example they want to use the function "f()" to spend son time.... In order that the time is considerable long they do the variable "d" volatile. Basically apply optimization in the access of this variable (
combined, removed, or reordered
) to avoid any shortcut by the compiler in the loop iterations.

Regarding the second example:
I see that in the link to the editor you sent me it shows also the assembly language behind:
1
2
shr dword ptr [rdi], 6
ret

But I don't understand what do you mean with
cheaper arithmetic shift


@Cubbi: Thanks for the obs. I will change it... To many tabs opened and a fast copy-paste..
> But I don't understand what do you mean with cheaper arithmetic shift

Assume that on a particular hardware platform, for (unsigned integer) values in registers, an integer divide instruction takes 7 clock cycles and a bit-wise shift instruction takes 2 clock cycles. This is typical; on most platforms a divide instruction is more expensive (takes more time) than a shift instruction. Here, if the compiler can replace a divide instruction with a shift instruction that produces an equivalent result, the generated code would execute faster. https://en.wikipedia.org/wiki/Division_by_two#Binary
---- @JLBorges ----
I understand that one shift to the right of the binary would divide this number between two. So I read in the link you provided.
I also read the assembly command for shifting to the right and it is
shr
.

In the example you provided:
1
2
void bar_with_volatile( volatile unsigned int& v ) { v /= 64 ; }
void bar_without_volatile( unsigned int& v ) { v /= 64 ; }


They are actually executing this shift to the right:
1
2
 shr dword ptr [rdi], 6
 ret


Unfortunately I do not understand the rest of the line:

ptr [rdi], 6
ret

So I think I am missing something about what you say:
if the compiler can replace a divide instruction with a shift instruction


----
I would like also to point out what @TheIdeasMan said:
Not sure what happens if the calls to it are interleaved, would the value of d go crazy? Does d have a separate representation in each thread, or does volatile enforce that somehow? I am guessing it doesn't. There is no mutex to prevent different threads running that code


In the example there is no mutex. So the variable "d" is accessed by both threads and the value should be at first sight modified by both threads. However the program seems to execute correctly.
Why does the variable "d" not become crazy?
Last edited on
> Unfortunately I do not understand the rest of the line:

To understand the rest of the line, you would need to learn about how arguments are passed to a function in the x86-64 architecture, and how a typical compiler implements passing references to objects to a function.

To understand that a shift instruction is used for the integer divide operation, the knowledge that shr is the shift right instruction is sufficient.


> So the variable "d" is accessed by both threads

No. The object has automatic storage duration; the two threads operate on two different objects.
Last edited on
Topic archived. No new replies allowed.