Why is the execution time 0ms?

Pages: 123
My program calculates the factorial of 100:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include <iostream>

using namespace std;

int main() {
	double fac, tmp1;
	double duration;

	fac=100;

	tmp1 = fac-1;

	clock_t c_start = clock();
	while (tmp1>0){
		fac*=tmp1;
		tmp1--;
		//cout << " #####fac: " <<fac;
	}

	clock_t c_end=clock();

	duration= 1000.0*(c_start - c_end) / CLOCKS_PER_SEC;

	cout<<endl;

	cout<< "fac is: " <<fac <<endl;
	cout<< "#####Execution time: " <<duration <<"ms" <<endl;


I was surprised to see an execution time of 0 ms.
Does anybody know why?
Thanks.
On Line 22 you require c_end - c_start, not the other way round. It actually gave me -0.001 milliseconds in c++ shell on one trial.

The number of clock ticks per second is machine-dependent, but 100 multiplies hopefully won't take long. If it's less than one clock interval then the discrete result will be 0.
Last edited on
Thanks @lastchance.

c_end - c_start, not the other way round

Silly mistake. I was already too late when I was working on this topic...

If it's less than one clock interval then the discrete result will be 0

I think this can be the reason why the 0 is displayed in the stdout.

How can I confirm that this is the reason?
I mean... Is there a way to know how long a clock interval takes?

I also tried to increase the 100 multiplies by 1000 but it goes to "inf". Probably because the double type precision is not big enough to store the factorial.
Last edited on
Is there a way to know how long a clock interval takes?


You could output CLOCKS_PER_SEC. (It will be different on every machine: it's miles better in c++ shell than my desktop PC, for instance.)

Factorials grow very quickly indeed and will rapidly exceed standard numerical limits, even with doubles. You could calculate instead log(N!), which will grow much more slowly. The code below does this for N=1000000. Timing is for my (slow!) desktop PC. (Laws of logs: log(product) = sum of logs).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
#include <ctime>
#include <cmath>
using namespace std;

int main()
{
   cout << "Number of clock ticks per second = " << CLOCKS_PER_SEC << '\n';
   cout << "Time between clock ticks = " << 1.0 / CLOCKS_PER_SEC << " s\n\n";

   int N = 1000000;
   double log10Nfactorial = 0.0;

   clock_t c_start = clock();
   for ( int i = 1; i <= N; i++ ) log10Nfactorial += log10( (double)i );
   clock_t c_end = clock();

   cout << "log_10( " << N << "! ) = " << log10Nfactorial << '\n';
   cout << "This vitally-important calculation took " << 1000.0 * ( c_end - c_start ) / CLOCKS_PER_SEC << " ms\n";
}


Number of clock ticks per second = 1000
Time between clock ticks = 0.001 s

log_10( 1000000! ) = 5.56571e+006
This vitally-important calculation took 93 ms
Last edited on
Thanks again @lastchance.

You could output CLOCKS_PER_SEC

This worked!! :)

So I assume if an execution takes less than one clock tick there is no possible way to calculate the execution time :'(. Is there?

I was also checking other ways to measure the execution time with the clock ticks in C++98 and I found this:
std::chrono::steady_clock::now

But it looks that returns the same as
std::clock

I don't see any special advantage between both functions...


Last edited on
So I assume if an execution takes less than one clock tick there is no possible way to calculate the execution time :'(. Is there?


That is true. Actually, the more clock ticks a calculation takes, the more accurate your answer is. The percentage error in 1 tick from 3 is much greater than 1 tick in 10 for example. A reasonable approach is simply to repeat the calculation a large number of times, so that the relative error is small (i.e. you have a large number of clock ticks).

There are many more routines available in <chrono>, so they add more flexibility, but whether they give you better time accuracy I'm afraid I'm not qualified to answer. (Search on this site and you will find more knowledgeable answers about this.)


My approach would always be to repeat the calculation a sufficient number of times that the relative error is small. If I repeat it 100 times, say, then I could always divide my final answer by 100 to get the unit time.
closed account (48T7M4Gy)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include <chrono>

using namespace std::chrono;

class Stopwatch
{
private:
    steady_clock::time_point start;
    
public:
    Stopwatch(){
        reset();
    }
    
    ~Stopwatch(){}
    
    // Start the clock
    void reset(){ start = steady_clock::now(); }
    
    // Read time elapsed in nanoseconds
    double getTime(){
        return duration<double, std::nano>(steady_clock::now() - start).count();
    }
};

#include <iostream>
using namespace std;

int main() {
    double fac, tmp1;
    
    fac=100;
    
    tmp1 = fac-1;
    
    Stopwatch sw;
    
    while (tmp1>0){
        fac*=tmp1;
        tmp1--;
        //cout << " #####fac: " <<fac;
    }
    cout<< "fac is: " <<fac <<endl;
    cout<< "#####Execution time: " <<sw.getTime() <<" nanoseconds" <<endl;
}
Thanks @kemort.

I think that is the modern solution @puertas12.
closed account (48T7M4Gy)
My pleasure @lastchance

I decided a couple of weeks ago to write myself a stopwatch class for some other timing stuff I'm doing and thought it might be useful here.

I was surprised about the nanoseconds at first but it makes sense.

http://en.cppreference.com/w/cpp/chrono/duration
hey kemort, using the helper type std::chrono::nanoseconds instead of the duration_cast, am getting 0 nanoseconds:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
 
#include <chrono>

using namespace std::chrono;

class Stopwatch
{
private:
    steady_clock::time_point start;

public:
    Stopwatch(){
        reset();
    }

    ~Stopwatch(){}

    // Start the clock
    void reset(){ start = steady_clock::now(); }

    // Read time elapsed in nanoseconds
    double getTime(){
    //    return duration<double, std::nano>(steady_clock::now() - start).count();
       return std::chrono::nanoseconds (steady_clock::now() - start).count();
    }
};

#include <iostream>
using namespace std;

int main() {
    double fac, tmp1;

    fac=100;

    tmp1 = fac-1;

    Stopwatch sw;

    while (tmp1>0){
        fac*=tmp1;
        tmp1--;
        //cout << " #####fac: " <<fac;
    }
    cout<< "fac is: " <<fac <<endl;
    cout<< "#####Execution time: " <<sw.getTime() <<" nanoseconds" <<endl;
}

using the version of your program the execution time is, for example, 2.967e+006 nanoseconds. any ideas why this difference? thanks
closed account (48T7M4Gy)
I don't know - chrono is a bit of a mystery to me in that sort of detail. Maybe Cubbi will chime in. The logic(?) I decided on was to measure all the times in nanoseconds and convert as I need outside the class eg in main().

It might have something to do with double appearing in the function instead of duration</*signed integer type of at least 64 bits*/, std::nano> as in cppreference. I put double in before I decided to measure all times in nanoseconds.

Have you tried running you version in the cpp shell because I get:
fac is: 9.33262e+157
#####Execution time: 74436 nanoseconds


and for my version above I get
fac is: 9.33262e+157
#####Execution time: 51258 nanoseconds
both vaguely the same while dramatically different to your 2.967e+006

I can say no more ... :)
All the clocks in <chrono> measure the wrong time for this; they are not appropriate for this program. The elapsed time reported by clocks in <chrono> would vary from run to run depending on the processor and other execution resources given to the program by the operating system.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <ctime>
#include <iostream>
#include <chrono>
#include <thread>

volatile long long v = 0 ;

int main()
{
    constexpr int N = 1024*1024*128 ;
    
    using namespace std::chrono ;
    const auto timepoint_start = steady_clock::now() ;         
    const auto clock_start = std::clock() ;
    
    for( int i=0 ; i < N ; ++i ) v += 1 ;
    
    // simulate the processor not being available to our program for two seconds
    // for example, this could happer if our timeslice expired after the first loop
    //                                and there is contention for processor time
    std::this_thread::sleep_for( seconds(2) ) ; 
    
    for( int i=0 ; i < N ; ++i ) v += 1 ;

    const auto clock_end = std::clock() ;
    const auto timepoint_end = steady_clock::now() ;         

    std::cout << " elapsed processor time: " << double(clock_end-clock_start) / CLOCKS_PER_SEC * 1000.0 << " milliseconds.\n" ;
    std::cout << "elapsed wall clock time: " << duration_cast<milliseconds>(timepoint_end-timepoint_start).count() << " milliseconds.\n" ;
}

 elapsed processor time: 684.851 milliseconds.
elapsed wall clock time: 2685 milliseconds.

http://coliru.stacked-crooked.com/a/6e1cbae48ca5fcda
@JLBorges,
when I run your code I get the same result for steady_clock::now() and clock().
Why is this ?

elapsed processor time: 2617 milliseconds.
elapsed wall clock time: 2617 milliseconds.

closed account (48T7M4Gy)
Interesting. I don't know where this leaves us but if we comment out line 23 and use the "Stopwatch" then all 3 times are the same on my machine.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include <iostream>
#include <chrono>
#include <thread>
#include "stopwatch.h"

volatile long long v = 0 ;

int main()
{
    constexpr int N = 1024*1024*128 ;
    
    using namespace std::chrono ;
    const auto timepoint_start = steady_clock::now() ;
    const auto clock_start = std::clock() ;
    Stopwatch sw;
    for( int i=0 ; i < N ; ++i ) v += 1 ;
    
    // simulate the processor not being available to our program for two seconds
    // for example, this could happer if our timeslice expired after the first loop
    //                                and there is contention for processor time
    
    //std::this_thread::sleep_for( seconds(2) ) ; // <--
    
    for( int i=0 ; i < N ; ++i ) v += 1 ;
    
    const auto clock_end = std::clock() ;
    const auto timepoint_end = steady_clock::now() ;
    
    std::cout << " elapsed processor time: " << double(clock_end-clock_start) / CLOCKS_PER_SEC * 1000.0 << " milliseconds.\n" ;
    std::cout << "elapsed wall clock time: " << duration_cast<milliseconds>(timepoint_end-timepoint_start).count() << " milliseconds.\n" ;
    std::cout << sw.getTime()/1000000 << " milliseconds\n";
}


elapsed processor time: 518.371 milliseconds.
elapsed wall clock time: 520 milliseconds.
520 milliseconds
Program ended with exit code: 0


Put the "thread line" back in and:
1
2
3
4
elapsed processor time: 521.417 milliseconds.
elapsed wall clock time: 2524 milliseconds.
2524 milliseconds
Program ended with exit code: 0
JLBorges wrote:
All the clocks in <chrono> measure the wrong time for this; they are not appropriate for this program.

Hi, JLBorges. Do you mean there’s no solution in standard C++ or is there an appropriate clock elsewhere? Should we rely on third part library to get a correct result?
Could the average of several program execution be considered reliable enough, in your opinion?
Presumably, you are using the Microsoft library (or a library like the GNU library on windows which internally uses the facilities provided by the Microsoft library).

The implementation of clock() and std::clock() in the Microsoft library does not conform to the standard.
Note that this function does not strictly conform to ISO C, which specifies net CPU time as the return value.
https://msdn.microsoft.com/en-us/library/4e2ess30.aspx


Microsoft has categorically stated that they are not going to fix the non-conforming behaviour of std::clock()
This is indeed an instance of nonconformance in the Visual C++ C Standard Library implementation. We have made substantial improvements to the implementation of clock() for the next major release of the Visual C++ libraries by reimplementing it in terms of QueryPerformanceCounter, to improve precision and ensure that the results are monotonic. However, we have opted to avoid reimplementing clock() in such a way that it might return time values advancing faster than one second per physical second, as this change would silently break programs depending on the previous behavior (and we expect there are many such programs).
https://connect.microsoft.com/VisualStudio/feedback/details/860388/in-c-std-clock-returns-wrong-value

Thanks,

yes I use VS 2015 CE.
> Do you mean there’s no solution in standard C++

There is a solution; the standard does provide std::clock().


> is there an appropriate clock elsewhere?

Boost CPU timers can report processor time usage with high resolution.
http://www.boost.org/doc/libs/1_65_0/libs/timer/doc/cpu_timers.html
I was trying out some of the Windows-specific functions, in this case GetProcessTimes().

I may have taken some shortcuts converting from windows structures to usable numbers but it seems plausible. The resolution though seems to be in steps of 1/64 second on my machine.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
#include <windows.h>

class CPUTimer {
// comment: I suspect resolution is 1/64 second

public:
    CPUTimer();
    void stop(); 
    double CPU()    const { return utime; }
    double kernel() const { return ktime; }
   
private:
    FILETIME CreationTime;
    FILETIME ExitTime;
    FILETIME KernelTime;
    FILETIME UserTime;
    HANDLE handle;
    
    double ktime;
    double utime;
    
    unsigned long long value(const FILETIME &);
};

unsigned long long CPUTimer::value(const FILETIME & input)
{
    unsigned long long result = input.dwHighDateTime;
    result <<= 32;
    result |= input.dwLowDateTime;
    return result;
}

CPUTimer::CPUTimer() : ktime(0.0), utime(0.0)
{
    handle = GetCurrentProcess();
    GetProcessTimes(handle, 
        &CreationTime, &ExitTime, 
        &KernelTime, &UserTime );
}

void CPUTimer::stop()
{
    FILETIME CTime;
    FILETIME ETime;
    FILETIME KTime;
    FILETIME UTime;

    GetProcessTimes(handle, 
        &CTime, &ETime, 
        &KTime, &UTime ); 
                  
    ktime = (value(KTime) - value(KernelTime)) / 1.0E7;
    utime = (value(UTime) - value(UserTime)) / 1.0E7;
}

//------------------------------------------------------

#include <ctime>
#include <iostream>
#include <chrono>
#include <thread>

volatile long long v = 0 ;

int main()
{
    constexpr int N = 1024*1024*128 ;
    
    using namespace std::chrono ;
    CPUTimer cputimer;
    
    const auto timepoint_start = steady_clock::now() ;         
    const auto clock_start = std::clock() ;
    
    for( int i=0 ; i < N ; ++i ) v += 1 ;
    
    // simulate the processor not being available to our program for two seconds
    // for example, this could happer if our timeslice expired after the first loop
    //                                and there is contention for processor time
    std::this_thread::sleep_for( seconds(2) ) ; 
    
    for( int i=0 ; i < N ; ++i ) v += 1 ;

    const auto clock_end = std::clock() ;
    const auto timepoint_end = steady_clock::now() ; 
    cputimer.stop();        

    std::cout << " elapsed processor time: " << double(clock_end-clock_start) / CLOCKS_PER_SEC * 1000.0 << " milliseconds.\n" ;
    std::cout << "elapsed wall clock time: " << duration_cast<milliseconds>(timepoint_end-timepoint_start).count() << " milliseconds.\n" ;
    std::cout << " windows processor time: " << cputimer.CPU()  * 1000.0 << " milliseconds.\n" ;

}

My output:
 elapsed processor time: 3093 milliseconds.
elapsed wall clock time: 3100 milliseconds.
 windows processor time: 1093.75 milliseconds.


and same program on a different machine:
 elapsed processor time: 2747 milliseconds.
elapsed wall clock time: 2749 milliseconds.
 windows processor time: 750 milliseconds.

Last edited on
Oh! It has been quite a long conversation here :)

Overall, according with what JLBorges said:

All the clocks in <chrono> measure the wrong time for this; they are not appropriate for this program


I cannot rely in the library <chrono>, although the class provided by kemort looks interesting.

Specifically for my factiorial implementation:
1
2
3
4
5
while (tmp1>0){
		fac*=tmp1;
		tmp1--;
		//cout << " #####fac: " <<fac;
}

There is no way to measure the execution time with std::clock() because this execution takes less than one clock tick :(
However I will try with the Boost CPU timers JLBorges suggested. I wonder if with high precision I will be able to measure the execution....

In C++11 the library <chrono> adds a class high_resolution_clock. Could I rely on the function in two instances of chrono::high_resolution_clock::now() to calculate the execution time?

I also would like to point out the calculation of duration:
 
duration= 1000.0*(c_end - c_start) / CLOCKS_PER_SEC;

I noticed that c_start returns always 0. Is it then mandatory to do the subtraction?
I mean I think this would return the same result.
 
duration= 1000.0*(c_end) / CLOCKS_PER_SEC;


Finally a question for @Chervil:
What do you want to say with your last posted message? :)
Have you found a way to calculate the execution time with cpp in Windows?
Pages: 123