Unexpected Timing Results

Hello there!

So I wanted to do an experiment to test timing difference between using a switch statement and using an array of lambdas. I seem to be having an issue with my timer or something and I feel like I'm gonna feel stupid after asking this, but here goes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
#include <iostream>
#include <functional>
#include <chrono>

using std::function;
using namespace std::chrono;
using namespace std::chrono_literals;

enum MyEnum
{
	First,
	Second,
	Third,
	Fourth,
	Fifth,
	Sixth,
	Seventh,
	Eigth,
	Ninth,
	Tenth,
	Eleventh,
	NUM_ELEMENTS
};

struct LambdaOp
{
	function< void() > op = []() { std::cout << "Why?\n"; };
};

static const LambdaOp s_LambdaOps[ NUM_ELEMENTS ] =
{
	{ []() { std::cout << "adummy\n"; } }, // First
	{ []() { std::cout << "bdummy\n"; } }, // Second
	{ []() { std::cout << "cdummy\n"; } }, // Third
	{ []() { std::cout << "ddummy\n"; } }, // Fourth
	{ []() { std::cout << "edummy\n"; } }, // Fifth
	{ []() { std::cout << "fdummy\n"; } }, // Sixth
	{ []() { std::cout << "gdummy\n"; } }, // Seventh
	{ []() { std::cout << "hdummy\n"; } }, // Eigth
	{ []() { std::cout << "idummy\n"; } }, // Ninth
	{ []() { std::cout << "jdummy\n"; } }, // Tenth
	{ []() { std::cout << "kdummy\n"; } }, // Eleventh
};

void LambdaMethod( const MyEnum &lambdaOp )
{
	s_LambdaOps[ lambdaOp ].op();
}

void SwitchMethod( const MyEnum &statement )
{
	switch ( statement )
	{
	case First:
		std::cout << "adummy\n";
		break;
	case Second:
		std::cout << "bdummy\n";
		break;
	case Third:
		std::cout << "cdummy\n";
		break;
	case Fourth:
		std::cout << "ddummy\n";
		break;
	case Fifth:
		std::cout << "edummy\n";
		break;
	case Sixth:
		std::cout << "fdummy\n";
		break;
	case Seventh:
		std::cout << "gdummy\n";
		break;
	case Eigth:
		std::cout << "hdummy\n";
		break;
	case Ninth:
		std::cout << "idummy\n";
		break;
	case Tenth:
		std::cout << "jdummy\n";
		break;
	case Eleventh:
		std::cout << "kdummy\n";
		break;
	default:
		std::cout << "Why?\n";
		break;
	}
}

class MyTimer
{
public:
	void Start()
	{
		start = high_resolution_clock::now();
	}

	void Stop()
	{
		end = high_resolution_clock::now();
		time_span = duration_cast< duration< double > >( end - start );
	}

	double GetTime() const
	{
		return time_span.count();
	}

private:
	high_resolution_clock::time_point start = high_resolution_clock::now();
	high_resolution_clock::time_point end = start;
	duration< double > time_span = {};
};

int main( int argc, char *argv[] )
{
	const int numIterations = 100000;
	const MyEnum element = Sixth;

	MyTimer SwitchTimer = {};
	MyTimer LambdaTimer = {};

	LambdaTimer.Start();
	for ( int i = 0; i < numIterations; ++i )
	{
		LambdaMethod( element );
	}
	LambdaTimer.Stop();

	SwitchTimer.Start();
	for ( int i = 0; i < numIterations; ++i )
	{
		SwitchMethod( element );
	}
	SwitchTimer.Stop();

	std::cout << "Lambda Time: " << LambdaTimer.GetTime() << std::endl;
	std::cout << "Switch Time: " << SwitchTimer.GetTime() << std::endl;
	
	std::cin.get();

	return 0;
}


It seems like no matter what, whichever method I do SECOND comes out faster.

So you see this:

1
2
3
4
5
6
7
8
9
10
11
12
13
	LambdaTimer.Start();
	for ( int i = 0; i < numIterations; ++i )
	{
		LambdaMethod( element );
	}
	LambdaTimer.Stop();

	SwitchTimer.Start();
	for ( int i = 0; i < numIterations; ++i )
	{
		SwitchMethod( element );
	}
	SwitchTimer.Stop();


The SwitchTimer comes out faster. However, if I swap these like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
	SwitchTimer.Start();
	for ( int i = 0; i < numIterations; ++i )
	{
		SwitchMethod( element );
	}
	SwitchTimer.Stop();

	LambdaTimer.Start();
	for ( int i = 0; i < numIterations; ++i )
	{
		LambdaMethod( element );
	}
	LambdaTimer.Stop();


The LambdaTimer comes out faster.

I must clearly be doing something wrong. What am I not accounting for here? This is making me feel stupid.
Last edited on
I'm getting fairly consistent results.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
#include <iostream>
#include <functional>
#include <chrono>
#include <ctime>

using std::function;
using namespace std::chrono;
using namespace std::chrono_literals;

enum MyEnum
{
	First,
	Second,
	Third,
	Fourth,
	Fifth,
	Sixth,
	Seventh,
	Eigth,
	Ninth,
	Tenth,
	Eleventh,
	NUM_ELEMENTS
};

volatile unsigned long long n = 0 ;

struct LambdaOp
{
	function< void() > op = []() { std::cout << "Why?\n"; };
};

static const LambdaOp s_LambdaOps[ NUM_ELEMENTS ] =
{
	{ []() { std::cout << "adummy\n"; } }, // First
	{ []() { std::cout << "bdummy\n"; } }, // Second
	{ []() { std::cout << "cdummy\n"; } }, // Third
	{ []() { std::cout << "ddummy\n"; } }, // Fourth
	{ []() { std::cout << "edummy\n"; } }, // Fifth
	{ []() { ++n ; } }, // Sixth
	{ []() { std::cout << "gdummy\n"; } }, // Seventh
	{ []() { std::cout << "hdummy\n"; } }, // Eigth
	{ []() { std::cout << "idummy\n"; } }, // Ninth
	{ []() { std::cout << "jdummy\n"; } }, // Tenth
	{ []() { std::cout << "kdummy\n"; } }, // Eleventh
};

void LambdaMethod( const MyEnum &lambdaOp )
{
	s_LambdaOps[ lambdaOp ].op();
}

void SwitchMethod( const MyEnum &statement )
{
	switch ( statement )
	{
	case First:
		std::cout << "adummy\n";
		break;
	case Second:
		std::cout << "bdummy\n";
		break;
	case Third:
		std::cout << "cdummy\n";
		break;
	case Fourth:
		std::cout << "ddummy\n";
		break;
	case Fifth:
		std::cout << "edummy\n";
		break;
	case Sixth:
		++n ;
		break;
	case Seventh:
		std::cout << "gdummy\n";
		break;
	case Eigth:
		std::cout << "hdummy\n";
		break;
	case Ninth:
		std::cout << "idummy\n";
		break;
	case Tenth:
		std::cout << "jdummy\n";
		break;
	case Eleventh:
		std::cout << "kdummy\n";
		break;
	default:
		std::cout << "Why?\n";
		break;
	}
}

class MyTimer
{
public:
	void Start()
	{
		start = std::clock() ;
	}

	void Stop()
	{
		end = std::clock() ;
	}

	double GetTime() const
	{
		return (end-start) * 1000.0 / CLOCKS_PER_SEC ;
	}

private:
	std::clock_t start = std::clock();
	std::clock_t end = std::clock() ;
};

int main()
{
    std::cout << std::fixed ;
	const long long numIterations = 400'000'000;

	const MyEnum element = Sixth;

	// get relevant stuff into the cache
    SwitchMethod( element );
    LambdaMethod( element );

	MyTimer SwitchTimer = {};
	MyTimer LambdaTimer = {};

	SwitchTimer.Start();
	for ( long long i = 0; i < numIterations; ++i )
	{
		SwitchMethod( element );
	}
	SwitchTimer.Stop();
	std::cout << "Switch Time: " << SwitchTimer.GetTime() << "  (processor time, milliseconds)\n";

	LambdaTimer.Start();
	for ( long long i = 0; i < numIterations; ++i )
	{
		LambdaMethod( element );
	}
	LambdaTimer.Stop();
	std::cout << "Lambda Time: " << LambdaTimer.GetTime() << "  (processor time, milliseconds)\n";

	LambdaTimer.Start();
	for ( long long i = 0; i < numIterations; ++i )
	{
		LambdaMethod( element );
	}
	LambdaTimer.Stop();
	std::cout << "Lambda Time: " << LambdaTimer.GetTime() << "  (processor time, milliseconds)\n";

	SwitchTimer.Start();
	for ( long long i = 0; i < numIterations; ++i )
	{
		SwitchMethod( element );
	}
	SwitchTimer.Stop();
	std::cout << "Switch Time: " << SwitchTimer.GetTime() << "  (processor time, milliseconds)\n";
}

echo && echo && g++ -std=c++17 -O3 -Wall -Wextra -pedantic-errors -pthread main.cpp && ./a.out
echo && echo =========== && echo 
clang++ -std=c++17 -stdlib=libc++ -O3 -Wall -Wextra -pedantic-errors -pthread main.cpp -lsupc++ && ./a.out


Switch Time: 1038.101000  (processor time, milliseconds)
Lambda Time: 1148.268000  (processor time, milliseconds)
Lambda Time: 1112.642000  (processor time, milliseconds)
Switch Time: 1076.517000  (processor time, milliseconds)

===========

Switch Time: 1032.746000  (processor time, milliseconds)
Lambda Time: 1142.727000  (processor time, milliseconds)
Lambda Time: 1087.514000  (processor time, milliseconds)
Switch Time: 1155.309000  (processor time, milliseconds)

http://coliru.stacked-crooked.com/a/39a02a1b7aead344
I ended up trying on different platforms/compilers and got different results that made more sense. And from what I'm gathering it probably had to do with CPU caching.

However I'm still perplexed. Isn't a switch statement nothing more than a glorified set of if/else statements? If that's the case, why does the array of std::functions that contain lambdas tend to take longer given your results?

I would've expected for example that going for "Eleventh" would take longer in the switch statement than the lambda array.
Last edited on
A switch statement is usually not turned into a set of if-else by the compiler. Typically, it is a single look-up in a table.

https://stackoverflow.com/questions/2596320/how-does-switch-compile-in-visual-c-and-how-optimized-and-fast-is-it
Tests like this can be misleading. There's enough information in the program for an optimizer to theoretically unwind both loops into identical code.
Don't fall into a trap with your thinking. Different platform and different compiler gives different results sometimes. That does not prove anything except that different compilers and CPU do things differently.

Use the one that gives the best results on the target platform. If there are multiple target platforms, and it matters to you that much, swap the code versions for the different compilers with a #define block.

Changing it up gave a different result, but that does not mean your expected result is somehow confirmed, in other words. The test on the original system is still valid for that system.

@jonnin

Nah, I'm pretty sure by now it had to do with CPU caching or some form of optimization. Everything was tested on the same exact computer. It's just happened my first test was on Windows + VS2017.

And by doing what JLBorges was doing

1
2
3
// get relevant stuff into the cache
SwitchMethod( element );
LambdaMethod( element );


The results there were more consistent.


Oh and when I said I had results more in-line with what I expected. I meant more consistent, not that I was expecting one to better than the other. Though I definitely would've thought the lambda method to be faster.
Last edited on
Topic archived. No new replies allowed.