Threading Trouble

Hey guys,

I'm trying to teach myself a bit about how to thread. Below is the code I have, which basically has a 'control' that sums up [0 : 20000000], and two threads that do the same (thread one sums [0 : 10000000], thread two sums [10000001 : 20000000]).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#include <iostream>
#include <thread>
#include <ctime>

using std::cout;
using std::cin;
using std::thread;

typedef unsigned int uint;

double diffClocks(const clock_t start, const clock_t end);
void worker(const uint start, const uint end);

int main(int argc, char* argv[]) {

	// Control
	clock_t start = clock();

	uint x = 0;
	for(uint i = 0; i < 20000000; i++) {
		x += i;
	} // END for(i)

	clock_t end = clock();
	cout << diffClocks(start, end) << '\n';
	// END Control

	// Test
	start = clock();

	thread t1(worker, 0, 10000000);
	thread t2(worker, 10000001, 20000000);

	t1.join();
	t2.join();

	end = clock();
	cout << diffClocks(start, end) << '\n';
 	// END Test

return 0;
} // END int main()

double diffClocks(const clock_t start, const clock_t end) {
return ( (double)(end - start) / (double)CLOCKS_PER_SEC );
} // END double diffClocks()

void worker(const uint start, const uint end) {
	uint x = 0;
	for(uint i = start; i < end; i++) {
		x += i;
	} // END for(i)

return;
} // END worker 


Compiling with the following on Ubuntu:
g++ -std=c++11


I am getting the following errors:
/tmp/ccm9nZvO.o: In function `std::thread::thread<void (&)(unsigned int, unsigned int), int, int>(void (&)(unsigned int, unsigned int), int&&, int&&)':
test.cpp:(.text._ZNSt6threadC2IRFvjjEJiiEEEOT_DpOT0_[_ZNSt6threadC5IRFvjjEJiiEEEOT_DpOT0_]+0xa8): undefined reference to `pthread_create'
collect2: error: ld returned 1 exit status


What is causing this?
Last edited on
Solved: I was missing the
g++ -std=c++11 -pthread
option on the cmd shell.

That being said, and with the same code posted above, I am getting the following output:
Control: 0.049702
Test: 0.106213


Why would it be slower to do this in parallel as opposed to with one for loop?
It takes time to create a thread. You're not doing a very long or complex operation, so it doesn't make up for the time difference.
So I am currently including the creation of the thread within the times, if I move this out of it:
1
2
3
4
5
6
7
8
9
10
11
12
	// Test
	thread t1(worker, 0, 10000000);
	thread t2(worker, 10000001, 20000000);

	start = clock();

	t1.join();
	t2.join();

	end = clock();
	cout << diffClocks(start, end) << '\n';
        // END Test 


This would remove the overhead from creating the thread correct? Or does the thread immediately start running whatever function/params. it was passed?
Well... it can. You don't know – that's the tricky thing with multithreaded functions. I would expect it to start running immediately, however.

And anyway, really you should be measuring the overhead that creating threads give. Even if running two threads allows you to do it twice as fast, if the creation of those threads adds more time than you save you really aren't saving any time at all, so why not check for that?
I'm not the one to explain this but the timer is actually wrong. You can extend the number of iterations to take longer amounts of time (say 7 or 15 seconds). By doing this, you can simply count in your head about how long it takes. You'll notice the timer and what you counted are way off.

This is because the timer is providing user time. What you're wanting here is wall time.
There are various ways to fix the timer. I personally enjoy the simplicity of Boost Timer's "cpu_timer". It also has a wrapper called "auto_cpu_timer" which will output wall time, user time, and system time (and an addition of user time and system time). If you don't like boost, you can use clock_gettime from POSIX. WinAPI has its alternatives as well.

TL;DR: Your timer is wrong. The "test" is about linearly faster (~200%).

EDIT: Improved wording...
Last edited on
Well... it can. You don't know – that's the tricky thing with multithreaded functions. I would expect it to start running immediately, however.

And anyway, really you should be measuring the overhead that creating threads give. Even if running two threads allows you to do it twice as fast, if the creation of those threads adds more time than you save you really aren't saving any time at all, so why not check for that?


This is kind of a bad metric. The goal in threading for the most part is scalability. It's why things like thread pools were created, to remove the overhead of thread creation and only keeping the queuing overhead which is a bit lighter. Threading is always going to have some overhead because of resource and scheduling requirements but it can be mitigated. It just depends on the application.

That said, threading also adds complexity on a whole new level. It's not just a matter of adding a thread to the group to do some extra work. If your purpose is optimization, you *must* take things like CPU cache into mind. If your purpose is simultaneous work, not so much, although still worth it. If you don't take those things into mind while threading, you're going to be very displeased with the results sometimes causing benchmarks that really are slower than single-threaded performance.

Making a single-threaded application is simple and sometimes the better way to go, usually with short (less than a few seconds at most), single-purpose applications. With long term real-time applications, it's almost irresponsible not to take advantage of it.
Last edited on
Topic archived. No new replies allowed.