Multi Threading Performance

Hi everyone

I was trying to compare the performance of a multithreaded sceheme against a serialized scheme on a simple program.
I find the serialized version completes execution under 1 second, but the multi threaded version takes as much as 10 seconds.

I am trying to understand why this is the case.

Any help appreciated,
thanks

Threaded version:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
int tCount = 20000;

void *functionC(void * dummyPtr);

int main(){
	time_t t;
	time(&t);
	
	int rc1, rc2;
	pthread_t * pt;
   
	pt = new pthread_t[tCount];
	for (int i = 0; i < tCount; ++i){
		pthread_create(&pt[i], NULL, functionC, NULL);
	}
	
	for (int i = 0; i < tCount; ++i){
		pthread_join(pt[i],NULL);
	}
	
	
	delete[] pt;
	printf("Done\n");
	printf("%.21f\n",difftime(time(NULL),t));
	return 0;
}



void * functionC(void *dummyPtr){

	int sum = 0;
	for (int i = 1; i <= 25; ++i){
		sum +=i;
	}
	
}



UnThreaded version:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
int tCount = 20000;

void *functionC(void * dummyPtr);

int main(){
	time_t t;
	time(&t);
	
	for (int i = 0; i < tCount; ++i){
		functionC(NULL);
	}
	
	printf("Done\n");
	printf("%.21f\n",difftime(time(NULL),t));
	return 0;
}



void * functionC(void *dummyPtr){

	int sum = 0;
	for (int i = 1; i <= 25; ++i){
		sum +=i;
	}
	//printf("%d\n",sum);
}
If you have optimizations turned on the compiler will probably be smart enough to see that functionC doesn't do anything. In the UnThreaded version the compiler can probably remove the whole for loop. In the Threaded version you are probably just measuring the overhead of creating and joining threads because the threads don't actually do any work.
Last edited on
20,000 threads is too unrealistic. Typical MT applications rarely go above 300 and almost never above 500. Not only it takes forever to create and destroy them (which is what's eating up the time in your test - run with a profiler to see for yourself), it slows down the OS scheduler (unless it's RT). You need task-based concurrency after that, if you need that much concurrency at all (Intel Cilk++ is pretty good, and they are hoping to bring into the next C++ language standard).

Also, your functionC is declared to return void*, but is not returning anything.
As Cubbi said, you are doing it wrong. To benefit from multithreading, you should create exactly the same number of threads that you have CPU cores, and load them fully with tasks in such a way they almost never block, so forget mutexes and critical sections. This can be achieved by queues, asynchronous message passing and immutable data structures.

Cilk is nice, but nowhere near capabilities of Akka framework
http://akka.io
Last edited on
Topic archived. No new replies allowed.