Multi-threading performance hits

Hi.

First post here and sorry if this has been asked a thousand times already. I'm just starting out with multi-threading and I've got it working but performance seems to degrade almost exponentially the more threads I create.

In the code snippet below, each thread calls "threadTest()" which just does a printf 4 millions times. One thread can execute the code in 4 seconds. 2 threads, 8 seconds. 4 threads, 18 seconds, etc. What would explain this performance hit? My system is a 3.6 GHz quad core running Windows 7 64 bit. My application is a Win32 app built using Visual Studio Express 2012. Thanks.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
.
.
.
vector<thread*> threadVector;

for(unsigned i=0;i<numberOfThreads;i++)
{
   threadVector.push_back(new thread(threadTest));
}

for(unsigned i=0;i<numberOfThreads;i++)
{
   (*threadVector[i]).join());
}
.
.
.

void threadTest()
{
   for(unsigned i=0;i<4000000;i++)
   {
      printf("test");
   }
}
Last edited on
My guess is that printing to console is single threaded and most of the time in your application you spend on printing. Try some other operations instead of printf, that do not try to use the same resource at the same time
I tried just pushing and popping strings from a vector. Same slow down as before.

1
2
3
4
5
6
7
8
9
10
11
void threadTest()
{
   vector<string> myStringVector;

   for(unsigned i=0;i<1000000;i++)
   {
      string myString = "";
      myStringVector.push_back(myString);
      myStringVector.pop_back();
   }
}
I have to ask.... is this with optimizations turned on?
Multithreading isn't a panacea "make things go faster".

It actually involves a little more work to keep all those threads around, so unless you know some of that is getting offloaded onto another processor-- you're getting a performance hit anyway.

Beyond that, there are technical issues. Obnoxious ones. Like these:
http://www.gotw.ca/publications/optimizations.htm

Multithreading is designed to increase performance, but the meaning of the word "performance" here is very narrow. Make sure you know what you are trying to accomplish by multithreading. MSDN's article is worth a read:
http://www.google.com/search?btnI=1&q=msdn+multithreading+performance

Hope this helps.
Honestly, I don't know. How would I find out?
When you compiled your program, did you tell the compiler to use optimizations?

In VS, make sure you have "Release" selected at the top. Then right-click your application name in the Solution Explorer and select "Properties". This will bring up a dialog. From the left pane, click through "Configuration Properties"-->"C/C++"-->"Optimization".

From the command line, compilers typically need you to type extra stuff to tell them to turn on optimizations. For example, the GCC requires you to type something like:

    g++ -O2 myprog.cpp

You need to read your documentation for more.

Hope this helps.
Creating threads will not necessary will bring down the time. Threading application are designed that way so that they perform.. just creating many threads will not always work.. You have to find out how much time the threads are in lock state. on unix you can find it out, I do not know about windows.

probably you can use a performance/code profiling tool like gprof on linux/unix. VS2012 also has a performance analyzer but I am not sure what information it will give you about multithreading environment.

For a beginner, I would suggest do not go into optimization or reduction in time. Try to implement them so that they work correctly and give you correct result... as we say, "Premature optimization is the root of all evil"
You have a bottleneck: console output. Not only it slow by itself, so you cannot get it faster, you get it slower because threads are fighting for that limited resource.

You have pretty much same perfomance for single and dual threads program, but when you are trying to use 4 threads, they start to waste more time for synchronisation and hey cannot use whole CPU: one core is usually not-so free because of background activities.
The only situation where threads have a positive effect on speed is when no synchronization occur and those threads are on different processors.

So threads are usually not used for speed up your program. Usually they are used for long term tasks which shall not block other.
And the prize goes to MiiNiPaa! Not to invalidate what everyone else is saying about threading OP, they are all correct about it not being a silver bullet and how it requires you to think about your project differently. But your issue is specifically caused by you outputting to a console window on a Windows platform. This is the face of your culprit: http://en.wikipedia.org/wiki/Csrss

Thanks all for your responses. I will check the resources you gave me. The "printf" example was just a quick and dirty one. I also provided another example with vectors where I experience a similar slow down the more threads I add. In that example, I don't do any writing to the console.
The slow down from the vectors could be from page faults or variable re-basing. Try re-writing that one with a container that doesn't try to keep the memory contingent and try using perfmon to keep track of the page faults your app triggers. How are you measuring performance by the way? Are you running the test multiple times?
And, of course, you are not accessing same vector in different threads, are you?
What exactly are you trying to measure here, anyway?
Topic archived. No new replies allowed.