Multi-threading stories

Today, I finally broke through and started to multi-thread. It's quite a new world and requires a completely new way of thinking. In this thread, I'd like to hear some of your multi-threading stories and unexpected problems.

My story:
Today, I had to make my application multi-threaded so that I could take advantage of multiple processors and meet my real-time schedule. The threading and mutex protections themselves weren't hard, but I started to see weird behaviors in areas I didn't expect which have always worked in the past:
1
2
3
4
5
6
7
8
9
10
11
12
13
extern int g_someFlag;

int MY_API CALLBACK someEntryFunc() // now run in a second thread
{
    static int flag_prev;

    if (g_someFlag != flag_prev)
        someFunc();
    else
        someOtherFunc();

    flag_prev = g_someFlag;
}

This is how I do things all of the time and it has always been reliable (in a single thread that is). Today I found that I didn't always call someFunc() when g_someFlag changed which was very confusing. Then I realized that when g_someFlag is changed by another thread at the same time that someOtherFunc() is running, we lose the edge detection!

The multi-thread implementation was easy. This debugging is going to be insane. I just hope that I can find ALL of the bugs in hundreds of thousands of lines of code. At least now I know what to look for.
Last edited on
closed account (o1vk4iN6)
I'll just leave this here too...
http://developinthecloud.drdobbs.com/author.asp?section_id=2284&doc_id=255275&
Having spent the last 12 years habitually dealing with hundreds of threads running on hundreds of cores (today, my smallest dev boxes are 64-core), I find it funny reading how "the future is multicore". Commodity PC future maybe.
so that I could take advantage of multiple processors and meet my real-time schedule
In my experience, threading with very small workloads makes little sense unless you can send off a thread almost instantly. In one particular example, a workload of 14 serial ms ran parallel for 2-3 ms. I was using thread pools and everything, too.

If you have a lot of shared state in your program, that's a sign that it's not a prime candidate for parallelism. Either you spend a lot of time in redesign, or your program will spend a lot of time synchronizing threads (or producing race conditions).
It's also possible that the problem you're facing is inherently serial.
You probably shouldn't have a lot of shared state in your programs anyway. It's best not to have any.
I was actually referring to imperative/object-oriented languages. Functions that depend on shared mutable state generally aren't re-entrant or thread-safe without a lot of synchronisation, and having a function operate on data that isn't part of the function itself (or its class if it's a method) isn't conducive to easy understanding of what the function is operating on.

I also think that imperative programmers can learn a lot of good habits from functional programming. There are the obvious things, like keeping functions short and having them "do one thing and do it well", but also, functions should be written in such a way that most of them don't do anything except combine other functions, and avoid side-effects as much as possible, as is the style in functional programming.
Last edited on
@chrisname, Is that a reply to my deleted comment? After writing that this is too Haskell and somewhat unreasonable I realized that, if we're not talking about threads, I don't know what "shared state" is. Shared between what? Objects? That just encourages making them larger. Functions? Surely that is not an option. Modules maybe?
Well, shared state is just non-local (not necessarily global) state that's made visible to more than one thread at a time. In a way, the two are almost interchangeable.

Functions? Surely that is not an option.
What do you mean?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
int global=0;

int f(int &n){
    return global++*n;
}

void g(int &n){
    n+=global;
}

//...

int a=1;
std::cout <<f(a)<<std::endl;
std::cout <<f(a)<<std::endl;
g(a);
std::cout <<f(a)<<std::endl;
Last edited on
@hamsterman
Yes, it was. By shared state I was referring to non-thread-local state (accessible concurrently from multiple threads) as well as ordinary global state (accessible from different areas of a program, whether concurrently or not).
I assumed your comment was an addition to helios's
If you have a lot of shared state in your program, that's a sign that it's not a prime candidate for parallelism.
and derived that you meant not to have shared state even if there is no parallelism intended. My bad.
Last edited on
That is what I meant, I think shared state (whether between threads or between functions (except those within a single object)) is bad. It's often necessary, but it's still something to be avoided in general. Programs that access shared mutable state aren't re-entrant, whether they're thread-safe or not. Re-entrancy doesn't necessarily have anything to do with threads. It also refers to signal and interrupt handlers, and any functions they might call. For a function to be re-entrant, it has to only call functions which are also re-entrant.
Interesting discussion.

I've been chatting with the software director here, and he is very much against multi-thread and instead pushing me to use a multi-process solution. This may make more sense when reading helio's message which said that threads are best used when they have finite lifespan and need no synchronization.

I have a good interface defined for these two threads to run without synchronization which is good, but then I suppose multi-processes work just as well. I'd love to go multi-thread though because it is "the future" and this advance in technology would probably make our next few projects much easier.

In terms of shared data, all of the components read to and write from a massive (300kb) shared memory. Fortunately it is sequential so that we can mutex specific segments of the memory at a time.
threads are best used when they have finite lifespan

I would say the opposite, they are best used if their lifetime matches the lifetime of the program.

we can mutex specific segments of the memory at a time.

That's just another coarse lock. Perhaps you have non-thread-safe code to link?

(also, 300kb shared memis massive? I'm annoyed at IBM for their 3Gb limit, but they have ridiculous memory addressing)

I'd love to go multi-thread though because it is "the future"

I think some people here are trying to say it's "the past" :)
I have a good interface defined for these two threads to run without synchronization which is good, but then I suppose multi-processes work just as well. I'd love to go multi-thread though because it is "the future" and this advance in technology would probably make our next few projects much easier.
The only difference between these two configurations:

process A{[thread 0], [thread 1]}
-
process A{[thread 0]}
process B{[thread 1]}

is that one has separate address spaces for its threads. Personally, I would default to threads unless there are special considerations (e.g. you might not want possible errors in one workload to propagate to other workloads), if only because threads make for a slightly neater codebase. Plus, inter-thread communication mechanisms are generally better known than their inter-process counterparts.
Cubbi, ok, at worst, it's the present and I need to catch up. But where are we going to go from here? Processors aren't getting any faster, just more cores.
where are we going to go from here

fibers, strands, whatever you call them. Intel Cilk Plus is probably what it's gonna look like in C++. I don't see it as a replacement though, just as a tool that covers the problem cases where threads aren't all that suitable.
Topic archived. No new replies allowed.