Reuse a thread ?

Hello.
I have a simple question: is it possible to "reuse" a thread, or in other words pass a different method to a thread that already executes or executed another method? Are threads capable of running several functions one after another? Or is a thread limited to one and only function, and when that returns, the thread ends? I looked into the thread functions list, but I can't seem to find any functions designed for this. For example:
1
2
3
4
5
6
7
8
DWORD WINAPI function1(LPVOID lpParameter){return 0;};
DWORD WINAPI function2(LPVOID lpParameter){return 0;};
int main(){
    HANDLE Thread=CreateThread(0,0,function1,0,0,0);
    AppendToThread(Thread,0,0,function2,0,0,0);
    WaitForSingleObject(Thread,INFINITE);
    return 0;
};

MSDN describles the lpStartAddress parameter as "A pointer to the application-defined function to be executed by the thread. This pointer represents the starting address of the thread." and "starting address" somehow sounds like there could also be "continuing addresses". Any clues?
Thank you.
Or is a thread limited to one and only function, and when that returns, the thread ends?


This. However you can call any number of functions from that one function, so yes you can do what you want. Just not the way you're thinking.

You can make some kind of job queue that your worker thread looks at. Just have your worker thread spin in a loop pulling functions out of the queue and executing them. In your main thread, when you want the worker to do some more work, just add a new function to the queue.

Of course you'll have to make the queue threadsafe.
I was bored so I made this and very briefly tested it. It doesn't have many bells or whistles but it might work for what you want.

Uses some C++11 stuff (functional, smart pointers, threads) - though if you are using VS2010 like me it doesn't support C++11 threads so I fall back to boost for that (the API is identical for both):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
#if YOU_HAVE_A_COMPILER_THAT_SUPPORTS_STD_THREAD_LIB_AS_OF_CPP11
    #include <thread>
    namespace thd = std;
#else
    #include <boost/thread.hpp>
    namespace thd = boost;
#endif

#include <functional>
#include <memory>

class WorkerThread
{
public:
    typedef std::function<void()> job_t;

    WorkerThread()
    {
        wantExit = false;
        thread = std::unique_ptr<thd::thread>( new thd::thread( std::bind(&WorkerThread::Entry, this) ) );
    }

    ~WorkerThread()
    {
        {
            thd::lock_guard<thd::mutex> lock(queueMutex);
            wantExit = true;
            queuePending.notify_one();
        }
        thread->join();
    }

    void addJob(job_t job)
    {
        thd::lock_guard<thd::mutex> lock(queueMutex);
        jobQueue.push_back(job);
        queuePending.notify_one();
    }


private:
    void WorkerThread::Entry()
    {
        job_t job;

        while(true)
        {
            {
                thd::unique_lock<thd::mutex> lock(queueMutex);
                queuePending.wait( lock, [&] () { return wantExit || !jobQueue.empty(); } );

                if(wantExit)
                    return;

                job = jobQueue.front();
                jobQueue.pop_front();
            }

            job();
        }
    }

private:
    std::unique_ptr<thd::thread>    thread;
    thd::condition_variable         queuePending;
    thd::mutex                      queueMutex;
    std::list<job_t>                jobQueue;
    bool                            wantExit;
    
    WorkerThread(const WorkerThread&);              // no copying!
    WorkerThread& operator = (const WorkerThread&); // no copying!
};





//
//  Example usage:
//

#include <Windows.h>

int main()
{
    {
        WorkerThread t;
        t.addJob( [] () { Sleep(2000); } );
        t.addJob( [] () { std::cout << "printing 1." << std::endl; } );
        t.addJob( [] () { std::cout << "printing 2." << std::endl; } );
        t.addJob( [] () { Sleep(2000); } );
        t.addJob( [] () { std::cout << "printing 3." << std::endl; } );
        t.addJob( [] () { std::cout << "printing 4." << std::endl; } );

        Sleep(3000);
    } // <- thread exits here
    return 0;
}



Didn't bother to comment but it isn't that complicated. I'm happy to answer any questions you have about it.
Last edited on
You can make some kind of job queue that your worker thread looks at.

Thanks a lot!
In fact, I was previously doing something somewhat similar to this:
It's just that the project is split in two big parts, and I wanted to split these so that there are two "thread processing" functions. I guess I'll remain at using the same function for both sections.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
int MyClass::CallAdd(CallListType Function,void* Parameters){
    CallList[CallListSize]=Function;
    CallListParameters[CallListSize]=Parameters;
    CallListSize++;
    return 0;
};
int MyClass::CallCall(){
    int i;
    (this->*CallList[0])();
    for(i=0;i<CallListSize-1;i++){
        CallList[i]=CallList[i+1];
        CallListParameters[i]=CallListParameters[i+1];
    };
    CallList[CallListSize]=NULL;
    CallListParameters[CallListSize]=NULL;
    CallListSize--;
    return 0;
};

...and somewhere in the main loop...

if(CallListSize)CallCall();


where CallList, CallListParameters and CallListSize are members of MyClass. (of course, I simplified things a bit, but this is not far from the original at all). Do you think it's okay how I'm doing it? (Frankly, I'm awful at using mutex-es).

[Edit] Code goes into code box, quote goes into quote box.
Last edited on
It's just that the project is split in two big parts, and I wanted to split these so that there are two "thread processing" functions.


I strongly recommend against throwing multithreading in as an afterthought. It's very tricky to get right and it kind of has to be in your design from the start or else you'll likely be chasing bugs forever.

Unless these two 'big parts' do not share access with any variables/objects (unlikely), this will be a difficult task to work into an existing project.


You also should ask yourself what benefit you hope to gain by adding a separate thread. Is it really worth the added complexity? Chasing down bugs related to multiple thread race conditions is not fun -- they are some of the more difficult bugs (if not THE most difficult kind of bug) to find and fix.


Do you think it's okay how I'm doing it? (Frankly, I'm awful at using mutex-es).


Not really... it's not threadsafe at all. It will behave very strangely. Any kind of information that is shared between threads (like that CallList) MUST be guarded by a mutex or you are asking for serious trouble.

Mutexes are a critical, fundamental part of multithreaded programming. If you are uncomfortable with them you really should not be working multithreading into an existing program, because odds are extremely high you won't get it right.

I don't mean to sound so discouraging. If you want to start working on a multithreaded app, I recommend you start with a smaller program and work it into the design from day one. Making a multithreaded program takes careful planning as to what data is used where and when. It's not something that can be easily patched in later.
I strongly recommend against throwing multithreading in as an afterthought. It's very tricky to get right and it kind of has to be in your design from the start or else you'll likely be chasing bugs forever.
Technically, that is how I originally planned it; ironically, it turns out that it seems like I should be dumping the idea now. Fortunately and most luckily, doing that wouldn't be much of a complication, as there are only a few adjustments that need to be made, and the functions get fusion-ed.

Unless these two 'big parts' do not share access with any variables/objects (unlikely), this will be a difficult task to work into an existing project.
I think there's a slight confusion here: The two functions are never supposed to run simultaneously. However, they both use members and variables of the same class, which I don't particularly like, therefore I am going to get rid of the second thread. However, the working thread / main thread duality still has to remain. Speaking of which, if a thread is associated to a function, is the program's main thread associated with the function main() (or WinMain, etc.)? (this is just my assumption based on nothing but speculations). I was also quite curious about what exactly makes the difference between main() and WinMain() or DllMain() at compiling time... Perhaps I'm asking too many questions.

You also should ask yourself what benefit you hope to gain by adding a separate thread. Is it really worth the added complexity? Chasing down bugs related to multiple thread race conditions is not fun -- they are some of the more difficult bugs (if not THE most difficult kind of bug) to find and fix.
Good point; I believe that keeping things simple is the best solution for any problems. I will still have to learn how to handle such problems some day though, right?

Not really... it's not threadsafe at all. It will behave very strangely. Any kind of information that is shared between threads (like that CallList) MUST be guarded by a mutex or you are asking for serious trouble.
Looks like I was under the wrong impression that declaring a variable as volatile would spare me from most of the thread problems. I was hoping that only using this CallList for input to the thread and a very similar method for output from it would be safe enough, considering that no other variables are shared. Of course, somehow I always thought that wouldn't be enough. In other words, I'll look into mutex-es as soon as possible.

I don't mean to sound so discouraging. If you want to start working on a multithreaded app, I recommend you start with a smaller program and work it into the design from day one. Making a multithreaded program takes careful planning as to what data is used where and when. It's not something that can be easily patched in later.
I know what you're thinking: that I'm just jumping into action unprepared and unaware, hoping to get it work. In fact, I feel pretty ashamed that I didn't document myself about everything that is going to be needed in this case. It's just that I'm quite new to multithreading and my "bug sensor" concerning it is limited to realizing that memory shared between threads can be accessed at the same time by different threads, which makes things get ugly. Now I must go play with mutex-es :)
I could be wrong (probably am) but with *nix threads, since thread creation and thread task-giving (exec()) are separated, can't you just keep using that thread to execute various functions after it's finished the previous?

P.S., I didn't actually read most of this thread so if this question has been answered, just let me know and I'll read.
I think there's a slight confusion here: The two functions are never supposed to run simultaneously.


That is a little confusing. If the two functions are never run simultaneously, then what's the benefit of putting them in seperate threads?

Speaking of which, if a thread is associated to a function, is the program's main thread associated with the function main() (or WinMain, etc.)?


I guess you could say that. main() is the entry point for the main thread, just as <yourfunction> is the entry point for whatever additional thread you spawn. But there isn't really any association between them other than the function being the entry point.

I was also quite curious about what exactly makes the difference between main() and WinMain() or DllMain() at compiling time


main() is the standard entry point as dictated by the C++ standard.

WinMain is WinAPI's custom entry point. It exists so that additional, platform-specific information can be passed to the program's entry point (particuarly the HINSTANCE of the program).

I like never use DllMain but it's the same idea as WinMain, just for DLLs.

Looks like I was under the wrong impression that declaring a variable as volatile would spare me from most of the thread problems.


In C++, 'volatile' is rather ill-defined... sort of. It's actually very clearly defined by the standard, but different compilers (MSVS) like to give their own special meaning for the word.

Strictly speaking, all 'volatile' does is it guarantees that accesses to a variable will do a read/write to memory. This removes optimization possibilities. For example... normally the compiler might choose to keep a variable in a register so that accesses can be faster -- but with the volatile keyword it isn't allowed to do that.

It's really not that useful on its own.

However MSVS takes the volatile keyword a step further and puts a memory barrier around accesses. I think it might also ensure that accesses to the variable are atomic (but don't quote me on that!).

Both of these make the individual variable thread-safe on its own... but don't necessarily make the whole program thread-safe. For example in your code.. even if all accesses to all variables are atomic... and the memory accesses are behind barriers and occur in the order you expect, you could still get screwed:

1
2
3
4
    CallList[CallListSize]=NULL;
       // if 'CallListSize' is modified here -- you're boned
    CallListParameters[CallListSize]=NULL;
    CallListSize--;



But really... you shouldn't rely on 'volatile' doing this. Because like I said it only works in MSVS... so if you try to build on another compiler it'll be disasterous. What's worse... behavior isn't even consistent across different versions of MSVS. Some versions do it, others don't. So really I would avoid it altogether.


I'll look into mutex-es as soon as possible.


They're conceptually very simple. A mutex can only be locked by one thread at a time. If you try to lock and another thread has it locked already, the thread will stop (sleep) and wait for it to be unlocked. This ensures that two threads are not trying to access sensitive data at the same time.

Furthermore, they form a memory barrier, so when you unlock a mutex, you are guaranteed that all previous writes that the thread has performed are "done". (memory barriers and pipelining are tricky to explain -- reply if you're interested and I'd be happy to give a crash course).

So for an example of a mutex, let's look at some broken code:

1
2
3
4
5
6
7
8
// thread A
foo++;


// thread B
ar1[foo] = x;
  // <- caution!
ar2[foo] = y;


If thread A runs its foo++ line while threadB is on the 'caution' line, you're boned because ar1 and ar2 will fall out of sync. To make sure this never happens, we can put those accesses behind a mutex:

1
2
3
4
5
6
7
8
9
10
// thread A
mymutex.lock();
foo++;
mymutex.unlock();

// thread B
mymutex.lock();
ar1[foo] = x;
ar2[foo] = y;
mymutex.unlock();


Now we are guaranteed that only one of those blocks of code will be run at a time. So it is impossible for the foo++ line in thread A to interrupt the array updating in thread B.


In my code, I used RAII constructs 'unique_lock' and 'lock_guard' which basically automate the process of locking and unlocking.

For example my code here:

1
2
3
4
5
        {
            thd::lock_guard<thd::mutex> lock(queueMutex);
            wantExit = true;
            queuePending.notify_one();
        }


Is the same as this:
1
2
3
4
5
6
        {
            queueMutex.lock();
            wantExit = true;
            queuePending.notify_one();
            queueMutex.unlock();
        }


The lock_guard object will automatically lock the mutex in its constructor and unlock it in its destructor. Use of these is advised because if something throws an exception, it would normally skip over the unlock() keeping the mutex locked (which might be trouble) -- but wtih RAII the destructor would still kick in even if there was an exception, so it will always unlock the mutex.




Anyway blah blah blah. Hopefully I'm clarifying things and not confusing you. I'm happy to answer more questions. This stuff is actually a lot of fun for me. :)
Last edited on
That is a little confusing. If the two functions are never run simultaneously, then what's the benefit of putting them in seperate threads?
I know, right? What was I thinking?!

Strictly speaking, all 'volatile' does is it guarantees that accesses to a variable will do a read/write to memory. This removes optimization possibilities. For example... normally the compiler might choose to keep a variable in a register so that accesses can be faster -- but with the volatile keyword it isn't allowed to do that.
So, it is correct that declaring a variable as 'volatile' allocates an "impartial" place in the memory for it, allowing it to be accessed and modified from "somewhere else" (including a different thread - maybe even a different process?). That causes it to become untouchable from the compiler's optimization procedures, because who knows who's going to access that variable in the meantime. That's probably why it would become somewhat insecure, which could be why VS sets it as atomic. Which leads me to wonder: Are you saying that normally, a simple variable access for read or write is non-atomic by default? Meaning that one thread writing a chunk of data interfering with another thread reading the same chunk could end up so that the data read by thread 2 is altered as in a combination between the former and the new value? In this case, there still has to be a quantum. One byte or one bit perhaps? In fact, who knows, this "hack" might even be useful (although I haven't yet found any situation in which it would be). Excluding of course the fact that it would be completely uncontrolled, because as far as I know, threads never do things at a constant rate). Anyway, I bet the real problem in this case is not about variables; it's about pointers...

Thanks for the fast mutex tutorial. By my understanding, a mutex is similar to a global bool: The first thread that gets to write it as 1 locks it, and for the others while(itislocked)sleep(); (Of course, it's curious how the mutual exclusion itself isn't affected by thread safety, but I'm not even by far knowledgeable in processor science) Sounds like an elegant solution - easy to understand and hopefully efficient. Not sure what's with all the template stuff for the lock_guard but I suppose it's a matter of preference. I haven't yet got time to play with them enough, but hopefully I will get some soon enough. Anyway, are there other fundamental things that I'm missing and I need to watch out for in order to continue developing my program?

Thanks a lot for the patience, knowledge and time.
Last edited on
Volatile doesn't allow access to a variable from "somewhere else"1 it's more like it informs the compiler that the variable is being controlled from somewhere else.

An example of when to use volatile is if a variable represents the state of a real time sensor, your program has no way of knowing that it might change in an instant so it might want to copy the current value of the variable into a register to do a comparison because it would normally be faster from there then to compare two memory addresses. Volatile would prevent this.

1: By somewhere else I assume you mean interprocess communication (IPC). This is done with things like named pipes and Remote Procedure Calls. The volatile specifier has no bearing on the scope of the variable.
Last edited on
So, it is correct that declaring a variable as 'volatile' allocates an "impartial" place in the memory for it, allowing it to be accessed and modified from "somewhere else" (including a different thread - maybe even a different process?).

That's not correct.

Because a volatile access is not associated with a memory barrier, just because your thread performed a volatile write doesn't mean another thread will ever observe it when it reads from the same address, and just because another thread updated a volatile variable doesn't mean your thread will ever observe the change.

Moreover, because a volatile access is not atomic, if one thread performs a volatile write and another thread performs a volatile read from the same variable without additional synchronization, data race occurs, which is a kind of undefined behavior: the program no longer makes any sense, it can do anything whatsoever.

That causes it to become untouchable from the compiler's optimization procedures, because who knows who's going to access that variable in the meantime.

That's also not correct: volatile accesses are not "untouchable". They are simply treated by the optimizer the same way as I/O (because they *are* how memory-mapped I/O is implemented). They cannot be removed, but the can be moved. The compiler guarantees that they will be performed in the same sequence and the that the writes will be done with the same values as the abstract machine would have done, regardless of actual code transformations.

In particular, the compiler is free to move code over the volatile access (as long as that code is not a volatile access or other I/O itself): if you volatile write, then do some long loop that calculates a value, and volatile write that value, it can do the loop first, and then volatile write old value and immediately volatile write new.

Are you saying that normally, a simple variable access for read or write is non-atomic by default?

That's correct: accessing a non-atomic variable (volatile or not) from one thread and reading from another without additional synchronization is UB.
Last edited on

Mutexes are a critical, fundamental part of multithreaded programming.


I disagree.

Actually I can't see a problem of writing multithreaded code without mutexes at all (except it is extremely hard in C++, assuming lack of concurrent collections in STL, lack of GC, and lack of immutable structures). Mutexes are almost always a scalability killer at some point. That's why Erlang and Go got rid of them, yet are one of the languages with top-notch multithreading / parallelism support.
Last edited on
Fine, then let me qualify that statement with "in C++".

EDIT:

Though really... it's not just C++, it's a low level concept. Mutual Exclusion is a practically a building block of parallelism. Languages that don't impliment them must abstract or automate them away somehow (unless they rely solely on barrier guarded atomic memory accesses). So when you get to a low enough level, it's still an important concept. So I don't really have to qualify my statement.

If the implementation for Erlang/Go do not employ mutexes in some form, I would be astounded.
Last edited on
lack of concurrent collections in STL


As opposed to the "thread-safe" structures in other languages that still require external locking for nontrivial use.

lack of GC


Garbage collectors pause the rest of the program when they run. That's much worse than an occasional block on one thread.

lack of immutable structures


1
2
3
4
5
6
7
struct S
{
    int a;
    int b;
};

const S s = {1, 2};


s is immutable unless you destroy all guarantees of program correctness.

Erlang and Go got rid of them


Both of these languages not only use mutexes internally but provide them to the programmer.
I can't see a problem of writing multithreaded code without mutexes at all


All of programmers-passed just rolled over in their graves. Maybe in the next 2,000 years there will be enough lock-free algorithms discovered to make the quote a reality.


Erlang and Go got rid of them


It only looks to me like they hid them better from end-users to trick them into believing that the languages are magical.

http://golang.org/pkg/sync/

The erlang documentation is so cryptic that I gave up looking for a link.
Last edited on
lack of GC
Garbage collectors pause the rest of the program when they run

He was probably alluding to the ABA problem in lockless lists that use CAS-based atomic instructions. One of the many ways to deal with it is GC, the worst possible way in my opinion.
Last edited on
Topic archived. No new replies allowed.