Okay everyone, I have a problem that is KILLING me. I'm reaching out to you guys in hopes that someone can point out the problem.
Long story short
Mutex locks are not forcing my variables to be physically written, resulting in a condition_variable failing to wake up, resulting in program deadlock.
Long story long
I'm writing an NES emulator which runs the CPU in the main thread, and the PPU (graphics) in a spinoff thread. Emulation runs a frame at a time. Once a frame completes, I need the PPU to stop and wait for the next frame to start.
This is accomplished with 2 variables.
1) an "EmulationOn" boolean to indicate if a frame is currently being emulated
2) a "timestamp" which is used to sync up the CPU/PPU. When the PPU finishes the frame, it gets set to a crazy high value. It is reset to 0 at frame start.
Bear with me... there's a lot of code but it's not that hard to see the problem. Overall emulation flow is below. All the 'SuperLog' stuff is dumped to a log file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
|
void EmulateOneFrame()
{
mPpuSyncSystem->SuperLog( "N: Start of frame...\n" );
mClock.EmulationOn = true;
mCpu.AdjustTimestamp( -mPpu.GetLastFrameLength() );
mPpu.SetVideoOut(video);
mPpu.NextFrame(); // <- HERE, this resets the timestamp to zero
mPpuSyncSystem->SuperLog( "<ON> = %d\n", mClock.EmulationOn ? 1 : 0 );
// ... emulation is started here, once both CPU and PPU threads finish
// emulating the frame, the PPU thread waits and execution resumes
// here...
mClock.EmulationOn = false;
mPpuSyncSystem->SuperLog( "<OFF> = %d\n", mClock.EmulationOn ? 1 : 0 );
mPpuSyncSystem->SuperLog( "N: End of frame...\n" );
}
|
Some other relevent areas in the code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
|
// Called by above code at frame start:
void NesPpu::NextFrame()
{
mClock->PpuTimestamp_ = 0; //<- HERE, notice, PpuTimestamp_ is zero'd
mSync->SuperLog("P: Timestamp reset T=%d &clk=%08X\n", (timestamp_t)mClock->PpuTimestamp_, mClock);
// ... other stuff here to reset the PPU state in prep to begin a new frame
mSync->WakeThis(); // <- calls notify_one on the condition_variable to
// wake the PPU thread. This much is working.
}
// Used as the condition callback for the condition_variable which the PPU
// waits for once it finishes a frame
bool NesPpu::TODORemoveMe()
{
bool result = (mClock->EmulationOn && (mClock->PpuTimestamp_ != timestamp_never)) || mSync->WantToBeJoined();
mSync->SuperLog( "P: checking: on=%d, ts=%d OUT=%d &clk=%08X\n", mClock->EmulationOn ? 1 : 0, (timestamp_t)(mClock->PpuTimestamp_), result ? 1 : 0, mClock );
return result;
}
// .. when the PPU finishes a frame...
mClock->PpuTimestamp_ = timestamp_never; // timestamp_never==0x7FFFFFFF
mSync->SuperLog( "P: About to wait for frame to start...\n" );
mSync->Wait( [&] () { return TODORemoveMe(); }, mMasterSync );
mSync->SuperLog( "P: Begin Frame emulation...\n" );
// (ignore the mMasterSync thing, this effectively just calls
// condition_variable::wait )
|
So when running this... it works fine for a while, then deadlocks. Checking the log reveals the problem:
P: About to wait for frame to start... <- PPU finished frame
P: Thread waiting... <- PPU thread now waiting
C: Thread Waking... <- CPU thread is poked (still waiting, this just means notify_one has been called and it should recheck its condition)
P: checking: on=1, ts=2147483647 OUT=0 &clk=023DF788 <- PPU thread condition is checked, 'OUT=0' so thread remains waiting
C: Thread Waking... <- poke
P: Thread Waking... <- poke
C: ...waiting complete <- CPU thread now wakes up and resumes
<OFF> = 0 <- EmulationOn changed to false, to signal this frame is done
N: End of frame... <- Frame end
N: Start of frame... <- Next frame starts
P: Timestamp reset T=0 &clk=023DF788 <- PPU resets timestamp !!!!IMPORTANT notice timestamp=0
P: Thread Waking... <- poke PPU thread
<ON> = 1 <- EmulationOn is true, new frame is starting
(reg write) 2000=10 <- CPU needs to sync with PPU on reg writes
C: Thread waiting... <- so CPU waits
P: Thread Waking... <- PPU is poked
P: Thread Waking... <- ... twice
P: checking: on=1, ts=2147483647 OUT=0 &clk=023DF788 <- PPU's condition is checked !!!! NOTE ts != 0
|
The timestamp here is the zinger. Notice how it's clearly being set to zero at the start of the frame, but is still being read back as crazy-high when the condition variable is checking the status.
The program deadlocks here, as now both CPU and PPU threads are waiting, and neither one ever gets poked again.
Now I know that variables used to sync multiple threads need to be guarded. I also know that marking them as 'volatile' isn't enough, as you have to ensure memory is actually serialized in the right order and race conditions have to be avoided. As such, both EmulationOn and my timestamp variables are 'atomic'. std::atomic isn't supported by my compiler, so I wrote my own ghetto Atomic class which basically ensures all accesses are locked behind a mutex:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
|
template <typename T>
class Atomic
{
public:
Atomic(T x = T()) : v(x) { }
Atomic(const Atomic<T>& x) : v(x) { }
operator T () const
{
T copy;
{
threadlib::lock_guard<threadlib::mutex> lk(m);
copy = v;
}
return copy;
}
void operator = (const T& x)
{
threadlib::lock_guard<threadlib::mutex> lk(m);
v = x;
}
void operator = (const Atomic<T>& x)
{
*this = static_cast<T>(x);
}
void operator += (const T& x)
{
threadlib::lock_guard<threadlib::mutex> lk(m);
v += x;
}
private:
volatile T v;
mutable threadlib::mutex m;
};
|
Does anyone have any ideas why this is happening? I'm ripping my hair out over this.
I
thought that a mutex lock combined with the actual variable being volatile would ensure reads/writes across threads would be in every way protected, but now I'm thinking maybe there's more that's needed?
I'm using boost for threads/mutexes/etc (my compiler doesn't have std::thread implemented yet)
Thanks.