Globals really aren't as bad as some people make them out to be in the thread... you just have to be careful with it. Accessing a resource in a uniform location is often faster than passing it via a function or handler.
You just need to either use a mutex (or something to prevent data races) or use TLS. SFML uses TLS.
Well, a mutex can often have no lock time at all. It's just the unlucky case of actually hitting the lock. Sometimes bad programming can cause the mutex case to be slow as well where you do things that unnecessary while keeping the mutex locked (like doing syncronous IO with a locked mutex = rather slow and probably wrong).
Although, you're also right. Just as bad practice can make this method just as bad though.
I am not following computerquip's point, mutex = you are wasting (n - 1)/n processor resource (n = the number of your processors) if you are not using it correctly, not only sync IO. and even you are using it correctly, the waste is also increasing after the processor count increased. this is the reason we have lots of lock free data structures, say queue, memory allocator, etc.
usually nothing different when trying a mutex queue or a lock free queue, if you have only 2 - 4 cores. but when you are trying to run it on a 32 cores machine, you will see the gap.