Fastest performance lookup/insert/update thread safe

I've got a name and a score. Name is a std::string, score is a double. I need to be able to, in a thread safe way update/read name/value pairs from multiple threads simultaneously.

Such that:
void MakeScore(name,score);
where name is a name i.e. "John" and score is a score i.e. 87.5 would be written to some container. The couple would be inserted if it did not exist. Any new calls to MakeScore where "John" was the name would update the value(score) at that node.

At the same time I can call GetScore(name); from any thread at any given time, assume more calls to GetScore than MakeScore will be done.

What is the fastest thread safe mechanism for this?

I've implemented this using critical sections, it's not fast enough, so now I'm thinking of a lock-less approach. Any ideas?

Also what would be the best container for what I need, assuming inserts were the least common action, and lookups and updates being the more common.

You should use reference counting in a smart way:

Take count of how many times a variable is being readed. While it's readed, don't lock it.
When it's required to write to the variable, wait until all readers finished reading. Then lock it.

The trick is to optimize it to only be locked when it's being modified.
If you only need to read it, you aren't going to need to read it once each thread. But be careful not to edit it!
If a coarse rwlock was profiled and shown to be a bottleneck, I would start by evaluating TBB:
http://threadingbuildingblocks.org/docs/help/reference/containers_overview/concurrent_unordered_map_cls.htm
or the older
http://threadingbuildingblocks.org/docs/help/reference/containers_overview/concurrent_hash_map_cls.htm

(also, keep in mind that lockless doesn't mean fast)
Thanks for the suggestions I'll look into both and report back.

@EssGeEich - One of my ideas that sort of plays off your approach is using the interlocked methods. A count will be atomically incremented while the data is being read and if being read by multiple threads the counter will increase, when it is no longer being read the counter will be decremented (in each thread that was reading).

The problem that I was thinking for the writer that there would then be a race for the writer when trying to acquire the data. If readers don't lock them a moments hesitation for the writer trying to acquire could be put on the back burner as another reader increments the counter. Lets say 16 threads are reading and only 1 thread is writing.

I thought that maybe I could implement some sort of Interlocked increment "flag" if you will for the writer such that when it tries to aquire but readers are reading, the writer will "spin", however, any new readers will have to spin as well until the writer acquires the data does the update and decrements this flag. All the flagging would be atomic for both readers and writers. Does this sound like a 'sound' approach?

Humor me for a second. What if the writer never updates the data the readers are actually accessing, instead it is updating a copy of that data, when the writer is done, it's atomic flag says "I'm ready to be read now" and any new readers entering the "count" process will no access this data. All old readers will finish their instructions decrements their counter and when that reaches zero the old data is destroyed.

@Cubbi going to look into those now. BTW I've never ran across that website, thanks for this one. Yeah I hear you on the lockless bit, I think the problem was a lot of Kernel transitions, which I can avoid using a lockless approach, however, the over head in such implementation may just prove to be as inefficient.
Last edited on
Hey, I just tought again at my way of doing it and, yeah, let me say, i'm a dumb. Because if two threads were to increment the reference count in the same time, bad actions were to happen. The way to fix it is to lock the reference count, and we're back where we was first. So, I can only say, follow Cubbi.
If you're already using critical sections then you're unlikely to get much more speed out of it. You're well into the point of diminishing returns.

Usually when critical sections are "too slow" and people start thinking about lockless data structures then they are going about solving the problem the wrong way. For example, Do you prevent race conditions? Do you ensure that updates are transactional? Do you need to do either or both of those two things? Can you do updates in batches rather than one at a time? Do inserts need to be immediately visible to other threads? Have you considered locking to read the value, then unlocking while you do processing, then re-locking, confirming that nothing has changed and then updating before unlocking again? Are you doing everything you possibly can outside of the lock? Have you considered each thread having its own map which is synced back to the main copy just once a second or so? I mean does it really matter if one thread doesn't see another threads changes immediately? Afterall it is a race-condition anyway.

My point being with these questions, is that there are a lot of other ways to improve and fix the system before resorting to the horrors of a lock-free system which likely still wont get your program much faster anyway.
You need to give a lot more detail to get the best outcome.
Topic archived. No new replies allowed.