Concurrent lock-free unique id generator

Hello.
I'm trying to implement a concurrent lock-free unique id "generator".
Currently it looks more or less like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
using UID = unsigned;

class UIDPool {
public:
    UIDPool() : m_nextId{1}, m_numIds{0} {}
    ~UIDPool() {}

    UID acquire() {
        std::lock_guard<SpinLock> lock{m_lock};
        m_numIds++;
        return m_nextId++;
    }

    void release(UID id) {
        std::lock_guard<SpinLock> lock{m_lock};
        if (m_numIds <= id) {
            m_nextId = m_numIds;
        }
        m_numIds--;
    }

private:
    SpinLock m_lock;
    UID m_nextId;
    UID m_numIds;
}


The usage is obvious:
// in some object class declaration
static UIDPool s_uidPool;
UID m_id;
...
// in constructor of this object
m_id = s_uidPool.acquire();
...
// in destructor of this object
s_uidPool.release(m_id);

What I don't like about this code is the lock guards, even though they use a (fast) spinlock and contention is expected to occur rarely. So I'm wondering if it is possible to implement such (or similar) unique id generator using std::atomic's magic. But simply increasing some counter atomically and using its current value as id won't do because unused ids must be returned to the "pool" of unused ids.
Last edited on
How about a lock free queue? Make it whatever size is the size of the pool, unused ids go back onto the end of the queue.
Something like this, perhaps:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <atomic>

struct has_unique_id
{
    using uid = unsigned long long ;

    uid unique_id() const { return m_id ; }

    // do not copy or assign unique ids
    has_unique_id( const has_unique_id& ) : m_id(++next) {}
    has_unique_id( has_unique_id&& ) : m_id(++next) {}
    has_unique_id& operator= ( has_unique_id ) noexcept { return *this ; }

    protected: has_unique_id() : m_id(++next) {}

    private:
        uid m_id ;
        static std::atomic<uid> next ; // http://en.cppreference.com/w/cpp/atomic/atomic
};

////////////////// in has_unique_id.cpp /////////////////////

// start with 1001 as the first id
std::atomic<has_unique_id::uid> has_unique_id::next { 1000 } ;

///////////////////////////////////////////////////////// 


And then, to define a type with objects having unique ids: struct A : has_unique_id { /*... */ };

http://coliru.stacked-crooked.com/a/b531057532b3489d
Repeater
The thing is that the largest possible size of this "pool" is more or less unknown. The worst case I think will be 10^7 (and even about this I'm not sure, but definitely less than 10^9). And allocating 40MB+ just for this queue is too wasteful.
So use a lock free queue containing 1000 IDs (or some other sensible allocation). As IDs are released, add them back into the queue. If the queue is ever down to some small size, you need more IDs so add another thousand into the queue. The queue thus seems infinite to the user, but never takes up too much space, and released IDs are put back into the queue to be reused.

Why do you need to reuse IDs? Since you could have 10^7 items at once, why so keen to limit the number of IDs?
Last edited on
JLBorges
My code needs to also run on 32-bit systems (x86/ARMv7). I think that atomicity of operations on unsigned long long variables is realized internally using locks (mutexes) on 32-bit systems. Am I wrong?

Repeater
I was just hoping that this could be done without external data structures like in my 1st post for example (although its kinda unsafe, even considering that all objects will be relatively short-lived).
But such queue of ids seems to be the good and safe choice. Thanks.

If the queue is ever down to some small size, you need more IDs so add another thousand into the queue.

Indeed. I was thinking that many threads can drastically increase this queue's size when they see it being depleted and start adding new ids to it all at once (e.g. 1024 ids at a time each thread). But this can be adjusted.

Why do you need to reuse IDs?

Because objects with these ids will be coming and going. And overflowing unsigned int by simply increasing the counter is quite possible.

Since you could have 10^7 items at once, why so keen to limit the number of IDs?

I don't know how many objects there will be at one time. 10^7 is just a "soft top". But the probability of some number of objects existing at one time decreases greatly as the number of these objects increases. So I just don't want to waste memory considering that there will be several independent id-generators of that kind.
Last edited on
And overflowing unsigned int by simply increasing the counter is quite possible.


If you used a 64 bit unsigned int, and you allocated one new ID every clock cycle, you'd take about 146 years to overflow on a 4Ghz processor.
Last edited on
Repeater
If you used a 64 bit unsigned int

Yeah. And I already answered JLBorges why I don't want to do that.
I suspect that the interaction with a 64 bit int value on a 32 bit system will be orders of magnitude faster than anything done involving a class and queues and all that sort of thing.

Something along the lines of this would seem to do everything needed to generate the ID.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#include <atomic>
#include <iostream>

int getNextSerial()
{
  static std::atomic<uint64_t> i;
  return i++;
}

int main()
{
  for (int  a = 0; a < 34; ++a)
  {
    std::cout << getNextSerial() << std::endl;
  }
}


Simple. Fast. Easy to understand. Easy to maintain. Ticks all the boxes.
Last edited on
Repeater
A 32 bit system is quite capable of using a 64 bit int.

I know. But 32-bit systems are incapable of atomic operations on 64-bit ints without using locks (mutexes) afaik.
> My code needs to also run on 32-bit systems (x86/ARMv7). I think that atomicity of operations
> on unsigned long long variables is realized internally using locks (mutexes) on 32-bit systems.
> Am I wrong?

Check it out by running this program (on a 32-bit platform):
1
2
3
4
5
6
7
8
#include <atomic>
#include <iostream>

int main()
{
    std::atomic< unsigned long long > uid ;
    std::cout << "lock free? " << std::boolalpha << uid.is_lock_free() << '\n' ;
}
I just read the reference page, where I found

Atomic types are also allowed to be sometimes lock-free
http://en.cppreference.com/w/cpp/atomic/atomic/is_lock_free

C++17 introduces std::atomic::is_always_lock_free - would that be a better choice here?
http://en.cppreference.com/w/cpp/atomic/atomic/is_always_lock_free
Last edited on
> Atomic types are also allowed to be sometimes lock-free

For some atomic types, naturally aligned objects may be lock-free, while misaligned objects require locking operations.


> would that be a better choice here?

Yes, if a C++17 standard library is available.
fmad wrote:
32-bit systems are incapable of atomic operations on 64-bit ints without using locks (mutexes) afaik.

32-bit Intel is certainly capable, that was the reason they introduced the cmpxchg8b instruction all the way back in 1993


Well I've tested atomicity of 64-bit variables on needed architectures and yes they're lock-free. So it seems that I was deluding myself. :P
Thanks everyone.
Topic archived. No new replies allowed.