ID resrevation & generation

I have this helper class, but I'm wondering if there's some better way to do it. I tried looking through the functions in <algorithm> but none seem to really do the trick.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
		template<typename T>
		struct ID_Manager
		{
			using ID_type = T;
			ID_type GenerateID()
			{
				Assert(lowest < std::numeric_limits<ID_type>::max());
				ID_type ret (lowest);
				while(IDs.find(++lowest) != IDs.end())
				{
				}
				return IDs.insert(ret), ret;
			}
			void ReleaseID(ID_type ID)
			{
				Assert(IDs.find(ID) != IDs.end());
				IDs.erase(ID);
				lowest = (ID < lowest) ? ID : lowest;
			}
		private:
			std::set<ID_type> IDs;
			ID_type lowest = std::numeric_limits<ID_type>::min();
		};
Is there a better way to go about this?
Last edited on
So this is the most effective way to do it, and there is no existing standard library code, container, or function that behaves the same?
This would be more efficient.

(It can be made even more efficient under certain situations with a std::set<> and std::set<>::upper_bound() , but that's unlikely to be worth it.
Another idea, which may be worth it with some patterns of usage would be to maintain a cache of recently released ids).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#include <type_traits>
#include <limits>
#include <unordered_set>

template < typename T, typename = T > struct id_factory ;

template < typename T >
using int_type = typename std::enable_if< std::is_integral<T>::value, T >::type ;

template < typename T >struct id_factory< T, int_type<T> >
{
    static constexpr T min = std::numeric_limits<T>::min() ;
    static constexpr T max = std::numeric_limits<T>::max() ;

    static T generate()
    {
        T n = last_id + 1 ;

        while( id_set.insert(n).second == false )
        {
            if( n < max ) ++n ;
            else n = min ;
        }

        return last_id = n ;

    }

    static void release( T id ) ;

    private:
        static T last_id ;
        static std::unordered_set<T> id_set ;
};

template < typename T >
T id_factory< T, int_type<T> >::last_id = std::numeric_limits<T>::min() ;

template < typename T >
std::unordered_set<T> id_factory< T, int_type<T> >::id_set ;

// if required
#include <string>
template <> struct id_factory< std::string, std::string >
{
    // factory for string ids

    // ...
};
The protocol I am implementing requires me to use the lowest possible IDs, and I'd also need the ID factories to be independent of each other for the same type. I understand how your code works, mostly, but I'm not sure how to change it. Also, I don't see where the implementation is for release?

Edit: I updated my first post with my current code
Last edited on
> I'd also need the ID factories to be independent of each other for the same type.

So, everything becomes non-static.


> Also, I don't see where the implementation is for release?

I'd just left it at the declaration; thought that the implementation was obvious.


> The protocol I am implementing requires me to use the lowest possible IDs

I'm a bit curious. Why?

For the lowest possible IDs:

Using a std::set<>, like in your code,
with generate() being O(N) instead of O( N log N )
and release() remaining O( log N )

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
template < typename T, typename = T > struct id_factory ;

template < typename T >
using int_type = typename std::enable_if< std::is_integral<T>::value, T >::type ;

template < typename T >struct id_factory< T, int_type<T> >
{
    static constexpr T min = std::numeric_limits<T>::min() ;
    static constexpr T max = std::numeric_limits<T>::max() ;

    T generate()
    {
        T n = lowest ;
        
        // EDIT:
        // for( const T& id : id_set )
        for( auto iter = id_set.lower_bound(n) ; iter != id_set.end() ; ++iter )
        {
            if( *iter > n ) break ;
            else if( *iter < max ) n = *iter+1 ;
            else throw std::range_error( "no more ids" ) ;
        }

        id_set.insert(n) ;
        lowest = n+1 ;
        return n ;
    }

    void release( T id )
    {
        if( !id_set.erase(id) ) throw std::domain_error( "invalid id" ) ;
        lowest = std::min( id, lowest ) ;
    }

    private:
        std::set<T> id_set ;
        T lowest = min ;
};



A std::priority_queue<> version
with generate() being O( log N ),
release() remaining O( log N )
(and using more memory if sanity checks are enabled)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
template < typename T, typename = T > struct id_factory ;

template < typename T >
using int_type = typename std::enable_if< std::is_integral<T>::value, T >::type ;

template < typename T > struct id_factory< T, int_type<T> >
{
    static constexpr T min = std::numeric_limits<T>::min() ;
    static constexpr T max = std::numeric_limits<T>::max() ;

    explicit id_factory( bool check_id = false ) : check(check_id) {}

    T generate()
    {
        T n ;
        if( free_ids.empty() ) n = next++ ;
        else
        {
            n = free_ids.top() ;
            free_ids.pop() ;
        }

        if(check) used_ids.insert(n) ;
        return n ;
    }

    void release( T id )
    {
        if( check && !used_ids.erase(id) )
            throw std::domain_error( "invalid id" ) ;
        free_ids.push(id) ;
    }

    private:
        std::unordered_set<T> used_ids ;
        std::priority_queue< T, std::vector<T>, std::greater<T> > free_ids ;
        T next = min ;
        const bool check ;
};


Caveat: untested code!
Last edited on
Thanks, I'll start dissecting that code to try and understand it...
JLBorges wrote:
> The protocol I am implementing requires me to use the lowest possible IDs

I'm a bit curious. Why?
I actually don't know, and I'm pretty sure I could get away with using your previous example given how the protocol works (and if it turns out that I can, I will indeed use that method), but currently my job is to implement the protocol as-is. The problem is that it has to be compatible with existing software that expects it to behave this way.


Anyway, I'll start digesting your examples. I guess it is reassuring to know that the reason I couldn't think of a simpler way is because there actually wasn't a simpler way.
In your priority_queue example:
13
14
15
16
17
18
19
20
21
22
23
24
25
    T generate()
    {
        T n ;
        if( free_ids.empty() ) n = next++ ; //what is 'next'?
        else
        {
            n = free_ids.top() ;
            free_ids.pop() ;
        }

        if(check) used_ids.insert(n) ;
        return n ;
    }



I'm also trying to compare your std::set example to my code. I don't know what the difference is, besides being more verbose (which is fine).
5
6
7
8
9
10
11
12
13
	ID_type GenerateID()
	{
		Assert(lowest < std::numeric_limits<ID_type>::max()); //A
		ID_type ret (lowest); //B
		while(IDs.find(++lowest) != IDs.end()) //C
		{
		}
		return IDs.insert(ret), ret; //D
	}
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
    T generate()
    {
        T n = lowest ; //B
        
        // EDIT:
        // for( const T& id : id_set ) //C
        for( auto iter = id_set.lower_bound(n) ; iter != id_set.end() ; ++iter ) //C
        {
            if( *iter > n ) break ; //C
            else if( *iter < max ) n = *iter+1 ; //C
            else throw std::range_error( "no more ids" ) ; //A
        }

        id_set.insert(n) ; //D
        lowest = n+1 ; //C
        return n ; //D
    }
I can't tell which performs better, or if mine is somehow incorrect.
Last edited on
> In your priority_queue example, what is 'next'

'next' holds the next id to be generated if there are no free released ids to pick from.

If we have generated a total of 1000 fresh ids, 'next' would be 'min' + 1000. If 50 of these 1000 ids are in the free ids list, no new id is generated, and the smallest free id is picked. We would now have 49 of the 1000 ids generated left in the free list.

If the free list is empty, all 1000 ids generated so far are in use; we would need to get a fresh id. The new id would be 'next' ( 'min' + 1000 ), and next would be incremented; it becomes 'min' + 1001.


> I can't tell which performs better, or if mine is somehow incorrect.

Yours is correct.

However, IDs.find(++lowest) is O( log N );
while ++iter ... ( *iter > n ) ... ( iter < max ) is O(1).

EDIT:
In while(IDs.find(++lowest) != IDs.end()) the ordered-ness of the std::set<> is not being exploited.

In that scenario, we might as well change std::set<> to std::unordered_set<> for which find() is amortised O(1)
Last edited on
Thanks!

About 'next', I meant, where is it defined? Was it just something I was supposed to fill in myself? I wasn't sure.
> About 'next', I meant, where is it defined?

'next' is a non-static member variable.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
template < typename T > struct id_factory< T, int_type<T> >
{
    static constexpr T min = std::numeric_limits<T>::min() ;

   // ...

    T generate()
    {
        T n ;
        if( free_ids.empty() ) n = next++ ; // *** and used here

        else // ...
        // ...
        return n ;
    }

    // ...

    private:
        // ...
        T next = min ; // **** declared with an initializer here
        // ...
};
Oops, haha, I can't believe I missed that. Thanks!
Topic archived. No new replies allowed.