An unusual data type... what do I use?

I have an unusual situation in which I want to use a basic (raw/built in) data type that hides some of its value. Let me explain what I mean.

Basically the type is an unsigned long. However, I want to hide the 4 most significant bits and use them as flags. For the purposes of this situation, lets call it MyType. Here is an example of how I want to use it:

1
2
3
4
5
6
7
8
9
MyType v = 0x3FFF0FFFFFFFFFFF;
unsigned long ulv = (unsigned long)v;  // ulv = 0xFFF0FFFFFFFFFFF... note that the most significant 
                                       // hex digit is cut off.  Those bits will be used as flags.

// the 3 at the end breaks down as follows:
bool flag0 = v.flag0(); // would be true
bool flag1 = v.flag1(); // would be true
bool flag2 = v.flag2(); // would be false
bool flag3 = v.flag3(); // would be false 


Important note: I want data of this type to be treated like a basic type or struct... (i.e. a value... specifically an unsigned long) when it is passed on the stack so that any data created locally will still be usable when returned from a function.

I am familiar with operator overloading and all that but I am not familiar with which operators can be overloaded in which data type. (i.e. not sure which operators can be overloaded in a struct for example).

I just want to know what the best data type would be best? A class? A struct? Any other options? And possibly any tips on which operators I can overload if I need to use a struct.
Last edited on
If you want this, there is an easy way and a hard way. The easy way is like this:
1
2
3
4
struct DataType {
    unsigned char flag;
    unsigned long data;
};

That will have a similar effect to what you want. The other way is to use a class/struct of some kind. Note that the only difference between a class and a struct is the default access privilege (public vs private). As an example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class DataType {
    public:
        // you may want to split these into flags and data
        DataType(uint32_t data) : _data(data) {}
        void set(uint32_t data) { _data = data; }

        uint32_t get() const { return _data & 0x0FFFFFFFFFFFFFFF; }
        operator uint32_t() const { return get(); }

        bool flag(unsigned num) const {
            assert(num < 5); // make sure flag isn't out of range
            return _data & (0x1UL << (27 + num));
        }

    private:
        uint32_t _data; // the data, includes flags as well
};

I'm not sure if I got that entirely right, but you should get the idea. In case you were wondering, the 'operator uint32_t' business is for when the variable is cast to an unsigned long.
Last edited on
Also FYI, unsigned long is not necessarily 64-bits wide. (You need 64 bits for anything more than 8 hex digits)

If you need 64-bit width, consider unsigned long long, or uint64_t.
The only difference between a struct and a class is that the members of a struct are public by default and the members of a class are private by default.

You can overload the cast operators:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
struct Data
{
  int my_value;

  int      flag( void ) const { return ( my_value & 0xFF000000 ) >> 24; }

  operator  int( void ) const { return my_value & 0x00FFFFFF; }
};

void func( int x )  { /* blah */ }

Data d = { 0x12345678 };
func( d );  // does something with 0x00345678; 


But now you have a whole lot of messing with bits to change the Data.

Better to use bitfileds.

1
2
3
4
5
6
7
8
9
10
struct Data
{
  int flag1 : 1;  // says "flag1 uses one bit of an int"
  int flag2 : 1;
  int flag3 : 1;
  int flag4 : 1;
  int value : 28;
};
Data d;
d.value = 1234;
LOL NT3, I didn’t need you to write a class for me.

A lot of people say that the only difference between a class and a struct is [insert comment here]
But usually what they say is not true when it comes to memory management (unless things have changed in recent years…)

Consider the following example:
1
2
3
4
5
6
7
#include “MyClass.h”
MyClass returnLocalInstance(/*some data*/)
{
	MyClass localInstance(/*some data*/);
	return localInstance;
}
 


It has been a long time since I tried something like this but when I first learned C++, trying to access any data returned by the above code would more than likely result in either an error or garbage data because the returned instance is created on the stack rather than the heap and NOT copied upon return. As far as I understand this is different from a struct because the struct is actually copied in full.

As I said, I want the data COPIED in memory when assigning from one variable to another as would be the case with a struct. But I also would prefer not to have the data visible by publicly accessing a member variable. Basically I want the transparency offered by a class but I also want the data copied upon return.

EDIT: Thanks LowestOne for the info on cast operators. And... even better the ability to use the bit specifications. That helps. So will the compiler automatically resolve that to use 32 consecutive bits of memory? essentially a single int?
Last edited on
> unless things have changed in recent years…
yes, a little more than fifteen years ago (but I think that your claim is just plain wrong, and that was never the case)
Last edited on
> I want the data COPIED in memory when assigning from one variable to another as would be the case with a struct.
> But I also would prefer not to have the data visible by publicly accessing a member variable.
> Basically I want the transparency offered by a class but I also want the data copied upon return.

In short, a light-weight CopyAssignable class having logically visible, but publicly inaccessible state.

Taking a somewhat more generalised approach:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
#include <iostream>
#include <type_traits>
#include <limits>
#include <bitset>

template < std::size_t NFLAG_BITS, typename UINT_TYPE = unsigned long long >
struct flagged_unsigned
{
    static_assert( std::is_same<UINT_TYPE,unsigned long>::value ||
                   std::is_same<UINT_TYPE,unsigned long long>::value,
                   "unsupported integer type" ) ;

    static constexpr std::size_t NBITS = std::numeric_limits<UINT_TYPE>::digits ;
    static_assert( NFLAG_BITS < NBITS, "too many flag bits" ) ;

    static constexpr std::size_t NVALUE_BITS = NBITS - NFLAG_BITS ;
    static constexpr UINT_TYPE FLAGS_MASK = UINT_TYPE(-1ULL) << NVALUE_BITS ;
    static constexpr UINT_TYPE VALUE_MASK = UINT_TYPE(-1ULL) >> NFLAG_BITS ;
    static constexpr bool is_unsigned_long = std::is_same< UINT_TYPE, unsigned long >::value  ;

    void operator& () const = delete ;
    UINT_TYPE to_uint_type() const { return is_unsigned_long ? bits.to_ulong() : bits.to_ullong() ; }

    flagged_unsigned( UINT_TYPE value = 0, std::bitset<NFLAG_BITS> flags = {} )
    {
        bits = value ;
        bits |= ( UINT_TYPE( flags.to_ullong() ) << NVALUE_BITS ) ;
    }

    template < std::size_t N > bool flag() const
    {
        static_assert( N < NFLAG_BITS, "no flag with this id" ) ;
        return bits[ NBITS - 1 - N ] ;
    }

    template < std::size_t N > void flag( bool b )
    {
        static_assert( N < NFLAG_BITS, "no flag with this id" ) ;
        bits[ NBITS - 1 - N ] = b ;
    }

    std::bitset<NFLAG_BITS> flags() const
    { return ( to_uint_type() & FLAGS_MASK ) >> NVALUE_BITS ; }

    operator UINT_TYPE () const { return to_uint_type() & VALUE_MASK ; }

    flagged_unsigned& operator= ( UINT_TYPE value )
    {
        bits = ( to_uint_type() & FLAGS_MASK ) | ( value & VALUE_MASK ) ;
        return *this ;
    }

    // TODO: overload other integer operations on lvalues += etc.

    private: std::bitset<NBITS> bits ;
};

template < std::size_t NFLAG_BITS, typename UINT_TYPE >
std::ostream& operator<< ( std::ostream& stm, flagged_unsigned<NFLAG_BITS,UINT_TYPE> v )
{ return stm << "( flags: " << v.flags() << " value: " << UINT_TYPE(v) << " )" ; }

int main()
{
    flagged_unsigned<4> a = { 0x123456789abcdef, 3 } ;
    std::cout << std::hex << std::showbase << a << '\n' // ( flags: 0011 value: 0x123456789abcdef )
              << std::boolalpha << a.flag<1>() << '\n' // false
              << a.flag<2>() << '\n' // true
              << a.flags() << '\n' ; // 0011

    a.flag<0>(true) ;
    std::cout << a << '\n' ; // ( flags: 1011 value: 0x123456789abcdef )

    a = 0x1a2b3c4d5e6f ;
    std::cout << a << '\n' ; // ( flags: 1011 value: 0x1a2b3c4d5e6f )

    unsigned long long b = a ;
    std::cout << b << '\n' ; // 0x1a2b3c4d5e6f

    a = a + 4095 ;
    std::cout << a << '\n' ; // ( flags: 1011 value: 0x1a2b3c4d6e6e )
}

http://coliru.stacked-crooked.com/a/47169fe454715fb6
So will the compiler automatically resolve that to use 32 consecutive bits of memory?


Under the assumption an int is 32 bits, yes. However, the struct will be padded in the case of using less.

1
2
3
4
5
struct Thing
{
  int a : 20;
  int b: 20;
}


I believe that Thing will be 64 bits large. I don't think there is a promise that a is the first 20 bits. It might have the 12 bits of padding before it or it might even be somewhere else. Maybe b is on the 20th bit, and there is 24 bits of padding at the end of the struct. Who knows?



Last edited on
But usually what they say is not true when it comes to memory management (unless things have changed in recent years…)

Things have not changed with regards to the code you illustrated with, and your claim is as false pre-standard as it is with the current standard.
If you want to guarantee 32-bits, use std::int32_t or std::uint32_t

http://en.cppreference.com/w/cpp/header/cstdint
Last edited on
> So will the compiler automatically resolve that to use 32 consecutive bits of memory? essentially a single int?

With bit-fields, nothing is guaranteed. In our code, we specify the value representation of a bit-field; the object representation of the bit-field is implementation-defined, except for this special case:
An unnamed bit-field with a width of zero specifies alignment of the next bit-field at an allocation unit boundary.


For instance, with 8-bit bytes and 32-bit integers, the object representation of
1
2
3
4
5
6
7
8
struct Data
{
  int flag1 : 1;  // says "flag1 uses one bit of an int"
  int flag2 : 1;
  int flag3 : 1;
  int flag4 : 1;
  int value : 28;
};

may be 32 bits on some implementations (where bit-fields straddle allocation units) and 64 bits on others (where bit-fields do not straddle allocation units). An implementation which uses 160 bits would be rare, but conforming.
I looked up the convention you suggested LowestOne before and decided to use it. Thanks. As result I haven't really looked at this thread in a week or so.

Thanks for everyone's help but I would like to clear something up if possible:

Things have not changed with regards to the code you illustrated with, and your claim is as false pre-standard as it is with the current standard.


Hmm... well if that is not the case (if my statement is false regarding the past) I wish someone could explain to me why any object returned from that function was always inaccessible or more often... contained garbage data. Whenever I debugged a program that used that convention when I started programming, the data would not be data I assigned to the variable within the function. Presumably because stuff on the stack had already overwritten that memory space. So I started using pointers allocated with new keyword instead. This caused a lot of frustration when I was first starting with C++. I know this may be difficult without actual examples from when I had the problems but it was a good while ago and I changed the code anyway.

Also, why do older games who have released their source code (Such as Civ4) always using pointers or reference types instead of strait objects? I would accept saving time as a reason were it not from my own experiences.

FYI my programming background: I started with C in 1995 and C++ back in 1997 for small things. Around 2003-2004 I started using C# and and used it for quite a bit of time in most of my projects because it was "easier" and quicker to get things done. In 2006-2007 I went back to C++ for a project I developed with a team (a mod for Civ4:Warlords using their publicly released SDK).

Recently I have been using C++ again because I am working on a new, very large, project that makes use of Qt (for its cross-platform GUI) and because based on my experience with programs developed with different languages, C++ is faster at the kind of thing I am currently working on.

One big question about "references" has arisen from my working with C#. C# hides the difference between references, pointers, and objects because it treats what it calls "references" similarly to pointers (actually garbage collected pointers) and all object variables are actually of this type. (You can assign null to a reference in C#). (In the case of unsafe code, an actual pointer can be used but the object must explicitly be kept from being moved by the memory manager). In moving back to C++ it is clear that there is a gap in my understanding in what it calls a reference. C++ treats references more similarly to objects themselves. So what is the difference between what is returned from the following functions?

1
2
MyClass& getObjectReference();
MyClass getObject();


Perhaps I should put this question in another thread?
Also, why do older games who have released their source code (Such as Civ4) always using pointers or reference types instead of strait objects?
To avoid the copy dilemma.

There are two types of copy:

Shallow and Deep

Shallow copy copies pointer instead of the object pointed to. Very dangerous when trying to free the according memory.
Deep copy requires additional operators (such as operator=(...)) in order to copy the content of the objects pointed to. Slower but safer.

With smart pointers (C++11) today due to its reference count copy is safe and relative fast
> So what is the difference between what is returned from the following functions?
1
2
> MyClass& getObjectReference();
> MyClass getObject();


With MyClass getObject(); the expression getObject() yields a prvalue (a pure rvalue).
This prvalue identifies a temporary object of type MyClass.

With MyClass& getObjectReference(); the expression getObjectReference() yields an lvalue.
This lvalue identifies a non-temporary object of type MyClass.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <string>

struct MyClass
{
    std::string str ;
};

MyClass foo()
{
    MyClass object = { "hello" } ;
    return object ; // this is fine
    // lifetime of object is over once the block is exited.
    // but we are not returning the object,
    // we are returning a prvalue (an anonymous temporary) initialised with object
    // we are semantically returning a temporary copy of the object
    // the object must be copyable, even though compilers are permitted to 'elide' this copy
}

MyClass& bar()
{
    MyClass object = { "hello" } ;
    return object ; // *** this is a serious logical error
    // lifetime of object is over once the block is exited.
    // we are returning an alias of an object that no longer exists.
    // a compiler will typically warn us about the problem
    // *** warning: reference to local variable returned
}

MyClass& baz()
{
    static MyClass object = { "hello" } ;
    return object ; // this is fine
    // lifetime of the objcet extends beyond the return
    // we are returning an alias (an lvalue) of this non-temporary object
}
Thanks JLBorges (and coder). What does 'elide' mean though?

Now I also understand why I have had different experiences with structs than with classes as well. I have never use structs to hold anything other than pointers and values. (All basic or typedef'd types) because I never really saw a point to doing so. In my mind, structs have always been for sending memory blocks of a limited size to and from functions. So now thinking back, I realize I have never created struct that contained a string (or any other class instance) itself. Rather just a pointer to a string. So I never would have returned a class from a struct function.
> What does 'elide' mean though?

In English, 'elide' means 'omit'; primarily 'omit to pronounce (a sound or syllable) when speaking'.

In C++, 'copy elision' is used as a technical term meaning 'optimising away gratituous copying/moving of objects'.

A brief explanation, and link to the cppreference page:
http://www.cplusplus.com/forum/general/141939/#msg749400
Topic archived. No new replies allowed.