Can I create a custom data type? (Not Classes)

Hello there!

I was wondering if there is any way to create a custom data type. I don't mean classes, I mean a 3 bit data type that holds values from 0 to 7.

I read somewhere to use __asm__, but I didn't quite get that.

In case anyone is wondering, this is for a project where a existing array of size 1000 X 1000 which has only 8 various numbers (which are very big, and hence use float) has to be made into an array of 1000 X 1000 which should hold 0 to 7.
I thought that rather than using a short int to use only 16 bits or 2 bytes, would there be anyway I could create a new data type called smallDataType which has only 3 bits?

Or do compilers not work that way?
Neither hardware nor compilers "work that way". You either use what exists, or you create a class.

The hardware does not operate on 3-bit chunks, but you can wrap the entire 3'000'000 bit array into a class that makes it look like you had 1'000'000 three-bit values.

Look at http://www.cplusplus.com/reference/bitset/bitset/
You could use std::bitset in your class ...
Computers don’t work that way — the smallest integer type is a byte (typically eight bits).

However, since you are storing them in a collection, you have a friend. Make a 3-bit collection class. You only need three bytes to store exactly eight three-bit fields.

000000001111111122222222
000111222333444555666777

This makes for a very convenient storage.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
struct vec3bits
{
  typedef unsigned char byte_type;
  typedef std::vector <byte_type> bytes_type;
  typedef typename bytes_type::size_type size_type;

  vec3bits( size_type n = 0 ): _bytes( n ) { }

  size_type size() const
  {
    // Returns the maximum number of triplets that fit in our array
    return _bytes.size() * 8 / 3;
  }

  byte_type get( size_type index ) const
  {
    // Calculate our offsets into the array
    auto byte_index = index * 3 / 8;
    auto shift_offset = index * 3 % 8;

    // Validate them
    if (byte_index >= _bytes.size()) throw std::range_error( "vec3bits" );

    // Now, our triplet may span more than one byte, so we will grab two bytes as an int
    unsigned word = _bytes[ byte_index ];
    if (byte_index < _bytes.size()) word |= _bytes[ byte_index + 1 ] << 8;

    // And finally, we can shift down and mask out our three bits
    return (word >> shift_offset) & 7;
  }

  void set( size_type index, byte_type value )
  {
    // I'll leave this as an exercise for you
  }

private:
  bytes_type _bytes;
};

Now you have a type that can be used in your matrix:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
struct mat3bits
{
  typedef std::vector <vec3bits> matrix_type;
  typedef typename matrix_type::size_type size_type;
  typedef typename vec3bits::byte_type byte_type;
  
  mat3bits() { }
  mat3bits( size_type rows, size_type columns ): _m( rows, vec3bits( columns ) ) { }

  size_type size() const { return _m.size(); }

  byte_type get( size_type row, size_type column ) const
  {
    return _m[ row ].get( column );
  }

  void set( size_type row, size_type column, byte_type value )
  {
    _m[ row ].set( column, value );
  }

private:
  matrix_type _m;
};

Well, that should get you started.

Disclaimer: I just typed this all in off the top of my head. Typos and other stupidities may have occurred.
Thank you very much. This is exactly what I required!
> create a new data type called smallDataType which has only 3 bits?

Strongly consider using 4 bits instead of 3 bits.

Though this would use a little more storage, since in every implementations currently in use, CHAR_BIT == 8 (a byte is an octet), it would lead to code that is not only cleaner but also more efficient.
Pardon me if this was already said, not reading it all right this sec.
Assembly on some systems can address and operate on 1/2 a byte. On those systems, embedded assembly routines can expose this to your program.

__asm etc are various platform specific ways to embed assembly language in the C++ code. This is not portable and usually overkill .. assembly is a lot of work so only very performance critical work needs it, which is rare these days due to cpu speeds going up so high now. It was more common long ago.

In memory, etc, you should probably work on groups of bytes, usually a power of 2, that is 1,2,4,8 byte integers etc. However if you know that your data is 3 bits used in a byte, you can compact that when writing a file or ethernet packet if you want to save space/bandwidth. Trying to do this on live data can save ram at a terribly high performance cost... the time/space tradeoff but the time cost is too high in this case unless you are on a wristwatch or something and have very little memory to work with.

you can do some funky stuff with typedef, bitfields, booleans, and more in the bit realm on some systems, but for the most part this isnt practical. Usually all this lets you do is pack several small types into less real space, and again, it comes at a cost.
Last edited on
Heh, I let my save every ounce tendency take over. I agree with the others here: use nibbles and waste a bit. Life is so much easier and faster that way.
Topic archived. No new replies allowed.