Compilers, bit operations, and data alignment

Lets say I have some struct compiled for a 64 bit processor:

1
2
3
4
5
6
struct MyStruct
{
	__uint32 d1;
	__int32 d2;
	__uint64 d3;
};


I know that sizeof(MyStruct) will give me 16 bytes. When it comes to memory alignment however, does it actually take 16 bytes? I have always assumed such. If so, I also assume it builds in the bitwise operations necessary to convert the alignment. For example, assuming d1 is stored in the higher order 32 bits, to read d1, the operation must essentially perform this (inlined):

1
2
__uint64 d1_equivalent = 0xFFFFFFFF00000000 & d1d2_MEMORY_LOCATION;
return (__uint32)(d1_equivalent >> 32);


Are these assumptions correct? Is it compiler dependent? Do I need to provide my own bitwise logic in order to guarantee that the above structure only takes 16 bytes?


Last edited on
> I know that sizeof(MyStruct) will give me 16 bytes.
Well your first incorrect assumption is that all machines use 8-bit bytes.
https://stackoverflow.com/questions/32091992/is-char-bit-ever-8

> If so, I also assume it builds in the bitwise operations necessary to convert the alignment.
In the absence of bit-fields, the compiler adds padding between struct members (if necessary) to ensure that each member can be accessed with the appropriate aligned memory read instruction.

In your particular case, assuming the machine is capable of 32-bit memory accesses, there should be no padding in that structure.

> Do I need to provide my own bitwise logic in order to guarantee that the above structure only takes 16 bytes?
That depends on what you're asking.
If you've got 16 bytes (or perhaps more accurately, octets) from some external source (file, network packet, I/O device), then you actually do need to do your own bitwise logic if you want portable code.

Even if you can make the struct align perfectly with the external data (for your 'foo' compiler), endian issues cannot be overcome with simple structure rearrangement.

But if it's just an internal structure to your program, then you minimise wasted space by ordering the members in size order.
1
2
3
4
5
6
struct MyStruct
{
	__uint64 d3;
	__uint32 d1;
	__int32 d2;
};

Edit. I didn't see @salem c's post until after I posted this.

I know that sizeof(MyStruct) will give me 16 bytes.
Yes, it would probably be 16 bytes. It's not guaranteed, however - we could conceive of a (probably stupid) implementation that would yield a larger struct.

When it comes to memory alignment however, does it actually take 16 bytes?

Yes. Alignment is a property inherent to the address of an object; the alignment of an object never affects that object's size.

The important distinction is that the alignment requirements of a subobject can affect the alignment requirement and size of an object.

1
2
3
4
5
6
struct A { alignas(8) std::uint8_t x[1]; } a;

static_assert(sizeof(a) == 8);
static_assert(alignof(a) == 8);
static_assert(sizeof(a.x) == 1);
static_assert(alignof(a.x) == 8);


If we use the memory optimization tool pahole to examine the layout of A, we'll get the following. C++ comments are mine:
struct A {
        // x[1] spans starts at offset 0, and ends at byte 1
	uint8_t                    x[1];                 /*     0     1 */
        // The size of the structure is 8 bytes; it requires one cache-line
        // The structure has one member.
   	/* size: 8, cachelines: 1, members: 1 */
        // 7 total bytes of padding in the structure;
	/* padding: 7 */
        // The object requires only 8 bytes of its last cache-line 
	/* last cacheline: 8 bytes */
};

Pahole: https://linux.die.net/man/1/pahole

Note that the type A is effectively the same as std::aligned_storage_t<1, 8>.

I also assume it builds in the bitwise operations necessary to convert the alignment.

I don't know what you mean. As I wrote above, alignment is inherent to the address of an object. Bitmath is not generally necessary (or even natural) to access an object that is naturally aligned.

Is it compiler dependent?
Yes.

Do I need to provide my own bitwise logic in order to guarantee that the above structure only takes 16 bytes?
If it's just a plain struct, make this static assertion:
 
static_assert(sizeof(MyStruct) == 16);


If you need to pack or unpack structures that are not naturally aligned, you can use vendor-specific language extensions, e.g., the GCC attribute [[gnu::packed]] to allow the compiler to misalign the members of a class, or use std::memcpy as I discussed before. Misaligned accesses are undefined behavior; be careful.

I suggest that you play around with pahole on Matt Godbolt's page:
https://godbolt.org/z/UqlG-G
Last edited on
If so, I also assume it builds in the bitwise operations necessary to convert the alignment


I asked this question because I have not updated my understanding of assembly/machine language through the years. I first learned how to do assembly on 8 bit machines (specifically the 6502 processor). So my question arises out of my not understanding the specifics of memory retrieval on modern machines/processors. For example, assuming A is a register is the following load command (from memory) in assembly valid?

LDA 0xCA220501 // Note this is not aligned...

@salem c seems to imply that at the very least, a register load instruction (from memory) can fall on a 32 bit boundary... so I suppose then bitwise operations would not be necessary for int operations of any kind. Can a load instruction operate on any address?
The golden rule here is "it's compiler dependent."

I know that sizeof(MyStruct) will give me 16 bytes.
Probably, but not necessarily. If you want to be sure, add an assertion to your code. If you ever find that it isn't 16 bytes, investigate #pragma pack (I think that's it).

primem0ver wrote:
When it comes to memory alignment however, does it actually take 16 bytes?
mbozzi wrote:
the alignment of an object never affects that object's size.

The alignment can affect the size of the object. To see why, just consider an array of the objects. For example:
1
2
3
4
struct s {
    double d;   // 8 bytes
    char ch;    // 1 byte
};

You might look at this and think "the size is 9 and it must be aligned on an 8-byte boundary." But then what about:
struct s sArray[2];
If the size is 9 then s[1] won't be properly aligned. It can't just add the extra space when creating an array because that would make it impossible to create a a dynamic array in C:
struct s *dynamicArray = calloc(2, sizeof(struct s)); // oops, allocates 2*9=18 bytes - not enough for 2 s's

To solve these problems, the compiler pads the struct to 16 bytes.

If so, I also assume it builds in the bitwise operations necessary to convert the alignment.
No, most processors have instructions to read, write, and manipulate data of different sizes. For example, the x86 architecture has registers AL(1 byte), AX(16 bytes, EAX (32 bytes), and (RAX) (64 bytes). So to read or write MyStruct:d1 is just executes the 32 bit read command.

You shouldn't need to provide your own bit-wise logic to guarantee the alignment that you want, but you may need some ifdefs around the declaration to handle the specifics of getting the right alignment for different compilers.
> assuming A is a register is the following load command (from memory) in assembly valid?
> LDA 0xCA220501 // Note this is not aligned...
That depends on your architecture.
- It costs zero time, alignment doesn't matter.
- It costs 1 extra memory read. The processor reads 0xCA220500 and 0xCA220504, and does the bit-magic for you.
- It's an expensive trap into the OS to fix things up for you.
- You get a bus error, and your program dies.

The alignment can affect the size of the object. To see why, just consider an array of the objects. For example:
I did choose my language carefully. The alignment of a particular object x has nothing to do with the size of the same object x. A one-byte char can be aligned on any N2-byte boundary, but it is always one byte in size.

But otherwise you're absolutely correct, I was just unclear.
mbozzi wrote:
the alignment requirements of a subobject can affect the alignment requirement and size of an object
For example, the size and alignment requirement of s is adjusted (via the insertion of padding) to satisfy the alignment requirement of its member subobjects.
Last edited on
Thanks everyone for the info. It was helpful. One last question though for salem c and dhayden (regarding architecture and defines). I am targeting the PC (Linux and Windows), Apple (MacOS), and tablet markets. How do I find out the details regarding architecture?
Last edited on
Topic archived. No new replies allowed.