Alignment and Layout

2022-09-15
/code#rust#concepts#memory

Memory layouts, size, and alignments

Alignment

If you ask a CPU to load a 32-bit value from memory, it will expect that value to be 32-bit aligned. That is, its address should be evenly divisible by 32-bits, or 4 bytes. Similarly, 16-bit values are 2-byte aligned, and 8-bit values can go anywhere.

On 32-bit systems, 64-bit values are only need to be 4-byte alligned, while on 64-bit systems they're 8-byte aligned. In general, the alignment required by CPUs rarely exceeds the size of pointers on the system.

The consequences of trying to access a misaligned value depend on the CPU. On x86(-64), misaligned access is much slower as the the initial read fails, forcing the CPU to perform two smaller reads and glue the results together. Other ISAs, like ARM, might raise processor exceptions, or even misbehave.

Primative types (e.g. numbers, pointers) have alignments that can be inferred from the rules above. Compound types like a struct will have an alignment equal to the greatest alignment of its fields. E.g. a struct with a u8s, a i16 and a float will have a 4-byte alignment due to the float.

If the sum of the size of elements of a struct is not a multiple of its alignment, padding bytes will be added by the compiler to round it up. This means that types can be larger than expected:

struct Example {
  a: u64,
  b: u8
}

has a size of 16 bytes because the struct has an alignment of 8, but a naïve size of 9: an additional 7 bytes of padding are required.

Memory layouts

In C and C++, the layout of fields in a struct follow the exact order specified in the structs definition.

struct Example {
  uint_64 a,
  uint_8 b,
  uint_64 c,
  uint_8 d
}

has a size of 32 bytes, with 7 padding bytes after b and c. C/C++'s strict adherence to field ordering means that field ordering is very important in these languages.

By default, the equivalent struct

struct Example {
  a: u64,
  b: u8,
  c: u64,
  d: u8
}

could have its fields reordered in memory, e.g. moving the two u8 fields next to each other. This kind of flexibility is especially useful when it comes to generic values: given

struct Example2<T> {
  a: T,
  b: i32,
  c: T
}

then Example2<u8> might order the fields b, a, c; Example2<u32> could keep the fields in a, b, c order, and Example2<&str> might be a, c, b.

Rust lets you use the repr attribute to control field ordering behaviour. This default behaviour is #[repr(rust)]. See the Nomicon for information on other reprs, including #[repr(transparent)] and the unsafe #[repr(packed)].

The representation of #[repr(rust)] (default repr) structs is not guaranteed, and should not be relied on. Different invocations of the same compiler, let alone different versions of the compiler, can result in different layouts. There is a nightly compiler flag that will deliberately randomize the order of the fields on each build.

Alignment considerations is one reason why struct-of-array layouts can outperform array-of-structs:

struct Foo {
  a: u64,
  b: u8
}
let foo: [Foo; 20];

foo will take 320 bytes, while

struct Bar {
  a: [u64; 20],
  b: [u8; 20]
}
let bar: Bar;

bar only needs 184 bytes. (a = 180 bytes, b = 20 bytes, then another 4 for padding.)

In addition to just using less memory overall, denser data plays more nicely with CPU caches.

Enums and niche optimization

Naively, Option<T> should require at least size_of::<T>() + align_of::<T>() bytes. This is true for many types: Option<u16> has a size of 4 bytes. For some types, such as references, Vec, and String, there are particular bit patterns that are invalid values for these types. Rust is smart enough to use these invalid bit patterns to represent the None values for these types. In the case of references, since references must be non-null, Rust can use 0x00000000 to represent None. Vec and String also avoid all-zero values to enable this optimization.

There's nothing special about Option in this example! Any enum you write will automatically make use of this optimization, called niche optimization, if possible.