If you ask a CPU to load a 32-bit value from memory, it will expect that value to be 32-bit aligned. That is, its address should be evenly divisible by 32-bits, or 4 bytes. Similarly, 16-bit values are 2-byte aligned, and 8-bit values can go anywhere.
On 32-bit systems, 64-bit values are only need to be 4-byte alligned, while on 64-bit systems they're 8-byte aligned. In general, the alignment required by CPUs rarely exceeds the size of pointers on the system.
The consequences of trying to access a misaligned value depend on the CPU. On x86(-64), misaligned access is much slower as the the initial read fails, forcing the CPU to perform two smaller reads and glue the results together. Other ISAs, like ARM, might raise processor exceptions, or even misbehave.
Primative types (e.g. numbers, pointers) have alignments that can be inferred from the rules above. Compound types like a struct will have an alignment equal to the greatest alignment of its fields. E.g. a struct with a u8
s, a i16
and a float
will have a 4-byte alignment due to the float.
If the sum of the size of elements of a struct is not a multiple of its alignment, padding bytes will be added by the compiler to round it up. This means that types can be larger than expected:
struct Example {
a: u64,
b: u8
}
has a size of 16 bytes because the struct has an alignment of 8, but a naïve size of 9: an additional 7 bytes of padding are required.
In C and C++, the layout of fields in a struct follow the exact order specified in the structs definition.
struct Example {
uint_64 a,
uint_8 b,
uint_64 c,
uint_8 d
}
has a size of 32 bytes, with 7 padding bytes after b
and c
. C/C++'s strict adherence to field ordering means that field ordering is very important in these languages.
By default, the equivalent struct
struct Example {
a: u64,
b: u8,
c: u64,
d: u8
}
could have its fields reordered in memory, e.g. moving the two u8
fields next to each other. This kind of flexibility is especially useful when it comes to generic values: given
struct Example2<T> {
a: T,
b: i32,
c: T
}
then Example2<u8>
might order the fields b
, a
, c
; Example2<u32>
could keep the fields in a
, b
, c
order, and Example2<&str>
might be a
, c
, b
.
Rust lets you use the repr
attribute to control field ordering behaviour. This default behaviour is #[repr(rust)]
. See the Nomicon for information on other repr
s, including #[repr(transparent)]
and the unsafe
#[repr(packed)]
.
The representation of #[repr(rust)]
(default repr) structs is not guaranteed, and should not be relied on. Different invocations of the same compiler, let alone different versions of the compiler, can result in different layouts. There is a nightly compiler flag that will deliberately randomize the order of the fields on each build.
Alignment considerations is one reason why struct-of-array layouts can outperform array-of-structs:
struct Foo {
a: u64,
b: u8
}
let foo: [Foo; 20];
foo
will take 320 bytes, while
struct Bar {
a: [u64; 20],
b: [u8; 20]
}
let bar: Bar;
bar
only needs 184 bytes. (a = 180 bytes, b = 20 bytes, then another 4 for padding.)
In addition to just using less memory overall, denser data plays more nicely with CPU caches.
Naively, Option<T>
should require at least size_of::<T>() + align_of::<T>()
bytes. This is true for many types: Option<u16>
has a size of 4 bytes. For some types, such as references, Vec
, and String
, there are particular bit patterns that are invalid values for these types. Rust is smart enough to use these invalid bit patterns to represent the None
values for these types. In the case of references, since references must be non-null, Rust can use 0x00000000
to represent None
. Vec
and String
also avoid all-zero values to enable this optimization.
There's nothing special about Option
in this example! Any enum you write will automatically make use of this optimization, called niche optimization, if possible.