Make UNALIGNED_LOAD16/32 on ARMv7 go through an explicitly unaligned struct, to avoid the compiler coalescing multiple loads into a single load instruction (which only work for aligned accesses). A typical example where GCC would coalesce: uint8* p = ...; uint32 a = UNALIGNED_LOAD32(p); uint32 b = UNALIGNED_LOAD32(p + 4); uint32 c = a | b;