blob: d9eb14ad2442c7628def376c72e2eccf1dbb37a9 [file] [log] [blame] [view]
# ShadowCallStack in Zircon & Fuchsia
[TOC]
## Introduction
LLVM's [shadow-call-stack feature][shadow-call-stack] is a compiler mode
intended to harden the generated code against stack-smashing attacks such as
exploits of buffer overrun bugs.
The Clang/LLVM documentation page linked above describes the scheme. The
capsule summary is that the function return address is never reloaded from the
normal stack but only from a separate "shadow call stack". This is an
additional stack, but rather than containing whole stack frames of whatever
size each function needs, it contains only a single address word for each call
frame it records: just the return address. Since the shadow call stack is
allocated independently of other stacks or heap blocks with its own randomized
address to which pointers are rare, it is much less likely that some sort of
buffer overrun or use-after-free exploit will overwrite a return address in
memory so that it can cause the program to return to an instruction by the
attacker.
The [shadow-call-stack] and [safe-stack] instrumentation schemes and ABIs are
related and similar but also orthogonal. Each can be enabled or disabled
independently for any function. Fuchsia's compiler ABI and libc always
interoperate with code built with or without either kind of instrumentation,
regardless of what instrumentation was or wasn't used in the particular libc
build.
[shadow-call-stack]: https://clang.llvm.org/docs/ShadowCallStack.html
[safe-stack]: safestack.md
## Interoperation and ABI Effects
In general, shadow-call-stack does not affect the ABI. The machine-specific
calling conventions are unchanged. It works fine to have some functions in a
program built with shadow-call-stack and some not. It doesn't matter if
combining the two comes from directly compiled `.o` files, from archive
libraries (`.a` files), or from shared libraries (`.so` files), in any
combination.
While there is some additional per-thread state (the *shadow call stack
pointer*, [see below](#implementation-details)), code not using
shadow-call-stack does not need to do anything about this state to keep it
correct when calling, or being called by, code that does use safe-stack. The
only potential exceptions to this are for code that is implementing its own
kinds of non-local exits or context-switching (e.g. coroutines). The Zircon C
library's `setjmp`/`longjmp` code saves and restores this additional state
automatically, so anything that is based on `longjmp` already handles everything
correctly even if the code calling `setjmp` and `longjmp` doesn't know about
shadow-call-stack.
For AArch64 (ARM64), the `x18` register is already reserved as "fixed" in the
ABI generally. Code unaware of the shadow-call-stack extension to the ABI is
interoperable with the shadow-call-stack ABI by default if it simply never
touches `x18`.
The feature is not yet supported on any other architecture.
## Use in Zircon & Fuchsia
Zircon on Aarch64 (ARM64) supports shadow-call-stack both in the kernel and
for user-mode code. This is enabled in the Clang compiler by the
`-fsanitize=shadow-call-stack` command-line option. For `aarch64-fuchsia`
(ARM64) targets, it is enabled by default. To disable it for a specific
compilation, use the `-fno-sanitize=shadow-call-stack` command-line option.
As with [safe-stack], there is no separate facility for specifying the size of
the shadow call stack. Instead, the size specified for "the stack" in legacy
APIs (such as `pthread_attr_setstacksize`) and ABIs (such as `PT_GNU_STACK`) is
used as the size for **each** kind of stack. Because the different kinds of
stack are used in different proportions according to the particular program
behavior, there is no good way to choose the shadow call stack size based on
the traditional single stack size. So each kind of stack is as big as it might
need to be in the worst case expected by the tuned "unitary" stack size. While
this seems wasteful, it is only slightly so: at worst one page is wasted per
kind of stack, plus the page table overhead of using more address space for
pages that are never accessed.
## Implementation details
The essential addition to support shadow-call-stack code is the *shadow call
stack pointer*. This is a register with a global use, like the traditional
stack pointer. But each call frame pushes and pops a single return address
word rather than arbitrary data as in the normal stack frame.
For AArch64 (ARM64), the `x18` register holds the shadow call stack pointer at
function entry. The shadow call stack grows upwards with post-increment
semantics, so `x18` always points to the next free slot. The compiler never
touches the register except to spill and reload the return address register
(`x30`, aka LR). The Fuchsia ABI requires that `x18` contain a valid shadow
stack pointer at all times. That is, it must **always** be valid to push a
new address onto the shadow call stack at `x18` (modulo stack overflow).
### Notes for low-level and assembly code
Most code, even in assembly, does not need to think about shadow-call-stack
issues at all. The calling conventions are not changed. All use of the stack
(and/or the [unsafe stack][safe-stack]) is the same with or without
shadow-call-stack; *when frame pointers are enabled*, the return address will
be stored on the machine stack next to the frame pointer as expected. For
AArch64 (ARM64), function calls still use `x30` for the return address as
normal, though functions that clobber `x30` can choose to spill and reload it
using different memory. Non-leaf functions written in assembly should ideally
make use of the shadow-call-stack ABI by spilling and reloading the return
address register there instead of on the machine stack.
The main exception is code that is implementing something like a non-local
exit or context switch. Such code may need to save or restore the shadow call
stack pointer. Both the `longjmp` function and C++ `throw` already handle
this directly, so C or C++ code using those constructs does not need to do
anything new.
New code implementing some new kind of non-local exit or context switch will
need to handle the shadow call stack pointer similarly to how it handles the
traditional machine stack pointer register and the [unsafe stack][safe-stack]
pointer. Any such code should use `#if __has_feature(shadow_call_stack)` to
test at compile time whether shadow-call-stack is being used in the particular
build. That preprocessor construct can be used in C, C++, or assembly (`.S`)
source files.