|  | # ShadowCallStack in Zircon & Fuchsia | 
|  |  | 
|  | [TOC] | 
|  |  | 
|  | ## Introduction | 
|  |  | 
|  | LLVM's [shadow-call-stack feature][shadow-call-stack] is a compiler mode | 
|  | intended to harden the generated code against stack-smashing attacks such as | 
|  | exploits of buffer overrun bugs. | 
|  |  | 
|  | The Clang/LLVM documentation page linked above describes the scheme.  The | 
|  | capsule summary is that the function return address is never reloaded from the | 
|  | normal stack but only from a separate "shadow call stack".  This is an | 
|  | additional stack, but rather than containing whole stack frames of whatever | 
|  | size each function needs, it contains only a single address word for each call | 
|  | frame it records: just the return address.  Since the shadow call stack is | 
|  | allocated independently of other stacks or heap blocks with its own randomized | 
|  | address to which pointers are rare, it is much less likely that some sort of | 
|  | buffer overrun or use-after-free exploit will overwrite a return address in | 
|  | memory so that it can cause the program to return to an instruction by the | 
|  | attacker. | 
|  |  | 
|  | The [shadow-call-stack] and [safe-stack] instrumentation schemes and ABIs are | 
|  | related and similar but also orthogonal.  Each can be enabled or disabled | 
|  | independently for any function.  Fuchsia's compiler ABI and libc always | 
|  | interoperate with code built with or without either kind of instrumentation, | 
|  | regardless of what instrumentation was or wasn't used in the particular libc | 
|  | build. | 
|  |  | 
|  | [shadow-call-stack]: https://clang.llvm.org/docs/ShadowCallStack.html | 
|  | [safe-stack]: safestack.md | 
|  |  | 
|  | ## Interoperation and ABI Effects | 
|  |  | 
|  | In general, shadow-call-stack does not affect the ABI.  The machine-specific | 
|  | calling conventions are unchanged.  It works fine to have some functions in a | 
|  | program built with shadow-call-stack and some not.  It doesn't matter if | 
|  | combining the two comes from directly compiled `.o` files, from archive | 
|  | libraries (`.a` files), or from shared libraries (`.so` files), in any | 
|  | combination. | 
|  |  | 
|  | While there is some additional per-thread state (the *shadow call stack | 
|  | pointer*, [see below](#implementation-details)), code not using | 
|  | shadow-call-stack does not need to do anything about this state to keep it | 
|  | correct when calling, or being called by, code that does use safe-stack.  The | 
|  | only potential exceptions to this are for code that is implementing its own | 
|  | kinds of non-local exits or context-switching (e.g. coroutines).  The Zircon C | 
|  | library's `setjmp`/`longjmp` code saves and restores this additional state | 
|  | automatically, so anything that is based on `longjmp` already handles everything | 
|  | correctly even if the code calling `setjmp` and `longjmp` doesn't know about | 
|  | shadow-call-stack. | 
|  |  | 
|  | For AArch64 (ARM64), the `x18` register is already reserved as "fixed" in the | 
|  | ABI generally.  Code unaware of the shadow-call-stack extension to the ABI is | 
|  | interoperable with the shadow-call-stack ABI by default if it simply never | 
|  | touches `x18`. | 
|  |  | 
|  | The feature is not yet supported on any other architecture. | 
|  |  | 
|  | ## Use in Zircon & Fuchsia | 
|  |  | 
|  | Zircon on Aarch64 (ARM64) supports shadow-call-stack both in the kernel and | 
|  | for user-mode code.  This is enabled in the Clang compiler by the | 
|  | `-fsanitize=shadow-call-stack` command-line option.  For `aarch64-fuchsia` | 
|  | (ARM64) targets, it is enabled by default.  To disable it for a specific | 
|  | compilation, use the `-fno-sanitize=shadow-call-stack` command-line option. | 
|  |  | 
|  | As with [safe-stack], there is no separate facility for specifying the size of | 
|  | the shadow call stack.  Instead, the size specified for "the stack" in legacy | 
|  | APIs (such as `pthread_attr_setstacksize`) and ABIs (such as `PT_GNU_STACK`) is | 
|  | used as the size for **each** kind of stack.  Because the different kinds of | 
|  | stack are used in different proportions according to the particular program | 
|  | behavior, there is no good way to choose the shadow call stack size based on | 
|  | the traditional single stack size.  So each kind of stack is as big as it might | 
|  | need to be in the worst case expected by the tuned "unitary" stack size.  While | 
|  | this seems wasteful, it is only slightly so: at worst one page is wasted per | 
|  | kind of stack, plus the page table overhead of using more address space for | 
|  | pages that are never accessed. | 
|  |  | 
|  | ## Implementation details | 
|  |  | 
|  | The essential addition to support shadow-call-stack code is the *shadow call | 
|  | stack pointer*.  This is a register with a global use, like the traditional | 
|  | stack pointer.  But each call frame pushes and pops a single return address | 
|  | word rather than arbitrary data as in the normal stack frame. | 
|  |  | 
|  | For AArch64 (ARM64), the `x18` register holds the shadow call stack pointer at | 
|  | function entry.  The shadow call stack grows upwards with post-increment | 
|  | semantics, so `x18` always points to the next free slot.  The compiler never | 
|  | touches the register except to spill and reload the return address register | 
|  | (`x30`, aka LR).  The Fuchsia ABI requires that `x18` contain a valid shadow | 
|  | stack pointer at all times.  That is, it must **always** be valid to push a | 
|  | new address onto the shadow call stack at `x18` (modulo stack overflow). | 
|  |  | 
|  | ### Notes for low-level and assembly code | 
|  |  | 
|  | Most code, even in assembly, does not need to think about shadow-call-stack | 
|  | issues at all.  The calling conventions are not changed.  All use of the stack | 
|  | (and/or the [unsafe stack][safe-stack]) is the same with or without | 
|  | shadow-call-stack; *when frame pointers are enabled*, the return address will | 
|  | be stored on the machine stack next to the frame pointer as expected.  For | 
|  | AArch64 (ARM64), function calls still use `x30` for the return address as | 
|  | normal, though functions that clobber `x30` can choose to spill and reload it | 
|  | using different memory.  Non-leaf functions written in assembly should ideally | 
|  | make use of the shadow-call-stack ABI by spilling and reloading the return | 
|  | address register there instead of on the machine stack. | 
|  |  | 
|  | The main exception is code that is implementing something like a non-local | 
|  | exit or context switch.  Such code may need to save or restore the shadow call | 
|  | stack pointer.  Both the `longjmp` function and C++ `throw` already handle | 
|  | this directly, so C or C++ code using those constructs does not need to do | 
|  | anything new. | 
|  |  | 
|  | New code implementing some new kind of non-local exit or context switch will | 
|  | need to handle the shadow call stack pointer similarly to how it handles the | 
|  | traditional machine stack pointer register and the [unsafe stack][safe-stack] | 
|  | pointer.  Any such code should use `#if __has_feature(shadow_call_stack)` to | 
|  | test at compile time whether shadow-call-stack is being used in the particular | 
|  | build.  That preprocessor construct can be used in C, C++, or assembly (`.S`) | 
|  | source files. |