docs/concepts/kernel/shadow_call_stack.md - fuchsia - Git at Google

 # ShadowCallStack in Zircon & Fuchsia

 [TOC]

 ## Introduction

 LLVM's [shadow-call-stack feature][shadow-call-stack] is a compiler mode
 intended to harden the generated code against stack-smashing attacks such as
 exploits of buffer overrun bugs.

 The Clang/LLVM documentation page linked above describes the scheme.  The
 capsule summary is that the function return address is never reloaded from the
 normal stack but only from a separate "shadow call stack".  This is an
 additional stack, but rather than containing whole stack frames of whatever
 size each function needs, it contains only a single address word for each call
 frame it records: just the return address.  Since the shadow call stack is
 allocated independently of other stacks or heap blocks with its own randomized
 address to which pointers are rare, it is much less likely that some sort of
 buffer overrun or use-after-free exploit will overwrite a return address in
 memory so that it can cause the program to return to an instruction by the
 attacker.

 The [shadow-call-stack] and [safe-stack] instrumentation schemes and ABIs are
 related and similar but also orthogonal.  Each can be enabled or disabled
 independently for any function.  Fuchsia's compiler ABI and libc always
 interoperate with code built with or without either kind of instrumentation,
 regardless of what instrumentation was or wasn't used in the particular libc
 build.

 [shadow-call-stack]: https://clang.llvm.org/docs/ShadowCallStack.html
 [safe-stack]: safestack.md

 ## Interoperation and ABI Effects

 In general, shadow-call-stack does not affect the ABI.  The machine-specific
 calling conventions are unchanged.  It works fine to have some functions in a
 program built with shadow-call-stack and some not.  It doesn't matter if
 combining the two comes from directly compiled `.o` files, from archive
 libraries (`.a` files), or from shared libraries (`.so` files), in any
 combination.

 While there is some additional per-thread state (the *shadow call stack
 pointer*, [see below](#implementation-details)), code not using
 shadow-call-stack does not need to do anything about this state to keep it
 correct when calling, or being called by, code that does use safe-stack.  The
 only potential exceptions to this are for code that is implementing its own
 kinds of non-local exits or context-switching (e.g. coroutines).  The Zircon C
 library's `setjmp`/`longjmp` code saves and restores this additional state
 automatically, so anything that is based on `longjmp` already handles everything
 correctly even if the code calling `setjmp` and `longjmp` doesn't know about
 shadow-call-stack.

 For AArch64 (ARM64), the `x18` register is already reserved as "fixed" in the
 ABI generally.  Code unaware of the shadow-call-stack extension to the ABI is
 interoperable with the shadow-call-stack ABI by default if it simply never
 touches `x18`.

 The feature is not yet supported on any other architecture.

 ## Use in Zircon & Fuchsia

 Zircon on Aarch64 (ARM64) supports shadow-call-stack both in the kernel and
 for user-mode code.  This is enabled in the Clang compiler by the
 `-fsanitize=shadow-call-stack` command-line option.  For `aarch64-fuchsia`
 (ARM64) targets, it is enabled by default.  To disable it for a specific
 compilation, use the `-fno-sanitize=shadow-call-stack` command-line option.

 As with [safe-stack], there is no separate facility for specifying the size of
 the shadow call stack.  Instead, the size specified for "the stack" in legacy
 APIs (such as `pthread_attr_setstacksize`) and ABIs (such as `PT_GNU_STACK`) is
 used as the size for **each** kind of stack.  Because the different kinds of
 stack are used in different proportions according to the particular program
 behavior, there is no good way to choose the shadow call stack size based on
 the traditional single stack size.  So each kind of stack is as big as it might
 need to be in the worst case expected by the tuned "unitary" stack size.  While
 this seems wasteful, it is only slightly so: at worst one page is wasted per
 kind of stack, plus the page table overhead of using more address space for
 pages that are never accessed.

 ## Implementation details

 The essential addition to support shadow-call-stack code is the *shadow call
 stack pointer*.  This is a register with a global use, like the traditional
 stack pointer.  But each call frame pushes and pops a single return address
 word rather than arbitrary data as in the normal stack frame.

 For AArch64 (ARM64), the `x18` register holds the shadow call stack pointer at
 function entry.  The shadow call stack grows upwards with post-increment
 semantics, so `x18` always points to the next free slot.  The compiler never
 touches the register except to spill and reload the return address register
 (`x30`, aka LR).  The Fuchsia ABI requires that `x18` contain a valid shadow
 stack pointer at all times.  That is, it must **always** be valid to push a
 new address onto the shadow call stack at `x18` (modulo stack overflow).

 ### Notes for low-level and assembly code

 Most code, even in assembly, does not need to think about shadow-call-stack
 issues at all.  The calling conventions are not changed.  All use of the stack
 (and/or the [unsafe stack][safe-stack]) is the same with or without
 shadow-call-stack; *when frame pointers are enabled*, the return address will
 be stored on the machine stack next to the frame pointer as expected.  For
 AArch64 (ARM64), function calls still use `x30` for the return address as
 normal, though functions that clobber `x30` can choose to spill and reload it
 using different memory.  Non-leaf functions written in assembly should ideally
 make use of the shadow-call-stack ABI by spilling and reloading the return
 address register there instead of on the machine stack.

 The main exception is code that is implementing something like a non-local
 exit or context switch.  Such code may need to save or restore the shadow call
 stack pointer.  Both the `longjmp` function and C++ `throw` already handle
 this directly, so C or C++ code using those constructs does not need to do
 anything new.

 New code implementing some new kind of non-local exit or context switch will
 need to handle the shadow call stack pointer similarly to how it handles the
 traditional machine stack pointer register and the [unsafe stack][safe-stack]
 pointer.  Any such code should use `#if __has_feature(shadow_call_stack)` to
 test at compile time whether shadow-call-stack is being used in the particular
 build.  That preprocessor construct can be used in C, C++, or assembly (`.S`)
 source files.
	# ShadowCallStack in Zircon & Fuchsia

	[TOC]

	## Introduction

	LLVM's [shadow-call-stack feature][shadow-call-stack] is a compiler mode
	intended to harden the generated code against stack-smashing attacks such as
	exploits of buffer overrun bugs.

	The Clang/LLVM documentation page linked above describes the scheme. The
	capsule summary is that the function return address is never reloaded from the
	normal stack but only from a separate "shadow call stack". This is an
	additional stack, but rather than containing whole stack frames of whatever
	size each function needs, it contains only a single address word for each call
	frame it records: just the return address. Since the shadow call stack is
	allocated independently of other stacks or heap blocks with its own randomized
	address to which pointers are rare, it is much less likely that some sort of
	buffer overrun or use-after-free exploit will overwrite a return address in
	memory so that it can cause the program to return to an instruction by the
	attacker.

	The [shadow-call-stack] and [safe-stack] instrumentation schemes and ABIs are
	related and similar but also orthogonal. Each can be enabled or disabled
	independently for any function. Fuchsia's compiler ABI and libc always
	interoperate with code built with or without either kind of instrumentation,
	regardless of what instrumentation was or wasn't used in the particular libc
	build.

	[shadow-call-stack]: https://clang.llvm.org/docs/ShadowCallStack.html
	[safe-stack]: safestack.md

	## Interoperation and ABI Effects

	In general, shadow-call-stack does not affect the ABI. The machine-specific
	calling conventions are unchanged. It works fine to have some functions in a
	program built with shadow-call-stack and some not. It doesn't matter if
	combining the two comes from directly compiled `.o` files, from archive
	libraries (`.a` files), or from shared libraries (`.so` files), in any
	combination.

	While there is some additional per-thread state (the *shadow call stack
	pointer*, [see below](#implementation-details)), code not using
	shadow-call-stack does not need to do anything about this state to keep it
	correct when calling, or being called by, code that does use safe-stack. The
	only potential exceptions to this are for code that is implementing its own
	kinds of non-local exits or context-switching (e.g. coroutines). The Zircon C
	library's `setjmp`/`longjmp` code saves and restores this additional state
	automatically, so anything that is based on `longjmp` already handles everything
	correctly even if the code calling `setjmp` and `longjmp` doesn't know about
	shadow-call-stack.

	For AArch64 (ARM64), the `x18` register is already reserved as "fixed" in the
	ABI generally. Code unaware of the shadow-call-stack extension to the ABI is
	interoperable with the shadow-call-stack ABI by default if it simply never
	touches `x18`.

	The feature is not yet supported on any other architecture.

	## Use in Zircon & Fuchsia

	Zircon on Aarch64 (ARM64) supports shadow-call-stack both in the kernel and
	for user-mode code. This is enabled in the Clang compiler by the
	`-fsanitize=shadow-call-stack` command-line option. For `aarch64-fuchsia`
	(ARM64) targets, it is enabled by default. To disable it for a specific
	compilation, use the `-fno-sanitize=shadow-call-stack` command-line option.

	As with [safe-stack], there is no separate facility for specifying the size of
	the shadow call stack. Instead, the size specified for "the stack" in legacy
	APIs (such as `pthread_attr_setstacksize`) and ABIs (such as `PT_GNU_STACK`) is
	used as the size for each kind of stack. Because the different kinds of
	stack are used in different proportions according to the particular program
	behavior, there is no good way to choose the shadow call stack size based on
	the traditional single stack size. So each kind of stack is as big as it might
	need to be in the worst case expected by the tuned "unitary" stack size. While
	this seems wasteful, it is only slightly so: at worst one page is wasted per
	kind of stack, plus the page table overhead of using more address space for
	pages that are never accessed.

	## Implementation details

	The essential addition to support shadow-call-stack code is the *shadow call
	stack pointer*. This is a register with a global use, like the traditional
	stack pointer. But each call frame pushes and pops a single return address
	word rather than arbitrary data as in the normal stack frame.

	For AArch64 (ARM64), the `x18` register holds the shadow call stack pointer at
	function entry. The shadow call stack grows upwards with post-increment
	semantics, so `x18` always points to the next free slot. The compiler never
	touches the register except to spill and reload the return address register
	(`x30`, aka LR). The Fuchsia ABI requires that `x18` contain a valid shadow
	stack pointer at all times. That is, it must always be valid to push a
	new address onto the shadow call stack at `x18` (modulo stack overflow).

	### Notes for low-level and assembly code

	Most code, even in assembly, does not need to think about shadow-call-stack
	issues at all. The calling conventions are not changed. All use of the stack
	(and/or the [unsafe stack][safe-stack]) is the same with or without
	shadow-call-stack; when frame pointers are enabled, the return address will
	be stored on the machine stack next to the frame pointer as expected. For
	AArch64 (ARM64), function calls still use `x30` for the return address as
	normal, though functions that clobber `x30` can choose to spill and reload it
	using different memory. Non-leaf functions written in assembly should ideally
	make use of the shadow-call-stack ABI by spilling and reloading the return
	address register there instead of on the machine stack.

	The main exception is code that is implementing something like a non-local
	exit or context switch. Such code may need to save or restore the shadow call
	stack pointer. Both the `longjmp` function and C++ `throw` already handle
	this directly, so C or C++ code using those constructs does not need to do
	anything new.

	New code implementing some new kind of non-local exit or context switch will
	need to handle the shadow call stack pointer similarly to how it handles the
	traditional machine stack pointer register and the [unsafe stack][safe-stack]
	pointer. Any such code should use `#if __has_feature(shadow_call_stack)` to
	test at compile time whether shadow-call-stack is being used in the particular
	build. That preprocessor construct can be used in C, C++, or assembly (`.S`)
	source files.