docs/OptimizationTips.rst - third_party/swift - Git at Google

 :orphan:

 Writing High-Performance Swift Code
 ===================================

 The following document is a gathering of various tips and tricks for writing
 high-performance Swift code. The intended audience of this document is compiler
 and standard library developers.

 Some of the tips in this document can help improve the quality of your Swift
 program and make your code less error prone and more readable. Explicitly
 marking final-classes and class-protocols are two obvious examples. However some
 of the tips described in this document are unprincipled, twisted and come to
 solve a specific temporary limitation of the compiler or the language. Many of
 the recommendations in this document come with trade offs for things like
 program runtime, binary size, code readability, etc.


 Enabling Optimizations
 ======================

 The first thing one should always do is to enable optimization. Swift provides
 three different optimization levels:

 - ``-Onone``: This is meant for normal development. It performs minimal
   optimizations and preserves all debug info.
 - ``-O``: This is meant for most production code. The compiler performs
   aggressive optimizations that can drastically change the type and amount of
   emitted code. Debug information will be emitted but will be lossy.
 - ``-Ounchecked``: This is a special optimization mode meant for specific
   libraries or applications where one is willing to trade safety for
   performance.  The compiler will remove all overflow checks as well as some
   implicit type checks.  This is not intended to be used in general since it may
   result in undetected memory safety issues and integer overflows. Only use this
   if you have carefully reviewed that your code is safe with respect to integer
   overflow and type casts.

 In the Xcode UI, one can modify the current optimization level as follows:

 ...


 Whole Module Optimizations
 ==========================

 By default Swift compiles each file individually. This allows Xcode to
 compile multiple files in parallel very quickly. However, compiling each file
 separately prevents certain compiler optimizations. Swift can also compile
 the entire program as if it were one file and optimize the program as if it
 were a single compilation unit. This mode is enabled using the command
 line flag ``-whole-module-optimization``. Programs that are compiled in
 this mode will most likely take longer to compile, but may run faster.

 This mode can be enabled using the Xcode build setting 'Whole Module Optimization'.


 Reducing Dynamic Dispatch
 =========================

 Swift by default is a very dynamic language like Objective-C. Unlike Objective
 C, Swift gives the programmer the ability to improve runtime performance when
 necessary by removing or reducing this dynamicism. This section goes through
 several examples of language constructs that can be used to perform such an
 operation.

 Dynamic Dispatch
 ----------------

 Classes use dynamic dispatch for methods and property accesses by default. Thus
 in the following code snippet, ``a.aProperty``, ``a.doSomething()`` and
 ``a.doSomethingElse()`` will all be invoked via dynamic dispatch:

 ::

   class A {
     var aProperty: [Int]
     func doSomething() { ... }
     dynamic doSomethingElse() { ... }
   }

   class B : A {
     override var aProperty {
       get { ... }
       set { ... }
     }

     override func doSomething() { ... }
   }

   func usingAnA(a: A) {
     a.doSomething()
     a.aProperty = ...
   }

 In Swift, dynamic dispatch defaults to indirect invocation through a vtable
 [#]_. If one attaches the ``dynamic`` keyword to the declaration, Swift will
 emit calls via Objective-C message send instead. In both cases this is slower
 than a direct function call because it prevents many compiler optimizations [#]_
 in addition to the overhead of performing the indirect call itself. In
 performance critical code, one often will want to restrict this dynamic
 behavior.

 Advice: Use 'final' when you know the declaration does not need to be overridden
 --------------------------------------------------------------------------------

 The ``final`` keyword is a restriction on a declaration of a class, a method, or
 a property such that the declaration cannot be overridden. This implies that the
 compiler can emit direct function calls instead of indirect calls. For instance
 in the following ``C.array1`` and ``D.array1`` will be accessed directly
 [#]_. In contrast, ``D.array2`` will be called via a vtable:

 ::

   final class C {
     // No declarations in class 'C' can be overridden.
     var array1: [Int]
     func doSomething() { ... }
   }

   class D {
     final var array1 [Int] // 'array1' cannot be overridden by a computed property.
     var array2: [Int]      // 'array2' *can* be overridden by a computed property.
   }

   func usingC(c: C) {
      c.array1[i] = ... // Can directly access C.array without going through dynamic dispatch.
      c.doSomething() = ... // Can directly call C.doSomething without going through virtual dispatch.
   }

   func usingD(d: D) {
      d.array1[i] = ... // Can directly access D.array1 without going through dynamic dispatch.
      d.array2[i] = ... // Will access D.array2 through dynamic dispatch.
   }

 Advice: Use 'private' when declaration does not need to be accessed outside of file
 -----------------------------------------------------------------------------------

 Applying the ``private`` keyword to a declaration restricts the visibility of
 the declaration to the file in which it is declared. This allows the compiler to
 be able to ascertain all other potentially overriding declarations. Thus the
 absence of any such declarations enables the compiler to infer the ``final``
 keyword automatically and remove indirect calls for methods and field accesses
 accordingly. For instance in the following, ``e.doSomething()`` and
 ``f.myPrivateVar``, will be able to be accessed directly assuming ``E``, ``F``
 do not have any overriding declarations in the same file:

 ::

   private class E {
     func doSomething() { ... }
   }

   class F {
     private var myPrivateVar : Int
   }

   func usingE(e: E) {
     e.doSomething() // There is no sub class in the file that declares this class.
                     // The compiler can remove virtual calls to doSomething()
                     // and directly call A's doSomething method.
   }

   func usingF(f: F) -> Int {
     return f.myPrivateVar
   }

 Using Container Types Efficiently
 =================================

 An important feature provided by the Swift standard library are the generic
 containers Array and Dictionary. This section will explain how to use these
 types in a performant manner.

 Advice: Use value types in Array
 --------------------------------

 In Swift, types can be divided into two different categories: value types
 (structs, enums, tuples) and reference types (classes). A key distinction is
 that value types cannot be included inside an NSArray. Thus when using value
 types, the optimizer can remove most of the overhead in Array that is necessary
 to handle the possibility of the array being backed an NSArray.

 Additionally, In contrast to reference types, value types only need reference
 counting if they contain, recursively, a reference type. By using value types
 without reference types, one can avoid additional retain, release traffic inside
 Array.

 ::

   // Don't use a class here.
   struct PhonebookEntry {
     var name : String
     var number : [Int]
   }

   var a : [PhonebookEntry]

 Keep in mind that there is a trade-off between using large value types and using
 reference types. In certain cases, the overhead of copying and moving around
 large value types will outweigh the cost of removing the bridging and
 retain/release overhead.

 Advice: Use ContiguousArray with reference types when NSArray bridging is unnecessary
 -------------------------------------------------------------------------------------

 If you need an array of reference types and the array does not need to be
 bridged to NSArray, use ContiguousArray instead of Array:

 ::

   class C { ... }
   var a: ContiguousArray<C> = [C(...), C(...), ..., C(...)]

 Advice: Use inplace mutation instead of object-reassignment
 -----------------------------------------------------------

 All standard library containers in Swift are value types that use COW
 (copy-on-write) [#]_ to perform copies instead of explicit copies. In many cases
 this allows the compiler to elide unnecessary copies by retaining the container
 instead of performing a deep copy. This is done by only copying the underlying
 container if the reference count of the container is greater than 1 and the
 container is mutated. For instance in the following, no copying will occur when
 ``d`` is assigned to ``c``, but when ``d`` undergoes structural mutation by
 appending ``2``, ``d`` will be copied and then ``2`` will be appended to ``d``:

 ::

   var c: [Int] = [ ... ]
   var d = c        // No copy will occur here.
   d.append(2)      // A copy *does* occur here.

 Sometimes COW can introduce additional unexpected copies if the user is not
 careful. An example of this is attempting to perform mutation via
 object-reassignment in functions. In Swift, all parameters are passed in at +1,
 i.e. the parameters are retained before a callsite, and then are released at the
 end of the callee. This means that if one writes a function like the following:

 ::

   func append_one(a: [Int]) -> [Int] {
     a.append(1)
     return a
   }

   var a = [1, 2, 3]
   a = append_one(a)

 ``a`` may be copied [#]_ despite the version of ``a`` without one appended to it
 has no uses after ``append_one`` due to the assignment. This can be avoided
 through the usage of ``inout`` parameters:

 ::

   func append_one_in_place(inout a: [Int]) {
     a.append(1)
   }

   var a = [1, 2, 3]
   append_one_in_place(&a)

 Unchecked operations
 ====================

 Swift eliminates integer overflow bugs by checking for overflow when performing
 normal arithmetic. These checks are not appropriate in high performance code
 where one knows that no memory safety issues can result.

 Advice: Use unchecked integer arithmetic when you can prove that overflow cannot occur
 ---------------------------------------------------------------------------------------

 In performance-critical code you can elide overflow checks if you know it is
 safe.

 ::

   a : [Int]
   b : [Int]
   c : [Int]

   // Precondition: for all a[i], b[i]: a[i] + b[i] does not overflow!
   for i in 0 ... n {
     c[i] = a[i] &+ b[i]
   }

 Generics
 ========

 Swift provides a very powerful abstraction mechanism through the use of generic
 types. The Swift compiler emits one block of concrete code that can perform
 ``MySwiftFunc<T>`` for any ``T``. The generated code takes a table of function
 pointers and a box containing ``T`` as additional parameters. Any differences in
 behavior between ``MySwiftFunc<Int>`` and ``MySwiftFunc<String>`` are accounted
 for by passing a different table of function pointers and the size abstraction
 provided by the box. An example of generics:

 ::

   class MySwiftFunc<T> { ... }

   MySwiftFunc<Int> X    // Will emit code that works with Int...
   MySwiftFunc<String> Y // ... as well as String.

 When optimizations are enabled, the Swift compiler looks at each invocation of
 such code and attempts to ascertain the concrete (i.e. non-generic type) used in
 the invocation. If the generic function's definition is visible to the optimizer
 and the concrete type is known, the Swift compiler will emit a version of the
 generic function specialized to the specific type. This process, called
 *specialization*, enables the removal of the overhead associated with
 generics. Some more examples of generics:

 ::

   class MyStack<T> {
     func push(element: T) { ... }
     func pop() -> T { ... }
   }

   func myAlgorithm(a: [T], length: Int) { ... }

   // The compiler can specialize code of MyStack[Int]
   var stackOfInts: MyStack[Int]
   // Use stack of ints.
   for i in ... {
     stack.push(...)
     stack.pop(...)
   }

   var arrayOfInts: [Int]
   // The compiler can emit a specialized version of 'myAlgorithm' targeted for
   // [Int]' types.
   myAlgorithm(arrayOfInts, arrayOfInts.length)

 Advice: Put generic declarations in the same file where they are used
 ---------------------------------------------------------------------

 The optimizer can only perform specializations if the definition of the generic
 declaration is visible in the current Module. This can only occur if the
 declaration is in the same file as the invocation of the generic. *NOTE* The
 standard library is a special case. Definitions in the standard library are
 visible in all modules and available for specialization.

 Advice: Allow the compiler to perform generic specialization
 ------------------------------------------------------------

 The compiler can only specialize generic code if the call site and the callee
 function are located in the same compilation unit. One trick that we can use to
 allow compiler to optimize the callee function is to write code that performs a
 type check in the same compilation unit as the callee function. The code behind
 the type check then re-dispatches the call to the generic function - but this
 time it has the type information. In the code sample below we've inserted a type
 check into the function "play_a_game" and made the code run hundreds of times
 faster.

 ::

   //Framework.swift:

   protocol Pingable { func ping() -> Self }
   protocol Playable { func play() }

   extension Int : Pingable {
     func ping() -> Int { return self + 1 }
   }

   class Game<T : Pingable> : Playable {
     var t : T

     init (_ v : T) {t = v}

     func play() {
       for _ in 0...100_000_000 { t = t.ping() }
     }
   }

   func play_a_game(game : Playable ) {
     // This check allows the optimizer to specialize the
     // generic call 'play'
     if let z = game as? Game<Int> {
       z.play()
     } else {
       game.play()
     }
   }

   /// -------------- >8

   // Application.swift:

   play_a_game(Game(10))


 The cost of large swift values
 ==============================

 In Swift, values keep a unique copy of their data. There are several advantages
 to using value-types, like ensuring that values have independent state. When we
 copy values (the effect of assignment, initialization, and argument passing) the
 program will create a new copy of the value. For some large values these copies
 could be time consuming and hurt the performance of the program.

 .. More on value types:
 .. https://developer.apple.com/swift/blog/?id=10

 Consider the example below that defines a tree using "value" nodes. The tree
 nodes contain other nodes using a protocol. In computer graphics scenes are
 often composed from different entities and transformations that can be
 represented as values, so this example is somewhat realistic.

 .. See Protocol-Oriented-Programming:
 .. https://developer.apple.com/videos/play/wwdc2015-408/

 ::

   protocol P {}
   struct Node : P {
     var left, right : P?
   }

   struct Tree {
     var node : P?
     init() { ... }
   }


 When a tree is copied (passed as an argument, initialized or assigned) the whole
 tree needs to be copied. In the case of our tree this is an expensive operation
 that requires many calls to malloc/free and a significant reference counting
 overhead.

 However, we don't really care if the value is copied in memory as long as the
 semantics of the value remains.

 Advice: Use copy-on-write semantics for large values
 ----------------------------------------------------

 To eliminate the cost of copying large values adopt copy-on-write behavior.  The
 easiest way to implement copy-on-write is to compose existing copy-on-write data
 structures, such as Array. Swift arrays are values, but the content of the array
 is not copied around every time the array is passed as an argument because it
 features copy-on-write traits.

 In our Tree example we eliminate the cost of copying the content of the tree by
 wrapping it in an array. This simple change has a major impact on the
 performance of our tree data structure, and the cost of passing the array as an
 argument drops from being O(n), depending on the size of the tree to O(1).

 ::

   struct Tree : P {
     var node : [P?]
     init() {
       node = [ thing ]
     }
   }


 There are two obvious disadvantages of using Array for COW semantics. The first
 problem is that Array exposes methods like "append" and "count" that don't make
 any sense in the context of a value wrapper. These methods can make the use of
 the reference wrapper awkward. It is possible to work around this problem by
 creating a wrapper struct that will hide the unused APIs and the optimizer will
 remove this overhead, but this wrapper will not solve the second problem.  The
 Second problem is that Array has code for ensuring program safety and
 interaction with Objective-C. Swift checks if indexed accesses fall within the
 array bounds and when storing a value if the array storage needs to be extended.
 These runtime checks can slow things down.

 An alternative to using Array is to implement a dedicated copy-on-write data
 structure to replace Array as the value wrapper. The example below shows how to
 construct such a data structure:

 .. Note: This solution is suboptimal for nested structs, and an addressor based
 ..       COW data structure would be more efficient. However at this point it's not
 ..       possible to implement addressors out of the standard library.

 .. More details in this blog post by Mike Ash:
 .. https://www.mikeash.com/pyblog/friday-qa-2015-04-17-lets-build-swiftarray.html

 ::

   final class Ref<T> {
     var val : T
     init(_ v : T) {val = v}
   }

   struct Box<T> {
       var ref : Ref<T>
       init(_ x : T) { ref = Ref(x) }

       var value: T {
           get { return ref.val }
           set {
             if (!isUniquelyReferencedNonObjC(&ref)) {
               ref = Ref(newValue)
               return
             }
             ref.val = newValue
           }
       }
   }

 The type ``Box`` can replace the array in the code sample above.

 Unsafe code
 ===========

 Swift classes are always reference counted. The swift compiler inserts code
 that increments the reference count every time the object is accessed.
 For example, consider the problem of scanning a linked list that's
 implemented using classes. Scanning the list is done by moving a
 reference from one node to the next: ``elem = elem.next``. Every time we move
 the reference swift will increment the reference count of the ``next`` object
 and decrement the reference count of the previous object. These reference
 count operations are expensive and unavoidable when using Swift classes.

 ::

   final class Node {
    var next: Node?
    var data: Int
    ...
   }


 Advice: Use unmanaged references to avoid reference counting overhead
 ---------------------------------------------------------------------

 In performance-critical code you can use choose to use unmanaged
 references. The ``Unmanaged<T>`` structure allows developers to disable
 automatic reference counting for a specific reference.

 ::

     var Ref : Unmanaged<Node> = Unmanaged.passUnretained(Head)

     while let Next = Ref.takeUnretainedValue().next {
       ...
       Ref = Unmanaged.passUnretained(Next)
     }


 Protocols
 =========

 Advice: Mark protocols that are only satisfied by classes as class-protocols
 ----------------------------------------------------------------------------

 Swift can limit protocols adoption to classes only. One advantage of marking
 protocols as class-only is that the compiler can optimize the program based on
 the knowledge that only classes satisfy a protocol. For example, the ARC memory
 management system can easily retain (increase the reference count of an object)
 if it knows that it is dealing with a class. Without this knowledge the compiler
 has to assume that a struct may satisfy the protocol and it needs to be prepared
 to retain or release non-trivial structures, which can be expensive.

 If it makes sense to limit the adoption of protocols to classes then mark
 protocols as class-only protocols to get better runtime performance.

 ::

   protocol Pingable : class { func ping() -> Int }

 .. https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/Protocols.html


 Footnotes
 =========

 .. [#] A virtual method table or 'vtable' is a type specific table referenced by
        instances that contains the addresses of the type's methods. Dynamic
        dispatch proceeds by first looking up the table from the object and then
        looking up the method in the table.

 .. [#] This is due to the compiler not knowing the exact function being called.

 .. [#] i.e. a direct load of a class's field or a direct call to a function.

 .. [#] Explain what COW is here.

 .. [#] In certain cases the optimizer is able to via inlining and ARC
        optimization remove the retain, release causing no copy to occur.
	:orphan:

	Writing High-Performance Swift Code
	===================================

	The following document is a gathering of various tips and tricks for writing
	high-performance Swift code. The intended audience of this document is compiler
	and standard library developers.

	Some of the tips in this document can help improve the quality of your Swift
	program and make your code less error prone and more readable. Explicitly
	marking final-classes and class-protocols are two obvious examples. However some
	of the tips described in this document are unprincipled, twisted and come to
	solve a specific temporary limitation of the compiler or the language. Many of
	the recommendations in this document come with trade offs for things like
	program runtime, binary size, code readability, etc.


	Enabling Optimizations
	======================

	The first thing one should always do is to enable optimization. Swift provides
	three different optimization levels:

	- ``-Onone``: This is meant for normal development. It performs minimal
	optimizations and preserves all debug info.
	- ``-O``: This is meant for most production code. The compiler performs
	aggressive optimizations that can drastically change the type and amount of
	emitted code. Debug information will be emitted but will be lossy.
	- ``-Ounchecked``: This is a special optimization mode meant for specific
	libraries or applications where one is willing to trade safety for
	performance. The compiler will remove all overflow checks as well as some
	implicit type checks. This is not intended to be used in general since it may
	result in undetected memory safety issues and integer overflows. Only use this
	if you have carefully reviewed that your code is safe with respect to integer
	overflow and type casts.

	In the Xcode UI, one can modify the current optimization level as follows:

	...


	Whole Module Optimizations
	==========================

	By default Swift compiles each file individually. This allows Xcode to
	compile multiple files in parallel very quickly. However, compiling each file
	separately prevents certain compiler optimizations. Swift can also compile
	the entire program as if it were one file and optimize the program as if it
	were a single compilation unit. This mode is enabled using the command
	line flag ``-whole-module-optimization``. Programs that are compiled in
	this mode will most likely take longer to compile, but may run faster.

	This mode can be enabled using the Xcode build setting 'Whole Module Optimization'.


	Reducing Dynamic Dispatch
	=========================

	Swift by default is a very dynamic language like Objective-C. Unlike Objective
	C, Swift gives the programmer the ability to improve runtime performance when
	necessary by removing or reducing this dynamicism. This section goes through
	several examples of language constructs that can be used to perform such an
	operation.

	Dynamic Dispatch
	----------------

	Classes use dynamic dispatch for methods and property accesses by default. Thus
	in the following code snippet, ``a.aProperty``, ``a.doSomething()`` and
	``a.doSomethingElse()`` will all be invoked via dynamic dispatch:

	::

	class A {
	var aProperty: [Int]
	func doSomething() { ... }
	dynamic doSomethingElse() { ... }
	}

	class B : A {
	override var aProperty {
	get { ... }
	set { ... }
	}

	override func doSomething() { ... }
	}

	func usingAnA(a: A) {
	a.doSomething()
	a.aProperty = ...
	}

	In Swift, dynamic dispatch defaults to indirect invocation through a vtable
	[#]_. If one attaches the ``dynamic`` keyword to the declaration, Swift will
	emit calls via Objective-C message send instead. In both cases this is slower
	than a direct function call because it prevents many compiler optimizations [#]_
	in addition to the overhead of performing the indirect call itself. In
	performance critical code, one often will want to restrict this dynamic
	behavior.

	Advice: Use 'final' when you know the declaration does not need to be overridden
	--------------------------------------------------------------------------------

	The ``final`` keyword is a restriction on a declaration of a class, a method, or
	a property such that the declaration cannot be overridden. This implies that the
	compiler can emit direct function calls instead of indirect calls. For instance
	in the following ``C.array1`` and ``D.array1`` will be accessed directly
	[#]_. In contrast, ``D.array2`` will be called via a vtable:

	::

	final class C {
	// No declarations in class 'C' can be overridden.
	var array1: [Int]
	func doSomething() { ... }
	}

	class D {
	final var array1 [Int] // 'array1' cannot be overridden by a computed property.
	var array2: [Int] // 'array2' can be overridden by a computed property.
	}

	func usingC(c: C) {
	c.array1[i] = ... // Can directly access C.array without going through dynamic dispatch.
	c.doSomething() = ... // Can directly call C.doSomething without going through virtual dispatch.
	}

	func usingD(d: D) {
	d.array1[i] = ... // Can directly access D.array1 without going through dynamic dispatch.
	d.array2[i] = ... // Will access D.array2 through dynamic dispatch.
	}

	Advice: Use 'private' when declaration does not need to be accessed outside of file
	-----------------------------------------------------------------------------------

	Applying the ``private`` keyword to a declaration restricts the visibility of
	the declaration to the file in which it is declared. This allows the compiler to
	be able to ascertain all other potentially overriding declarations. Thus the
	absence of any such declarations enables the compiler to infer the ``final``
	keyword automatically and remove indirect calls for methods and field accesses
	accordingly. For instance in the following, ``e.doSomething()`` and
	``f.myPrivateVar``, will be able to be accessed directly assuming ``E``, ``F``
	do not have any overriding declarations in the same file:

	::

	private class E {
	func doSomething() { ... }
	}

	class F {
	private var myPrivateVar : Int
	}

	func usingE(e: E) {
	e.doSomething() // There is no sub class in the file that declares this class.
	// The compiler can remove virtual calls to doSomething()
	// and directly call A's doSomething method.
	}

	func usingF(f: F) -> Int {
	return f.myPrivateVar
	}

	Using Container Types Efficiently
	=================================

	An important feature provided by the Swift standard library are the generic
	containers Array and Dictionary. This section will explain how to use these
	types in a performant manner.

	Advice: Use value types in Array
	--------------------------------

	In Swift, types can be divided into two different categories: value types
	(structs, enums, tuples) and reference types (classes). A key distinction is
	that value types cannot be included inside an NSArray. Thus when using value
	types, the optimizer can remove most of the overhead in Array that is necessary
	to handle the possibility of the array being backed an NSArray.

	Additionally, In contrast to reference types, value types only need reference
	counting if they contain, recursively, a reference type. By using value types
	without reference types, one can avoid additional retain, release traffic inside
	Array.

	::

	// Don't use a class here.
	struct PhonebookEntry {
	var name : String
	var number : [Int]
	}

	var a : [PhonebookEntry]

	Keep in mind that there is a trade-off between using large value types and using
	reference types. In certain cases, the overhead of copying and moving around
	large value types will outweigh the cost of removing the bridging and
	retain/release overhead.

	Advice: Use ContiguousArray with reference types when NSArray bridging is unnecessary
	-------------------------------------------------------------------------------------

	If you need an array of reference types and the array does not need to be
	bridged to NSArray, use ContiguousArray instead of Array:

	::

	class C { ... }
	var a: ContiguousArray<C> = [C(...), C(...), ..., C(...)]

	Advice: Use inplace mutation instead of object-reassignment
	-----------------------------------------------------------

	All standard library containers in Swift are value types that use COW
	(copy-on-write) [#]_ to perform copies instead of explicit copies. In many cases
	this allows the compiler to elide unnecessary copies by retaining the container
	instead of performing a deep copy. This is done by only copying the underlying
	container if the reference count of the container is greater than 1 and the
	container is mutated. For instance in the following, no copying will occur when
	``d`` is assigned to ``c``, but when ``d`` undergoes structural mutation by
	appending ``2``, ``d`` will be copied and then ``2`` will be appended to ``d``:

	::

	var c: [Int] = [ ... ]
	var d = c // No copy will occur here.
	d.append(2) // A copy does occur here.

	Sometimes COW can introduce additional unexpected copies if the user is not
	careful. An example of this is attempting to perform mutation via
	object-reassignment in functions. In Swift, all parameters are passed in at +1,
	i.e. the parameters are retained before a callsite, and then are released at the
	end of the callee. This means that if one writes a function like the following:

	::

	func append_one(a: [Int]) -> [Int] {
	a.append(1)
	return a
	}

	var a = [1, 2, 3]
	a = append_one(a)

	``a`` may be copied [#]_ despite the version of ``a`` without one appended to it
	has no uses after ``append_one`` due to the assignment. This can be avoided
	through the usage of ``inout`` parameters:

	::

	func append_one_in_place(inout a: [Int]) {
	a.append(1)
	}

	var a = [1, 2, 3]
	append_one_in_place(&a)

	Unchecked operations
	====================

	Swift eliminates integer overflow bugs by checking for overflow when performing
	normal arithmetic. These checks are not appropriate in high performance code
	where one knows that no memory safety issues can result.

	Advice: Use unchecked integer arithmetic when you can prove that overflow cannot occur
	---------------------------------------------------------------------------------------

	In performance-critical code you can elide overflow checks if you know it is
	safe.

	::

	a : [Int]
	b : [Int]
	c : [Int]

	// Precondition: for all a[i], b[i]: a[i] + b[i] does not overflow!
	for i in 0 ... n {
	c[i] = a[i] &+ b[i]
	}

	Generics
	========

	Swift provides a very powerful abstraction mechanism through the use of generic
	types. The Swift compiler emits one block of concrete code that can perform
	``MySwiftFunc<T>`` for any ``T``. The generated code takes a table of function
	pointers and a box containing ``T`` as additional parameters. Any differences in
	behavior between ``MySwiftFunc<Int>`` and ``MySwiftFunc<String>`` are accounted
	for by passing a different table of function pointers and the size abstraction
	provided by the box. An example of generics:

	::

	class MySwiftFunc<T> { ... }

	MySwiftFunc<Int> X // Will emit code that works with Int...
	MySwiftFunc<String> Y // ... as well as String.

	When optimizations are enabled, the Swift compiler looks at each invocation of
	such code and attempts to ascertain the concrete (i.e. non-generic type) used in
	the invocation. If the generic function's definition is visible to the optimizer
	and the concrete type is known, the Swift compiler will emit a version of the
	generic function specialized to the specific type. This process, called
	specialization, enables the removal of the overhead associated with
	generics. Some more examples of generics:

	::

	class MyStack<T> {
	func push(element: T) { ... }
	func pop() -> T { ... }
	}

	func myAlgorithm(a: [T], length: Int) { ... }

	// The compiler can specialize code of MyStack[Int]
	var stackOfInts: MyStack[Int]
	// Use stack of ints.
	for i in ... {
	stack.push(...)
	stack.pop(...)
	}

	var arrayOfInts: [Int]
	// The compiler can emit a specialized version of 'myAlgorithm' targeted for
	// [Int]' types.
	myAlgorithm(arrayOfInts, arrayOfInts.length)

	Advice: Put generic declarations in the same file where they are used
	---------------------------------------------------------------------

	The optimizer can only perform specializations if the definition of the generic
	declaration is visible in the current Module. This can only occur if the
	declaration is in the same file as the invocation of the generic. NOTE The
	standard library is a special case. Definitions in the standard library are
	visible in all modules and available for specialization.

	Advice: Allow the compiler to perform generic specialization
	------------------------------------------------------------

	The compiler can only specialize generic code if the call site and the callee
	function are located in the same compilation unit. One trick that we can use to
	allow compiler to optimize the callee function is to write code that performs a
	type check in the same compilation unit as the callee function. The code behind
	the type check then re-dispatches the call to the generic function - but this
	time it has the type information. In the code sample below we've inserted a type
	check into the function "play_a_game" and made the code run hundreds of times
	faster.

	::

	//Framework.swift:

	protocol Pingable { func ping() -> Self }
	protocol Playable { func play() }

	extension Int : Pingable {
	func ping() -> Int { return self + 1 }
	}

	class Game<T : Pingable> : Playable {
	var t : T

	init (_ v : T) {t = v}

	func play() {
	for _ in 0...100_000_000 { t = t.ping() }
	}
	}

	func play_a_game(game : Playable ) {
	// This check allows the optimizer to specialize the
	// generic call 'play'
	if let z = game as? Game<Int> {
	z.play()
	} else {
	game.play()
	}
	}

	/// -------------- >8

	// Application.swift:

	play_a_game(Game(10))


	The cost of large swift values
	==============================

	In Swift, values keep a unique copy of their data. There are several advantages
	to using value-types, like ensuring that values have independent state. When we
	copy values (the effect of assignment, initialization, and argument passing) the
	program will create a new copy of the value. For some large values these copies
	could be time consuming and hurt the performance of the program.

	.. More on value types:
	.. https://developer.apple.com/swift/blog/?id=10

	Consider the example below that defines a tree using "value" nodes. The tree
	nodes contain other nodes using a protocol. In computer graphics scenes are
	often composed from different entities and transformations that can be
	represented as values, so this example is somewhat realistic.

	.. See Protocol-Oriented-Programming:
	.. https://developer.apple.com/videos/play/wwdc2015-408/

	::

	protocol P {}
	struct Node : P {
	var left, right : P?
	}

	struct Tree {
	var node : P?
	init() { ... }
	}


	When a tree is copied (passed as an argument, initialized or assigned) the whole
	tree needs to be copied. In the case of our tree this is an expensive operation
	that requires many calls to malloc/free and a significant reference counting
	overhead.

	However, we don't really care if the value is copied in memory as long as the
	semantics of the value remains.

	Advice: Use copy-on-write semantics for large values
	----------------------------------------------------

	To eliminate the cost of copying large values adopt copy-on-write behavior. The
	easiest way to implement copy-on-write is to compose existing copy-on-write data
	structures, such as Array. Swift arrays are values, but the content of the array
	is not copied around every time the array is passed as an argument because it
	features copy-on-write traits.

	In our Tree example we eliminate the cost of copying the content of the tree by
	wrapping it in an array. This simple change has a major impact on the
	performance of our tree data structure, and the cost of passing the array as an
	argument drops from being O(n), depending on the size of the tree to O(1).

	::

	struct Tree : P {
	var node : [P?]
	init() {
	node = [ thing ]
	}
	}


	There are two obvious disadvantages of using Array for COW semantics. The first
	problem is that Array exposes methods like "append" and "count" that don't make
	any sense in the context of a value wrapper. These methods can make the use of
	the reference wrapper awkward. It is possible to work around this problem by
	creating a wrapper struct that will hide the unused APIs and the optimizer will
	remove this overhead, but this wrapper will not solve the second problem. The
	Second problem is that Array has code for ensuring program safety and
	interaction with Objective-C. Swift checks if indexed accesses fall within the
	array bounds and when storing a value if the array storage needs to be extended.
	These runtime checks can slow things down.

	An alternative to using Array is to implement a dedicated copy-on-write data
	structure to replace Array as the value wrapper. The example below shows how to
	construct such a data structure:

	.. Note: This solution is suboptimal for nested structs, and an addressor based
	.. COW data structure would be more efficient. However at this point it's not
	.. possible to implement addressors out of the standard library.

	.. More details in this blog post by Mike Ash:
	.. https://www.mikeash.com/pyblog/friday-qa-2015-04-17-lets-build-swiftarray.html

	::

	final class Ref<T> {
	var val : T
	init(_ v : T) {val = v}
	}

	struct Box<T> {
	var ref : Ref<T>
	init(_ x : T) { ref = Ref(x) }

	var value: T {
	get { return ref.val }
	set {
	if (!isUniquelyReferencedNonObjC(&ref)) {
	ref = Ref(newValue)
	return
	}
	ref.val = newValue
	}
	}
	}

	The type ``Box`` can replace the array in the code sample above.

	Unsafe code
	===========

	Swift classes are always reference counted. The swift compiler inserts code
	that increments the reference count every time the object is accessed.
	For example, consider the problem of scanning a linked list that's
	implemented using classes. Scanning the list is done by moving a
	reference from one node to the next: ``elem = elem.next``. Every time we move
	the reference swift will increment the reference count of the ``next`` object
	and decrement the reference count of the previous object. These reference
	count operations are expensive and unavoidable when using Swift classes.

	::

	final class Node {
	var next: Node?
	var data: Int
	...
	}


	Advice: Use unmanaged references to avoid reference counting overhead
	---------------------------------------------------------------------

	In performance-critical code you can use choose to use unmanaged
	references. The ``Unmanaged<T>`` structure allows developers to disable
	automatic reference counting for a specific reference.

	::

	var Ref : Unmanaged<Node> = Unmanaged.passUnretained(Head)

	while let Next = Ref.takeUnretainedValue().next {
	...
	Ref = Unmanaged.passUnretained(Next)
	}


	Protocols
	=========

	Advice: Mark protocols that are only satisfied by classes as class-protocols
	----------------------------------------------------------------------------

	Swift can limit protocols adoption to classes only. One advantage of marking
	protocols as class-only is that the compiler can optimize the program based on
	the knowledge that only classes satisfy a protocol. For example, the ARC memory
	management system can easily retain (increase the reference count of an object)
	if it knows that it is dealing with a class. Without this knowledge the compiler
	has to assume that a struct may satisfy the protocol and it needs to be prepared
	to retain or release non-trivial structures, which can be expensive.

	If it makes sense to limit the adoption of protocols to classes then mark
	protocols as class-only protocols to get better runtime performance.

	::

	protocol Pingable : class { func ping() -> Int }

	.. https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/Protocols.html



	Footnotes
	=========

	.. [#] A virtual method table or 'vtable' is a type specific table referenced by
	instances that contains the addresses of the type's methods. Dynamic
	dispatch proceeds by first looking up the table from the object and then
	looking up the method in the table.

	.. [#] This is due to the compiler not knowing the exact function being called.

	.. [#] i.e. a direct load of a class's field or a direct call to a function.

	.. [#] Explain what COW is here.

	.. [#] In certain cases the optimizer is able to via inlining and ARC
	optimization remove the retain, release causing no copy to occur.