docs/TypeChecker.rst - third_party/swift - Git at Google

 Type Checker Design and Implementation
 ========================================

 Purpose
 -----------------

 This document describes the design and implementation of the Swift
 type checker. It is intended for developers who wish to modify,
 extend, or improve on the type checker, or simply to understand in
 greater depth how the Swift type system works. Familiarity with the
 Swift programming language is assumed.

 Approach
 -------------------

 The Swift language and its type system incorporate a number of popular
 language features, including object-oriented programming via classes,
 function and operator overloading, subtyping, and constrained
 parametric polymorphism. Swift makes extensive use of type inference,
 allowing one to omit the types of many variables and expressions. For
 example::

   func round(_ x: Double) -> Int { /* ... */ }
   var pi: Double = 3.14159
   var three = round(pi) // 'three' has type 'Int'

   func identity<T>(_ x: T) -> T { return x }
   var eFloat: Float = -identity(2.71828)  // numeric literal gets type 'Float'

 Swift's type inference allows type information to flow in two
 directions. As in most mainstream languages, type information can flow
 from the leaves of the expression tree (e.g., the expression 'pi',
 which refers to a double) up to the root (the type of the variable
 'three'). However, Swift also allows type information to flow from the
 context (e.g., the fixed type of the variable 'eFloat') at the root of
 the expression tree down to the leaves (the type of the numeric
 literal 2.71828). This bi-directional type inference is common in
 languages that use ML-like type systems, but is not present in
 mainstream languages like C++, Java, C#, or Objective-C.

 Swift implements bi-directional type inference using a
 constraint-based type checker that is reminiscent of the classical
 Hindley-Milner type inference algorithm. The use of a constraint
 system allows a straightforward, general presentation of language
 semantics that is decoupled from the actual implementation of the
 solver. It is expected that the constraints themselves will be
 relatively stable, while the solver will evolve over time to improve
 performance and diagnostics.

 The Swift language contains a number of features not part of the
 Hindley-Milner type system, including constrained polymorphic types
 and function overloading, which complicate the presentation and
 implementation somewhat. On the other hand, Swift limits the scope of
 type inference to a single expression or statement, for purely
 practical reasons: we expect that we can provide better performance
 and vastly better diagnostics when the problem is limited in scope.

 Type checking proceeds in three main stages:

 `Constraint Generation`_
   Given an input expression and (possibly) additional contextual
   information, generate a set of type constraints that describes the
   relationships among the types of the various subexpressions. The
   generated constraints may contain unknown types, represented by type
   variables, which will be determined by the solver.

 `Constraint Solving`_
   Solve the system of constraints by assigning concrete types to each
   of the type variables in the constraint system. The constraint
   solver should provide the most specific solution possible among
   different alternatives.

 `Solution Application`_
   Given the input expression, the set of type constraints generated
   from that expression, and the set of assignments of concrete types
   to each of the type variables, produce a well-typed expression that
   makes all implicit conversions (and other transformations) explicit
   and resolves all unknown types and overloads. This step cannot fail.

 The following sections describe these three stages of type checking,
 as well as matters of performance and diagnostics.

 Constraints
 ----------------
 A constraint system consists of a set of type constraints. Each type
 constraint either places a requirement on a single type (e.g., it is
 an integer literal type) or relates two types (e.g., one is a subtype
 of the other). The types described in constraints can be any type in
 the Swift type system including, e.g., builtin types, tuple types,
 function types, enum/struct/class types, protocol types, and generic
 types. Additionally, a type can be a type variable ``T`` (which are
 typically numbered, ``T0``, ``T1``, ``T2``, etc., and are introduced
 as needed). Type variables can be used in place of any other type,
 e.g., a tuple type ``(T0, Int, (T0) -> Int)`` involving the type
 variable ``T0``.

 There are a number of different kinds of constraints used to describe
 the Swift type system:

 **Equality**
   An equality constraint requires two types to be identical. For
   example, the constraint ``T0 == T1`` effectively ensures that ``T0`` and
   ``T1`` get the same concrete type binding. There are two different
   flavors of equality constraints:

     - Exact equality constraints, or  "binding", written ``T0 := X``
       for some type variable ``T0`` and type ``X``, which requires
       that ``T0`` be exactly identical to ``X``;
     - Equality constraints, written ``X == Y`` for types ``X`` and
       ``Y``, which require ``X`` and ``Y`` to have the same type,
       ignoring lvalue types in the process. For example, the
       constraint ``T0 == X`` would be satisfied by assigning ``T0``
       the type ``X`` and by assigning ``T0`` the type ``@lvalue X``.

 **Subtyping**
   A subtype constraint requires the first type to be equivalent to or
   a subtype of the second. For example, a class type ``Dog`` is a
   subtype of a class type ``Animal`` if ``Dog`` inherits from
   ``Animal`` either directly or indirectly. Subtyping constraints are
   written ``X < Y``.

 **Conversion**
   A conversion constraint requires that the first type be convertible
   to the second, which includes subtyping and equality. Additionally,
   it allows a user-defined conversion function to be
   called. Conversion constraints are written ``X <c Y``, read as
   "``X`` can be converted to ``Y``."

 **Construction**
   A construction constraint, written ``X <C Y`` requires that the
   second type be a nominal type with a constructor that accepts a
   value of the first type. For example, the constraint ``Int <C
   String`` is satisfiable because ``String`` has a constructor that
   accepts an ``Int``.

 **Member**
   A member constraint ``X[.name] == Y`` specifies that the first type
   (``X``) have a member (or an overloaded set of members) with the
   given name, and that the type of that member be bound to the second
   type (``Y``).  There are two flavors of member constraint: value
   member constraints, which refer to the member in an expression
   context, and type member constraints, which refer to the member in a
   type context (and therefore can only refer to types).

 **Conformance**
   A conformance constraint ``X conforms to Y`` specifies that the
   first type (``X``) must conform to the protocol ``Y``.

 **Checked cast**
   A constraint describing a checked cast from the first type to the
   second, i.e., for ``x as T``.

 **Applicable function**
   An applicable function requires that both types are function types
   with the same input and output types. It is used when the function
   type on the left-hand side is being split into its input and output
   types for function application purposes. Note, that it does not
   require the type attributes to match.

 **Overload binding**
   An overload binding constraint binds a type variable by selecting a
   particular choice from an overload set. Multiple overloads are
   represented by a disjunction constraint.

 **Conjunction**
   A constraint that is the conjunction of two or more other
   constraints. Typically used within a disjunction.

 **Disjunction**
   A constraint that is the disjunction of two or more
   constraints. Disjunctions are used to model different decisions that
   the solver could make, i.e., the sets of overloaded functions from
   which the solver could choose, or different potential conversions,
   each of which might resolve in a (different) solution.

 **Archetype**
   An archetype constraint requires that the constrained type be bound
   to an archetype. This is a very specific kind of constraint that is
   only used for calls to operators in protocols.

 **Class**
   A class constraint requires that the constrained type be bound to a
   class type.

 **Self object of protocol**
   An internal-use-only constraint that describes the conformance of a
   ``Self`` type to a protocol. It is similar to a conformance
   constraint but "looser" because it allows a protocol type to be the
   self object of its own protocol (even when an existential type would
   not conform to its own protocol).

 **Dynamic lookup value**
   A constraint that requires that the constrained type be
   DynamicLookup or an lvalue thereof.

 Constraint Generation
 ``````````````````````````
 The process of constraint generation produces a constraint system
 that relates the types of the various subexpressions within an
 expression. Programmatically, constraint generation walks an
 expression from the leaves up to the root, assigning a type (which
 often involves type variables) to each subexpression as it goes.

 Constraint generation is driven by the syntax of the
 expression, and each different kind of expression---function
 application, member access, etc.---generates a specific set of
 constraints. Here, we enumerate the primary expression kinds in the
 language and describe the type assigned to the expression and the
 constraints generated from such as expression. We use ``T(a)`` to
 refer to the type assigned to the subexpression ``a``. The constraints
 and types generated from the primary expression kinds are:

 **Declaration reference**
   An expression that refers to a declaration ``x`` is assigned the
   type of a reference to ``x``. For example, if ``x`` is declared as
   ``var x: Int``, the expression ``x`` is assigned the type
   ``@lvalue Int``. No constraints are generated.

   When a name refers to a set of overloaded declarations, the
   selection of the appropriate declaration is handled by the
   solver. This particular issue is discussed in the `Overloading`_
   section. Additionally, when the name refers to a generic function or
   a generic type, the declaration reference may introduce new type
   variables; see the `Polymorphic Types`_ section for more information.

 **Member reference**
   A member reference expression ``a.b`` is assigned the type ``T0``
   for a fresh type variable ``T0``. In addition, the expression
   generates the value member constraint ``T(a).b == T0``.  Member
   references may end up resolving to a member of a nominal type or an
   element of a tuple; in the latter case, the name (``b``) may
   either be an identifier or a positional argument (e.g., ``1``).

   Note that resolution of the member constraint can refer to a set of
   overloaded declarations; this is described further in the
   `Overloading`_ section.

 **Unresolved member reference**
   An unresolved member reference ``.name`` refers to a member of a
   enum type. The enum type is assumed to have a fresh variable
   type ``T0`` (since that type can only be known from context), and a
   value member constraint ``T0.name == T1``, for fresh type variable
   ``T1``, captures the fact that it has a member named ``name`` with
   some as-yet-unknown type ``T1``. The type of the unresolved member
   reference is ``T1``, the type of the member. When the unresolved
   member reference is actually a call ``.name(x)``, the function
   application is folded into the constraints generated by the
   unresolved member reference.

   Note that the constraint system above actually has insufficient
   information to determine the type ``T0`` without additional
   contextual information. The `Overloading`_ section describes how the
   overload-selection mechanism is used to resolve this problem.

 **Function application**
   A function application ``a(b)`` generates two constraints. First,
   the applicable function constraint ``T0 -> T1 ==Fn T(a)`` (for fresh
   type variables ``T0`` and ``T1``) captures the rvalue-to-lvalue
   conversion applied on the function (``a``) and decomposes the
   function type into its argument and result types. Second, the
   conversion constraint ``T(b) <c T0`` captures the requirement that
   the actual argument type (``b``) be convertible to the argument type
   of the function. Finally, the expression is given the type ``T1``,
   i.e., the result type of the function.

 **Construction**
   A type construction ``A(b)``, where ``A`` refers to a type, generates
   a construction constraint ``T(b) <C A``, which requires that ``A``
   have a constructor that accepts ``b``. The type of the expression is
   ``A``.

   Note that construction and function application use the same
   syntax. Here, the constraint generator performs a shallow analysis
   of the type of the "function" argument (``A`` or ``a``, in the
   exposition above); if it obviously has metatype type, the expression
   is considered a coercion/construction rather than a function
   application. This particular area of the language needs more work.

 **Subscripting**
   A subscript operation ``a[b]`` is similar to function application. A
   value member constraint ``T(a).subscript == T0 -> T1`` treats the
   subscript as a function from the key type to the value type,
   represented by fresh type variables ``T0`` and ``T1``,
   respectively. The constraint ``T(b) <c T0`` requires the key
   argument to be convertible to the key type, and the type of the
   subscript operation is ``T1``.

 **Literals**
   A literal expression, such as ``17``, ``1.5``, or ``"Hello,
   world!``, is assigned a fresh type variable ``T0``. Additionally, a
   literal constraint is placed on that type variable depending on the
   kind of literal, e.g., "``T0`` is an integer literal."

 **Closures**
   A closure is assigned a function type based on the parameters and
   return type. When a parameter has no specified type or is positional
   (``$1``, ``$2``, etc.), it is assigned a fresh type variable to
   capture the type. Similarly, if the return type is omitted, it is
   assigned a fresh type variable.

   When the body of the closure is a single expression, that expression
   participates in the type checking of its enclosing expression
   directly. Otherwise, the body of the closure is separately
   type-checked once the type checking of its context has computed a
   complete function type.

 **Array allocation**
   An array allocation ``new A[s]`` is assigned the type ``A[]``. The
   type checker (separately) checks that ``T(s)`` is an array bound
   type.

 **Address of**
   An address-of expression ``&a`` always returns an ``@inout``
   type. Therefore, it is assigned the type ``@inout T0`` for a fresh
   type variable ``T0``. The subtyping constraint ``@inout T0 < @lvalue
   T(a)`` captures the requirement that input expression be an lvalue
   of some type.

 **Ternary operator**
   A ternary operator ``x ? y : z`` generates a number of
   constraints. The type ``T(x)`` must conform to the ``LogicValue``
   protocol to determine which branch is taken. Then, a new type
   variable ``T0`` is introduced to capture the result type, and the
   constraints ``T(y) <c T0`` and ``T(z) <c T0`` capture the need for
   both branches of the ternary operator to convert to a common type.

 There are a number of other expression kinds within the language; see
 the constraint generator for their mapping to constraints.

 Overloading
 ''''''''''''''''''''''''''

 Overloading is the process of giving multiple, different definitions
 to the same name. For example, we might overload a ``negate`` function
 to work on both ``Int`` and ``Double`` types, e.g.::

   func negate(_ x: Int) -> Int { return -x }
   func negate(_ x: Double) -> Double { return -x }

 Given that there are two definitions of ``negate``, what is the type
 of the declaration reference expression ``negate``? If one selects the
 first overload, the type is ``(Int) -> Int``; for the second overload,
 the type is ``(Double) -> Double``. However, constraint generation
 needs to assign some specific type to the expression, so that its
 parent expressions can refer to that type.

 Overloading in the type checker is modeled by introducing a fresh type
 variable (call it ``T0``) for the type of the reference to an
 overloaded declaration. Then, a disjunction constraint is introduced,
 in which each term binds that type variable (via an exact equality
 constraint) to the type produced by one of the overloads in the
 overload set. In our negate example, the disjunction is
 ``T0 := (Int) -> Int or T0 := (Double) -> Double``. The constraint
 solver, discussed in the later section on `Constraint Solving`_,
 explores both possible bindings, and the overloaded reference resolves
 to whichever binding results in a solution that satisfies all
 constraints [#]_.

 Overloading can be introduced both by expressions that refer to sets
 of overloaded declarations and by member constraints that end up
 resolving to a set of overloaded declarations. One particularly
 interesting case is the unresolved member reference, e.g.,
 ``.name``. As noted in the prior section, this generates the
 constraint ``T0.name == T1``, where ``T0`` is a fresh type variable
 that will be bound to the enum type and ``T1`` is a fresh type
 variable that will be bound to the type of the selected member. The
 issue noted in the prior section is that this constraint does not give
 the solver enough information to determine ``T0`` without
 guesswork. However, we note that the type of an enum member actually
 has a regular structure. For example, consider the ``Optional`` type::

   enum Optional<T> {
     case None
     case Some(T)
   }

 The type of ``Optional<T>.None`` is ``Optional<T>``, while the type of
 ``Optional<T>.Some`` is ``(T) -> Optional<T>``. In fact, the
 type of an enum element can have one of two forms: it can be ``T0``,
 for an enum element that has no extra data, or it can be ``T2 -> T0``,
 where ``T2`` is the data associated with the enum element.  For the
 latter case, the actual arguments are parsed as part of the unresolved
 member reference, so that a function application constraint describes
 their conversion to the input type ``T2``.

 Polymorphic Types
 ''''''''''''''''''''''''''''''''''''''''''''''

 The Swift language includes generics, a system of constrained
 parameter polymorphism that enables polymorphic types and
 functions. For example, one can implement a ``min`` function as,
 e.g.,::

   func min<T : Comparable>(x: T, y: T) -> T {
     if y < x { return y }
     return x
   }

 Here, ``T`` is a generic parameter that can be replaced with any
 concrete type, so long as that type conforms to the protocol
 ``Comparable``. The type of ``min`` is (internally) written as ``<T :
 Comparable> (x: T, y: T) -> T``, which can be read as "for all ``T``,
 where ``T`` conforms to ``Comparable``, the type of the function is
 ``(x: T, y: T) -> T``." Different uses of the ``min`` function may
 have different bindings for the generic parameter ``T``.

 When the constraint generator encounters a reference to a generic
 function, it immediately replaces each of the generic parameters within
 the function type with a fresh type variable, introduces constraints
 on that type variable to match the constraints listed in the generic
 function, and produces a monomorphic function type based on the
 newly-generated type variables. For example, the first occurrence of
 the declaration reference expression ``min`` would result in a type
 ``(x : T0, y : T0) -> T0``, where ``T0`` is a fresh type variable, as
 well as the subtype constraint ``T0 < Comparable``, which expresses
 protocol conformance. The next occurrence of the declaration reference
 expression ``min`` would produce the type ``(x : T1, y : T1) -> T1``,
 where ``T1`` is a fresh type variable (and therefore distinct from
 ``T0``), and so on. This replacement process is referred to as
 "opening" the generic function type, and is a fairly simple (but
 effective) way to model the use of polymorphic functions within the
 constraint system without complicating the solver. Note that this
 immediate opening of generic function types is only valid because
 Swift does not support first-class polymorphic functions, e.g., one
 cannot declare a variable of type ``<T> T -> T``.

 Uses of generic types are also immediately opened by the constraint
 solver. For example, consider the following generic dictionary type::

   class Dictionary<Key : Hashable, Value> {
     // ...
   }

 When the constraint solver encounters the expression ``Dictionary()``,
 it opens up the type ``Dictionary``---which has not
 been provided with any specific generic arguments---to the type
 ``Dictionary<T0, T1>``, for fresh type variables ``T0`` and ``T1``,
 and introduces the constraint ``T0 conforms to Hashable``. This allows
 the actual key and value types of the dictionary to be determined by
 the context of the expression. As noted above for first-class
 polymorphic functions, this immediate opening is valid because an
 unbound generic type, i.e., one that does not have specified generic
 arguments, cannot be used except where the generic arguments can be
 inferred.

 Constraint Solving
 -----------------------------
 The primary purpose of the constraint solver is to take a given set of
 constraints and determine the most specific type binding for each of the type
 variables in the constraint system. As part of this determination, the
 constraint solver also resolves overloaded declaration references by
 selecting one of the overloads.

 Solving the constraint systems generated by the Swift language can, in
 the worst case, require exponential time. Even the classic
 Hindley-Milner type inference algorithm requires exponential time, and
 the Swift type system introduces additional complications, especially
 overload resolution. However, the problem size for any particular
 expression is still fairly small, and the constraint solver can employ
 a number of tricks to improve performance. The Performance_ section
 describes some tricks that have been implemented or are planned, and
 it is expected that the solver will be extended with additional tricks
 going forward.

 This section will focus on the basic ideas behind the design of the
 solver, as well as the type rules that it applies.

 Simplification
 ```````````````````
 The constraint generation process introduces a number of constraints
 that can be immediately solved, either directly (because the solution
 is obvious and trivial) or by breaking the constraint down into a
 number of smaller constraints. This process, referred to as
 *simplification*, canonicalizes a constraint system for later stages
 of constraint solving. It is also re-invoked each time the constraint
 solver makes a guess (at resolving an overload or binding a type
 variable, for example), because each such guess often leads to other
 simplifications. When all type variables and overloads have been
 resolved, simplification terminates the constraint solving process
 either by detecting a trivial constraint that is not satisfied (hence,
 this is not a proper solution) or by reducing the set of constraints
 down to only simple constraints that are trivially satisfied.

 The simplification process breaks down constraints into simpler
 constraints, and each different kind of constraint is handled by
 different rules based on the Swift type system. The constraints fall
 into five categories: relational constraints, member constraints,
 type properties, conjunctions, and disjunctions. Only the first three
 kinds of constraints have interesting simplification rules, and are
 discussed in the following sections.

 Relational Constraints
 ''''''''''''''''''''''''''''''''''''''''''''''''

 Relational constraints describe a relationship between two types. This
 category covers the equality, subtyping, and conversion constraints,
 and provides the most common simplifications. The simplification of
 relationship constraints proceeds by comparing the structure of the
 two types and applying the typing rules of the Swift language to
 generate additional constraints. For example, if the constraint is a
 conversion constraint::

   A -> B <c C -> D

 then both types are function types, and we can break down this
 constraint into two smaller constraints ``C < A`` and ``B < D`` by
 applying the conversion rule for function types. Similarly, one can
 destroy all of the various type constructors---tuple types, generic
 type specializations, lvalue types, etc.---to produce simpler
 requirements, based on the type rules of the language [#]_.

 Relational constraints involving a type variable on one or both sides
 generally cannot be solved directly. Rather, these constraints inform
 the solving process later by providing possible type bindings,
 described in the `Type Variable Bindings`_ section. The exception is
 an equality constraint between two type variables, e.g., ``T0 ==
 T1``. These constraints are simplified by unifying the equivalence
 classes of ``T0`` and ``T1`` (using a basic union-find algorithm),
 such that the solver need only determine a binding for one of the type
 variables (and the other gets the same binding).

 Member Constraints
 '''''''''''''''''''''''''''''''''''''''''''

 Member constraints specify that a certain type has a member of a given
 name and provide a binding for the type of that member. A member
 constraint ``A.member == B`` can be simplified when the type of ``A``
 is determined to be a nominal or tuple type, in which case name lookup
 can resolve the member name to an actual declaration. That declaration
 has some type ``C``, so the member constraint is simplified to the
 exact equality constraint ``B := C``.

 The member name may refer to a set of overloaded declarations. In this
 case, the type ``C`` is a fresh type variable (call it ``T0``). A
 disjunction constraint is introduced, each term of which new overload
 set binds a different declaration's type to ``T0``, as described in
 the section on Overloading_.

 The kind of member constraint---type or value---also affects the
 declaration type ``C``. A type constraint can only refer to member
 types, and ``C`` will be the declared type of the named member. A
 value constraint, on the other hand, can refer to either a type or a
 value, and ``C`` is the type of a reference to that entity. For a
 reference to a type, ``C`` will be a metatype of the declared type.


 Strategies
 ```````````````````````````````
 The basic approach to constraint solving is to simplify the
 constraints until they can no longer be simplified, then produce (and
 check) educated guesses about which declaration from an overload set
 should be selected or what concrete type should be bound to a given
 type variable. Each guess is tested as an assumption, possibly with
 other guesses, until the solver either arrives at a solution or
 concludes that the guess was incorrect.

 Within the implementation, each guess is modeled as an assumption
 within a new solver scope. The solver scope inherits all of the
 constraints, overload selections, and type variable bindings of its
 parent solver scope, then adds one more guess. As such, the solution
 space explored by the solver can be viewed as a tree, where the
 top-most node is the constraint system generated directly from the
 expression. The leaves of the tree are either solutions to the
 type-checking problem (where all constraints have been simplified
 away) or represent sets of assumptions that do not lead to a solution.

 The following sections describe the techniques used by the solver to
 produce derived constraint systems that explore the solution space.

 Overload Selection
 '''''''''''''''''''''''''''''''''''''''''''''''''''''
 Overload selection is the simplest way to make an assumption. For an
 overload set that introduced a disjunction constraint
 ``T0 := A1 or T0 := A2 or ... or T0 := AN`` into the constraint
 system, each term in the disjunction will be visited separately. Each
 solver state binds the type variable ``T0`` and explores
 whether the selected overload leads to a suitable solution.

 Type Variable Bindings
 '''''''''''''''''''''''''''''''''''''''''''''''''''''
 A second way in which the solver makes assumptions is to guess at the
 concrete type to which a given type variable should be bound. That
 type binding is then introduced in a new, derived constraint system to
 determine if the binding is feasible.

 The solver does not conjure concrete type bindings from nothing, nor
 does it perform an exhaustive search. Rather, it uses the constraints
 placed on that type variable to produce potential candidate
 types. There are several strategies employed by the solver.

 Meets and Joins
 ..........................................
 A given type variable ``T0`` often has relational constraints
 placed on it that relate it to concrete types, e.g., ``T0 <c Int`` or
 ``Float <c T0``. In these cases, we can use the concrete types as a
 starting point to make educated guesses for the type ``T0``.

 To determine an appropriate guess, the relational constraints placed
 on the type variable are categorized. Given a relational constraint of the form
 ``T0 <? A`` (where ``<?`` is one of ``<``, ``<t``, or ``<c``), where
 ``A`` is some concrete type, ``A`` is said to be  "above"
 ``T0``. Similarly, given a constraint of the form ``B <? T0`` for a
 concrete type ``B``, ``B`` is said to be "below" ``T0``. The
 above/below terminologies comes from a visualization of the lattice of
 types formed by the conversion relationship, e.g., there is an edge
 ``A -> B`` in the latter if ``A`` is convertible to ``B``. ``B`` would
 therefore be higher in the lattice than ``A``, and the topmost element
 of the lattice is the element to which all types can be converted,
 ``Any`` (often called "top").

 The concrete types "above" and "below" a given type variable provide
 bounds on the possible concrete types that can be assigned to that
 type variable. The solver computes [#]_ the join of the types "below"
 the type variable, i.e., the most specific (lowest) type to which all
 of the types "below" can be converted, and uses that join as a
 starting guess.


 Supertype Fallback
 ..........................................
 The join of the "below" types computed as a starting point may be too
 specific, due to constraints that involve the type variable but
 weren't simple enough to consider as part of the join. To cope with
 such cases, if no solution can be found with the join of the "below"
 types, the solver creates a new set of derived constraint systems with
 weaker assumptions, corresponding to each of the types that the join
 is directly convertible to. For example, if the join was some class
 ``Derived``, the supertype fallback would then try the class ``Base``
 from which ``Derived`` directly inherits. This fallback process
 continues until the types produced are no longer convertible to the
 meet of types "above" the type variable, i.e., the least specific
 (highest) type from which all of the types "above" the type variable
 can be converted [#]_.


 Default Literal Types
 ..........................................
 If a type variable is bound by a conformance constraint to one of the
 literal protocols, "``T0`` conforms to ``ExpressibleByIntegerLiteral``",
 then the constraint solver will guess that the type variable can be
 bound to the default literal type for that protocol. For example,
 ``T0`` would get the default integer literal type ``Int``, allowing
 one to type-check expressions with too little type information to
 determine the types of these literals, e.g., ``-1``.

 Comparing Solutions
 `````````````````````````
 The solver explores a potentially large solution space, and it is
 possible that it will find multiple solutions to the constraint system
 as given. Such cases are not necessarily ambiguities, because the
 solver can then compare the solutions to determine whether one of
 the solutions is better than all of the others. To do so, it computes
 a "score" for each solution based on a number of factors:

 - How many user-defined conversions were applied.
 - How many non-trivial function conversions were applied.
 - How many literals were given "non-default" types.

 Solutions with smaller scores are considered better solutions. When
 two solutions have the same score, the type variables and overload
 choices of the two systems are compared to produce a relative score:

 - If the two solutions have selected different type variable bindings
   for a type variable where a "more specific" type variable is a
   better match, and one of the type variable bindings is a subtype of
   the other, the solution with the subtype earns +1.

 - If an overload set has different selected overloads in the two
   solutions, the overloads are compared. If the type of the
   overload picked in one solution is a subtype of the type of
   the overload picked in the other solution, then first solution earns
   +1.

 The solution with the greater relative score is considered to be
 better than the other solution.

 Solution Application
 -------------------------
 Once the solver has produced a solution to the constraint system, that
 solution must be applied to the original expression to produce a fully
 type-checked expression that makes all implicit conversions and
 resolved overloads explicit. This application process walks the
 expression tree from the leaves to the root, rewriting each expression
 node based on the kind of expression:

 *Declaration references*
   Declaration references are rewritten with the precise type of the
   declaration as referenced. For overloaded declaration references, the
   ``Overload*Expr`` node is replaced with a simple declaration
   reference expression. For references to polymorphic functions or
   members of generic types, a ``SpecializeExpr`` node is introduced to
   provide substitutions for all of the generic parameters.

 *Member references*
   References to members are similar to declaration
   references. However, they have the added constraint that the base
   expression needs to be a reference. Therefore, an rvalue of
   non-reference type will be materialized to produce the necessary
   reference.

 *Literals*
   Literals are converted to the appropriate literal type, which
   typically involves introducing calls to the witnesses for the
   appropriate literal protocols.

 *Closures*
   Since the closure has acquired a complete function type,
   the body of the closure is type-checked with that
   complete function type.

 The solution application step cannot fail for any type checking rule
 modeled by the constraint system. However, there are some failures
 that are intentionally left to the solution application phase, such as
 a postfix '!' applied to a non-optional type.

 Locators
 ```````````
 During constraint generation and solving, numerous constraints are
 created, broken apart, and solved. During constraint application as
 well as during diagnostics emission, it is important to track the
 relationship between the constraints and the actual expressions from
 which they originally came. For example, consider the following type
 checking problem::

   struct X {
     // user-defined conversions
     func [conversion] __conversion () -> String { /* ... */ }
     func [conversion] __conversion () -> Int { /* ... */ }
   }

   func f(_ i : Int, s : String) { }

   var x : X
   f(10.5, x)

 This constraint system generates the constraints "``T(f)`` ==Fn ``T0
 -> T1``" (for fresh variables ``T0`` and ``T1``), "``(T2, X) <c
 T0``" (for fresh variable ``T2``) and "``T2`` conforms to
 ``ExpressibleByFloatLiteral``". As part of the solution, after ``T0`` is
 replaced with ``(i : Int, s : String)``, the second of
 these constraints is broken down into "``T2 <c Int``" and "``X <c
 String``". These two constraints are interesting for different
 reasons: the first will fail, because ``Int`` does not conform to
 ``ExpressibleByFloatLiteral``. The second will succeed by selecting one
 of the (overloaded) conversion functions.

 In both of these cases, we need to map the actual constraint of
 interest back to the expressions they refer to. In the first case, we
 want to report not only that the failure occurred because ``Int`` is
 not ``ExpressibleByFloatLiteral``, but we also want to point out where
 the ``Int`` type actually came from, i.e., in the parameter. In the
 second case, we want to determine which of the overloaded conversion
 functions was selected to perform the conversion, so that conversion
 function can be called by constraint application if all else succeeds.

 *Locators* address both issues by tracking the location and derivation
 of constraints. Each locator is anchored at a specific expression,
 i.e., the function application ``f(10.5, x)``, and contains a path of
 zero or more derivation steps from that anchor. For example, the
 "``T(f)`` ==Fn ``T0 -> T1``" constraint has a locator that is
 anchored at the function application and a path with the "apply
 function" derivation step, meaning that this is the function being
 applied. Similarly, the "``(T2, X) <c T0`` constraint has a
 locator anchored at the function application and a path with the
 "apply argument" derivation step, meaning that this is the argument
 to the function.

 When constraints are simplified, the resulting constraints have
 locators with longer paths. For example, when a conversion constraint between two
 tuples is simplified conversion constraints between the corresponding
 tuple elements, the resulting locators refer to specific elements. For
 example, the ``T2 <c Int`` constraint will be anchored at the function
 application (still), and have two derivation steps in its path: the
 "apply function" derivation step from its parent constraint followed
 by the "tuple element 0" constraint that refers to this specific tuple
 element. Similarly, the ``X <c String`` constraint will have the same
 locator, but with "tuple element 1" rather than "tuple element 0". The
 ``ConstraintLocator`` type in the constraint solver has a number of
 different derivation step kinds (called "path elements" in the source)
 that describe the various ways in which larger constraints can be
 broken down into smaller ones.

 Overload Choices
 '''''''''''''''''''''''''''''
 Whenever the solver creates a new overload set, that overload set is
 associated with a particular locator. Continuing the example from the
 parent section, the solver will create an overload set containing the
 two user-defined conversions. This overload set is created while
 simplifying the constraint ``X <c String``, so it uses the locator
 from that constraint extended by a "conversion member" derivation
 step. The complete locator for this overload set is, therefore::

   function application -> apply argument -> tuple element #1 -> conversion member

 When the solver selects a particular overload from the overload set,
 it records the selected overload based on the locator of the overload
 set. When it comes time to perform constraint application, the locator
 is recreated based on context (as the bottom-up traversal walks the
 expressions to rewrite them with their final types) and used to find
 the appropriate conversion to call. The same mechanism is used to
 select the appropriate overload when an expression refers directly to
 an overloaded function. Additionally, when comparing two solutions to
 the same constraint system, overload sets present in both solutions
 can be found by comparing the locators for each of the overload
 choices made in each solution. Naturally, all of these operations
 require locators to be unique, which occurs in the constraint system
 itself.

 Simplifying Locators
 '''''''''''''''''''''''''''''
 Locators provide the derivation of location information that follows
 the path of the solver, and can be used to query and recover the
 important decisions made by the solver. However, the locators
 determined by the solver may not directly refer to the most specific
 expression for the purposes of identifying the corresponding source
 location. For example, the failed constraint "``Int`` conforms to
 ``ExpressibleByFloatLiteral``" can most specifically by centered on the
 floating-point literal ``10.5``, but its locator is::

   function application -> apply argument -> tuple element #0

 The process of locator simplification maps a locator to its most
 specific expression. Essentially, it starts at the anchor of the
 locator (in this case, the application ``f(10.5, x)``) and then walks
 the path, matching derivation steps to subexpressions. The "function
 application" derivation step extracts the argument (``(10.5,
 x)``). Then, the "tuple element #0" derivation extracts the tuple
 element 0 subexpression, ``10.5``, at which point we have traversed
 the entire path and now have the most specific expression for
 source-location purposes.

 Simplification does not always exhaust the complete path. For example,
 consider a slight modification to our example, so that the argument to
 ``f`` is provided by another call, we get a different result
 entirely::

   func f(_ i : Int, s : String) { }
   func g() -> (f : Float, x : X) { }

   f(g())

 Here, the failing constraint is ``Float <c Int``, with the same
 locator::

   function application -> apply argument -> tuple element #0

 When we simplify this locator, we start with ``f(g())``. The "apply
 argument" derivation step takes us to the argument expression
 ``g()``. Here, however, there is no subexpression for the first tuple
 element of ``g()``, because it's simple part of the tuple returned
 from ``g``. At this point, simplification ceases, and creates the
 simplified locator::

   function application of g -> tuple element #0

 Performance
 -----------------
 The performance of the type checker is dependent on a number of
 factors, but the chief concerns are the size of the solution space
 (which is exponential in the worst case) and the effectiveness of the
 solver in exploring that solution space. This section describes some
 of the techniques used to improve solver performance, many of which
 can doubtless be improved.

 Constraint Graph
 ````````````````
 The constraint graph describes the relationships among type variables
 in the constraint system. Each vertex in the constraint graph
 corresponds to a single type variable. The edges of the graph
 correspond to constraints in the constraint system, relating sets of
 type variables together. Technically, this makes the constraint graph
 a *multigraph*, although the internal representation is more akin to a
 graph with multiple kinds of edges: each vertex (node) tracks the set
 of constraints that mention the given type variable as well as the set
 of type variables that are adjacent to this type variable. A vertex
 also includes information about the equivalence class corresponding to
 a given type variable (when type variables have been merged) or the
 binding of a type variable to a specific type.

 The constraint graph is critical to a number of solver
 optimizations. For example, it is used to compute the connected
 components within the constraint graph, so that each connected
 component can be solved independently. The partial results from all of
 the connected components are then combined into a complete
 solution. Additionally, the constraint graph is used to direct
 simplification, described below.

 Simplification Worklist
 ```````````````````````
 When the solver has attempted a type variable binding, that binding
 often leads to additional simplifications in the constraint
 system. The solver will query the constraint graph to determine which
 constraints mention the type variable and will place those constraints
 onto the simplification worklist. If those constraints can be
 simplified further, it may lead to additional type variable bindings,
 which in turn adds more constraints to the worklist. Once the worklist
 is exhausted, simplification has completed. The use of the worklist
 eliminates the need to reprocess constraints that could not have
 changed because the type variables they mention have not changed.

 Solver Scopes
 `````````````
 The solver proceeds through the solution space in a depth-first
 manner. Whenever the solver is about to make a guess---such as a
 speculative type variable binding or the selection of a term from a
 disjunction---it introduces a new solver scope to capture the results
 of that assumption. Subsequent solver scopes are nested as the solver
 builds up a set of assumptions, eventually leading to either a
 solution or an error. When a solution is found, the stack of solver
 scopes contains all of the assumptions needed to produce that
 solution, and is saved in a separate solution data structure.

 The solver scopes themselves are designed to be fairly cheap to create
 and destroy. To support this, all of the major data structures used by
 the constraint solver have reversible operations, allowing the solver
 to easily backtrack. For example, the addition of a constraint to the
 constraint graph can be reversed by removing that same constraint. The
 constraint graph tracks all such additions in a stack: pushing a new
 solver scope stores a marker to the current top of the stack, and
 popping that solver scope reverses all of the operations on that stack
 until it hits the marker.

 Online Scoring
 ``````````````
 As the solver evaluates potential solutions, it keeps track of the
 score of the current solution and of the best complete solution found
 thus far. If the score of the current solution is ever greater than
 that of the best complete solution, it abandons the current solution
 and backtracks to continue its search.

 The solver makes some attempt at evaluating cheaper solutions before
 more expensive solutions. For example, it will prefer to try normal
 conversions before user-defined conversions, prefer the "default"
 literal types over other literal types, and prefer cheaper conversions
 to more expensive conversions. However, some of the rules are fairly
 ad hoc, and could benefit from more study.

 Arena Memory Management
 ```````````````````````
 Each constraint system introduces its own memory allocation arena,
 making allocations cheap and deallocation essentially free. The
 allocation arena extends all the way into the AST context, so that
 types composed of type variables (e.g., ``T0 -> T1``) will be
 allocated within the constraint system's arena rather than the
 permanent arena. Most data structures involved in constraint solving
 use this same arena.

 Diagnostics
 -----------------
 The diagnostics produced by the type checker are currently
 terrible. We plan to do something about this, eventually. We also
 believe that we can implement some heroics, such as spell-checking
 that takes into account the surrounding expression to only provide
 well-typed suggestions.

 .. [#] It is possible that both overloads will result in a solution,
    in which case the solutions will be ranked based on the rules
    discussed in the section `Comparing Solutions`_.

 .. [#] As of the time of this writing, the type rules of Swift have
   not specifically been documented outside of the source code. The
   constraints-based type checker contains a function ``matchTypes``
   that documents and implements each of these rules. A future revision
   of this document will provide a more readily-accessible version.

 .. [#] More accurately, as of this writing, "will compute". The solver
   doesn't current compute meets and joins properly. Rather, it
   arbitrarily picks one of the constraints "below" to start with.

 .. [#] Again, as of this writing, the solver doesn't actually compute
   meets and joins, so the solver continues until it runs out of
   supertypes to enumerate.