| .. _Generics: |
| |
| Generics in Swift |
| ================= |
| |
| Motivation |
| ---------- |
| |
| Most types and functions in code are expressed in terms of a single, concrete |
| set of sets. Generics generalize this notion by allowing one to express types |
| and functions in terms of an abstraction over a (typically unbounded) set of |
| types, allowing improved code reuse. A typical example of a generic type is a |
| linked list of values, which can be used with any type of value. In C++, this |
| might be expressed as:: |
| |
| template<typename T> |
| class List { |
| public: |
| struct Node { |
| T value; |
| Node *next; |
| }; |
| |
| Node *first; |
| }; |
| |
| where List<Int>, List<String>, and List<DataRecord> are all distinct types that |
| provide a linked list storing integers, strings, and DataRecords, |
| respectively. Given such a data structure, one also needs to be able to |
| implement generic functions that can operate on a list of any kind of elements, |
| such as a simple, linear search algorithm:: |
| |
| template<typename T> |
| typename List<T>::Node *find(const List<T>&list, const T& value) { |
| for (typename List<T>::Node *result = list.first; result; result = result->next) |
| if (result->value == value) |
| return result; |
| |
| return 0; |
| } |
| |
| .. @test('compile', howmany = 'all', cmake_args = ['COMPILER', '${CMAKE_CXX_COMPILER}']) |
| |
| Generics are important for the construction of useful libraries, because they |
| allow the library to adapt to application-specific data types without losing |
| type safety. This is especially important for foundational libraries containing |
| common data structures and algorithms, since these libraries are used across |
| nearly every interesting application. |
| |
| The alternatives to generics tend to lead to poor solutions: |
| |
| * Object-oriented languages tend to use "top" types (id in Objective-C, |
| java.lang.Object in pre-generics Java, etc.) for their containers and |
| algorithms, which gives up static type safety. Pre-generics Java forced the |
| user to introduce run-time-checked type casts when interacting with containers |
| (which is overly verbose), while Objective-C relies on id's unsound implicit |
| conversion behavior to eliminate the need for casts. |
| * Many languages bake common data structures (arrays, dictionaries, tables) into |
| the language itself. This is unfortunate both because it significantly |
| increases the size of the core language and because users then tend to use |
| this limited set of data structures for *every* problem, even when another |
| (not-baked-in) data structure would be better. |
| |
| Swift is intended to be a small, expressive language with great support for |
| building libraries. We'll need generics to be able to build those libraries |
| well. |
| |
| Goals |
| ----- |
| |
| * Generics should enable the development of rich generic libraries that feel |
| similar to first-class language features |
| * Generics should work on any type, whether it is a value type or some kind of |
| object type |
| * Generic code should be almost as easy to write as non-generic code |
| * Generic code should be compiled such that it can be executed with any data |
| type without requiring a separate "instantiation" step |
| * Generics should interoperate cleanly with run-time polymorphism |
| * Types should be able to retroactively modified to meet the requirements of a |
| generic algorithm or data structure |
| |
| As important as the goals of a feature are the explicit non-goals, which we |
| don't want or don't need to support: |
| |
| * Compile-time "metaprogramming" in any form |
| * Expression-template tricks a la Boost.Spirit, POOMA |
| |
| Polymorphism |
| ------------ |
| |
| Polymorphism allows one to use different data types with a uniform |
| interface. Overloading already allows a form of polymorphism (ad hoc |
| polymorphism) in Swift. For example, given:: |
| |
| func +(x : Int, y : Int) -> Int { add... } |
| func +(x : String, y : String) -> String { concat... } |
| |
| .. @example.replace('add...','return 1') |
| example.replace('concat...','return ""') |
| test() |
| |
| we can write the expression "x + y", which will work for both integers and |
| strings. |
| |
| However, we want the ability to express an algorithm or data structure |
| independently of mentioning any data type. To do so, we need a way to express |
| the essential interface that algorithm or data structure requires. For example, |
| an accumulation algorithm would need to express that for any type T, one can |
| write the expression "x + y" (where x and y are both of type T) and it will |
| produce another T. |
| |
| Protocols |
| --------- |
| |
| Most languages that provide some form of polymorphism also have a way to |
| describe abstract interfaces that cover a range of types: Java and C# |
| interfaces, C++ abstract base classes, Objective-C protocols, Scala traits, |
| Haskell type classes, C++ concepts (briefly), and many more. All allow one to |
| describe functions or methods that are part of the interface, and provide some |
| way to re-use or extend a previous interface by adding to it. We'll start with |
| that core feature, and build onto it what we need. |
| |
| In Swift, I suggest that we use the term protocol for this feature, because I |
| expect the end result to be similar enough to Objective-C protocols that our |
| users will benefit, and (more importantly) different enough from Java/C# |
| interfaces and C++ abstract base classes that those terms will be harmful. The |
| term trait comes with the wrong connotation for C++ programmers, and none of our |
| users know Scala. |
| |
| In its most basic form, a protocol is a collection of function signatures:: |
| |
| protocol Document { |
| func title() -> String |
| } |
| |
| Document describes types that have a title() operation that accepts no arguments |
| and returns a String. Note that there is implicitly a 'self' type, |
| which is the type that conforms to the protocol itself. This follows how most |
| object-oriented languages describe interfaces, but deviates from Haskell type |
| classes and C++ concepts, which require explicit type parameters for all of the |
| types. We'll revisit this decision later. |
| |
| Protocol Inheritance |
| -------------------- |
| |
| Composition of protocols is important to help programmers organize and |
| understand a large number of protocols and the data types that conform to those |
| protocols. For example, we could extend our Document protocol to cover documents |
| that support versioning:: |
| |
| protocol VersionedDocument : Document { |
| func version() -> Int |
| } |
| |
| Multiple inheritance is permitted, allowing us to form a directed acyclic graph |
| of protocols:: |
| |
| protocol PersistentDocument : VersionedDocument, Serializable { |
| func saveToFile(_ filename : path) |
| } |
| |
| .. @example.prepend('struct path {} ; protocol Serializable {}') |
| test(howmany='all') |
| |
| Any type that conforms to PersistentDocument also conforms to VersionedDocument, |
| Document, and Serializable, which gives us substitutability. |
| |
| Self Types |
| ---------- |
| |
| Protocols thus far do not give us an easy way to express simple binary |
| operations. For example, let's try to write a Comparable protocol that could be |
| used to search for a generic find() operation:: |
| |
| protocol Comparable { |
| func isEqual(_ other : ???) -> Bool |
| } |
| |
| Our options for filling in ??? are currently very poor. We could use the syntax |
| for saying "any type" or "any type that is comparable", as one must do most OO |
| languages, including Java, C#, and Objective-C, but that's not expressing what |
| we want: that the type of both of the arguments be the same. This is sometimes |
| referred to as the binary method problem |
| (http://www.cis.upenn.edu/~bcpierce/papers/binary.ps has a discussion of this |
| problem, including the solution I'm proposing below). |
| |
| Neither C++ concepts nor Haskell type classes have this particular problem, |
| because they don't have the notion of an implicit 'Self' type. Rather, |
| they explicitly parameterize everything. In C++ concepts:: |
| |
| concept Comparable<typename T> { |
| bool T::isEqual(T); |
| } |
| |
| .. @ignore() |
| .. We don't have a compiler for ConceptC++ |
| |
| Java and C# programmers work around this issue by parameterizing the |
| interface, e.g. (in Java):: |
| |
| abstract class Comparable<THIS extends Comparable<THIS>> { |
| public bool isEqual(THIS other); |
| } |
| |
| .. @ignore() |
| .. This test just doesn't compile at the moment, but that represents a |
| bug in Swift |
| |
| and then a class X that wants to be Comparable will inherit from |
| Comparable<X>. This is ugly and has a number of pitfalls; see |
| http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6479372. |
| |
| Scala and Strongtalk have the notion of the 'Self' type, which effectively |
| allows one to refer to the eventual type of 'self' (which we call |
| 'self'). 'Self' (which we call 'Self' in Swift) allows us to express the |
| Comparable protocol in a natural way:: |
| |
| protocol Comparable { |
| func isEqual(_ other : Self) -> Bool |
| } |
| |
| By expressing Comparable in this way, we know that if we have two objects of |
| type T where T conforms to Comparable, comparison between those two objects with |
| isEqual is well-typed. However, if we have objects of different types T and U, |
| we cannot compare those objects with isEqual even if both T and U are |
| Comparable. |
| |
| Self types are not without their costs, particularly in the case where Self is |
| used as a parameter type of a class method that will be subclassed. Here, the |
| parameter type ends up being (implicitly) covariant, which tightens up |
| type-checking but may also force us into more dynamic type checks. We can |
| explore this separately; within protocols, type-checking for Self is more |
| direct. |
| |
| Associated Types |
| ---------------- |
| |
| In addition to Self, a protocol's operations often need to refer to types that |
| are related to the type of 'Self', such as a type of data stored in a |
| collection, or the node and edge types of a graph. For example, this would allow |
| us to cleanly describe a protocol for collections:: |
| |
| protocol Collection { |
| typealias Element |
| func forEach(_ callback : (value : Element) -> Void) |
| func add(_ value : Element) |
| } |
| |
| It is important here that a generic function that refers to a given type T, |
| which is known to be a collection, can access the associated types corresponding |
| to T. For example, one could implement an "accumulate" operation for an |
| arbitrary Collection, but doing so requires us to specify some constraints on |
| the Value type of the collection. We'll return to this later. |
| |
| Operators, Properties, and Subscripting |
| --------------------------------------- |
| |
| As previously noted, protocols can contain both function requirements (which are |
| in effect requirements for instance methods) and associated type |
| requirements. Protocols can also contain operators, properties, and subscript |
| operators:: |
| |
| protocol RandomAccessContainer : Collection { |
| var count: Int |
| func == (lhs: Self, rhs: Self) |
| subscript(i: Int) -> Element |
| } |
| |
| Operator requirements can be satisfied by operator definitions, property |
| requirements can be satisfied by either variables or properties, and subscript |
| requirements can be satisfied by subscript operators. |
| |
| Conforming to a Protocol |
| ------------------------ |
| |
| Thus far, we have not actually shown how a type can meet the requirements of a |
| protocol. The most syntactically lightweight approach is to allow implicit |
| conformance. This is essentially duck typing, where a type is assumed to conform |
| to a protocol if it meets the syntactic requirements of the protocol. For |
| example, given:: |
| |
| protocol Shape { |
| func draw() |
| } |
| |
| One could write a Circle struct such as:: |
| |
| struct Circle { |
| var center : Point |
| var radius : Int |
| |
| func draw() { |
| // draw it |
| } |
| } |
| |
| Circle provides a draw() method with the same input and result types as required |
| by the Shape protocol. Therefore, Circle conforms to Shape. |
| |
| Implicit protocol conformance is convenient, because it requires no additional |
| typing. However, it can run into some trouble when an entity that syntactically |
| matches a protocol doesn't provide the required semantics. For example, Cowboys |
| also know how to "draw!":: |
| |
| struct Cowboy { |
| var gun : SixShooter |
| |
| func draw() { |
| // draw! |
| } |
| } |
| |
| It is unlikely that Cowboy is meant to conform to Shape, but the method name and |
| signatures match, so implicit conformance deduces that Cowboy conforms to |
| Shape. Random collisions between types are fairly rare. However, when one is |
| using protocol inheritance with fine-grained (semantic or mostly-semantic) |
| differences between protocols in the hierarchy, they become more common. See |
| http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1798.html for examples |
| of this problem as it surfaced with C++ concepts. It is not clear at this time |
| whether we want implicit conformance in Swift: there's no existing code to worry |
| about, and explicit conformance (described below) provides some benefits. |
| |
| Explicit Protocol Conformance |
| ----------------------------- |
| |
| Type authors often implement types that are intended to conform to a particular |
| protocol. For example, if we want a linked-list type to conform to Collection, |
| we can specify that it is by adding a protocol conformance annotation to the |
| type:: |
| |
| struct EmployeeList : Collection { // EmployeeList is a collection |
| typealias Element = T |
| func forEach(_ callback : (value : Element) -> Void) { /* Implement this */ } |
| func add(_ value : Element) { /* Implement this */ } |
| } |
| |
| This explicit protocol conformance declaration forces the compiler to check that |
| EmployeeList actually does meet the requirements of the Collection protocol. If |
| we were missing an operation (say, forEach) or had the wrong signature, the |
| definition of 'EmployeeList' would be ill-formed. Therefore, explicit |
| conformance provides both documentation for the user of EmployeeList and |
| checking for the author and future maintainers of EmployeeList. |
| |
| Any nominal type (such as an enum, struct, or class) can be specified to conform |
| to one or more protocols in this manner. Additionally, a typealias can be |
| specified to conform to one or more protocols, e.g.,:: |
| |
| typealias NSInteger : Numeric = Int |
| |
| While not technically necessary due to retroactive modeling (below), this can be |
| used to document and check that a particular type alias does in fact meet some |
| basic, important requirements. Moreover, it falls out of the syntax that places |
| requirements on associated types. |
| |
| Retroactive Modeling |
| -------------------- |
| |
| When using a set of libraries, it's fairly common that one library defines a |
| protocol (and useful generic entities requiring that protocol) while another |
| library provides a data type that provides similar functionality to that |
| protocol, but under a different name. Retroactive modeling is the process by |
| which the type is retrofitted (without changing the type) to meet the |
| requirements of the protocol. |
| |
| In Swift, we provide support for retroactive modeling by allowing |
| extensions, e.g.,:: |
| |
| extension String : Collection { |
| typealias Element = char |
| func forEach(_ callback : (value : Element) -> Void) { /* use existing String routines to enumerate characters */ } |
| func add(_ value : Element) { self += value /* append character */ } |
| } |
| |
| Once an extension is defined, the extension now conforms to the Collection |
| protocol, and can be used anywhere a Collection is expected. |
| |
| Default Implementations |
| ----------------------- |
| |
| The functions declared within a protocol are requirements that any type must |
| meet if it wants to conform to the protocol. There is a natural tension here, |
| then, between larger protocols that make it easier to write generic algorithms, |
| and smaller protocols that make it easier to write conforming types. For |
| example, should a Numeric protocol implement all operations, e.g.,:: |
| |
| protocol Numeric { |
| func +(lhs : Self, rhs : Self) -> Self |
| func -(lhs : Self, rhs : Self) -> Self |
| func +(x : Self) -> Self |
| func -(x : Self) -> Self |
| } |
| |
| which would make it easy to write general numeric algorithms, but requires the |
| author of some BigInt class to implement a lot of functionality, or should the |
| numeric protocol implement just the core operations:: |
| |
| protocol Numeric { |
| func +(lhs : Self, rhs : Self) -> Self |
| func -(x : Self) -> Self |
| } |
| |
| to make it easier to adopt the protocol (but harder to write numeric |
| algorithms)? Both of the protocols express the same thing (semantically), |
| because one can use the core operations (binary +, unary -) to implement the |
| other algorithms. However, it's far easier to allow the protocol itself to |
| provide default implementations:: |
| |
| protocol Numeric { |
| func +(lhs : Self, rhs : Self) -> Self |
| func -(lhs : Self, rhs : Self) -> Self { return lhs + -rhs } |
| func +(x : Self) -> Self { return x } |
| func -(x : Self) -> Self |
| } |
| |
| This makes it easier both to implement generic algorithms (which can use the |
| most natural syntax) and to make a new type conform to the protocol. For |
| example, if we were to define only the core algorithms in our BigNum type:: |
| |
| struct BigNum : Numeric { |
| func +(lhs : BigNum, rhs : BigNum) -> BigNum { ... } |
| func -(x : BigNum) -> BigNum { ... } |
| } |
| |
| the compiler will automatically synthesize the other operations needed for the |
| protocol. Moreover, these operations will be available to uses of the BigNum |
| class as if they had been written in the type itself (or in an extension of the |
| type, if that feature is used), which means that protocol conformance actually |
| makes it easier to define types that conform to protocols, rather than just |
| providing additional checking. |
| |
| Subtype Polymorphism |
| -------------------- |
| |
| Subtype polymorphism is based on the notion of substitutability. If a type S is |
| a subtype of a type T, then a value of type S can safely be used where a value |
| of type T is expected. Object-oriented languages typically use subtype |
| polymorphism, where the subtype relationship is based on inheritance: if the |
| class Dog inherits from the class Animal, then Dog is a subtype of |
| Animal. Subtype polymorphism is generally dynamic, in the sense that the |
| substitution occurs at run-time, even if it is statically type-checked. |
| |
| In Swift, we consider protocols to be types. A value of protocol type has an |
| existential type, meaning that we don't know the concrete type until run-time |
| (and even then it varies), but we know that the type conforms to the given |
| protocol. Thus, a variable can be declared with type "Serializable", e.g.,:: |
| |
| var x : Serializable = // value of any Serializable type |
| x.serialize() // okay: serialize() is part of the Serializable protocol |
| |
| Naturally, such polymorphism is dynamic, and will require boxing of value types |
| to implement. We can now see how Self types interact with subtype |
| polymorphism. For example, say we have two values of type Comparable, and we try |
| to compare them:: |
| |
| var x : Comparable = ... |
| var y : Comparable = ... |
| if x.isEqual(y) { // well-typed? |
| } |
| |
| Whether x.isEqual(y) is well-typed is not statically determinable, because the |
| dynamic type of x may different from the dynamic type of y, even if they are |
| both comparable (e.g., one is an Int and the other a String). It can be |
| implemented by the compiler as a dynamic type check, with some general failure |
| mode (aborting, throwing an exception, etc.) if the dynamic type check fails. |
| |
| To express types that meet the requirements of several protocols, one can just |
| create a new protocol aggregating those protocols:: |
| |
| protocol SerializableDocument : Document, Serializable { } |
| var doc : SerializableDocument |
| print(doc.title()) // okay: title() is part of the Document protocol, so we can call it |
| doc.serialize(stout) // okay: serialize() is part of the Serializable protocol |
| |
| However, this only makes sense when the resulting protocol is a useful |
| abstraction. A SerializableDocument may or may not be a useful abstraction. When |
| it is not useful, one can instead use '&' types to compose different protocols, e.g.,:: |
| |
| var doc : Document & Serializable |
| |
| Here, doc has an existential type that is known to conform to both the Document |
| and Serializable protocols. This gives rise to a natural "top" type, such that |
| every type in the language is a subtype of "top". Java has java.lang.Object, C# |
| has object, Objective-C has "id" (although "id" is weird, because it is also |
| convertible to everything; it's best not to use it as a model). In Swift, the |
| "top" type is simply an empty protocol composition: ``Any``:: |
| |
| var value : Any = 17 // an any can hold an integer |
| value = "hello" // or a String |
| value = (42, "hello", Red) // or anything else |
| |
| Bounded Parametric Polymorphism |
| ------------------------------- |
| |
| Parametric polymorphism is based on the idea of providing type parameters for a |
| generic function or type. When using that function or type, one substitutes |
| concrete types for the type parameters. Strictly speaking, parametric |
| polymorphism allows *any* type to be substituted for a type parameter, but it's |
| useless in practice because that means that generic functions or types cannot do |
| anything to the type parameters: they must instead rely on first-class functions |
| passed into the generic function or type to perform any meaningful work. |
| |
| Far more useful (and prevalent) is bounded parametric polymorphism, which allows |
| the generic function or type to specify constraints (bounds) on the type |
| parameters. By specifying these bounds, it becomes far easier to write and use |
| these generic functions and types. Haskell type classes, Java and C# generics, |
| C++ concepts, and many other language features support bounded parametric |
| polymorphism. |
| |
| Protocols provide a natural way to express the constraints of a generic function |
| in Swift. For example, one could define a generic linked list as:: |
| |
| struct ListNode<T> { |
| var Value : T |
| enum NextNode { case Node : ListNode<T>, End } |
| var Next : NextNode |
| } |
| |
| struct List<T > { |
| var First : ListNode<T>::NextNode |
| } |
| |
| This list works on any type T. One could then add a generic function that |
| inserts at the beginning of the list:: |
| |
| func insertAtBeginning<T>(_ list : List<T>, value : T) { |
| list.First = ListNode<T>(value, list.First) |
| } |
| |
| Expressing Constraints |
| ---------------------- |
| |
| Within the type parameter list of a generic type or function (e.g., the <T> in |
| ListNode<T>), the 'T' introduces a new type parameter and the (optional) ": |
| type" names a protocol (or protocol composition) to which 'T' must |
| conform. Within the body of the generic type or function, any of the functions |
| or types described by the constraints are available. For example, let's |
| implement a find() operation on lists:: |
| |
| func find<T : Comparable>(_ list : List<T>, value : T) -> Int { |
| var index = 0 |
| var current |
| for (current = list.First; current is Node; current = current.Next) { |
| if current.Value.isEqual(value) { // okay: T is Comparable |
| return index |
| } |
| index = index + 1 |
| } |
| return -1 |
| } |
| |
| In addition to providing constraints on the type parameters, we also need to be |
| able to constrain associated types. To do so, we introduce the notion of a |
| "where" clause, which follows the signature of the generic type or |
| function. For example, let's generalize our find algorithm to work on any |
| ordered collection:: |
| |
| protocol OrderedCollection : Collection { |
| func size() -> Int |
| func getAt(_ index : Int) -> Element // Element is an associated type |
| } |
| |
| func find<C : OrderedCollection where C.Element : Comparable>( |
| _ collection : C, value : C.Element) -> Int |
| { |
| for index in 0...collection.size() { |
| if (collection.getAt(index) == value) { // okay: we know that C.Element is Comparable |
| return index |
| } |
| } |
| return -1 |
| } |
| |
| The where clause is actually the more general way of expressing constraints, |
| and the constraints expressed in the angle brackets (e.g., <C : |
| OrderedCollection>) are just sugar for a where clause. For example, the |
| above find() signature is equivalent to:: |
| |
| func find<C where C : OrderedCollection, C.Element : Comparable>( |
| _ collection : C, value : C.Element) -> Int |
| |
| Note that find<C> is shorthand for (and equivalent to) find<C : Any>, since |
| every type conforms to the Any protocol composition. |
| |
| There are two other important kinds of constraints that need to be |
| expressible. Before we get to those, consider a simple "Enumerator" protocol that |
| lets us describe an iteration of values of some given value type:: |
| |
| protocol Enumerator { |
| typealias Element |
| func isEmpty() -> Bool |
| func next() -> Element |
| } |
| |
| Now, we want to express the notion of an enumerable collection, which provides |
| iteration, which we do by adding requirements into the protocol:: |
| |
| protocol EnumerableCollection : Collection { |
| typealias EnumeratorType : Enumerator |
| where EnumeratorType.Element == Element |
| func getEnumeratorType() -> EnumeratorType |
| } |
| |
| Here, we are specifying constraints on an associated type (EnumeratorType must |
| conform to the Enumerator protocol), by adding a conformance clause (: Enumerator) |
| to the associated type definition. We also use a separate where clause to |
| require that the type of values produced by querying the enumerator is the same as |
| the type of values stored in the container. This is important, for example, for |
| use with the Comparable protocol (and any protocol using Self types), because it |
| maintains type identity within the generic function or type. |
| |
| Constraint Inference |
| -------------------- |
| |
| Generic types often constrain their type parameters. For example, a |
| SortedDictionary, which provides dictionary functionality using some kind of |
| balanced binary tree (as in C++'s std::map), would require that its key type be |
| Comparable:: |
| |
| class SortedDictionary<Key : Comparable, Value> { |
| // ... |
| } |
| |
| Naturally, one any generic operation on a SortedDictionary<K,V> would also require |
| that K be Comparable, e.g.,:: |
| |
| func forEachKey<Key : Comparable, Value>(_ c : SortedDictionary<Key, Value>, |
| f : (Key) -> Void) { /* ... */ } |
| |
| However, explicitly requiring that Key conform to Comparable is redundant: one |
| could not provide an argument for 'c' without the Key type of the |
| SortedDictionary conforming to Comparable, because the SortedDictionary type |
| itself could not be formed. Constraint inference infers these additional |
| constraints within a generic function from the parameter and return types of the |
| function, simplifying the specification of forEachKey:: |
| |
| func forEachKey<Key, Value>(_ c : SortedDictionary<Key, Value>, |
| f : (Key) -> Void) { /* ... */ } |
| |
| Type Parameter Deduction |
| ------------------------ |
| |
| As noted above, type arguments will be deduced from the call arguments to a |
| generic function:: |
| |
| var values : list<Int> |
| insertAtBeginning(values, 17) // deduces T = Int |
| |
| Since Swift already has top-down type inference (as well as the C++-like |
| bottom-up inference), we can also deduce type arguments from the result type:: |
| |
| func cast<T, U>(_ value : T) -> U { ... } |
| var x : Any |
| var y : Int = cast(x) // deduces T = Any, U = Int |
| |
| We require that all type parameters for a generic function be deducible. We |
| introduce this restriction so that we can avoid introducing a syntax for |
| explicitly specifying type arguments to a generic function, e.g.,:: |
| |
| var y : Int = cast<Int>(x) // not permitted: < is the less-than operator |
| |
| This syntax is horribly ambiguous in C++, and with good type argument deduction, |
| should not be necessary in Swift. |
| |
| Implementation Model |
| -------------------- |
| |
| Because generics are constrained, a well-typed generic function or type can be |
| translated into object code that uses dynamic dispatch to perform each of its |
| operations on type parameters. This is in stark contrast to the instantiation |
| model of C++ templates, where each new set of template arguments requires the |
| generic function or type to be compiled again. This model is important for |
| scalability of builds, so that the time to perform type-checking and code |
| generation scales with the amount of code written rather than the amount of code |
| instantiated. Moreover, it can lead to smaller binaries and a more flexible |
| language (generic functions can be "virtual"). |
| |
| The translation model is fairly simple. Consider the generic find() we |
| implemented for lists, above:: |
| |
| func find<T : Comparable>(_ list : List<T>, value : T) -> Int { |
| var index = 0 |
| var current = list.First |
| while current is ListNode<T> { // now I'm just making stuff up |
| if current.value.isEqual(value) { // okay: T is Comparable |
| return index |
| } |
| current = current.Next |
| index = index + 1 |
| } |
| return -1 |
| } |
| |
| to translate this into executable code, we form a vtable for each of the |
| constraints on the generic function. In this case, we'll have a vtable for |
| Comparable T. Every operation within the body of this generic function |
| type-checks to either an operation on some concrete type (e.g., the operations |
| on Int), to an operation within a protocol (which requires indirection through |
| the corresponding vtable), or to an operation on a generic type definition, all |
| of which can be emitted as object code. |
| |
| .. _generics-specialization: |
| |
| Specialization |
| -------------- |
| |
| This implementation model lends itself to optimization when we know the specific |
| argument types that will be used when invoking the generic function. In this |
| case, some or all of the vtables provided for the constraints will effectively |
| be constants. By specializing the generic function (at compile-time, link-time, |
| or (if we have a JIT) run-time) for these types, we can eliminate the cost of |
| the virtual dispatch, inline calls when appropriate, and eliminate the overhead |
| of the generic system. Such optimizations can be performed based on heuristics, |
| user direction, or profile-guided optimization. |
| |
| An internal @_specialize function attribute allows developers to force |
| full specialization by listing concrete type names corresponding to the |
| function's generic signature. A function's generic signature is a |
| concatenation of its generic context and the function's own generic |
| type parameters.:: |
| |
| struct S<T> { |
| var x: T |
| @_specialize(where T == Int, U == Float) |
| mutating func exchangeSecond<U>(_ u: U, _ t: T) -> (U, T) { |
| x = t |
| return (u, x) |
| } |
| } |
| |
| // Substitutes: <T, U> with <Int, Float> producing: |
| // S<Int>::exchangeSecond<Float>(u: Float, t: Int) -> (Float, Int) |
| |
| @_specialize currently acts as a hint to the optimizer, which |
| generates type checks and code to dispatch to the specialized routine |
| without affecting the signature of the generic function. The |
| intention is to support efforts at evaluating the performance of |
| specialized code. The performance impact is not guaranteed and is |
| likely to change with the optimizer. This attribute should only be |
| used in conjunction with rigorous performance analysis. Eventually, |
| a similar attribute could be defined in the language, allowing it to be |
| exposed as part of a function's API. That would allow direct dispatch |
| to specialized code without type checks, even across modules. |
| |
| The exact syntax of the @_specialize function attribute is defined as: :: |
| |
| @_specialize(exported: true, kind: full, where K == Int, V == Int) |
| @_specialize(exported: false, kind: partial, where K: _Trivial64) |
| func dictFunction<K, V>(dict: Dictionary<K, V>) { |
| } |
| |
| If 'exported' is set, the corresponding specialization would have a public |
| visibility and can be used by clients. If 'exported' is omitted, it's value |
| is assumed to be 'false'. |
| |
| If 'kind' is 'full' it means a full specialization and the compiler will |
| produce an error if you forget to specify the type for some of the generic |
| parameters in the 'where' clause. If 'kind' is 'partial' it means a partial |
| specialization. If 'kind' is omitted, its value is assumed to be 'full. |
| |
| The requirements in the where clause may be same-type constraints like 'T == Int', |
| but they may also specify so-called layout constraints like 'T: _Trivial'. |
| |
| The following layout constraints are currently supported: |
| * AnyObject - the actual parameter should be an instance of a class |
| * _NativeClass - the actual parameter should be an instance of a Swift native |
| class |
| * _RefCountedObject - the actual parameter should be a reference-counted object |
| * _NativeRefCountedObject - the actual parameter should be a Swift-native |
| reference-counted object |
| * _Trivial - the actual parameter should be of a trivial type, i.e. a type |
| without any reference counted properties. |
| * _Trivial(SizeInBits) - like _Trivial, but the size of the type should be |
| exactly 'SizeInBits' bits. |
| * _TrivialAtMost(SizeInBits) - like _Trivial, but the size of the type should |
| be at most 'SizeInBits' bits. |
| |
| |
| Existential Types and Generics |
| ------------------------------ |
| |
| Both existential types and generics depend on dynamic dispatching based on |
| protocols. A value of an existential type (say, Comparable) is a pair (value, |
| vtable). 'value' stores the current value either directly (if it fits in the 3 |
| words allocated to the value) or as a pointer to the boxed representation (if |
| the actual representation is larger than 3 words). By itself, this value cannot |
| be interpreted, because it's type is not known statically, and may change due to |
| assignment. The vtable provides the means to manipulate the value, because it |
| provides a mapping between the protocols to which the existential type conforms |
| (which is known statically) to the functions that implements that |
| functionality for the type of the value. The value, therefore, can only be |
| safely manipulated through the functions in this vtable. |
| |
| A value of some generic type T uses a similar implementation model. However, |
| the (value, vtable) pair is split apart: values of type T contain only the value |
| part (the 3 words of data), while the vtable is maintained as a separate value |
| that can be shared among all T's within that generic function. |
| |
| Overloading |
| ----------- |
| |
| Generic functions can be overloaded based entirely on constraints. For example, |
| consider a binary search algorithm:: |
| |
| func binarySearch< |
| C : EnumerableCollection where C.Element : Comparable |
| >(_ collection : C, value : C.Element) |
| -> C.EnumeratorType |
| { |
| // We can perform log(N) comparisons, but EnumerableCollection |
| // only supports linear walks, so this is linear time |
| } |
| |
| protocol RandomAccessEnumerator : Enumerator { |
| // splits a range in half, returning both halves |
| func split() -> (Enumerator, Enumerator) |
| } |
| |
| func binarySearch< |
| C : EnumerableCollection |
| where C.Element : Comparable, |
| C.EnumeratorType: RandomAccessEnumerator |
| >(_ collection : C, value : C.Element) |
| -> C.EnumeratorType |
| { |
| // We can perform log(N) comparisons and log(N) range splits, |
| // so this is logarithmic time |
| } |
| |
| If binarySearch is called with a sequence whose range type conforms to |
| RandomAccessEnumerator, both of the generic functions match. However, the second |
| function is more specialized, because its constraints are a superset of the |
| constraints of the first function. In such a case, overloading should pick the |
| more specialized function. |
| |
| There is a question as to when this overloading occurs. For example, |
| binarySearch might be called as a subroutine of another generic function with |
| minimal requirements:: |
| |
| func doSomethingWithSearch< |
| C : EnumerableCollection where C.Element : Ordered |
| >( |
| _ collection : C, value : C.Element |
| ) -> C.EnumeratorType |
| { |
| binarySearch(collection, value) |
| } |
| |
| At the time when the generic definition of doSomethingWithSearch is |
| type-checked, only the first binarySearch() function applies, since we don't |
| know that C.EnumeratorType conforms to RandomAccessEnumerator. However, when |
| doSomethingWithSearch is actually invoked, C.EnumeratorType might conform to the |
| RandomAccessEnumerator, in which case we'd be better off picking the second |
| binarySearch. This amounts to run-time overload resolution, which may be |
| desirable, but also has downsides, such as the potential for run-time failures |
| due to ambiguities and the cost of performing such an expensive operation at |
| these call sites. Of course, that cost could be mitigated in hot generic |
| functions via the specialization mentioned above. |
| |
| Our current proposal for this is to decide statically which function is called |
| (based on similar partial-ordering rules as used in C++), and avoid run-time |
| overload resolution. If this proves onerous, we can revisit the decision later. |
| |
| Parsing Issues |
| -------------- |
| |
| The use of angle brackets to supply arguments to a generic type, while familiar |
| to C++/C#/Java programmers, cause some parsing problems. The problem stems from |
| the fact that '<', '>', and '>>' (the latter of which will show up in generic |
| types such as Array<Array<Int>>) match the 'operator' terminal in the grammar, |
| and we wish to continue using this as operators. |
| |
| When we're in the type grammar, this is a minor inconvenience for the parser, |
| because code like this:: |
| |
| var x : Array<Int> |
| |
| will essentially parse the type as:: |
| |
| identifier operator identifier operator |
| |
| and verify that the operators are '<' and '>', respectively. Cases |
| involving <> are more interesting, because the type of:: |
| |
| var y : Array<Array<Int>> |
| |
| is effectively parsed as:: |
| |
| identifier operator identifier operator identifier operator operator |
| |
| by splitting the '>>' operator token into two '>' operator tokens. |
| |
| However, this is manageable, and is already implemented for the (now deprecated) |
| protocol composition syntax (protocol<>). The larger problem occurs at expression |
| context, where the parser cannot disambiguate the tokens:: |
| |
| Matrix<Double>(10, 10) |
| |
| i.e.,:: |
| |
| identifier operator identifier operator unspaced_lparen integer-literal comma integer-literal rparen |
| |
| which can be interpreted as either:: |
| |
| (greater_than |
| (less_than |
| (declref Matrix) |
| (declref Double) |
| (tuple |
| (integer_literal 10) |
| (integer_literal 10))) |
| |
| or:: |
| |
| (constructor Matrix<Double> |
| (tuple |
| (integer_literal 10) |
| (integer_literal 10))) |
| |
| Both Java and C# have this ambiguity. C# resolves the ambiguity by looking at |
| the token after the closing '>' to decide which way to go; Java seems to do the |
| same. We have a few options: |
| |
| 1. Follow C# and Java and implement the infinite lookahead needed to make this |
| work. Note that we have true ambiguities, because one could make either of |
| the above parse trees well-formed. |
| |
| 2. Introduce some kind of special rule for '<' like we have for '(', such as: an |
| identifier followed by an unspaced '<' is a type, while an identifier |
| followed by spacing and then '<' is an expression, or |
| |
| 3. Pick some syntax other than angle brackets, which is not ambiguous. Note |
| that neither '(' nor '[' work, because they too have expression forms. |
| |
| 4. Disambiguate between the two parses semantically. |
| |
| We're going to try a variant of #1, using a variation of the disambiguation |
| rule used in C#. Essentially, when we see:: |
| |
| identifier < |
| |
| we look ahead, trying to parse a type parameter list, until parsing the type |
| parameter list fails or we find a closing '>'. We then look ahead an additional |
| token to see if the closing '>' is followed by a '(', '.', or closing bracketing |
| token (since types are most commonly followed by a constructor call or static |
| member access). If parsing the type parameter list succeeds, and the closing |
| angle bracket is followed by a '(', '.', or closing bracket token, then the |
| '<...>' sequence is parsed as a generic parameter list; otherwise, the '<' |
| is parsed as an operator. |
| |
| .. @ignore('all') |