blob: b5e15cbf3ea57343847e08f8ea7cf3a17c400d5c [file] [log] [blame]
.. _Generics:
Generics in Swift
Most types and functions in code are expressed in terms of a single, concrete
set of sets. Generics generalize this notion by allowing one to express types
and functions in terms of an abstraction over a (typically unbounded) set of
types, allowing improved code reuse. A typical example of a generic type is a
linked list of values, which can be used with any type of value. In C++, this
might be expressed as::
template<typename T>
class List {
struct Node {
T value;
Node *next;
Node *first;
where List<Int>, List<String>, and List<DataRecord> are all distinct types that
provide a linked list storing integers, strings, and DataRecords,
respectively. Given such a data structure, one also needs to be able to
implement generic functions that can operate on a list of any kind of elements,
such as a simple, linear search algorithm::
template<typename T>
typename List<T>::Node *find(const List<T>&list, const T& value) {
for (typename List<T>::Node *result = list.first; result; result = result->next)
if (result->value == value)
return result;
return 0;
.. @test('compile', howmany = 'all', cmake_args = ['COMPILER', '${CMAKE_CXX_COMPILER}'])
Generics are important for the construction of useful libraries, because they
allow the library to adapt to application-specific data types without losing
type safety. This is especially important for foundational libraries containing
common data structures and algorithms, since these libraries are used across
nearly every interesting application.
The alternatives to generics tend to lead to poor solutions:
* Object-oriented languages tend to use "top" types (id in Objective-C,
java.lang.Object in pre-generics Java, etc.) for their containers and
algorithms, which gives up static type safety. Pre-generics Java forced the
user to introduce run-time-checked type casts when interacting with containers
(which is overly verbose), while Objective-C relies on id's unsound implicit
conversion behavior to eliminate the need for casts.
* Many languages bake common data structures (arrays, dictionaries, tables) into
the language itself. This is unfortunate both because it significantly
increases the size of the core language and because users then tend to use
this limited set of data structures for *every* problem, even when another
(not-baked-in) data structure would be better.
Swift is intended to be a small, expressive language with great support for
building libraries. We'll need generics to be able to build those libraries
* Generics should enable the development of rich generic libraries that feel
similar to first-class language features
* Generics should work on any type, whether it is a value type or some kind of
object type
* Generic code should be almost as easy to write as non-generic code
* Generic code should be compiled such that it can be executed with any data
type without requiring a separate "instantiation" step
* Generics should interoperate cleanly with run-time polymorphism
* Types should be able to retroactively modified to meet the requirements of a
generic algorithm or data structure
As important as the goals of a feature are the explicit non-goals, which we
don't want or don't need to support:
* Compile-time "metaprogramming" in any form
* Expression-template tricks a la Boost.Spirit, POOMA
Polymorphism allows one to use different data types with a uniform
interface. Overloading already allows a form of polymorphism (ad hoc
polymorphism) in Swift. For example, given::
func +(x : Int, y : Int) -> Int { add... }
func +(x : String, y : String) -> String { concat... }
.. @example.replace('add...','return 1')
example.replace('concat...','return ""')
we can write the expression "x + y", which will work for both integers and
However, we want the ability to express an algorithm or data structure
independently of mentioning any data type. To do so, we need a way to express
the essential interface that algorithm or data structure requires. For example,
an accumulation algorithm would need to express that for any type T, one can
write the expression "x + y" (where x and y are both of type T) and it will
produce another T.
Most languages that provide some form of polymorphism also have a way to
describe abstract interfaces that cover a range of types: Java and C#
interfaces, C++ abstract base classes, Objective-C protocols, Scala traits,
Haskell type classes, C++ concepts (briefly), and many more. All allow one to
describe functions or methods that are part of the interface, and provide some
way to re-use or extend a previous interface by adding to it. We'll start with
that core feature, and build onto it what we need.
In Swift, I suggest that we use the term protocol for this feature, because I
expect the end result to be similar enough to Objective-C protocols that our
users will benefit, and (more importantly) different enough from Java/C#
interfaces and C++ abstract base classes that those terms will be harmful. The
term trait comes with the wrong connotation for C++ programmers, and none of our
users know Scala.
In its most basic form, a protocol is a collection of function signatures::
protocol Document {
func title() -> String
Document describes types that have a title() operation that accepts no arguments
and returns a String. Note that there is implicitly a 'self' type,
which is the type that conforms to the protocol itself. This follows how most
object-oriented languages describe interfaces, but deviates from Haskell type
classes and C++ concepts, which require explicit type parameters for all of the
types. We'll revisit this decision later.
Protocol Inheritance
Composition of protocols is important to help programmers organize and
understand a large number of protocols and the data types that conform to those
protocols. For example, we could extend our Document protocol to cover documents
that support versioning::
protocol VersionedDocument : Document {
func version() -> Int
Multiple inheritance is permitted, allowing us to form a directed acyclic graph
of protocols::
protocol PersistentDocument : VersionedDocument, Serializable {
func saveToFile(_ filename : path)
.. @example.prepend('struct path {} ; protocol Serializable {}')
Any type that conforms to PersistentDocument also conforms to VersionedDocument,
Document, and Serializable, which gives us substitutability.
Self Types
Protocols thus far do not give us an easy way to express simple binary
operations. For example, let's try to write a Comparable protocol that could be
used to search for a generic find() operation::
protocol Comparable {
func isEqual(_ other : ???) -> Bool
Our options for filling in ??? are currently very poor. We could use the syntax
for saying "any type" or "any type that is comparable", as one must do most OO
languages, including Java, C#, and Objective-C, but that's not expressing what
we want: that the type of both of the arguments be the same. This is sometimes
referred to as the binary method problem
( has a discussion of this
problem, including the solution I'm proposing below).
Neither C++ concepts nor Haskell type classes have this particular problem,
because they don't have the notion of an implicit 'Self' type. Rather,
they explicitly parameterize everything. In C++ concepts::
concept Comparable<typename T> {
bool T::isEqual(T);
.. @ignore()
.. We don't have a compiler for ConceptC++
Java and C# programmers work around this issue by parameterizing the
interface, e.g. (in Java)::
abstract class Comparable<THIS extends Comparable<THIS>> {
public bool isEqual(THIS other);
.. @ignore()
.. This test just doesn't compile at the moment, but that represents a
bug in Swift
and then a class X that wants to be Comparable will inherit from
Comparable<X>. This is ugly and has a number of pitfalls; see
Scala and Strongtalk have the notion of the 'Self' type, which effectively
allows one to refer to the eventual type of 'self' (which we call
'self'). 'Self' (which we call 'Self' in Swift) allows us to express the
Comparable protocol in a natural way::
protocol Comparable {
func isEqual(_ other : Self) -> Bool
By expressing Comparable in this way, we know that if we have two objects of
type T where T conforms to Comparable, comparison between those two objects with
isEqual is well-typed. However, if we have objects of different types T and U,
we cannot compare those objects with isEqual even if both T and U are
Self types are not without their costs, particularly in the case where Self is
used as a parameter type of a class method that will be subclassed. Here, the
parameter type ends up being (implicitly) covariant, which tightens up
type-checking but may also force us into more dynamic type checks. We can
explore this separately; within protocols, type-checking for Self is more
Associated Types
In addition to Self, a protocol's operations often need to refer to types that
are related to the type of 'Self', such as a type of data stored in a
collection, or the node and edge types of a graph. For example, this would allow
us to cleanly describe a protocol for collections::
protocol Collection {
typealias Element
func forEach(_ callback : (value : Element) -> Void)
func add(_ value : Element)
It is important here that a generic function that refers to a given type T,
which is known to be a collection, can access the associated types corresponding
to T. For example, one could implement an "accumulate" operation for an
arbitrary Collection, but doing so requires us to specify some constraints on
the Value type of the collection. We'll return to this later.
Operators, Properties, and Subscripting
As previously noted, protocols can contain both function requirements (which are
in effect requirements for instance methods) and associated type
requirements. Protocols can also contain operators, properties, and subscript
protocol RandomAccessContainer : Collection {
var count: Int
func == (lhs: Self, rhs: Self)
subscript(i: Int) -> Element
Operator requirements can be satisfied by operator definitions, property
requirements can be satisfied by either variables or properties, and subscript
requirements can be satisfied by subscript operators.
Conforming to a Protocol
Thus far, we have not actually shown how a type can meet the requirements of a
protocol. The most syntactically lightweight approach is to allow implicit
conformance. This is essentially duck typing, where a type is assumed to conform
to a protocol if it meets the syntactic requirements of the protocol. For
example, given::
protocol Shape {
func draw()
One could write a Circle struct such as::
struct Circle {
var center : Point
var radius : Int
func draw() {
// draw it
Circle provides a draw() method with the same input and result types as required
by the Shape protocol. Therefore, Circle conforms to Shape.
Implicit protocol conformance is convenient, because it requires no additional
typing. However, it can run into some trouble when an entity that syntactically
matches a protocol doesn't provide the required semantics. For example, Cowboys
also know how to "draw!"::
struct Cowboy {
var gun : SixShooter
func draw() {
// draw!
It is unlikely that Cowboy is meant to conform to Shape, but the method name and
signatures match, so implicit conformance deduces that Cowboy conforms to
Shape. Random collisions between types are fairly rare. However, when one is
using protocol inheritance with fine-grained (semantic or mostly-semantic)
differences between protocols in the hierarchy, they become more common. See for examples
of this problem as it surfaced with C++ concepts. It is not clear at this time
whether we want implicit conformance in Swift: there's no existing code to worry
about, and explicit conformance (described below) provides some benefits.
Explicit Protocol Conformance
Type authors often implement types that are intended to conform to a particular
protocol. For example, if we want a linked-list type to conform to Collection,
we can specify that it is by adding a protocol conformance annotation to the
struct EmployeeList : Collection { // EmployeeList is a collection
typealias Element = T
func forEach(_ callback : (value : Element) -> Void) { /* Implement this */ }
func add(_ value : Element) { /* Implement this */ }
This explicit protocol conformance declaration forces the compiler to check that
EmployeeList actually does meet the requirements of the Collection protocol. If
we were missing an operation (say, forEach) or had the wrong signature, the
definition of 'EmployeeList' would be ill-formed. Therefore, explicit
conformance provides both documentation for the user of EmployeeList and
checking for the author and future maintainers of EmployeeList.
Any nominal type (such as an enum, struct, or class) can be specified to conform
to one or more protocols in this manner. Additionally, a typealias can be
specified to conform to one or more protocols, e.g.,::
typealias NSInteger : Numeric = Int
While not technically necessary due to retroactive modeling (below), this can be
used to document and check that a particular type alias does in fact meet some
basic, important requirements. Moreover, it falls out of the syntax that places
requirements on associated types.
Retroactive Modeling
When using a set of libraries, it's fairly common that one library defines a
protocol (and useful generic entities requiring that protocol) while another
library provides a data type that provides similar functionality to that
protocol, but under a different name. Retroactive modeling is the process by
which the type is retrofitted (without changing the type) to meet the
requirements of the protocol.
In Swift, we provide support for retroactive modeling by allowing
extensions, e.g.,::
extension String : Collection {
typealias Element = char
func forEach(_ callback : (value : Element) -> Void) { /* use existing String routines to enumerate characters */ }
func add(_ value : Element) { self += value /* append character */ }
Once an extension is defined, the extension now conforms to the Collection
protocol, and can be used anywhere a Collection is expected.
Default Implementations
The functions declared within a protocol are requirements that any type must
meet if it wants to conform to the protocol. There is a natural tension here,
then, between larger protocols that make it easier to write generic algorithms,
and smaller protocols that make it easier to write conforming types. For
example, should a Numeric protocol implement all operations, e.g.,::
protocol Numeric {
func +(lhs : Self, rhs : Self) -> Self
func -(lhs : Self, rhs : Self) -> Self
func +(x : Self) -> Self
func -(x : Self) -> Self
which would make it easy to write general numeric algorithms, but requires the
author of some BigInt class to implement a lot of functionality, or should the
numeric protocol implement just the core operations::
protocol Numeric {
func +(lhs : Self, rhs : Self) -> Self
func -(x : Self) -> Self
to make it easier to adopt the protocol (but harder to write numeric
algorithms)? Both of the protocols express the same thing (semantically),
because one can use the core operations (binary +, unary -) to implement the
other algorithms. However, it's far easier to allow the protocol itself to
provide default implementations::
protocol Numeric {
func +(lhs : Self, rhs : Self) -> Self
func -(lhs : Self, rhs : Self) -> Self { return lhs + -rhs }
func +(x : Self) -> Self { return x }
func -(x : Self) -> Self
This makes it easier both to implement generic algorithms (which can use the
most natural syntax) and to make a new type conform to the protocol. For
example, if we were to define only the core algorithms in our BigNum type::
struct BigNum : Numeric {
func +(lhs : BigNum, rhs : BigNum) -> BigNum { ... }
func -(x : BigNum) -> BigNum { ... }
the compiler will automatically synthesize the other operations needed for the
protocol. Moreover, these operations will be available to uses of the BigNum
class as if they had been written in the type itself (or in an extension of the
type, if that feature is used), which means that protocol conformance actually
makes it easier to define types that conform to protocols, rather than just
providing additional checking.
Subtype Polymorphism
Subtype polymorphism is based on the notion of substitutability. If a type S is
a subtype of a type T, then a value of type S can safely be used where a value
of type T is expected. Object-oriented languages typically use subtype
polymorphism, where the subtype relationship is based on inheritance: if the
class Dog inherits from the class Animal, then Dog is a subtype of
Animal. Subtype polymorphism is generally dynamic, in the sense that the
substitution occurs at run-time, even if it is statically type-checked.
In Swift, we consider protocols to be types. A value of protocol type has an
existential type, meaning that we don't know the concrete type until run-time
(and even then it varies), but we know that the type conforms to the given
protocol. Thus, a variable can be declared with type "Serializable", e.g.,::
var x : Serializable = // value of any Serializable type
x.serialize() // okay: serialize() is part of the Serializable protocol
Naturally, such polymorphism is dynamic, and will require boxing of value types
to implement. We can now see how Self types interact with subtype
polymorphism. For example, say we have two values of type Comparable, and we try
to compare them::
var x : Comparable = ...
var y : Comparable = ...
if x.isEqual(y) { // well-typed?
Whether x.isEqual(y) is well-typed is not statically determinable, because the
dynamic type of x may different from the dynamic type of y, even if they are
both comparable (e.g., one is an Int and the other a String). It can be
implemented by the compiler as a dynamic type check, with some general failure
mode (aborting, throwing an exception, etc.) if the dynamic type check fails.
To express types that meet the requirements of several protocols, one can just
create a new protocol aggregating those protocols::
protocol SerializableDocument : Document, Serializable { }
var doc : SerializableDocument
print(doc.title()) // okay: title() is part of the Document protocol, so we can call it
doc.serialize(stout) // okay: serialize() is part of the Serializable protocol
However, this only makes sense when the resulting protocol is a useful
abstraction. A SerializableDocument may or may not be a useful abstraction. When
it is not useful, one can instead use '&' types to compose different protocols, e.g.,::
var doc : Document & Serializable
Here, doc has an existential type that is known to conform to both the Document
and Serializable protocols. This gives rise to a natural "top" type, such that
every type in the language is a subtype of "top". Java has java.lang.Object, C#
has object, Objective-C has "id" (although "id" is weird, because it is also
convertible to everything; it's best not to use it as a model). In Swift, the
"top" type is simply an empty protocol composition: ``Any``::
var value : Any = 17 // an any can hold an integer
value = "hello" // or a String
value = (42, "hello", Red) // or anything else
Bounded Parametric Polymorphism
Parametric polymorphism is based on the idea of providing type parameters for a
generic function or type. When using that function or type, one substitutes
concrete types for the type parameters. Strictly speaking, parametric
polymorphism allows *any* type to be substituted for a type parameter, but it's
useless in practice because that means that generic functions or types cannot do
anything to the type parameters: they must instead rely on first-class functions
passed into the generic function or type to perform any meaningful work.
Far more useful (and prevalent) is bounded parametric polymorphism, which allows
the generic function or type to specify constraints (bounds) on the type
parameters. By specifying these bounds, it becomes far easier to write and use
these generic functions and types. Haskell type classes, Java and C# generics,
C++ concepts, and many other language features support bounded parametric
Protocols provide a natural way to express the constraints of a generic function
in Swift. For example, one could define a generic linked list as::
struct ListNode<T> {
var Value : T
enum NextNode { case Node : ListNode<T>, End }
var Next : NextNode
struct List<T > {
var First : ListNode<T>::NextNode
This list works on any type T. One could then add a generic function that
inserts at the beginning of the list::
func insertAtBeginning<T>(_ list : List<T>, value : T) {
list.First = ListNode<T>(value, list.First)
Expressing Constraints
Within the type parameter list of a generic type or function (e.g., the <T> in
ListNode<T>), the 'T' introduces a new type parameter and the (optional) ":
type" names a protocol (or protocol composition) to which 'T' must
conform. Within the body of the generic type or function, any of the functions
or types described by the constraints are available. For example, let's
implement a find() operation on lists::
func find<T : Comparable>(_ list : List<T>, value : T) -> Int {
var index = 0
var current
for (current = list.First; current is Node; current = current.Next) {
if current.Value.isEqual(value) { // okay: T is Comparable
return index
index = index + 1
return -1
In addition to providing constraints on the type parameters, we also need to be
able to constrain associated types. To do so, we introduce the notion of a
"where" clause, which follows the signature of the generic type or
function. For example, let's generalize our find algorithm to work on any
ordered collection::
protocol OrderedCollection : Collection {
func size() -> Int
func getAt(_ index : Int) -> Element // Element is an associated type
func find<C : OrderedCollection where C.Element : Comparable>(
_ collection : C, value : C.Element) -> Int
for index in 0...collection.size() {
if (collection.getAt(index) == value) { // okay: we know that C.Element is Comparable
return index
return -1
The where clause is actually the more general way of expressing constraints,
and the constraints expressed in the angle brackets (e.g., <C :
OrderedCollection>) are just sugar for a where clause. For example, the
above find() signature is equivalent to::
func find<C where C : OrderedCollection, C.Element : Comparable>(
_ collection : C, value : C.Element) -> Int
Note that find<C> is shorthand for (and equivalent to) find<C : Any>, since
every type conforms to the Any protocol composition.
There are two other important kinds of constraints that need to be
expressible. Before we get to those, consider a simple "Enumerator" protocol that
lets us describe an iteration of values of some given value type::
protocol Enumerator {
typealias Element
func isEmpty() -> Bool
func next() -> Element
Now, we want to express the notion of an enumerable collection, which provides
iteration, which we do by adding requirements into the protocol::
protocol EnumerableCollection : Collection {
typealias EnumeratorType : Enumerator
where EnumeratorType.Element == Element
func getEnumeratorType() -> EnumeratorType
Here, we are specifying constraints on an associated type (EnumeratorType must
conform to the Enumerator protocol), by adding a conformance clause (: Enumerator)
to the associated type definition. We also use a separate where clause to
require that the type of values produced by querying the enumerator is the same as
the type of values stored in the container. This is important, for example, for
use with the Comparable protocol (and any protocol using Self types), because it
maintains type identity within the generic function or type.
Constraint Inference
Generic types often constrain their type parameters. For example, a
SortedDictionary, which provides dictionary functionality using some kind of
balanced binary tree (as in C++'s std::map), would require that its key type be
class SortedDictionary<Key : Comparable, Value> {
// ...
Naturally, one any generic operation on a SortedDictionary<K,V> would also require
that K be Comparable, e.g.,::
func forEachKey<Key : Comparable, Value>(_ c : SortedDictionary<Key, Value>,
f : (Key) -> Void) { /* ... */ }
However, explicitly requiring that Key conform to Comparable is redundant: one
could not provide an argument for 'c' without the Key type of the
SortedDictionary conforming to Comparable, because the SortedDictionary type
itself could not be formed. Constraint inference infers these additional
constraints within a generic function from the parameter and return types of the
function, simplifying the specification of forEachKey::
func forEachKey<Key, Value>(_ c : SortedDictionary<Key, Value>,
f : (Key) -> Void) { /* ... */ }
Type Parameter Deduction
As noted above, type arguments will be deduced from the call arguments to a
generic function::
var values : list<Int>
insertAtBeginning(values, 17) // deduces T = Int
Since Swift already has top-down type inference (as well as the C++-like
bottom-up inference), we can also deduce type arguments from the result type::
func cast<T, U>(_ value : T) -> U { ... }
var x : Any
var y : Int = cast(x) // deduces T = Any, U = Int
We require that all type parameters for a generic function be deducible. We
introduce this restriction so that we can avoid introducing a syntax for
explicitly specifying type arguments to a generic function, e.g.,::
var y : Int = cast<Int>(x) // not permitted: < is the less-than operator
This syntax is horribly ambiguous in C++, and with good type argument deduction,
should not be necessary in Swift.
Implementation Model
Because generics are constrained, a well-typed generic function or type can be
translated into object code that uses dynamic dispatch to perform each of its
operations on type parameters. This is in stark contrast to the instantiation
model of C++ templates, where each new set of template arguments requires the
generic function or type to be compiled again. This model is important for
scalability of builds, so that the time to perform type-checking and code
generation scales with the amount of code written rather than the amount of code
instantiated. Moreover, it can lead to smaller binaries and a more flexible
language (generic functions can be "virtual").
The translation model is fairly simple. Consider the generic find() we
implemented for lists, above::
func find<T : Comparable>(_ list : List<T>, value : T) -> Int {
var index = 0
var current = list.First
while current is ListNode<T> { // now I'm just making stuff up
if current.value.isEqual(value) { // okay: T is Comparable
return index
current = current.Next
index = index + 1
return -1
to translate this into executable code, we form a vtable for each of the
constraints on the generic function. In this case, we'll have a vtable for
Comparable T. Every operation within the body of this generic function
type-checks to either an operation on some concrete type (e.g., the operations
on Int), to an operation within a protocol (which requires indirection through
the corresponding vtable), or to an operation on a generic type definition, all
of which can be emitted as object code.
.. _generics-specialization:
This implementation model lends itself to optimization when we know the specific
argument types that will be used when invoking the generic function. In this
case, some or all of the vtables provided for the constraints will effectively
be constants. By specializing the generic function (at compile-time, link-time,
or (if we have a JIT) run-time) for these types, we can eliminate the cost of
the virtual dispatch, inline calls when appropriate, and eliminate the overhead
of the generic system. Such optimizations can be performed based on heuristics,
user direction, or profile-guided optimization.
An internal @_specialize function attribute allows developers to force
full specialization by listing concrete type names corresponding to the
function's generic signature. A function's generic signature is a
concatenation of its generic context and the function's own generic
type parameters.::
struct S<T> {
var x: T
@_specialize(where T == Int, U == Float)
mutating func exchangeSecond<U>(_ u: U, _ t: T) -> (U, T) {
x = t
return (u, x)
// Substitutes: <T, U> with <Int, Float> producing:
// S<Int>::exchangeSecond<Float>(u: Float, t: Int) -> (Float, Int)
@_specialize currently acts as a hint to the optimizer, which
generates type checks and code to dispatch to the specialized routine
without affecting the signature of the generic function. The
intention is to support efforts at evaluating the performance of
specialized code. The performance impact is not guaranteed and is
likely to change with the optimizer. This attribute should only be
used in conjunction with rigorous performance analysis. Eventually,
a similar attribute could be defined in the language, allowing it to be
exposed as part of a function's API. That would allow direct dispatch
to specialized code without type checks, even across modules.
The exact syntax of the @_specialize function attribute is defined as: ::
@_specialize(exported: true, kind: full, where K == Int, V == Int)
@_specialize(exported: false, kind: partial, where K: _Trivial64)
func dictFunction<K, V>(dict: Dictionary<K, V>) {
If 'exported' is set, the corresponding specialization would have a public
visibility and can be used by clients. If 'exported' is omitted, it's value
is assumed to be 'false'.
If 'kind' is 'full' it means a full specialization and the compiler will
produce an error if you forget to specify the type for some of the generic
parameters in the 'where' clause. If 'kind' is 'partial' it means a partial
specialization. If 'kind' is omitted, its value is assumed to be 'full.
The requirements in the where clause may be same-type constraints like 'T == Int',
but they may also specify so-called layout constraints like 'T: _Trivial'.
The following layout constraints are currently supported:
* AnyObject - the actual parameter should be an instance of a class
* _NativeClass - the actual parameter should be an instance of a Swift native
* _RefCountedObject - the actual parameter should be a reference-counted object
* _NativeRefCountedObject - the actual parameter should be a Swift-native
reference-counted object
* _Trivial - the actual parameter should be of a trivial type, i.e. a type
without any reference counted properties.
* _Trivial(SizeInBits) - like _Trivial, but the size of the type should be
exactly 'SizeInBits' bits.
* _TrivialAtMost(SizeInBits) - like _Trivial, but the size of the type should
be at most 'SizeInBits' bits.
Existential Types and Generics
Both existential types and generics depend on dynamic dispatching based on
protocols. A value of an existential type (say, Comparable) is a pair (value,
vtable). 'value' stores the current value either directly (if it fits in the 3
words allocated to the value) or as a pointer to the boxed representation (if
the actual representation is larger than 3 words). By itself, this value cannot
be interpreted, because its type is not known statically, and may change due to
assignment. The vtable provides the means to manipulate the value, because it
provides a mapping between the protocols to which the existential type conforms
(which is known statically) to the functions that implements that
functionality for the type of the value. The value, therefore, can only be
safely manipulated through the functions in this vtable.
A value of some generic type T uses a similar implementation model. However,
the (value, vtable) pair is split apart: values of type T contain only the value
part (the 3 words of data), while the vtable is maintained as a separate value
that can be shared among all T's within that generic function.
Generic functions can be overloaded based entirely on constraints. For example,
consider a binary search algorithm::
func binarySearch<
C : EnumerableCollection where C.Element : Comparable
>(_ collection : C, value : C.Element)
-> C.EnumeratorType
// We can perform log(N) comparisons, but EnumerableCollection
// only supports linear walks, so this is linear time
protocol RandomAccessEnumerator : Enumerator {
// splits a range in half, returning both halves
func split() -> (Enumerator, Enumerator)
func binarySearch<
C : EnumerableCollection
where C.Element : Comparable,
C.EnumeratorType: RandomAccessEnumerator
>(_ collection : C, value : C.Element)
-> C.EnumeratorType
// We can perform log(N) comparisons and log(N) range splits,
// so this is logarithmic time
If binarySearch is called with a sequence whose range type conforms to
RandomAccessEnumerator, both of the generic functions match. However, the second
function is more specialized, because its constraints are a superset of the
constraints of the first function. In such a case, overloading should pick the
more specialized function.
There is a question as to when this overloading occurs. For example,
binarySearch might be called as a subroutine of another generic function with
minimal requirements::
func doSomethingWithSearch<
C : EnumerableCollection where C.Element : Ordered
_ collection : C, value : C.Element
) -> C.EnumeratorType
binarySearch(collection, value)
At the time when the generic definition of doSomethingWithSearch is
type-checked, only the first binarySearch() function applies, since we don't
know that C.EnumeratorType conforms to RandomAccessEnumerator. However, when
doSomethingWithSearch is actually invoked, C.EnumeratorType might conform to the
RandomAccessEnumerator, in which case we'd be better off picking the second
binarySearch. This amounts to run-time overload resolution, which may be
desirable, but also has downsides, such as the potential for run-time failures
due to ambiguities and the cost of performing such an expensive operation at
these call sites. Of course, that cost could be mitigated in hot generic
functions via the specialization mentioned above.
Our current proposal for this is to decide statically which function is called
(based on similar partial-ordering rules as used in C++), and avoid run-time
overload resolution. If this proves onerous, we can revisit the decision later.
Parsing Issues
The use of angle brackets to supply arguments to a generic type, while familiar
to C++/C#/Java programmers, cause some parsing problems. The problem stems from
the fact that '<', '>', and '>>' (the latter of which will show up in generic
types such as Array<Array<Int>>) match the 'operator' terminal in the grammar,
and we wish to continue using this as operators.
When we're in the type grammar, this is a minor inconvenience for the parser,
because code like this::
var x : Array<Int>
will essentially parse the type as::
identifier operator identifier operator
and verify that the operators are '<' and '>', respectively. Cases
involving <> are more interesting, because the type of::
var y : Array<Array<Int>>
is effectively parsed as::
identifier operator identifier operator identifier operator operator
by splitting the '>>' operator token into two '>' operator tokens.
However, this is manageable, and is already implemented for the (now deprecated)
protocol composition syntax (protocol<>). The larger problem occurs at expression
context, where the parser cannot disambiguate the tokens::
Matrix<Double>(10, 10)
identifier operator identifier operator unspaced_lparen integer-literal comma integer-literal rparen
which can be interpreted as either::
(declref Matrix)
(declref Double)
(integer_literal 10)
(integer_literal 10)))
(constructor Matrix<Double>
(integer_literal 10)
(integer_literal 10)))
Both Java and C# have this ambiguity. C# resolves the ambiguity by looking at
the token after the closing '>' to decide which way to go; Java seems to do the
same. We have a few options:
1. Follow C# and Java and implement the infinite lookahead needed to make this
work. Note that we have true ambiguities, because one could make either of
the above parse trees well-formed.
2. Introduce some kind of special rule for '<' like we have for '(', such as: an
identifier followed by an unspaced '<' is a type, while an identifier
followed by spacing and then '<' is an expression, or
3. Pick some syntax other than angle brackets, which is not ambiguous. Note
that neither '(' nor '[' work, because they too have expression forms.
4. Disambiguate between the two parses semantically.
We're going to try a variant of #1, using a variation of the disambiguation
rule used in C#. Essentially, when we see::
identifier <
we look ahead, trying to parse a type parameter list, until parsing the type
parameter list fails or we find a closing '>'. We then look ahead an additional
token to see if the closing '>' is followed by a '(', '.', or closing bracketing
token (since types are most commonly followed by a constructor call or static
member access). If parsing the type parameter list succeeds, and the closing
angle bracket is followed by a '(', '.', or closing bracket token, then the
'<...>' sequence is parsed as a generic parameter list; otherwise, the '<'
is parsed as an operator.
.. @ignore('all')