| ====================================================== |
| Index Invalidation Rules in the Swift Standard Library |
| ====================================================== |
| |
| Points to consider |
| ================== |
| |
| (1) Collections can be implemented as value types or a reference types. |
| |
| (2) Copying an instance of a value type, or copying a reference has |
| well-defined semantics built into the language and is not controllable by the |
| user code. |
| |
| Consequence: value-typed collections in Swift have to use copy-on-write for |
| data stored out-of-line in reference-typed buffers. |
| |
| (3) We want to be able to pass/return a Collection along with its indices in a |
| safe manner. |
| |
| In Swift, unlike C++, indices are not sufficient to access collection data; |
| one needs an index and a collection. Thus, merely passing a collection by |
| value to a function should not invalidate indices. |
| |
| General principles |
| ================== |
| |
| In C++, validity of an iterator is a property of the iterator itself, since |
| iterators can be dereferenced to access collection elements. |
| |
| In Swift, in order to access a collection element designated by an index, |
| subscript operator is applied to the collection, ``C[I]``. Thus, index is |
| valid or not only in context of a certain collection instance at a certain |
| point of program execution. A given index can be valid for zero, one or more |
| than one collection instance at the same time. |
| |
| An index that is valid for a certain collection designates an element of that |
| collection or represents a one-past-end index. |
| |
| Operations that access collection elements require valid indexes (this includes |
| accessing using the subscript operator, slicing, swapping elements, removing |
| elements etc.) |
| |
| Using an invalid index to access elements of a collection leads to unspecified |
| memory-safe behavior. (Possibilities include trapping, performing the |
| operation on an arbitrary element of this or any other collection etc.) |
| Concrete collection types can specify behavior; implementations are advised to |
| perform a trap. |
| |
| An arbitrary index instance is not valid for an arbitrary collection instance. |
| |
| The following points apply to all collections, defined in the library or by the |
| user: |
| |
| (1) Indices obtained from a collection ``C`` via ``C.startIndex``, |
| ``C.endIndex`` and other collection-specific APIs returning indices, are |
| valid for ``C``. |
| |
| (2) If an index ``I`` is valid for a collection ``C``, a copy of ``I`` is valid |
| for ``C``. |
| |
| (3) If an index ``I`` is valid for a collection ``C``, indices obtained from |
| ``I`` via ``I.successor()``, ``I.predecessor()``, and other index-specific |
| APIs, are valid for ``C``. |
| FIXME: disallow startIndex.predecessor(), endIndex.successor() |
| |
| (4) **Indices of collections and slices freely interoperate.** |
| |
| If an index ``I`` is valid for a collection ``C``, it is also valid for |
| slices of ``C``, provided that ``I`` was in the bounds that were passed to |
| the slicing subscript. |
| |
| If an index ``I`` is valid for a slice obtained from a collection ``C``, it |
| is also valid for ``C`` itself. |
| |
| (5) If an index ``I`` is valid for a collection ``C``, it is also valid for |
| a copy of ``C``. |
| |
| (6) If an index ``I`` is valid for a collection ``C``, it continues to be valid |
| after a call to a non-mutating method on ``C``. |
| |
| (7) Calling a non-mutating method on a collection instance does not invalidate |
| any indexes. |
| |
| (8) Indices behave as if they are composites of offsets in the underlying data |
| structure. For example: |
| |
| - an index into a set backed by a hash table with open addressing is the |
| number of the bucket where the element is stored; |
| |
| - an index into a collection backed by a tree is a sequence of integers |
| that describe the path from the root of the tree to the leaf node; |
| |
| - an index into a lazy flatMap collection consists of a pair of indices, an |
| index into the base collection that is being mapped, and the index into |
| the result of mapping the element designated by the first index. |
| |
| This rule does not imply that indices should be cheap to convert to actual |
| integers. The offsets for consecutive elements could be non-consecutive |
| (e.g., in a hash table with open addressing), or consist of multiple |
| offsets so that the conversion to an integer is non-trivial (e.g., in a |
| tree). |
| |
| Note that this rule, like all other rules, is an "as if" rule. As long as |
| the resulting semantics match what the rules dictate, the actual |
| implementation can be anything. |
| |
| Rationale and discussion: |
| |
| - This rule is mostly motivated by its consequences, in particular, being |
| able to mutate an element of a collection without changing the |
| collection's structure, and, thus, without invalidating indices. |
| |
| - Replacing a collection element has runtime complexity O(1) and is not |
| considered a structural mutation. Therefore, there seems to be no reason |
| for a collection model would need to invalidate indices from the |
| implementation point of view. |
| |
| - Iterating over a collection and performing mutations in place is a common |
| pattern that Swift's collection library needs to support. If replacing |
| individual collection elements would invalidate indices, many common |
| algorithms (like sorting) wouldn't be implementable directly with |
| indices; the code would need to maintain its own shadow indices, for |
| example, plain integers, that are not invalidated by mutations. |
| |
| Consequences: |
| |
| - The setter of ``MutableCollection.subscript(_: Index)`` does not invalidate |
| any indices. Indices are composites of offsets, so replacing the value does |
| not change the shape of the data structure and preserves offsets. |
| |
| - A value type mutable linked list cannot conform to |
| ``MutableCollectionType``. An index for a linked list has to be implemented |
| as a pointer to the list node to provide O(1) element access. Mutating an |
| element of a non-uniquely referenced linked list will create a copy of the |
| nodes that comprise the list. Indices obtained before the copy was made |
| would point to the old nodes and wouldn't be valid for the copy of the list. |
| |
| It is still valid to have a value type linked list conform to |
| ``CollectionType``, or to have a reference type mutable linked list conform |
| to ``MutableCollection``. |
| |
| The following points apply to all collections by default, but specific |
| collection implementations can be less strict: |
| |
| (1) A call to a mutating method on a collection instance, except the setter of |
| ``MutableCollection.subscript(_: Index)``, invalidates all indices for that |
| collection instance. |
| |
| Consequences: |
| |
| - Passing a collection as an ``inout`` argument invalidates all indexes for |
| that collection instance, unless the function explicitly documents stronger |
| guarantees. (The function can call mutating methods on an ``inout`` argument |
| or completely replace it.) |
| |
| * ``Swift.swap()`` does not invalidate any indexes. |
| |
| Additional guarantees for ``Swift.Array``, ``Swift.ContiguousArray``, ``Swift.ArraySlice`` |
| ========================================================================================== |
| |
| **Valid array indexes can be created without using Array APIs.** Array indexes |
| are plain integers. Integers that are dynamically in the range ``0..<A.count`` |
| are valid indexes for the array or slice ``A``. It does not matter if an index |
| was obtained from the collection instance, or derived from input or unrelated |
| data. |
| |
| **Traps are guaranteed.** Using an invalid index to designate elements of an |
| array or an array slice is guaranteed to perform a trap. |
| |
| Additional guarantees for ``Swift.Dictionary`` |
| ============================================== |
| |
| **Insertion into a Dictionary invalidates indexes only on a rehash.** If a |
| ``Dictionary`` has enough free buckets (guaranteed by calling an initializer or |
| reserving space), then inserting elements does not invalidate indexes. |
| |
| Note: unlike C++'s ``std::unordered_map``, removing elements from a |
| ``Dictionary`` invalidates indexes. |
| |