| :orphan: |
| |
| .. @raise litre.TestsAreMissing |
| .. _MemoryAndConcurrencyModel: |
| |
| Swift Memory and Concurrency Model |
| ================================== |
| |
| .. warning:: This is a very early design document discussing the features of |
| a possible Swift concurrency model. It should not be taken as a plan of |
| record. |
| |
| The goal of this writeup is to provide a safe and efficient way to model, |
| design, and implement concurrent applications in Swift. It is believed that it |
| will completely eliminate data races and reduce deadlocks in swift apps, and |
| will allow for important performance wins as well. This happens by eliminating |
| shared mutable data, locks, and the need for most atomic memory accesses. The |
| model is quite different from what traditional unix folks are used to |
| though. :-) |
| |
| As usual, Swift will eventually support unsafe constructs, so if it turns out |
| that the model isn't strong enough for some particular use case, that coder can |
| always fall back to the hunting for concurrency with arrows and bone knives |
| instead of using the native language support. Ideally this will never be |
| required though. |
| |
| This starts by describing the basic model, then talks about issues and some |
| possible extensions to improve the model. |
| |
| Language Concepts |
| ----------------- |
| |
| The **unit of concurrency** in Swift is an Actor. As an execution context, an |
| actor corresponds to a lightweight thread or dispatch queue that can respond to |
| a declared list of async messages. Each actor has its own private set of mutable |
| data, can communicate with other actors by sending them async messages, and can |
| lock each other to perform synchronous actions. |
| |
| Only one message can be active at a time (you can also extend the model to allow |
| read/writer locks as a refinement) in an actor. Through the type system and |
| runtime, we make it impossible for an actor to read or change another actor's |
| mutable data: all mutable data is private to some actor. When a message is sent |
| from one actor to another, the argument is deep copied to ensure that two actors |
| can never share mutable data. |
| |
| To achieve this, there are three important kinds of memory object in |
| Swift. Given a static type, it is obvious what kind it is from its |
| definition. These kinds are: |
| |
| 1. **Immutable Data** - Immutable data (which can have a constructor, but whose |
| value cannot be changed after it completes) is shareable across actors, and it |
| would make sense to unique them where possible. Immutable data can |
| (transitively) point to other immutable data, but it isn't valid (and the |
| compiler rejects) immutable data that is pointing to mutable data. For |
| example:: |
| |
| struct [immutable, inout] IList { data : int, next : List } |
| ... |
| var a = IList(1, IList(2, IList(3, nil))) |
| ... |
| a.data = 42 // error, can't mutate field of immutable 'IList' type! |
| a.next = nil // error, same deal |
| a = IList(2, IList(3, nil)) // ok, "a" itself is mutable, it now points to something new. |
| a = IList(1, a) // ok |
| |
| Strings are a very important example of (inout) immutable data. |
| |
| |
| |
| |
| 2. **Normal mutable data** - Data (objects, structs, etc) are mutable by |
| default, and are thus local to the containing actor. Mutable data can point |
| to immutable data with no problem, and since it is local to an actor, no |
| synchronization or atomic accesses are ever required for any mutable |
| data. For example:: |
| |
| struct [inout] MEntry { x : int, y : int, list : IList } |
| ... |
| var b = MEntry(4, 2, IList(1, nil)) |
| b.x = 4 // ok |
| b.y = 71 // ok |
| b.list = nil // ok |
| b.list = IList(2, nil) // ok |
| b.list.data = 42 // error, can't change immutable data. |
| |
| As part of mutable data, it is worth pointing out that mutable "global |
| variables" in swift are not truly global, they are local to the current actor |
| (somewhat similar to "thread local storage", or perhaps to "an ivar on the |
| actor"). Immutable global variables (like lookup tables) are simple immutable |
| data just like today. Global variables with "static constructors / |
| initializers" in the C++ sense have their constructor lazily run on the first |
| use of the variable. |
| |
| Not having mutable shared state also prevents race conditions from turning |
| into safety violations. Here's a description of the go issue which causes |
| them to be unsafe when running multiple threads: |
| <http://research.swtch.com/gorace> |
| |
| Note that a highly desirable feature of the language is the ability to take |
| an instance of a mutable type and *make it immutable* somehow, allowing it to |
| be passed around by-value. It is unclear how to achieve this, but perhaps |
| typestate can magically make it possible. |
| |
| 3. **Actors** - In addition to being the unit of concurrency, Actors have an |
| address and are thus memory objects. Actors themselves can have mutable |
| fields, one actor can point to another, and mutable data can point to an |
| actor. However, any mutable data in an actor can only be directly accessed by |
| that actor, it isn't possible for one actor to access another actor's mutable |
| data. Also note that actors and objects are completely unrelated: it is not |
| possible for an object to inherit from an actor, and it is not possible for |
| an actor to inherit from an object. Either can contain the other as an ivar |
| though. |
| |
| Syntax for actors is TBD and not super important for now, but here is a silly |
| multithreaded mandelbrot example, that does each pixel in "parallel", to |
| illustrate some ideas:: |
| |
| func do_mandelbrot(_ x : float, y : float) -> int { |
| // details elided |
| } |
| |
| actor MandelbrotCalculator { |
| func compute(_ x : float, y : float, Driver D) { |
| var num_iters = do_mandelbrot(x, y) |
| D.collect_point(x, y, num_iters) |
| } |
| } |
| |
| actor Driver { |
| var result : image; // result and numpoints are mutable per-actor data. |
| var numpoints : int; |
| func main() { |
| result = new image() |
| foreach i in -2.0 ... 2.0 by 0.001 { |
| // Arbitrarily, create one MandelbrotCalculator for each row. |
| var MC = new MandelbrotCalculator() |
| foreach j in -2.0 ... 2.0 by 0.001 { |
| MC.compute(i, j, self) |
| ++numpoints; |
| } |
| } |
| } |
| |
| func collect_point(_ x : float, y : float, num_iters : int) { |
| result.setPoint(x, y, Color(num_iters, num_iters, num_iters)) |
| if (--numpoints == 0) |
| draw(result) |
| } |
| } |
| |
| Though actors have mutable data (like 'result' and 'numpoints'), there is no |
| need for any synchronization on that mutable data. |
| |
| One of the great things about this model (in my opinion) is that it gives |
| programmers a way to reason about granularity, and the data copy/sharing |
| issue gives them something very concrete and understandable that they can use |
| to make design decisions when building their app. While it is a common |
| pattern to have one class that corresponds to a thread in C++ and ObjC, this |
| is an informal pattern -- baking this into the language with actors and |
| giving a semantic difference between objects and actors makes the tradeoffs |
| crisp and easy to understand and reason about. |
| |
| Communicating with Actors |
| ------------------------- |
| |
| As the example above shows, the primary and preferred way to communicate with |
| actors is through one-way asynchronous messages. Asynchronous message sensed |
| are nice because they cannot block, deadlock, or have other bad |
| effects. However, they aren't great for two things: 1) invoking multiple methods |
| on an actor that need to be synchronized together, and 2) getting a value back |
| from the actor. |
| |
| Sending multiple messages asynchronously |
| ---------------------------------------- |
| |
| With the basic approach above, you can only perform actions on actors that are |
| built into the actor. For example, if you had an actor with two methods:: |
| |
| actor MyActor { |
| func foo() {...} |
| func bar() {...} |
| func getvalue() -> double {... } |
| } |
| |
| Then there is no way to perform a composite operation that needs to "atomically" |
| perform foo() and bar() without any other operations getting in between. If you |
| had code like this:: |
| |
| var a : MyActor = ... |
| a.foo() |
| a.bar() |
| |
| Then the foo/bar methods are both sent asynchronously, and (while they would be |
| ordered with respect to each other) there is no guarantee that some other method |
| wouldn't be run in between them. To handle this, the async block structure can |
| be used to submit a sequence of code that is atomically run in the actor's |
| context, e.g.:: |
| |
| var a : MyActor = ... |
| async a { |
| a.foo() |
| a.bar() |
| } |
| |
| This conceptually submits a closure to run in the context of the actor. If you |
| look at it this way, an async message send is conceptually equivalent to an |
| async block. As such, the original example was equivalent to:: |
| |
| var a : MyActor = ... |
| async a { a.foo() } |
| async a { a.bar() } |
| |
| which makes it pretty clear that the two sends are separate from each other. We |
| could optionally require all accesses to an actor to be in an async block, which |
| would make this behavior really clear at the cost of coding clarity. |
| |
| It is worth pointing out that you can't asynchronously call a message and get |
| its return value back. However, if the return value is ignored, a message send |
| can be performed. For example, "a.getvalue()" would be fine so long as the |
| result is ignored or if the value is in an explicit async block structure. |
| |
| From an implementation perspective, the code above corresponds directly to GCD's |
| dispatch_asynch on a per-actor queue. |
| |
| Performing synchronous operations |
| --------------------------------- |
| |
| Asynchronous calls are nice and define away the possibility of deadlock, but at |
| some point you need to get a return value back and async programming is very |
| awkward. To handle this, a 'synch' block is used. For example, the following is |
| valid:: |
| |
| var x : double |
| synch a { |
| x = a.getvalue(); |
| } |
| |
| but this is not:: |
| |
| var x = a.getvalue(); |
| |
| A synch block statement is directly related to dispatch_sync and conceptually |
| locks the specified actor's lock/queue and performs the block within its |
| context. |
| |
| Memory Ownership Model |
| ---------------------- |
| |
| Within an actor there is a question of how ownership is handled. It's not in the |
| scope of this document to say what the "one true model" is, but here are a |
| couple of interesting observations: |
| |
| 1. **Automated reference counting** would be much more efficient in this model |
| than in ObjC, because the compiler statically knows whether something is |
| mutable data or is shared. Mutable data (e.g. normal objects) can be ref |
| counted with non-atomic reference counting, which is 20-30x faster than |
| atomic adjustments. Actors are shared, so they'd have to have atomic ref |
| counts, but they should be much much less common than the normal objects in |
| the program. Immutable data is shared (and thus needs atomic reference |
| counts) but there are optimizations that can be performed since the edges in |
| the pointer graph can never change and cycles aren't possible in immutable |
| data. |
| |
| 2. **Garbage collection** for mutable data becomes a lot more attractive than in |
| ObjC for four reasons: 1) all GC is local to an actor, so you don't need to |
| stop the world to do a collection. 2) actors have natural local quiescent |
| points: when they have finished servicing a message, if their dispatch queue |
| is empty, they go to sleep. If nothing else in the CPU needs the thread, it |
| would be a natural time to collect. 3) GC would be fully precise in swift, |
| unlike in ObjC, no conservative stack scanning or other hacks are needed. 4) |
| If GC is used for mutable data, it would make sense to still use reference |
| counting for actors themselves and especially for immutable data, meaning |
| that you'd have *no* "whole process" GC. |
| |
| 3. Each actor can use a **different memory management policy**: it is completely |
| fine for one actor to be GC and one actor to be ARC, and another to be |
| manually malloc/freed (and thus unsafe) because actors can't reach each |
| other's pointers. However, realistically, we will still have to pick "the |
| right" model, because different actors can share the same code (e.g. they can |
| instantiate the same objects) and the compiled code has to implement the |
| model the actor wants. |
| |
| Issues with this Model |
| ---------------------- |
| |
| There are two significant issues with this model: 1) the amount of data copying |
| may be excessive if you have lots of messages each passing lots of mutable data |
| that is deep copied, and 2) the awkward nature of async programming for some |
| (common) classes of programming. For example, the "branch and rejoin" pattern |
| in the example requires a counter to know when everyone rejoined, and we really |
| want a "parallel for loop". |
| |
| I'd advocate implementing the simple model first, but once it is there, there |
| are several extensions that can help with these two problems: |
| |
| **No copy is needed for some important cases:** If you can prove (through the |
| type system) that an object graph has a single (unique) pointer to it, the |
| pointer value can be sent in the message and nil'd out in the sender. In this |
| way you're "transferring" ownership of the subgraph from one actor to the |
| other. It's not fully clear how to do this though. Another similar example: if |
| we add some way for an actor to self destruct along with a message send, then it |
| is safe for an actor to transfer any and all of its mutable state to another |
| actor when it destroys itself, avoiding a copy. |
| |
| **Getters for trivial immutable actor fields**: If an actor has an ivar with an |
| immutable type, then we can make all stores to it atomic, and allow other actors |
| to access the ivar. Silly example:: |
| |
| actor Window { |
| var title : string; // string is an immutable by-ref type. |
| ... |
| } |
| |
| ... |
| var x = new Window; |
| print(x.title) // ok, all stores will be atomic, an (recursively) immutable data is valid in all actors, so this is fine to load. |
| ... |
| |
| **Parallel for loops** and other constructs that don't guarantee that each |
| "thread" has its own non-shared mutable memory are very important and not |
| covered by this model at all. For example, having multiple threads execute on |
| different slices of the same array would require copying the array to temporary |
| disjoint memory spaces to do operations, then recopy it back into place. This |
| data copying can be awkward and reduce the benefits of parallelism to make it |
| non-profitable. |
| |
| There are multiple different ways to tackle this. We can just throw it back into |
| the programmer's lap and tell them that the behavior is undefined if they get a |
| race condition. This is fine for some systems levels stuff, but defeats the |
| purpose of having a safe language and is clearly not good enough for mobile |
| code. |
| |
| Another (more aggressive) approach is to provide a parallel for loop, and use it |
| as a hint that each iteration can be executed in parallel. It would then be up |
| to the implementation to try to prove the safety of this (e.g. using dependence |
| analysis), and if provable everything is good. If not provable, then the |
| implementation would have to compile it as serial code, or use something like an |
| STM approach to guarantee that the program is correct or the error is |
| detected. There is much work in academia that can be tapped for this sort of |
| thing. One nice thing about this approach is that you'd always get full |
| parallel performance if you "disable checking", which could be done in a |
| production build or something. |
| |
| Some blue sky kinds of random thoughts |
| -------------------------------------- |
| |
| **Distributed Programming** - Since deep copy is part of the language and "deep |
| copy" is so similar to "serialization", it would be easy to do a simple |
| implementation of something like "Distributed Objects". The primary additional |
| thing that is required is for messages sent to actors to be able to fail, which |
| is required anyway. The granularity issues that come up are similar in these two |
| domains. |
| |
| **Immutable Data w/Synch and Lazy Faulting** - Not a fully baked idea, but if |
| you're heavily using immutable data to avoid copies, a "distributed objects" |
| implementation would suffer because it would have to deep copy all the immutable |
| data that the receiver doesn't have, defeating the optimization. One approach to |
| handling this is to treat this as a data synch problem, and have the client |
| fault pieces of the immutable data subgraph in on demand, instead of eagerly |
| copying it. |
| |
| **OpenCL Integration** with this model could be really natural: the GPU is an |
| inherently async device to talk to. |
| |
| **UNIX processes**: Actors in a shared address space with no shared mutable data |
| are related to processes in a unix app that share by communicating with mmap |
| etc. |
| |
| .. _http://research.swtch.com/2010/02/off-to-races.html: http://research.swtch.com/2010/02/off-to-races.html |
| |
| |