docs/proposals/archive/MemoryAndConcurrencyModel.rst - third_party/swift - Git at Google

 :orphan:

 .. _MemoryAndConcurrencyModel:

 Swift Memory and Concurrency Model
 ==================================

 .. warning:: This is a very early design document discussing the features of
   a possible Swift concurrency model. It should not be taken as a plan of
   record.

 The goal of this writeup is to provide a safe and efficient way to model,
 design, and implement concurrent applications in Swift. It is believed that it
 will completely eliminate data races and reduce deadlocks in swift apps, and
 will allow for important performance wins as well. This happens by eliminating
 shared mutable data, locks, and the need for most atomic memory accesses. The
 model is quite different from what traditional unix folks are used to
 though. :-)

 As usual, Swift will eventually support unsafe constructs, so if it turns out
 that the model isn't strong enough for some particular use case, that coder can
 always fall back to the hunting for concurrency with arrows and bone knives
 instead of using the native language support. Ideally this will never be
 required though.

 This starts by describing the basic model, then talks about issues and some
 possible extensions to improve the model.

 Language Concepts
 -----------------

 The **unit of concurrency** in Swift is an Actor. As an execution context, an
 actor corresponds to a lightweight thread or dispatch queue that can respond to
 a declared list of async messages. Each actor has its own private set of mutable
 data, can communicate with other actors by sending them async messages, and can
 lock each other to perform synchronous actions.

 Only one message can be active at a time (you can also extend the model to allow
 read/writer locks as a refinement) in an actor. Through the type system and
 runtime, we make it impossible for an actor to read or change another actor's
 mutable data: all mutable data is private to some actor. When a message is sent
 from one actor to another, the argument is deep copied to ensure that two actors
 can never share mutable data.

 To achieve this, there are three important kinds of memory object in
 Swift. Given a static type, it is obvious what kind it is from its
 definition. These kinds are:

 1. **Immutable Data** - Immutable data (which can have a constructor, but whose
    value cannot be changed after it completes) is shareable across actors, and it
    would make sense to unique them where possible.  Immutable data can
    (transitively) point to other immutable data, but it isn't valid (and the
    compiler rejects) immutable data that is pointing to mutable data. For
    example::

      struct [immutable, inout] IList { data : int, next : List }
      ...
      var a = IList(1, IList(2, IList(3, nil)))
      ...
      a.data = 42 // error, can't mutate field of immutable 'IList' type!
      a.next = nil // error, same deal
      a = IList(2, IList(3, nil)) // ok, "a" itself is mutable, it now points to something new.
      a = IList(1, a) // ok

   Strings are a very important example of (inout) immutable data.


 2. **Normal mutable data** - Data (objects, structs, etc) are mutable by
    default, and are thus local to the containing actor. Mutable data can point
    to immutable data with no problem, and since it is local to an actor, no
    synchronization or atomic accesses are ever required for any mutable
    data. For example::

      struct [inout] MEntry { x : int, y : int, list : IList }
      ...
      var b = MEntry(4, 2, IList(1, nil))
      b.x = 4 // ok
      b.y = 71 // ok
      b.list = nil // ok
      b.list = IList(2, nil) // ok
      b.list.data = 42 // error, can't change immutable data.

    As part of mutable data, it is worth pointing out that mutable "global
    variables" in swift are not truly global, they are local to the current actor
    (somewhat similar to "thread local storage", or perhaps to "an ivar on the
    actor"). Immutable global variables (like lookup tables) are simple immutable
    data just like today. Global variables with "static constructors /
    initializers" in the C++ sense have their constructor lazily run on the first
    use of the variable.

    Not having mutable shared state also prevents race conditions from turning
    into safety violations. Here's a description of the go issue which causes
    them to be unsafe when running multiple threads:
    <http://research.swtch.com/gorace>

    Note that a highly desirable feature of the language is the ability to take
    an instance of a mutable type and *make it immutable* somehow, allowing it to
    be passed around by-value. It is unclear how to achieve this, but perhaps
    typestate can magically make it possible.

 3. **Actors** - In addition to being the unit of concurrency, Actors have an
    address and are thus memory objects. Actors themselves can have mutable
    fields, one actor can point to another, and mutable data can point to an
    actor. However, any mutable data in an actor can only be directly accessed by
    that actor, it isn't possible for one actor to access another actor's mutable
    data. Also note that actors and objects are completely unrelated: it is not
    possible for an object to inherit from an actor, and it is not possible for
    an actor to inherit from an object. Either can contain the other as an ivar
    though.

    Syntax for actors is TBD and not super important for now, but here is a silly
    multithreaded mandelbrot example, that does each pixel in "parallel", to
    illustrate some ideas::

      func do_mandelbrot(_ x : float, y : float) -> int {
        // details elided
      }

      actor MandelbrotCalculator {
        func compute(_ x : float, y : float, Driver D) {
          var num_iters = do_mandelbrot(x, y)
          D.collect_point(x, y, num_iters)
        }
      }

      actor Driver {
        var result : image; // result and numpoints are mutable per-actor data.
        var numpoints : int;
        func main() {
          result = new image()
          foreach i in -2.0 ... 2.0 by 0.001 {
            // Arbitrarily, create one MandelbrotCalculator for each row.
            var MC = new MandelbrotCalculator()
            foreach j in -2.0 ... 2.0 by 0.001 {
              MC.compute(i, j, self)
              ++numpoints;
            }
          }
        }

        func collect_point(_ x : float, y : float, num_iters : int) {
          result.setPoint(x, y, Color(num_iters, num_iters, num_iters))
          if (--numpoints == 0)
          draw(result)
        }
      }

    Though actors have mutable data (like 'result' and 'numpoints'), there is no
    need for any synchronization on that mutable data.

    One of the great things about this model (in my opinion) is that it gives
    programmers a way to reason about granularity, and the data copy/sharing
    issue gives them something very concrete and understandable that they can use
    to make design decisions when building their app. While it is a common
    pattern to have one class that corresponds to a thread in C++ and ObjC, this
    is an informal pattern -- baking this into the language with actors and
    giving a semantic difference between objects and actors makes the tradeoffs
    crisp and easy to understand and reason about.

 Communicating with Actors
 -------------------------

 As the example above shows, the primary and preferred way to communicate with
 actors is through one-way asynchronous messages.  Asynchronous message sensed
 are nice because they cannot block, deadlock, or have other bad
 effects. However, they aren't great for two things: 1) invoking multiple methods
 on an actor that need to be synchronized together, and 2) getting a value back
 from the actor.

 Sending multiple messages asynchronously
 ----------------------------------------

 With the basic approach above, you can only perform actions on actors that are
 built into the actor. For example, if you had an actor with two methods::

   actor MyActor {
     func foo() {...}
     func bar() {...}
     func getvalue() -> double {... }
   }

 Then there is no way to perform a composite operation that needs to "atomically"
 perform foo() and bar() without any other operations getting in between. If you
 had code like this::

   var a : MyActor = ...
   a.foo()
   a.bar()

 Then the foo/bar methods are both sent asynchronously, and (while they would be
 ordered with respect to each other) there is no guarantee that some other method
 wouldn't be run in between them. To handle this, the async block structure can
 be used to submit a sequence of code that is atomically run in the actor's
 context, e.g.::

   var a : MyActor = ...
   async a {
     a.foo()
     a.bar()
   }

 This conceptually submits a closure to run in the context of the actor. If you
 look at it this way, an async message send is conceptually equivalent to an
 async block. As such, the original example was equivalent to::

   var a : MyActor = ...
   async a { a.foo() }
   async a { a.bar() }

 which makes it pretty clear that the two sends are separate from each other. We
 could optionally require all accesses to an actor to be in an async block, which
 would make this behavior really clear at the cost of coding clarity.

 It is worth pointing out that you can't asynchronously call a message and get
 its return value back. However, if the return value is ignored, a message send
 can be performed. For example, "a.getvalue()" would be fine so long as the
 result is ignored or if the value is in an explicit async block structure.

 From an implementation perspective, the code above corresponds directly to GCD's
 dispatch_asynch on a per-actor queue.

 Performing synchronous operations
 ---------------------------------

 Asynchronous calls are nice and define away the possibility of deadlock, but at
 some point you need to get a return value back and async programming is very
 awkward. To handle this, a 'synch' block is used. For example, the following is
 valid::

   var x : double
   synch a {
     x = a.getvalue();
   }

 but this is not::

   var x = a.getvalue();

 A synch block statement is directly related to dispatch_sync and conceptually
 locks the specified actor's lock/queue and performs the block within its
 context.

 Memory Ownership Model
 ----------------------

 Within an actor there is a question of how ownership is handled. It's not in the
 scope of this document to say what the "one true model" is, but here are a
 couple of interesting observations:

 1. **Automated reference counting** would be much more efficient in this model
    than in ObjC, because the compiler statically knows whether something is
    mutable data or is shared. Mutable data (e.g. normal objects) can be ref
    counted with non-atomic reference counting, which is 20-30x faster than
    atomic adjustments. Actors are shared, so they'd have to have atomic ref
    counts, but they should be much much less common than the normal objects in
    the program. Immutable data is shared (and thus needs atomic reference
    counts) but there are optimizations that can be performed since the edges in
    the pointer graph can never change and cycles aren't possible in immutable
    data.

 2. **Garbage collection** for mutable data becomes a lot more attractive than in
    ObjC for four reasons: 1) all GC is local to an actor, so you don't need to
    stop the world to do a collection. 2) actors have natural local quiescent
    points: when they have finished servicing a message, if their dispatch queue
    is empty, they go to sleep. If nothing else in the CPU needs the thread, it
    would be a natural time to collect. 3) GC would be fully precise in swift,
    unlike in ObjC, no conservative stack scanning or other hacks are needed. 4)
    If GC is used for mutable data, it would make sense to still use reference
    counting for actors themselves and especially for immutable data, meaning
    that you'd have *no* "whole process" GC.

 3. Each actor can use a **different memory management policy**: it is completely
    fine for one actor to be GC and one actor to be ARC, and another to be
    manually malloc/freed (and thus unsafe) because actors can't reach each
    other's pointers. However, realistically, we will still have to pick "the
    right" model, because different actors can share the same code (e.g. they can
    instantiate the same objects) and the compiled code has to implement the
    model the actor wants.

 Issues with this Model
 ----------------------

 There are two significant issues with this model: 1) the amount of data copying
 may be excessive if you have lots of messages each passing lots of mutable data
 that is deep copied, and 2) the awkward nature of async programming for some
 (common) classes of programming.  For example, the "branch and rejoin" pattern
 in the example requires a counter to know when everyone rejoined, and we really
 want a "parallel for loop".

 I'd advocate implementing the simple model first, but once it is there, there
 are several extensions that can help with these two problems:

 **No copy is needed for some important cases:** If you can prove (through the
 type system) that an object graph has a single (unique) pointer to it, the
 pointer value can be sent in the message and nil'd out in the sender. In this
 way you're "transferring" ownership of the subgraph from one actor to the
 other. It's not fully clear how to do this though. Another similar example: if
 we add some way for an actor to self destruct along with a message send, then it
 is safe for an actor to transfer any and all of its mutable state to another
 actor when it destroys itself, avoiding a copy.

 **Getters for trivial immutable actor fields**: If an actor has an ivar with an
 immutable type, then we can make all stores to it atomic, and allow other actors
 to access the ivar. Silly example::

   actor Window {
     var title : string; // string is an immutable by-ref type.
     ...
   }

   ...
   var x = new Window;
   print(x.title) // ok, all stores will be atomic, an (recursively) immutable data is valid in all actors, so this is fine to load.
   ...

 **Parallel for loops** and other constructs that don't guarantee that each
 "thread" has its own non-shared mutable memory are very important and not
 covered by this model at all. For example, having multiple threads execute on
 different slices of the same array would require copying the array to temporary
 disjoint memory spaces to do operations, then recopy it back into place. This
 data copying can be awkward and reduce the benefits of parallelism to make it
 non-profitable.

 There are multiple different ways to tackle this. We can just throw it back into
 the programmer's lap and tell them that the behavior is undefined if they get a
 race condition. This is fine for some systems levels stuff, but defeats the
 purpose of having a safe language and is clearly not good enough for mobile
 code.

 Another (more aggressive) approach is to provide a parallel for loop, and use it
 as a hint that each iteration can be executed in parallel.  It would then be up
 to the implementation to try to prove the safety of this (e.g. using dependence
 analysis), and if provable everything is good. If not provable, then the
 implementation would have to compile it as serial code, or use something like an
 STM approach to guarantee that the program is correct or the error is
 detected. There is much work in academia that can be tapped for this sort of
 thing.  One nice thing about this approach is that you'd always get full
 parallel performance if you "disable checking", which could be done in a
 production build or something.

 Some blue sky kinds of random thoughts
 --------------------------------------

 **Distributed Programming** - Since deep copy is part of the language and "deep
 copy" is so similar to "serialization", it would be easy to do a simple
 implementation of something like "Distributed Objects".  The primary additional
 thing that is required is for messages sent to actors to be able to fail, which
 is required anyway. The granularity issues that come up are similar in these two
 domains.

 **Immutable Data w/Synch and Lazy Faulting** - Not a fully baked idea, but if
 you're heavily using immutable data to avoid copies, a "distributed objects"
 implementation would suffer because it would have to deep copy all the immutable
 data that the receiver doesn't have, defeating the optimization. One approach to
 handling this is to treat this as a data synch problem, and have the client
 fault pieces of the immutable data subgraph in on demand, instead of eagerly
 copying it.

 **OpenCL Integration** with this model could be really natural: the GPU is an
 inherently async device to talk to.

 **UNIX processes**: Actors in a shared address space with no shared mutable data
 are related to processes in a unix app that share by communicating with mmap
 etc.

 .. _http://research.swtch.com/2010/02/off-to-races.html: http://research.swtch.com/2010/02/off-to-races.html
	:orphan:

	.. _MemoryAndConcurrencyModel:

	Swift Memory and Concurrency Model
	==================================

	.. warning:: This is a very early design document discussing the features of
	a possible Swift concurrency model. It should not be taken as a plan of
	record.

	The goal of this writeup is to provide a safe and efficient way to model,
	design, and implement concurrent applications in Swift. It is believed that it
	will completely eliminate data races and reduce deadlocks in swift apps, and
	will allow for important performance wins as well. This happens by eliminating
	shared mutable data, locks, and the need for most atomic memory accesses. The
	model is quite different from what traditional unix folks are used to
	though. :-)

	As usual, Swift will eventually support unsafe constructs, so if it turns out
	that the model isn't strong enough for some particular use case, that coder can
	always fall back to the hunting for concurrency with arrows and bone knives
	instead of using the native language support. Ideally this will never be
	required though.

	This starts by describing the basic model, then talks about issues and some
	possible extensions to improve the model.

	Language Concepts
	-----------------

	The unit of concurrency in Swift is an Actor. As an execution context, an
	actor corresponds to a lightweight thread or dispatch queue that can respond to
	a declared list of async messages. Each actor has its own private set of mutable
	data, can communicate with other actors by sending them async messages, and can
	lock each other to perform synchronous actions.

	Only one message can be active at a time (you can also extend the model to allow
	read/writer locks as a refinement) in an actor. Through the type system and
	runtime, we make it impossible for an actor to read or change another actor's
	mutable data: all mutable data is private to some actor. When a message is sent
	from one actor to another, the argument is deep copied to ensure that two actors
	can never share mutable data.

	To achieve this, there are three important kinds of memory object in
	Swift. Given a static type, it is obvious what kind it is from its
	definition. These kinds are:

	1. Immutable Data - Immutable data (which can have a constructor, but whose
	value cannot be changed after it completes) is shareable across actors, and it
	would make sense to unique them where possible. Immutable data can
	(transitively) point to other immutable data, but it isn't valid (and the
	compiler rejects) immutable data that is pointing to mutable data. For
	example::

	struct [immutable, inout] IList { data : int, next : List }
	...
	var a = IList(1, IList(2, IList(3, nil)))
	...
	a.data = 42 // error, can't mutate field of immutable 'IList' type!
	a.next = nil // error, same deal
	a = IList(2, IList(3, nil)) // ok, "a" itself is mutable, it now points to something new.
	a = IList(1, a) // ok

	Strings are a very important example of (inout) immutable data.




	2. Normal mutable data - Data (objects, structs, etc) are mutable by
	default, and are thus local to the containing actor. Mutable data can point
	to immutable data with no problem, and since it is local to an actor, no
	synchronization or atomic accesses are ever required for any mutable
	data. For example::

	struct [inout] MEntry { x : int, y : int, list : IList }
	...
	var b = MEntry(4, 2, IList(1, nil))
	b.x = 4 // ok
	b.y = 71 // ok
	b.list = nil // ok
	b.list = IList(2, nil) // ok
	b.list.data = 42 // error, can't change immutable data.

	As part of mutable data, it is worth pointing out that mutable "global
	variables" in swift are not truly global, they are local to the current actor
	(somewhat similar to "thread local storage", or perhaps to "an ivar on the
	actor"). Immutable global variables (like lookup tables) are simple immutable
	data just like today. Global variables with "static constructors /
	initializers" in the C++ sense have their constructor lazily run on the first
	use of the variable.

	Not having mutable shared state also prevents race conditions from turning
	into safety violations. Here's a description of the go issue which causes
	them to be unsafe when running multiple threads:
	<http://research.swtch.com/gorace>

	Note that a highly desirable feature of the language is the ability to take
	an instance of a mutable type and make it immutable somehow, allowing it to
	be passed around by-value. It is unclear how to achieve this, but perhaps
	typestate can magically make it possible.

	3. Actors - In addition to being the unit of concurrency, Actors have an
	address and are thus memory objects. Actors themselves can have mutable
	fields, one actor can point to another, and mutable data can point to an
	actor. However, any mutable data in an actor can only be directly accessed by
	that actor, it isn't possible for one actor to access another actor's mutable
	data. Also note that actors and objects are completely unrelated: it is not
	possible for an object to inherit from an actor, and it is not possible for
	an actor to inherit from an object. Either can contain the other as an ivar
	though.

	Syntax for actors is TBD and not super important for now, but here is a silly
	multithreaded mandelbrot example, that does each pixel in "parallel", to
	illustrate some ideas::

	func do_mandelbrot(_ x : float, y : float) -> int {
	// details elided
	}

	actor MandelbrotCalculator {
	func compute(_ x : float, y : float, Driver D) {
	var num_iters = do_mandelbrot(x, y)
	D.collect_point(x, y, num_iters)
	}
	}

	actor Driver {
	var result : image; // result and numpoints are mutable per-actor data.
	var numpoints : int;
	func main() {
	result = new image()
	foreach i in -2.0 ... 2.0 by 0.001 {
	// Arbitrarily, create one MandelbrotCalculator for each row.
	var MC = new MandelbrotCalculator()
	foreach j in -2.0 ... 2.0 by 0.001 {
	MC.compute(i, j, self)
	++numpoints;
	}
	}
	}

	func collect_point(_ x : float, y : float, num_iters : int) {
	result.setPoint(x, y, Color(num_iters, num_iters, num_iters))
	if (--numpoints == 0)
	draw(result)
	}
	}

	Though actors have mutable data (like 'result' and 'numpoints'), there is no
	need for any synchronization on that mutable data.

	One of the great things about this model (in my opinion) is that it gives
	programmers a way to reason about granularity, and the data copy/sharing
	issue gives them something very concrete and understandable that they can use
	to make design decisions when building their app. While it is a common
	pattern to have one class that corresponds to a thread in C++ and ObjC, this
	is an informal pattern -- baking this into the language with actors and
	giving a semantic difference between objects and actors makes the tradeoffs
	crisp and easy to understand and reason about.

	Communicating with Actors
	-------------------------

	As the example above shows, the primary and preferred way to communicate with
	actors is through one-way asynchronous messages. Asynchronous message sensed
	are nice because they cannot block, deadlock, or have other bad
	effects. However, they aren't great for two things: 1) invoking multiple methods
	on an actor that need to be synchronized together, and 2) getting a value back
	from the actor.

	Sending multiple messages asynchronously
	----------------------------------------

	With the basic approach above, you can only perform actions on actors that are
	built into the actor. For example, if you had an actor with two methods::

	actor MyActor {
	func foo() {...}
	func bar() {...}
	func getvalue() -> double {... }
	}

	Then there is no way to perform a composite operation that needs to "atomically"
	perform foo() and bar() without any other operations getting in between. If you
	had code like this::

	var a : MyActor = ...
	a.foo()
	a.bar()

	Then the foo/bar methods are both sent asynchronously, and (while they would be
	ordered with respect to each other) there is no guarantee that some other method
	wouldn't be run in between them. To handle this, the async block structure can
	be used to submit a sequence of code that is atomically run in the actor's
	context, e.g.::

	var a : MyActor = ...
	async a {
	a.foo()
	a.bar()
	}

	This conceptually submits a closure to run in the context of the actor. If you
	look at it this way, an async message send is conceptually equivalent to an
	async block. As such, the original example was equivalent to::

	var a : MyActor = ...
	async a { a.foo() }
	async a { a.bar() }

	which makes it pretty clear that the two sends are separate from each other. We
	could optionally require all accesses to an actor to be in an async block, which
	would make this behavior really clear at the cost of coding clarity.

	It is worth pointing out that you can't asynchronously call a message and get
	its return value back. However, if the return value is ignored, a message send
	can be performed. For example, "a.getvalue()" would be fine so long as the
	result is ignored or if the value is in an explicit async block structure.

	From an implementation perspective, the code above corresponds directly to GCD's
	dispatch_asynch on a per-actor queue.

	Performing synchronous operations
	---------------------------------

	Asynchronous calls are nice and define away the possibility of deadlock, but at
	some point you need to get a return value back and async programming is very
	awkward. To handle this, a 'synch' block is used. For example, the following is
	valid::

	var x : double
	synch a {
	x = a.getvalue();
	}

	but this is not::

	var x = a.getvalue();

	A synch block statement is directly related to dispatch_sync and conceptually
	locks the specified actor's lock/queue and performs the block within its
	context.

	Memory Ownership Model
	----------------------

	Within an actor there is a question of how ownership is handled. It's not in the
	scope of this document to say what the "one true model" is, but here are a
	couple of interesting observations:

	1. Automated reference counting would be much more efficient in this model
	than in ObjC, because the compiler statically knows whether something is
	mutable data or is shared. Mutable data (e.g. normal objects) can be ref
	counted with non-atomic reference counting, which is 20-30x faster than
	atomic adjustments. Actors are shared, so they'd have to have atomic ref
	counts, but they should be much much less common than the normal objects in
	the program. Immutable data is shared (and thus needs atomic reference
	counts) but there are optimizations that can be performed since the edges in
	the pointer graph can never change and cycles aren't possible in immutable
	data.

	2. Garbage collection for mutable data becomes a lot more attractive than in
	ObjC for four reasons: 1) all GC is local to an actor, so you don't need to
	stop the world to do a collection. 2) actors have natural local quiescent
	points: when they have finished servicing a message, if their dispatch queue
	is empty, they go to sleep. If nothing else in the CPU needs the thread, it
	would be a natural time to collect. 3) GC would be fully precise in swift,
	unlike in ObjC, no conservative stack scanning or other hacks are needed. 4)
	If GC is used for mutable data, it would make sense to still use reference
	counting for actors themselves and especially for immutable data, meaning
	that you'd have no "whole process" GC.

	3. Each actor can use a different memory management policy: it is completely
	fine for one actor to be GC and one actor to be ARC, and another to be
	manually malloc/freed (and thus unsafe) because actors can't reach each
	other's pointers. However, realistically, we will still have to pick "the
	right" model, because different actors can share the same code (e.g. they can
	instantiate the same objects) and the compiled code has to implement the
	model the actor wants.

	Issues with this Model
	----------------------

	There are two significant issues with this model: 1) the amount of data copying
	may be excessive if you have lots of messages each passing lots of mutable data
	that is deep copied, and 2) the awkward nature of async programming for some
	(common) classes of programming. For example, the "branch and rejoin" pattern
	in the example requires a counter to know when everyone rejoined, and we really
	want a "parallel for loop".

	I'd advocate implementing the simple model first, but once it is there, there
	are several extensions that can help with these two problems:

	No copy is needed for some important cases: If you can prove (through the
	type system) that an object graph has a single (unique) pointer to it, the
	pointer value can be sent in the message and nil'd out in the sender. In this
	way you're "transferring" ownership of the subgraph from one actor to the
	other. It's not fully clear how to do this though. Another similar example: if
	we add some way for an actor to self destruct along with a message send, then it
	is safe for an actor to transfer any and all of its mutable state to another
	actor when it destroys itself, avoiding a copy.

	Getters for trivial immutable actor fields: If an actor has an ivar with an
	immutable type, then we can make all stores to it atomic, and allow other actors
	to access the ivar. Silly example::

	actor Window {
	var title : string; // string is an immutable by-ref type.
	...
	}

	...
	var x = new Window;
	print(x.title) // ok, all stores will be atomic, an (recursively) immutable data is valid in all actors, so this is fine to load.
	...

	Parallel for loops and other constructs that don't guarantee that each
	"thread" has its own non-shared mutable memory are very important and not
	covered by this model at all. For example, having multiple threads execute on
	different slices of the same array would require copying the array to temporary
	disjoint memory spaces to do operations, then recopy it back into place. This
	data copying can be awkward and reduce the benefits of parallelism to make it
	non-profitable.

	There are multiple different ways to tackle this. We can just throw it back into
	the programmer's lap and tell them that the behavior is undefined if they get a
	race condition. This is fine for some systems levels stuff, but defeats the
	purpose of having a safe language and is clearly not good enough for mobile
	code.

	Another (more aggressive) approach is to provide a parallel for loop, and use it
	as a hint that each iteration can be executed in parallel. It would then be up
	to the implementation to try to prove the safety of this (e.g. using dependence
	analysis), and if provable everything is good. If not provable, then the
	implementation would have to compile it as serial code, or use something like an
	STM approach to guarantee that the program is correct or the error is
	detected. There is much work in academia that can be tapped for this sort of
	thing. One nice thing about this approach is that you'd always get full
	parallel performance if you "disable checking", which could be done in a
	production build or something.

	Some blue sky kinds of random thoughts
	--------------------------------------

	Distributed Programming - Since deep copy is part of the language and "deep
	copy" is so similar to "serialization", it would be easy to do a simple
	implementation of something like "Distributed Objects". The primary additional
	thing that is required is for messages sent to actors to be able to fail, which
	is required anyway. The granularity issues that come up are similar in these two
	domains.

	Immutable Data w/Synch and Lazy Faulting - Not a fully baked idea, but if
	you're heavily using immutable data to avoid copies, a "distributed objects"
	implementation would suffer because it would have to deep copy all the immutable
	data that the receiver doesn't have, defeating the optimization. One approach to
	handling this is to treat this as a data synch problem, and have the client
	fault pieces of the immutable data subgraph in on demand, instead of eagerly
	copying it.

	OpenCL Integration with this model could be really natural: the GPU is an
	inherently async device to talk to.

	UNIX processes: Actors in a shared address space with no shared mutable data
	are related to processes in a unix app that share by communicating with mmap
	etc.

	.. _http://research.swtch.com/2010/02/off-to-races.html: http://research.swtch.com/2010/02/off-to-races.html