blob: 9a5841d32b816da69deee1e24d57514fd9be7933 [file] [log] [blame] [view] [edit]
# ChainLocks and ChainLockTransactions in the kernel
## What is a "ChainLock"?
ChainLocks are a type of synchronization building block similar to a SpinLock.
Exclusive locks generally have ordering requirements which prevent
deadlock. For example, if there exists a place in code where two locks (`A`,
and `B`) are acquired in the order `A` then `B`, then there cannot exist any
place in code where the locks are acquired in the order `B` then `A` which can
execute concurrently with the first sequence. Failure to follow this
requirement leads to deadlock potential.
ChainLocks are exclusive locks whose design includes a built in protocol which
can be used to obtain a set of locks in an arbitrary order, even when there
might exist other places in the code where an overlapping set of locks might be
obtained in a different order (which could traditionally lead to deadlock).
They were introduced as part of the effort to remove the kernel's global
`thread_lock` and reduce overall SpinLock contention.
## How do they work?
Internally, the state of a ChainLock is represented by a 64 bit unsigned
integer, referred to as the lock "Token". A reserved sentinel value of 0 is
used to indicate that the lock is in the unlocked state.
To acquire a set of ChainLocks (`S`), a user must first generate a Token from a
global monotonically increasing token generator. This Token can be thought of
as a type of priority or ticket. Lower internal integer values (Tokens
generated earlier in the sequence) have a "higher priority" then Tokens with a
higher internal integer value. This generated token must be the token used to
obtain *all* of the locks in `S`.
When a thread `A` attempts to obtain a ChainLock `L` with a token `T`, a compare
and exchange of the lock's state with `T` is performed, expecting the state of
the lock to be the unlocked sentinel value. If this succeeds (the state value
was "unlocked"), the lock has been acquired and `A` can continue to execute.
If the `CMPX` operation fails, it implies that some other thread, `B`, currently
holds the lock using a different Token, `X`. In this situation, there are three
possibilities.
1) `T` < `X` : `A` is "higher priority" than `B`. `A` should simply continue to
attempt to obtain the lock, as it would have done with a SpinLock, until it
finally succeeds.
2) `T` > `X` : `A` is "lower priority" than `B`. `A` now needs to back-off,
dropping _all_ of the ChainLocks it currently holds. It should then wait a
short amount of time before re-attempting the operation. This prevents
deadlock in the `B` is concurrently attempting to obtain a member of `S`
which is already held by `A` when `A` is attempting to acquire `L`.
3) `T` == `X` : `A` is attempting to re-acquire `L`, which it already holds, and
`A` == `B`. This indicates that `A` is following a lock cycle, and would have
otherwise deadlocked. Typically, this would be considered to be a fatal
error, however there are places in kernel code where this can happen because
of a request made by user-mode code, and the kernel needs to detect the
situation and implement proper policy to deal with the errant user-mode
program instead of triggering a panic.
See also //zircon/system/ulib/concurrent/models/chainlock for a TLA+ model of
the locking protocol. This model shows how the protocol avoids deadlock while
also guaranteeing that all threads will eventually obtain their sets of locks,
regardless of how the sets overlap, or the order in which the locks are
obtained.
## How are they used in the kernel?
With the removal of the global `thread_lock`, new locks needed to be introduced
in order to replace the global locks functionality. While a number of new
classes of locks needed to be introduced, the primary new lock classes are:
1) A ChainLock for each `Thread` instance (the "thread's lock").
2) A ChainLock for each `WaitQueue` instance.
3) A SpinLock for each Scheduler instance (the Scheduler's "queue_lock")
Because of profile inheritance, `Thread`s and `WaitQueue`s have a relationships
which form "profile inheritance graphs". Adding and removing edges to the
global profile inheritance graphs requires holding sets of locks at the same
time, and user requests which cause these graph manipulations have the potential
to produce concurrent sequences of acquisition which could result in an
ephemeral cycle and deadlock were it not for the back-off protocol implemented
by the ChainLocks. This was the primary motivation for the initial introduction
of the ChainLock, however other reasons for using them have show up during the
process of breaking the GTL.
### Read-Only vs. Read-Write access.
A general rule for the use of ChainLocks in the kernel goes something like this.
1) In order to perform some operation `X`, a thread may need to hold some set of
locks `S`.
2) `S` may not be initially completely known. For example, if a thread `T` is to
block in an owned wait queue `W`, we know that we must hold both `T` and
`W`'s locks, but we do not yet know if `W` has an owner which also needs to
be locked.
3) For operations involving a set of locks `S`, the general rule is that the
state of an object `O` protected by `O.lock` may be examined if `O.lock` is
held, but *no* mutations of any of the objects involved in `X` may take place
until *all* of the locks in `S` have been obtained exclusively.
So, if we wanted to block a thread `T` in a wait queue `W`, we would:
1) Generate a token `A`.
2) Obtain `T.lock` using `A`.
3) Obtain `W.lock` using `A`.
4) Check to see if `W` has an owner, and if so, obtain `W.owner.lock` using `A`
5) Check to see if `W.owner` is blocked in a WaitQueue, and if so, obtain
`W.owner.blocking_queue.lock` using `A`.
6) Repeat this process until we hold all of the locks in the priority
inheritance chain.
7) Now that all of the locks are held, add the edge between `T` and `W`, and
propagate the profile inheritance consequences through the PI chain to the
target node.
8) Drop all of the locks.
If, at any point in time, attempting to obtain a lock results in a back-off
result, all of the current locks must be dropped and the sequence restarted from
step #2. Note that we *do not* want to regenerate `A` when we back-off and
retry. Keeping our original token `A` guarantees that eventually we become the
most important lock operation in the system, and be able to obtain all of our
locks without needing to back off.
## ChainLockTransactions
While ChainLocks can be a generalized tool, in the kernel, they are used to solve
a specific class of problems where operations on objects protected by a set of
locks need to be performed, but the order in which the locks are obtained is not
always guaranteed to be the same. The ergonomics of attempting to manually
manage ChainLocks and Tokens can become burdensome in general, and also lead to
situations where it can be difficult to implement common static and dynamic
assertions related to lock ownership.
In order to make this easier to use and less error prone overall, the kernel
introduces a new helper object called the ChainLockTransaction (or CLT).
### What is a ChainLockTransaction, logically speaking?
Logically speaking, a ChainLockTransaction represents a sequence of operations
similar to the "block `T` in `W`" sequence previously described. At the start
of the transaction, `A` is generated, and the transaction is in the
"non-finalized" state. The transaction becomes the "active transaction", and
its generated token `A` is used to acquire the set of required locks. Once all
of the locks are acquired, the transaction becomes "finalized" (more on this
later) and no new chain locks may be obtained. Finally, any required mutations
take place, the locks are dropped, and the transaction is complete.
### What is a ChainLockTransaction, practically speaking?
In code, CLTs are RAII style objects which define the scope of an operation
involving a set of locks (henceforth, a transaction). They are always
instantiated on the stack at the start of a transaction. It is an error to ever
instantiate one in a global scope, a static local scope, or on the heap. For a
given thread, there may only ever be one active CLT at a time, and a pointer to
this CLT is stored in the CPU's `percpu` data structure.
There are a number of rules in the kernel which need to be followed when using
ChainLocks, and CLTs are meant to provide a way to both easily follow those
rules, as well as to enforce that the rules are being followed, either
statically or dynamically (using DEBUG_ASSERTs). These rules include:
1) The token, `A` used during a transaction must be the same token used for all
of the locks involved in the transaction. CLTs and the kernel's
specialization guarantee this by having the CLT generate `A`, and not
allowing any modification of this token after it was generated. Additionally,
the kernel version of a ChainLock does not provide a way to pass a token
during an `Acquire` operation, instead demanding that the token be provided
by a currently active CLT registered in the percpu data structure.
2) Like SpinLocks, no interrupt processing or context switching can take place
while any ChainLocks are held. CLTs help to enforce these requirements by
providing specialized version which can automatically disable interrupts and
preemption as appropriate based on the scope of the transaction. The fact
that no context switching or IRQ handling can take place during a CLT is
also what makes it possible to store the pointer to the currently active
transaction in the percpu data structure, making the transaction available
outside of the immediate scope the transaction was instantiated in.
3) There may only ever be a single CLT active at once. Overlapping transactions
are illegal. CLTs help enforce this rule two different ways, both statically
and dynamically. A global static analysis token (the
`chainlock_transaction_token`) is introduced, and the instantiation of a CLT
will `TA_ACQUIRE` this token. Note that this token is not actually a lock,
and "acquiring" it is a no-op from an instruction execution perspective. That
said, it gives the static analyzer the ability to understand that we are in a
currently active CLT, and any attempt to start another transaction will
result in a compile time error, as it would look to the analyzer like we are
attempting to acquire the same lock twice. Additionally, methods can use
this token to statically demand that we either are or are not involved in a
CLT, as appropriate. Second, we sometimes may need to dynamically enforce
that we are or are not in an active CLT. The percpu registration of the CLT
allows us to check this at runtime, and additionally allows us to implement
an `AssertHeld` style check on the `chainlock_transaction_token`, meaning
that we can use the dynamic check to satisfy the static analyzer in
situations where it would otherwise not be possible.
4) ChainLocks may only be obtained in the pre-finalized stage of a CLT. After
the CLT becomes finalized, no new locks may be obtained. CLTs can be used to
enforce this requirement by tracking internally whether or not the have
entered the finalized state, and triggering a panic in the case that someone
attempts to acquire a ChainLock after the transaction was finalized. Note
that this is additional CLT state which is not strictly needed, and can be
disabled in production builds if desired.
### Other practical uses of ChainLockTransactions
#### Dynamic assertion that a ChainLock is currently held.
There are several places in the kernel where we need to be able to demonstrate
to the static analyzer that a particular lock is held, but where it is not
possible to do so from existing static annotations. Kernel SpinLocks can
currently do this by invoking a method on the lock called `AssertHeld` which is
annotated with a `TA_ASSERT` static annotations. The state of a kernel SpinLock
is an integer whose value is zero of the lock is not held, and the 1s-indexed
CPU id of the CPU which holds the lock in the case that it is held. A
SpinLock's `AssertHeld` method simply needs to `DEBUG_ASSERT` that its state
equals `current_cpu_number + 1`. The current CPU (when interrupts are off and
context switching is disabled) is a constant property of the thread of
execution, and can be fetched from any context to establish whether or not a
given SpinLock instance is currently held.
The state of a ChainLock, however, is the Token which was used to obtain it.
Generally speaking, this is not available to every scope unless the token was
explicitly plumbed into the scope, passed either by parameter or lambda capture.
The CLT, as used in the kernel, removes this restriction. To hold any
ChainLocks at all requires that we have an active CLT, and that active CLT holds
our current token. This allows us to implement our own AssertHeld from any
scope, even those where the token would not otherwise be immediately available.
Common places where this pattern shows up include:
1) Places where lambda functions are used, especially where the are injected into
a method in functional programming pattern. The "migrate" function for a
thread is a good example of this.
2) Places involved in context switching, especially when the context switch is
to a newly created thread's Trampoline function. Lock handoff from the old
thread to the new thread needs to take place at this point in time (which is
a bit of a tricky process to begin with), but the Trampoline function has no
static way to demonstrate that it holds the locks which need to be traded.
CLTs provide the way to do this, even though no Tokens were explicitly
plumbed.
#### Quantitative contention analysis.
Quantitative analysis of how much time was lost to contention on a SpinLock
instance is relatively straightforward. If an attempt to acquire a SpinLock
fails the first time, the current time is observed and stored. When the lock is
finally obtained, the amount of time from the first contention to the eventual
acquisition can be computed, and the amount of time spent spinning for the lock
instance can be recorded (typically via kernel tracing).
ChainLocks are a bit different. Talking about the "time spent spinning" on a
ChainLock is difficult, because when a lock is contended, the result is
sometimes that the back-off protocol needs to be invoked. All of the locks
involved in the transaction need to be dropped and reacquired. During
re-acquisition, we may end up contending on a different lock involved in the
transaction, or the same lock, or no lock at all. None of this time would be
properly accounted for if we simply attempted to track the time from first
conflict to subsequent acquisition of the same individual lock.
In general, it does not make a lot of sense to talk about the "time spent
contending" an individual ChainLock. The CLT, however, gives us a useful analog
of this measurement. Instead of talking about the contention time for an
individual lock, we can talk about the time spent obtaining all of the locks
involved in a transaction if _any_ contention event was experienced along the
way.
CLTs are be integrated with the kernel's version of ChainLocks to record a
contention start time the very first time any ChainLock contention event happens
during a non-finalized transaction. When all of the locks are finally obtained,
the transaction becomes finalized, and the time between when the first
contention event happened and when the transaction was finally finalized can be
recorded as the overall "spin contention time" for the transaction.
This functionality can be enabled/disabled at compile time, dropping completely
out of the build when not needed.
#### Implementing proper `Relax` behavior.
As noted earlier, when a CLT encounters a situation where a back-off is
required, it must drop all locks and wait a small amount of time before trying
again. At a minimum, we want to `arch::Yield` at this point in time, but
sometimes we should go further.
For example, if we had to disable both preemption and interrupts at the start of
our transaction, we probably do not want to keep them disabled during our back
off period as this risks holding off the servicing of time critical interrupts
and threads during acquisition.
The specialized versions of the CLT (instantiated on the stack by users)
provides us a better option. CLT instances provide a `Relax()` method which can
be used to implement the proper behavior automatically. In the simplest case,
where interrupts and preemption where already appropriately disabled, all the
CLT needs to do is `arch::Yield`. In cases where the CLT instance is managing
the IRQ/Preempt state, it can:
1) Unregister the current transaction as the active transaction.
2) Restore the preempt/IRQ state. If we are preempted or take an interrupt at
this point in time, our per-cpu state properly reflects the fact that we are
not in a currently active transaction and hold no ChainLocks.
3) `arch::Yield()`
4) Undo step #2, disabling preemption/IRQs as appropriate.
5) Re-register the CLT as the active transaction.
Users do not need to explicitly implement the proper behavior, the type of the
instantiated transaction handles this for them.
#### Enforcing lock count consistency checks.
There are places in code where we would like to DEBUG_ASSERT things about the
number of ChainLocks currently held. For example, at the start of a
`RescheduleCommon` operation, we expect exactly one ChainLock to be held, that
of the currently running thread. The dynamic assertion capability of CLTs
allows us to assert that we hold the current thread's lock, but we'd like to go
further and assert that this is the _only_ chain lock held.
Other examples of where we want to make a check like this is during a `Relax`
operation, and at the end of a transaction (when the CLT instance destructs).
At both of these points in time, we want to be able to assert that we hold
no ChainLocks at all.
CLTs give us a tool to perform this style of consistency check. They are hooked
into the kernel specific implementation of ChainLocks in a way which allows them
to monitor Acquire/Release operations, and can maintain a count of the number of
locks currently held. This takes extra state in the CLT (the count of locks
held), but like all of the debug checks made possible by CLTs, they can be
disabled at compile time and completely drop out of the build when not needed.
### Types of ChainLockTransactions.
As noted earlier, multiple different versions of ChainLockTransaction are
provided to make it easier for users to satisfy preemption and IRQ
enabled/disabled requirements. All of these specific types of CLT are derived
from a common `ChainLockTransaction` class. This base class defines the common
CLT interface, and a `ChainLockTransaction*` is the type of pointer which is
stored in the percpu data structure when a transaction is active. Users,
however, may never instantiate a base ChainLockTransaction object. Instead, the
following specific types of transactions are available for use instead.
1) `ChainLockTransactionNoIrqSave`. This transaction does not manipulate either
preempt or IRQ enabled/disabled state. Instead, it simply asserts that IRQs
are disabled when it is generated, and implements a simple `arch::Yield`
if/when the transaction is `Relax`ed.
2) `ChainLockTransactionIrqSave`. Automatically saves and restores the IRQ
disabled state. Also automatically restores IRQ state and un/re-registers
the active transaction during a `Relax` operation.
3) `ChainLockTransactionPreemptDisableIrqAndSave`. The same as
`ChainLockTransactionIrqSave`, but goes one step further, automatically
disabling preemption, and then IRQs.
4) `ChainLockTransactionEagerReschedDisableIrqAndSave`. The same as
`ChainLockTransactionPreemptDisableIrqAndSave`, but goes even further,
disabling eager rescheduling in addition to preemption.
### A usage example.
Let's take a look at a simple (but heavily commented) example of using
ChainLockTransactions and ChainLocks. For our simplified scenario, let's
pretend that we want to block a `Thread` in an `OwnedWaitQueue` but we do not
have to deal with ownership reassignment.
```c++
extern ChainLock::LockResult AcquireDownstream(OwnedWaitQueue& owq)
TA_REQ(chainlock_transaction_token, owq.lock_);
extern void LinkAndPropagate(Thread& t, OwnedWaitQueue& owq)
TA_REQ(chainlock_transaction_token, t.lock_, owq.lock_);
extern void ReleaseDownstream(OwnedWaitQueue& owq)
TA_REQ(chainlock_transaction_token) TA_REL(owq.lock_);
extern ChainLock::LockResult BlockInScheduler(Thread& t)
TA_REQ(chainlock_transaction_token, t.lock_);
void BlockInQueue(OwnedWaitQueue& owq) {
using LockResult = ChainLock::LockResult;
Thread* const current_thread = Thread::Current::Get();
// Create our transaction. This will "acquire" the chainlock_transaction
// token, which is a static requirement for acquiring any ChainLocks. It will
// also handle disabling preemption and interrupts for us. Additionally, it
// will ensure that a consistent token is used throughout the transaction.
ChainLockTransactionPreemptDisableAndIrqSave clt("BlockInQueue");
// Enter into our retry loop. We will cycle through this loop any time we
// need to back off, automatically relaxing the transaction as we do. The
// relax operation will dynamically assert that we have properly released all
// of our locks before we relax.
for (;; clt.Relax()) {
// "Unconditionally" acquire the current thread's lock. An "unconditional"
// acquisition of a ChainLock means that the acquire operation will continue
// to attempt to obtain the lock, even if it is held by a higher priority
// transaction. This is legal _only_ because we know that at this point in
// time that we hold no other locks, therefore we cannot be "in the way" of
// any other concurrently executing CLTs. IOW - we have no locks to drop,
// and can simply keep trying.
//
// The advantage of unconditional acquisition here that we can acquire the
// lock using a traditional RAII style guard which will handle both
// releasing the guard for us, as well as establishing that we hold the lock
// statically.
UnconditionalChainLockGuard guard{current_thread->lock_};
if (const LockResult res = owq.lock_.Acquire(); res != LockResult::kOk) {
// Cycles should be impossible here. The only possible non-ok result
// should be "Backoff".
DEBUG_ASSERT(res == LockResult::kBackoff);
// All we need to do here is continue. We only hold the current thread's
// lock, and our guard will automatically drop it before our for loop
// automatically relaxes the transaction for us.
continue;
}
// Now that we have successfully obtained the lock, assert that we just
// acquired it. AssertAcquired is similar to AssertHeld, but goes a step
// further. Not only are we asserting that we hold it for static analysis
// purposes, we are asserting that we just _acquired_ the lock as well,
// meaning that we _must_ release the lock in our scope as well.
owq.lock_.AssertAcquired();
// If this lock has an owner, grab the downstream locks as well. If this
// fails, drop our locks and start over.
if (const LockResult res = AcquireDownstream(owq); res != LockResult::kOk) {
// Same as before, cycles should be impossible here.
DEBUG_ASSERT(res == LockResult::kBackoff);
// Drop the queue lock, and continue which will drop the thread lock and
// relax the transaction for us. We cannot forget to drop the queue lock
// because of the AssertAcquired we executed before.
owq.lock_.Release();
continue;
}
// We have all of the locks we need! Finalize our transaction (which will
// automatically handle logging any contention statistics if enabled).
clt.Finalize();
// Link the thread and the queue, and propagate the PI consequences. The
// requirement that we hold both the thread's lock and the queue's lock is
// statically enforced, while the fact that the downstream lock chain needs
// to be held can be dynamically enforced thanks to our active CLT.
LinkAndPropagate(*current_thread, owq);
// Now that we have the bookkeeping handled, we can release all of the locks
// at and downstream of the queue we are blocking in.
ReleaseDownstream(owq);
// Finally, block in the scheduler. This requires that our current thread's
// lock be held, but no other ChainLocks. We statically require that the
// current thread's lock be held, and the scheduler's code will dynamically
// assert that this is the only lock held. Our thread's lock will
// automatically be release and reacquired before and after the block
// operation.
BlockInScheduler(*current_thread);
// Finally, break out of our loop. Our guard will automatically release our
// (now re-acquired) thread's lock.
break;
}
// Exit the scope, destroying our CLT in the process. During destruction, we
// will automatically:
//
// 1) Check to make sure we hold no longer hold ChainLocks.
// 2) Check to make sure that the transaction was properly finalized.
// 3) Unregister the transaction as the active transaction.
// 4) Restore the previous preempt/IRQ enabled/disabled state.
}
```