Recursive helping is a technique for implementing lock-free concurrent data structures and algorithms. I’m going to illustrate this in the case of implementing a multi-variable compare-and-swap (MCAS) in terms of a single variable compare-and-swap. Basically everything I’m going to talk about comes from Keir Fraser’s PhD Thesis, Practical Lock-Freedom (2004) which I strongly recommend. Fraser’s thesis goes much further than this, e.g. fine-grained lock-free implementations of software transactional memory (STM)1. Fraser went on to contribute to the initial implementations of STM in Haskell, though his thesis uses C++.
First, some prerequisites.
I imagine most developers when they hear the term “lock-free” take it to mean a concurrent algorithm implemented without using locks. It, however, has a technical definition. Assuming a concurrent application is functionally correct, e.g. it does the right thing if it terminates no matter how things are scheduled, we still have three liveness problems in decreasing order of severity:
In parallel, we have three properties corresponding to programs that cannot exhibit the above behaviors:
Wait-freedom is the most desirable property but was difficult to achieve with reasonable efficiency. However, relatively recent techniques (2014) may do for wait-free algorithms what Fraser’s thesis did for lock-free algorithms, namely reduce a research problem to an exercise. These techniques, however, still start from a lock-free algorithm. Obstruction-freedom is usually what you get from concurrency control mechanisms that abort and retry in the case of conflicts. To achieve lock-freedom, we need to avoid losing and redoing work, or at least doing so repeatedly indefinitely.
See Fraser’s thesis for more formal definitions.
I’ll use “lockless” to mean an algorithm implemented without using locks.
The usual primitive used to implement and describe lockless algorithms is compare-and-swap,
often just called cas
. There are other possibilities, but cas
is universal, relatively
simple, and widely implemented, e.g. as the cmpxchg
operation for the x86 architecture. The following is a specification of
a cas
operation in Haskell with the additional note that this intended to be performed atomically (which it would not be in Haskell).
cas :: (Eq a) => IORef a -> a -> a -> IO a
= do
cas ref old new <- readIORef ref
curr if curr == old then do
writeIORef ref newreturn curr
else do
return curr
The specification of multiple compare-and-swap is the straightforward extension to the above to several variables.
mcasSpec :: (Eq a) => [(IORef a, a, a)] -> IO Bool
= do
mcasSpec entries <- forM entries $ \(ref, old, _) -> do
eqs <- readIORef ref
curr return (curr == old)
if and eqs then do -- if all equal
$ \(ref, _, new) -> do
forM_ entries
writeIORef ref newreturn True
else do
return False
The above is, again, intended to be executed atomically. It will be convenient to allow a bit more flexibility in the type producing the type:
mcas :: (Eq a) => [(MCASRef a, a, a)] -> IO Bool
where we have
-- Abstract.
newtype MCASRef a = MCASRef { unMCASRef :: IORef (Either (T a) a) } deriving ( Eq )
newMCASRef :: a -> IO (MCASRef a)
= MCASRef <$> newIORef (Right v)
newMCASRef v
readMCASRef :: MCASRef a -> IO a
-- Will be implemented below.
The idea here is that, in addition to values of type a
, we can also store values of type T a
into the
pointers for internal use, and we can unambiguously distinguish them from values of type a
. T a
can be any type constructor
you like.
In the code below, I will assume IORef
s have an Ord
instance, i.e. that they can be sorted. This is not true, but
an approach as in ioref-stable could be used to accomplish this.
Alternatively, Ptr
s to StablePtr
s could be used.
We won’t worry about memory consistency concerns here. That is, we’ll assume sequential consistency where all CPU cores see all updates immediately.
I recommend stopping here and thinking about how you would implement mcas
in terms of cas
while, of course,
achieving the desired atomicity. The solution I’ll present – the one from Fraser’s thesis – is moderately
involved, so if the approach you come up with is very simple, then you’ve probably made a mistake. Unsurprisingly,
the solution I’ll present makes use of the additional flexibility in the type and the ability to sort IORef
s. I’m
not claiming it is impossible to accomplish this without these though. For example, you could apply a universal construction
which witnesses the universality of cas
.
Lockless algorithms typically proceed by attempting an operation and detecting conflicts, e.g. as in multi-version concurrency control. This requires storing enough information to tell that a conflicting operation has occurred/is occurring. Once a conflict is detected, the simplest solution is typically to abort and retry hoping that there isn’t a conflict the next time. This clearly leads to the possibility of livelock.
Instead of aborting, the later invocation of the operation could instead help the earlier one to complete, thereby getting it out of its way. This ensures that the first invocation will always, eventually complete giving us lock-freedom. However, this doesn’t guarantee that once the second invocation finishes helping the first invocation that a third invocation won’t jump in before the second invocation gets a chance to start on its own work. In this case, the second invocation will help the third invocation to complete before attempting to start itself. A process can end up spending all its time helping other processes while never getting its own work done, leading to starvation.
To perform recursive helping, we need an invocation to store enough information so that subsequent, overlapping invocations
are able to assist. To accomplish this, we’ll store a (pointer to a) “descriptor” containing the parameters of the invocation
being helped and potentially additional information2. This is what we’ll use for the T
type constructor.
The general approach will be: at the beginning of the operation we will attempt to “install” a descriptor in the first
field we touch utilizing cas
. There are then three possible outcomes. If we fail and find a value, then the operation
has failed. If we fail and find an existing descriptor, then we (potentially recursively) help that descriptor. If we succeed,
then we have successfully “acquired” the field and we “help” ourselves. We can have many processes all trying to help the same
invocation at the same time, so it is still important that multiple identical help calls don’t interfere with each other. Just
because we’re helping an invocation of an operation doesn’t mean that that the original process isn’t still executing.
Since we’ll be replacing pointers to values with pointers to descriptors, reading the value becomes non-trivial. In particular, if when we read a pointer we get a descriptor, we’ll need to help the invocation described to completion. We will need to keep doing this until we successfully read a value.
An operation we’ll use in the implementations of mcas
is a conditional compare-and-swap (CCAS) where we only perform the swap
if, additionally, an additional variable is set to a given value. It has the following specification.
-- Specification. Implementations should perform this atomically.
ccasSpec :: (Eq a, Eq c) => IORef a -> a -> a -> IORef c -> c -> IO ()
= do
ccasSpec ref old new condRef check <- readIORef ref
curr <- readIORef condRef
cond if cond == check && curr == old then do
writeIORef ref newelse do
return ()
We’ll need to show that this can be implemented in terms of cas
, or rather a version with modifications similar to those
mentioned for mcas
. This will be a simple instance of the recursive helping approach that will be applied
in the implementation of mcas
.
type CCASDescriptor a c = IORef (CCASRef a c, a, a, IORef c, c)
newtype CCASRef a c = CCASRef { unCCASRef :: IORef (Either (CCASDescriptor a c) a) } deriving ( Eq, Ord )
We begin with the types. As described above, a CCASRef
is just an IORef
that holds either a value
or a descriptor, and the descriptor is just an IORef
pointing at a tuple holding the arguments to ccas
.
We won’t actually modify this latter IORef
and instead are just using it for its object identity. It could
be replaced with a read-only IVar
or a Unique
could be allocated and used as an identifier instead. In a
lower-level language, this IORef
corresponds to having a pointer to the descriptor.
newCCASRef :: a -> IO (CCASRef a c)
= CCASRef <$> newIORef (Right v)
newCCASRef v
readCCASRef :: (Eq a, Eq c) => CCASRef a c -> IO a
= do
readCCASRef ref <- readIORef (unCCASRef ref)
x case x of
Right v -> return v
Left d -> do
ccasHelp d
readCCASRef ref
-- Not atomic. This CAS can fail even when it would be impossible if `ccas` was truly atomic.
-- Example: ccas a reference to the same value but where the condRef is False. The ccas fails
-- and thus should behave as a no-op, but if a `casCCASRef` occurs during the course of the ccas,
-- the `casCCASRef` can fail even though it should succeed in all cases.
casCCASRef :: (Eq a) => CCASRef a c -> a -> a -> IO Bool
CCASRef ref) old new = do
casCCASRef (<- cas ref (Right old) (Right new)
curr return (curr == Right old)
tryReadCCASRef :: CCASRef a c -> IO (Maybe a)
= do
tryReadCCASRef ref <- readIORef (unCCASRef ref)
x return (case x of Left _ -> Nothing; Right v -> Just v)
To get them out of the way, the following functions implement the reference-like aspects of a CCASRef
.
The descriptor is an internal implementation detail. The interface is meant to look like a normal reference
to a value of type a
. The main notes are:
CCASRef
may not contain a value when we read, we loop helping to complete the ccas
until it does,casCCASRef
is a slightly simplified cas
used inmcas
but should be not be considered part of the interface, andtryReadCCASRef
is used in the implementation of mcas
, but you quite possibly wouldn’t provide it otherwise.ccas :: (Eq a, Eq c) => CCASRef a c -> a -> a -> IORef c -> c -> IO ()
= do
ccas ref old new condRef check <- newIORef (ref, old, new, condRef, check)
d <- cas (unCCASRef ref) (Right old) (Left d)
v
go d vwhere go d (Left d') = do -- descriptor already there
ccasHelp d'<- cas (unCCASRef ref) (Right old) (Left d)
v
go d vRight curr) | curr == old = ccasHelp d -- we succeeded
go d (| otherwise = return () -- we failed
ccasHelp :: (Eq a, Eq c) => CCASDescriptor a c -> IO ()
= do
ccasHelp d CCASRef ref, old, new, condRef, check) <- readIORef d
(<- readIORef condRef
cond <- cas ref (Left d) (Right $! if cond == check then new else old)
_ return ()
Here we illustrate the (not so recursive) helping pattern. ccas
allocates a descriptor and then attempts to “acquire” the
reference. There are three possibilities.
Helping, implemented by ccasHelp
, just performs the logic of CCAS. If we’ve gotten to ccasHelp
, we know the invocation
described by the descriptor did, in fact, find the expected value there. By installing our descriptor, we’ve effectively
“locked out” any other calls to ccas
until we complete. We can thus check the condRef
at
our leisure. As long as our descriptor is still in the CCASRef
, which we check via a cas
, we know that there have been
no intervening operations, including other processes completing this ccas
. ccasHelp
is idempotent in the sense that running
it multiple times, even in parallel, with the same descriptor is the same as running it once. This is due to the fact that we
only (successfully) CAS in the descriptor once, so we can only CAS it out at most once.
The setup for MCAS is much the same as CCAS. The main additional complexity comes from the fact that we need to simultaneously “acquire” multiple references. This is handled by a two-phase approach. In the first phase, we attempt to “acquire” each reference. We proceed to the second phase once we’ve either seen that the MCAS is going to fail, or we have successfully “acquired” each reference. In the second phase, we either reset all the “acquired” references to their old values if the MCAS failed or to their new values if it succeeded. The MCAS will be considered to have occurred atomically at the point we record this decision, via a CAS, i.e. between the two phases.
data MCASStatus = UNDECIDED | FAILED | SUCCESSFUL deriving ( Eq )
data MCASDescriptor' a = MCASDescriptor [(MCASRef a, a, a)] (IORef MCASStatus)
type MCASDescriptor a = IORef (MCASDescriptor' a)
newtype MCASRef a = MCASRef { unMCASRef :: CCASRef (Either (MCASDescriptor a) a) MCASStatus }
deriving ( Eq, Ord )
As with CCAS, an MCASRef
is a reference, in this case a CCASRef
, that either holds a value or a descriptor.
The descriptor holds the arguments of mcas
, as with ccas
, but it additionally holds a status reference. This
status reference will be used as the condition reference of the CCAS. In particular, as we’ll see, we will only
perform ccas
’s when the status is UNDECIDED
.
newMCASRef :: a -> IO (MCASRef a)
= MCASRef <$> newCCASRef (Right v)
newMCASRef v
readMCASRef :: (Eq a) => MCASRef a -> IO a
= do
readMCASRef ref <- readCCASRef (unMCASRef ref)
x case x of
Right v -> return v
Left d -> do
mcasHelp d readMCASRef ref
There’s nothing to say about the reference interface functions. They are essentially identical to the CCAS ones for
the same reasons only with CCASRef
s instead of IORef
s.
mcas :: (Eq a) => [(MCASRef a, a, a)] -> IO Bool
= do
mcas entries <- newIORef UNDECIDED
status <- newIORef (MCASDescriptor (sortOn (\(ref, _, _) -> ref) entries) status)
d mcasHelp d
The mcas
function is fairly straightforward. It allocates a status reference and a descriptor and delegates most of the work
to mcasHelp
. The main but critical subtlety is the sort. This is critical to ensuring termination.
mcasHelp :: (Eq a) => MCASDescriptor a -> IO Bool
= do
mcasHelp d MCASDescriptor entries statusRef <- readIORef d
let phase1 [] = do
<- cas statusRef UNDECIDED SUCCESSFUL
_
phase2MCASRef ref, old, new):es) = tryAcquire ref old new es
phase1 ((
= do
tryAcquire ref old new es <- ccas ref (Right old) (Left d) statusRef UNDECIDED
_ <- tryReadCCASRef ref
v case v of
Just (Left d') | d == d' -> phase1 es -- successful acquisition
| otherwise -> do -- help someone else
mcasHelp d'
tryAcquire ref old new esJust (Right curr) | curr == old -> do
<- readIORef statusRef
status if status == UNDECIDED then do
-- failed to acquire but could still succeed
tryAcquire ref old new es else do
phase2-> do -- failed MCAS
_ <- cas statusRef UNDECIDED FAILED
_
phase2
= do
phase2 <- readIORef statusRef
status let succeeded = status == SUCCESSFUL
$ \(MCASRef ref, old, new) -> do
forM_ entries Left d) (Right (if succeeded then new else old))
casCCASRef ref (return succeeded
phase1 entries
phase1
attempts to “acquire” each MCASRef
by using tryAcquire
which will move phase1
to the next entry each time it succeeds.
Therefore, if phase1
reaches the end of the list and there was no interference, we will have successfully “acquired” all
references. We record this with a CAS against statusRef
. If this CAS succeeds, then the MCAS will be considered successful and
conceptually to have occurred at this point. If the CAS fails, then some other process has already completed this MCAS, possibly
in success or failure. We then move to phase2
.
tryAcquire
can also detect that the MCAS should fail. In this case, we immediately attempt to record this fact via a CAS
into statusRef
. As with the successful case, this CAS succeeding marks the conceptual instant that the MCAS completes. As
before, we then move on to phase2
.
We never enter phase2
without statusRef
being set to either SUCCESSFUL
or FAILED
. phase2
is completely straightforward.
We simply set each “acquired” reference to either the new or old value depending on whether the MCAS succeeded or not. The
casCCASRef
will fail if either we never got around to “acquiring” a particular reference (in the case of MCAS failure), or if
a reference was written to since it was “acquired”. Since such writes conceptually occurred after the MCAS completed, we do not
want to overwrite them.
During tryAcquire
, there are a few cases that lead to retrying. First, if we find that a reference has already been “acquired”
by some other MCAS operation, we recursively help it. Here, the sorting of the references is important to ensure that any MCAS
operation we help will never try to help us back. It’s easy to see that without the sorting, if two MCAS operations on the same
two references each acquired one of the references, the (concurrent) recursive calls to mcasHelp
would become infinite loops.
With a total ordering on references, each recursive call to mcasHelp
will be at a greater reference and thus must eventually
terminate. The other case for tryAcquire
, is that the expected value is written after the ccas
but before the tryReadCCASRef
.
In this case, we try again unless the status has already been decided. It might seem like this is just an optimization, and that
we could instead treat this as the MCAS failing. However, the intervening write may have written the value that was there before
the ccas
, meaning that there was never a point at which the MCAS could have failed.
References are only “acquired” at the ccas
in phase1
. Once the status has been decided, no references may be “acquired” any
longer. Since it’s impossible to enter phase2
without deciding the status, once one process enters phase2
, no processes are
capable of “acquiring” references. This makes phase2
idempotent and, indeed, each CAS in phase2
is independently idempotent.
Overlapping executions of phase1
are fine essentially because each ccas
is idempotent and the statusRef
can change at
most once.
Let’s see how an mcas
interacts with other operations from the perspective of atomicity. If we attempt to read
a reference via readMCASRef
which is included in the list of references of an ongoing mcas
, there are two
possibilities. Either that reference has not yet been “acquired” by the mcas
, in which case the read will occur
conceptually before the MCAS, or it has been “acquired” in which case the read will help the MCAS to completion
and then try again after the MCAS. The story is similar for overlapping mcas
’s. The mcas
which “acquires” the
least reference in their intersection will conceptually complete first, either because it literally finishes before
the second mcas
notices or because the second mcas
will help it to completion. Writes are only slightly different.
It is important to note that these operations are NOT atomic with respect to some other reasonable operations. Most
notably, they are not atomic with respect to blind writes. It is easy to construct a scenario where two blind writes
happen in sequence but the first appears to happen after the mcas
and the second before. Except for initialization,
I don’t believe there are any blind writes to references involved in a mcas
in Fraser’s thesis. Fraser’s thesis does,
however, contain cas
operations directly against these references. These are also not atomic for exactly the same
reason casCCASRef
isn’t. That said, Fraser’s uses of cas
against MCASRef
s are safe, because in each case they just
retry until success.
While I’ve gone into a good amount of detail here, I’ve mainly wanted to illustrate the concept of recursive helping. It’s a key concept for lock-free and wait-free algorithm designs, but it also may be a useful idea to have in mind when designing concurrent code even if you aren’t explicitly trying to achieve a lock-free guarantee.
MCAS allows you to perform transactions involving multiple updates atomically. STM additionally allows you to perform transactions involving multiple reads and updates atomically.↩︎
While I haven’t attempted it to see if it works out, it seems like you
could make a generic “recursive helping” framework by storing an IO
action instead. The “descriptors” have the flavor of
defunctionalized continuations.↩︎