Recursive Helping

April 30, 2020 06:06 UTC (Last updated on August 2, 2024 02:45 UTC)

Tags: programming, concurrency

Introduction

Recursive helping is a technique for implementing lock-free concurrent data structures and algorithms. I’m going to illustrate this in the case of implementing a multi-variable compare-and-swap (MCAS) in terms of a single variable compare-and-swap. Basically everything I’m going to talk about comes from Keir Fraser’s PhD Thesis, Practical Lock-Freedom (2004) which I strongly recommend. Fraser’s thesis goes much further than this, e.g. fine-grained lock-free implementations of software transactional memory (STM)¹. Fraser went on to contribute to the initial implementations of STM in Haskell, though his thesis uses C++.

First, some prerequisites.

Terms

I imagine most developers when they hear the term “lock-free” take it to mean a concurrent algorithm implemented without using locks. It, however, has a technical definition. Assuming a concurrent application is functionally correct, e.g. it does the right thing if it terminates no matter how things are scheduled, we still have three liveness problems in decreasing order of severity:

Deadlock - the application getting stuck in state where no subprocesses can be scheduled
Livelock - the application fails to make progress despite subprocesses being scheduled, e.g. endlessly retrying
Starvation - some subprocesses never make progress even though the application as a whole makes progress

In parallel, we have three properties corresponding to programs that cannot exhibit the above behaviors:

Obstruction-free - no deadlock, you’ll get this if you don’t use locks
Lock-free - obstruction-free and no livelock
Wait-free - lock-free and no starvation

Wait-freedom is the most desirable property but was difficult to achieve with reasonable efficiency. However, relatively recent techniques (2014) may do for wait-free algorithms what Fraser’s thesis did for lock-free algorithms, namely reduce a research problem to an exercise. These techniques, however, still start from a lock-free algorithm. Obstruction-freedom is usually what you get from concurrency control mechanisms that abort and retry in the case of conflicts. To achieve lock-freedom, we need to avoid losing and redoing work, or at least doing so repeatedly indefinitely.

See Fraser’s thesis for more formal definitions.

I’ll use “lockless” to mean an algorithm implemented without using locks.

Compare-and-Swap

The usual primitive used to implement and describe lockless algorithms is compare-and-swap, often just called cas. There are other possibilities, but cas is universal, relatively simple, and widely implemented, e.g. as the cmpxchg operation for the x86 architecture. The following is a specification of a cas operation in Haskell with the additional note that this intended to be performed atomically (which it would not be in Haskell).

cas :: (Eq a) => IORef a -> a -> a -> IO a
cas ref old new = do
    curr <- readIORef ref
    if curr == old then do
        writeIORef ref new
        return curr
    else do
        return curr

Specification of Multiple Compare-and-Swap

The specification of multiple compare-and-swap is the straightforward extension to the above to several variables.

mcasSpec :: (Eq a) => [(IORef a, a, a)] -> IO Bool
mcasSpec entries = do
    eqs <- forM entries $ \(ref, old, _) -> do
        curr <- readIORef ref
        return (curr == old)
    if and eqs then do -- if all equal
        forM_ entries $ \(ref, _, new) -> do
            writeIORef ref new
        return True
    else do
        return False

The above is, again, intended to be executed atomically. It will be convenient to allow a bit more flexibility in the type producing the type:

mcas :: (Eq a) => [(MCASRef a, a, a)] -> IO Bool

where we have

-- Abstract.
newtype MCASRef a = MCASRef { unMCASRef :: IORef (Either (T a) a) } deriving ( Eq )

newMCASRef :: a -> IO (MCASRef a)
newMCASRef v = MCASRef <$> newIORef (Right v)

readMCASRef :: MCASRef a -> IO a
-- Will be implemented below.

The idea here is that, in addition to values of type a, we can also store values of type T a into the pointers for internal use, and we can unambiguously distinguish them from values of type a. T a can be any type constructor you like.

In the code below, I will assume IORefs have an Ord instance, i.e. that they can be sorted. This is not true, but an approach as in ioref-stable could be used to accomplish this. Alternatively, Ptrs to StablePtrs could be used.

We won’t worry about memory consistency concerns here. That is, we’ll assume sequential consistency where all CPU cores see all updates immediately.

I recommend stopping here and thinking about how you would implement mcas in terms of cas while, of course, achieving the desired atomicity. The solution I’ll present – the one from Fraser’s thesis – is moderately involved, so if the approach you come up with is very simple, then you’ve probably made a mistake. Unsurprisingly, the solution I’ll present makes use of the additional flexibility in the type and the ability to sort IORefs. I’m not claiming it is impossible to accomplish this without these though. For example, you could apply a universal construction which witnesses the universality of cas.

Recursive Helping

Lockless algorithms typically proceed by attempting an operation and detecting conflicts, e.g. as in multi-version concurrency control. This requires storing enough information to tell that a conflicting operation has occurred/is occurring. Once a conflict is detected, the simplest solution is typically to abort and retry hoping that there isn’t a conflict the next time. This clearly leads to the possibility of livelock.

Instead of aborting, the later invocation of the operation could instead help the earlier one to complete, thereby getting it out of its way. This ensures that the first invocation will always, eventually complete giving us lock-freedom. However, this doesn’t guarantee that once the second invocation finishes helping the first invocation that a third invocation won’t jump in before the second invocation gets a chance to start on its own work. In this case, the second invocation will help the third invocation to complete before attempting to start itself. A process can end up spending all its time helping other processes while never getting its own work done, leading to starvation.

To perform recursive helping, we need an invocation to store enough information so that subsequent, overlapping invocations are able to assist. To accomplish this, we’ll store a (pointer to a) “descriptor” containing the parameters of the invocation being helped and potentially additional information². This is what we’ll use for the T type constructor.

The general approach will be: at the beginning of the operation we will attempt to “install” a descriptor in the first field we touch utilizing cas. There are then three possible outcomes. If we fail and find a value, then the operation has failed. If we fail and find an existing descriptor, then we (potentially recursively) help that descriptor. If we succeed, then we have successfully “acquired” the field and we “help” ourselves. We can have many processes all trying to help the same invocation at the same time, so it is still important that multiple identical help calls don’t interfere with each other. Just because we’re helping an invocation of an operation doesn’t mean that that the original process isn’t still executing.

Since we’ll be replacing pointers to values with pointers to descriptors, reading the value becomes non-trivial. In particular, if when we read a pointer we get a descriptor, we’ll need to help the invocation described to completion. We will need to keep doing this until we successfully read a value.

Conditional Compare-and-Swap

An operation we’ll use in the implementations of mcas is a conditional compare-and-swap (CCAS) where we only perform the swap if, additionally, an additional variable is set to a given value. It has the following specification.

-- Specification. Implementations should perform this atomically.
ccasSpec :: (Eq a, Eq c) => IORef a -> a -> a -> IORef c -> c -> IO ()
ccasSpec ref old new condRef check = do
    curr <- readIORef ref
    cond <- readIORef condRef
    if cond == check && curr == old then do
        writeIORef ref new
    else do
        return ()

We’ll need to show that this can be implemented in terms of cas, or rather a version with modifications similar to those mentioned for mcas. This will be a simple instance of the recursive helping approach that will be applied in the implementation of mcas.

type CCASDescriptor a c = IORef (CCASRef a c, a, a, IORef c, c)

newtype CCASRef a c = CCASRef { unCCASRef :: IORef (Either (CCASDescriptor a c) a) } deriving ( Eq, Ord )

We begin with the types. As described above, a CCASRef is just an IORef that holds either a value or a descriptor, and the descriptor is just an IORef pointing at a tuple holding the arguments to ccas. We won’t actually modify this latter IORef and instead are just using it for its object identity. It could be replaced with a read-only IVar or a Unique could be allocated and used as an identifier instead. In a lower-level language, this IORef corresponds to having a pointer to the descriptor.

newCCASRef :: a -> IO (CCASRef a c)
newCCASRef v = CCASRef <$> newIORef (Right v)

readCCASRef :: (Eq a, Eq c) => CCASRef a c -> IO a
readCCASRef ref = do
    x <- readIORef (unCCASRef ref)
    case x of
        Right v -> return v
        Left d -> do
            ccasHelp d
            readCCASRef ref

-- Not atomic. This CAS can fail even when it would be impossible if `ccas` was truly atomic.
-- Example: ccas a reference to the same value but where the condRef is False. The ccas fails
-- and thus should behave as a no-op, but if a `casCCASRef` occurs during the course of the ccas,
-- the `casCCASRef` can fail even though it should succeed in all cases.
casCCASRef :: (Eq a) => CCASRef a c -> a -> a -> IO Bool
casCCASRef (CCASRef ref) old new = do
    curr <- cas ref (Right old) (Right new)
    return (curr == Right old)

tryReadCCASRef :: CCASRef a c -> IO (Maybe a)
tryReadCCASRef ref = do
    x <- readIORef (unCCASRef ref)
    return (case x of Left _ -> Nothing; Right v -> Just v)

To get them out of the way, the following functions implement the reference-like aspects of a CCASRef. The descriptor is an internal implementation detail. The interface is meant to look like a normal reference to a value of type a. The main notes are:

since the CCASRef may not contain a value when we read, we loop helping to complete the ccas until it does,
casCCASRef is a slightly simplified cas used inmcas but should be not be considered part of the interface, and
tryReadCCASRef is used in the implementation of mcas, but you quite possibly wouldn’t provide it otherwise.

ccas :: (Eq a, Eq c) => CCASRef a c -> a -> a -> IORef c -> c -> IO ()
ccas ref old new condRef check = do
    d <- newIORef (ref, old, new, condRef, check)
    v <- cas (unCCASRef ref) (Right old) (Left d)
    go d v
  where go d (Left d') = do -- descriptor already there
            ccasHelp d'
            v <- cas (unCCASRef ref) (Right old) (Left d)
            go d v
        go d (Right curr) | curr == old = ccasHelp d -- we succeeded
                          | otherwise = return ()   -- we failed
    
ccasHelp :: (Eq a, Eq c) => CCASDescriptor a c -> IO ()
ccasHelp d = do
    (CCASRef ref, old, new, condRef, check) <- readIORef d
    cond <- readIORef condRef
    _ <- cas ref (Left d) (Right $! if cond == check then new else old)
    return ()

Here we illustrate the (not so recursive) helping pattern. ccas allocates a descriptor and then attempts to “acquire” the reference. There are three possibilities.

We find a descriptor already there, in which case we help it and then try to acquire the reference again.
The CAS succeeds and thus we successfully “acquire” the reference. We then “help ourselves”.
The CAS fails with an unexpected (non-descriptor) value. Thus, the CCAS fails and we do nothing.

Helping, implemented by ccasHelp, just performs the logic of CCAS. If we’ve gotten to ccasHelp, we know the invocation described by the descriptor did, in fact, find the expected value there. By installing our descriptor, we’ve effectively “locked out” any other calls to ccas until we complete. We can thus check the condRef at our leisure. As long as our descriptor is still in the CCASRef, which we check via a cas, we know that there have been no intervening operations, including other processes completing this ccas. ccasHelp is idempotent in the sense that running it multiple times, even in parallel, with the same descriptor is the same as running it once. This is due to the fact that we only (successfully) CAS in the descriptor once, so we can only CAS it out at most once.

Multiple Compare-and-Swap

The setup for MCAS is much the same as CCAS. The main additional complexity comes from the fact that we need to simultaneously “acquire” multiple references. This is handled by a two-phase approach. In the first phase, we attempt to “acquire” each reference. We proceed to the second phase once we’ve either seen that the MCAS is going to fail, or we have successfully “acquired” each reference. In the second phase, we either reset all the “acquired” references to their old values if the MCAS failed or to their new values if it succeeded. The MCAS will be considered to have occurred atomically at the point we record this decision, via a CAS, i.e. between the two phases.

data MCASStatus = UNDECIDED | FAILED | SUCCESSFUL deriving ( Eq )

data MCASDescriptor' a = MCASDescriptor [(MCASRef a, a, a)] (IORef MCASStatus)
type MCASDescriptor a = IORef (MCASDescriptor' a)

newtype MCASRef a = MCASRef { unMCASRef :: CCASRef (Either (MCASDescriptor a) a) MCASStatus }
    deriving ( Eq, Ord )

As with CCAS, an MCASRef is a reference, in this case a CCASRef, that either holds a value or a descriptor. The descriptor holds the arguments of mcas, as with ccas, but it additionally holds a status reference. This status reference will be used as the condition reference of the CCAS. In particular, as we’ll see, we will only perform ccas’s when the status is UNDECIDED.

newMCASRef :: a -> IO (MCASRef a)
newMCASRef v = MCASRef <$> newCCASRef (Right v)

readMCASRef :: (Eq a) => MCASRef a -> IO a
readMCASRef ref = do
    x <- readCCASRef (unMCASRef ref)
    case x of
        Right v -> return v
        Left d -> do
            mcasHelp d
            readMCASRef ref

There’s nothing to say about the reference interface functions. They are essentially identical to the CCAS ones for the same reasons only with CCASRefs instead of IORefs.

mcas :: (Eq a) => [(MCASRef a, a, a)] -> IO Bool
mcas entries = do
    status <- newIORef UNDECIDED
    d <- newIORef (MCASDescriptor (sortOn (\(ref, _, _) -> ref) entries) status)
    mcasHelp d

The mcas function is fairly straightforward. It allocates a status reference and a descriptor and delegates most of the work to mcasHelp. The main but critical subtlety is the sort. This is critical to ensuring termination.

mcasHelp :: (Eq a) => MCASDescriptor a -> IO Bool
mcasHelp d = do
    MCASDescriptor entries statusRef <- readIORef d
    let phase1 [] = do
            _ <- cas statusRef UNDECIDED SUCCESSFUL
            phase2
        phase1 ((MCASRef ref, old, new):es) = tryAcquire ref old new es

        tryAcquire ref old new es = do
            _ <- ccas ref (Right old) (Left d) statusRef UNDECIDED
            v <- tryReadCCASRef ref
            case v of
                Just (Left d') | d == d' -> phase1 es -- successful acquisition
                               | otherwise -> do -- help someone else
                                    mcasHelp d'
                                    tryAcquire ref old new es
                Just (Right curr) | curr == old -> do
                    status <- readIORef statusRef
                    if status == UNDECIDED then do
                        tryAcquire ref old new es -- failed to acquire but could still succeed
                    else do
                        phase2
                _ -> do -- failed MCAS
                    _ <- cas statusRef UNDECIDED FAILED
                    phase2

        phase2 = do
            status <- readIORef statusRef
            let succeeded = status == SUCCESSFUL
            forM_ entries $ \(MCASRef ref, old, new) -> do
                casCCASRef ref (Left d) (Right (if succeeded then new else old))
            return succeeded

    phase1 entries

phase1 attempts to “acquire” each MCASRef by using tryAcquire which will move phase1 to the next entry each time it succeeds. Therefore, if phase1 reaches the end of the list and there was no interference, we will have successfully “acquired” all references. We record this with a CAS against statusRef. If this CAS succeeds, then the MCAS will be considered successful and conceptually to have occurred at this point. If the CAS fails, then some other process has already completed this MCAS, possibly in success or failure. We then move to phase2.

tryAcquire can also detect that the MCAS should fail. In this case, we immediately attempt to record this fact via a CAS into statusRef. As with the successful case, this CAS succeeding marks the conceptual instant that the MCAS completes. As before, we then move on to phase2.

We never enter phase2 without statusRef being set to either SUCCESSFUL or FAILED. phase2 is completely straightforward. We simply set each “acquired” reference to either the new or old value depending on whether the MCAS succeeded or not. The casCCASRef will fail if either we never got around to “acquiring” a particular reference (in the case of MCAS failure), or if a reference was written to since it was “acquired”. Since such writes conceptually occurred after the MCAS completed, we do not want to overwrite them.

During tryAcquire, there are a few cases that lead to retrying. First, if we find that a reference has already been “acquired” by some other MCAS operation, we recursively help it. Here, the sorting of the references is important to ensure that any MCAS operation we help will never try to help us back. It’s easy to see that without the sorting, if two MCAS operations on the same two references each acquired one of the references, the (concurrent) recursive calls to mcasHelp would become infinite loops. With a total ordering on references, each recursive call to mcasHelp will be at a greater reference and thus must eventually terminate. The other case for tryAcquire, is that the expected value is written after the ccas but before the tryReadCCASRef. In this case, we try again unless the status has already been decided. It might seem like this is just an optimization, and that we could instead treat this as the MCAS failing. However, the intervening write may have written the value that was there before the ccas, meaning that there was never a point at which the MCAS could have failed.

References are only “acquired” at the ccas in phase1. Once the status has been decided, no references may be “acquired” any longer. Since it’s impossible to enter phase2 without deciding the status, once one process enters phase2, no processes are capable of “acquiring” references. This makes phase2 idempotent and, indeed, each CAS in phase2 is independently idempotent. Overlapping executions of phase1 are fine essentially because each ccas is idempotent and the statusRef can change at most once.

Let’s see how an mcas interacts with other operations from the perspective of atomicity. If we attempt to read a reference via readMCASRef which is included in the list of references of an ongoing mcas, there are two possibilities. Either that reference has not yet been “acquired” by the mcas, in which case the read will occur conceptually before the MCAS, or it has been “acquired” in which case the read will help the MCAS to completion and then try again after the MCAS. The story is similar for overlapping mcas’s. The mcas which “acquires” the least reference in their intersection will conceptually complete first, either because it literally finishes before the second mcas notices or because the second mcas will help it to completion. Writes are only slightly different.

It is important to note that these operations are NOT atomic with respect to some other reasonable operations. Most notably, they are not atomic with respect to blind writes. It is easy to construct a scenario where two blind writes happen in sequence but the first appears to happen after the mcas and the second before. Except for initialization, I don’t believe there are any blind writes to references involved in a mcas in Fraser’s thesis. Fraser’s thesis does, however, contain cas operations directly against these references. These are also not atomic for exactly the same reason casCCASRef isn’t. That said, Fraser’s uses of cas against MCASRefs are safe, because in each case they just retry until success.

Conclusion

While I’ve gone into a good amount of detail here, I’ve mainly wanted to illustrate the concept of recursive helping. It’s a key concept for lock-free and wait-free algorithm designs, but it also may be a useful idea to have in mind when designing concurrent code even if you aren’t explicitly trying to achieve a lock-free guarantee.

MCAS allows you to perform transactions involving multiple updates atomically. STM additionally allows you to perform transactions involving multiple reads and updates atomically.↩︎
While I haven’t attempted it to see if it works out, it seems like you could make a generic “recursive helping” framework by storing an IO action instead. The “descriptors” have the flavor of defunctionalized continuations.↩︎