Non-Blocking Algorithms

Disadvantages of Blocking Algorithms

  • Performance: Leads to reduced performance under high concurrency due to lock contention.
  • Deadlocks: Possibility of deadlocks when multiple threads wait on each other’s locks.
  • Resource Utilization: Inefficient use of system resources with threads blocked and idle.

Advantages of Non-Blocking Algorithms

  • Scalability: Enable concurrent operations without locks.
  • Deadlock-Free: Eliminates the risk of deadlocks.
  • Efficiency: Offers finer granularity of operation.

Atomic Variables

Role of Atomic Variables

  • Atomic Operations: Enable low-level atomic operations.
  • Data Integrity: Ensure data integrity without locks.
  • Utility:
    • Counters and statistics.
    • Concurrent data structure implementation.

The Counter implementation using an AtomicInteger

public class AtomicCounter {
    AtomicInteger value = new AtomicInteger(0);

    void increment() {
        value.incrementAndGet();
    }

    int getValue() {
        return value.get();
    }
}

Typical Operations

  • get(), set(int newValue), getAndSet(int newValue)
  • compareAndSet(int expect, int newValue)
  • getAndIncrement(), getAndDecrement(), getAndAdd(int delta)
  • getAndUpdate(IntUnaryOperator lambda)

In Rust

struct Counter { value: AtomicU64 }

impl Counter {
    // Initialize a new counter
    fn new() -> Counter {  Counter { value: AtomicU64::new(0) } }

    // Increment the counter by 1
    fn increment(&self) {
        // Relaxed ordering is often sufficient for simple counters.
        self.value.fetch_add(1, Ordering::Relaxed);
    }

    // Get the current value of the counter
    fn get(&self) -> usize { self.value.load(Ordering::Relaxed) }
}

If you are really intrigued about the Ordering parameter, you can check: The rust nomicon

Typical Operations

  • new(val: i32) -> AtomicI32
  • load(order: Ordering) -> i32, store(val: i32, order: Ordering),
  • compare_exchange(expected: i32, new: i32, ...)
  • fetch_add(val: i32, order: Ordering) -> i32, fetch_sub(val: i32, order: Ordering) -> i32
  • fetch_update<F>(set_order: Ordering, fetch_order: Ordering, lambda: F)

Non Blocking Concurrent Data Structures

A Stack

class Stack<E> {
    class Node<E>(val item: E, var next: Node<E>? = null)

    private var top: Node<E>? = null

    fun push(item: E) {
        val newHead = Node(item)
        newHead.next = top
        top = newHead
    }
    
    // fun pop(): E? { ... }
}

A Non-Blocking Concurrent Stack

class ConcurrentStack<E> {
    class Node<E>(val item: E, var next: Node<E>? = null)

    private var top = AtomicReference<Node<E>?>()

    fun push(item: E) {
        val newHead = Node(item)
        var oldHead: Node<E>?
        do {
            oldHead = top.get()
            newHead.next = oldHead
        } while (!top.compareAndSet(oldHead, newHead))
    }
}

Pop Implementation

  fun pop(): E? {
      
    var oldHead: Node<E>? = top.get()
    
    while (oldHead != null && !top.compareAndSet(oldHead, oldHead.next))
        oldHead = top.get()
    
    return oldHead?.item
}

A Non Blocking Queue

A typical Queue implementation using a linked list

Implementation (Non Concurrent One)

class Queue<E> {
    class Node<E>(val item: E?, var next: Node<E>? = null)

    val dummy = Node<E>(null)
    var head = dummy
    var tail = dummy

    fun enqueue(item: E) {
        val newNode = Node(item)
        tail.next = newNode
        tail = newNode
    }
    fun dequeue(): E? {
        val headNext = head.next
        if (headNext == null) return null
        head = headNext
        return head.item
    }
}

The Problem for a concurrent algorithm

  • Two pointers refer to the node at the tail (Node 2)
    • The next pointer of the second to last element (Node 1),
    • The tail pointer
  • To insert a new element both should be updated atomically

Steps during insertion

Intermediate State

val newNode = Node(item)
tail.next = newNode

"At rest" State

tail = newNode

The Trick!

  • Given Threads A and B
  • If B finds the structure in the middle of an update by A, B can complete the update.
  • B “helps” A to finish A’s operation
  • When A gets try to finish its operation, it will find that B already did the job for it.

The implementation

First - Replace the pointers by AtomicReferences

class Queue {
    class Node<E>(val item: E?,
                  val next: AtomicReference<Node<E>?> = AtomicReference(null))

    private val dummy = Node<E>(null)
    private val head = AtomicReference(dummy)
    private val tail = AtomicReference(dummy)

    fun enqueue(item: E) {
        val newNode = Node(item)
        val curTail = tail.get()
        curTail.next.compareAndSet(null, newNode) // What happens if this fails ??
        tail.compareAndSet(curTail, newNode)
    }
}

Add a loop to try until success

val newNode = Node(item)
while (true) {
    val curTail = tail.get()

    if (curTail.next.compareAndSet(null, newNode)) {
        tail.compareAndSet(curTail, newNode) // What happens if this fails ???
        return
    }
}

Complete the operation if tail is already updated

 val newNode = Node(item)
while (true) {
    val curTail = tail.get()
    val tailNext = curTail.next.get()

    if (tailNext != null) {
        // Queue in intermediate state, advance tail
        tail.compareAndSet(curTail, tailNext)
    }
    else if (curTail.next.compareAndSet(null, newNode)) {
        tail.compareAndSet(curTail, newNode)
        return
    }
}

Last detail

val curTail = tail.get()

// What happens if the thread is interrupted here ??

val tailNext = curTail.next.get()

// tailNext will point to an invalid node !

Ensure that I have the right tail

while (true) {
    val curTail = tail.get()
    val tailNext = curTail.next.get()
    // Check if the tail has not moved
    if (curTail == tail.get()) { 
        // Previous code
    }
    // Try again 
}

The ABA Problem in Concurrency

  • What is the ABA Problem?
    • A concurrency issue where a value changes from A to B and back to A.
    • Concurrent operations may not detect this change, leading to incorrect assumptions.
  • Why is it a Problem?
    • Operations like compare-and-swap (CAS) can be tricked into thinking no change occurred.
    • Potentially causing incorrect program behavior.
    • For example. If I pop an item from a stack and modify it if still is the same

Solutions and Mitigations

  • Versioning: Attach version numbers or timestamps to data, ABA becomes A1-B2-A3.
  • In Java there is a class AtomicStampedReference that does that
  • Example: ref.compareAndSet(currentValue, newValue, currentStamp, newStamp);
  • Note that in Rust the problem in the example above should not happen. Why ?

Pros & Cons of Non Blocking algorithms

Aspect Pros Cons
Performance High in low contention. Can degrade in high contention.
Scalability Improved due to no blocking. Limited by contention and retry overhead.
Deadlock Avoided entirely. Livelocks can occur.
Simplicity Straightforward for simple operations. Complex operations are hard to design.

High contention is when multiple threads very frequently attempts to access and modify the same shared resource at the same time.

Pros & Cons of Non Blocking algorithms

Aspect Pros Cons
System Overhead Lower, with no context-switching. Increased by busy-waiting in contention.
Recovery No inconsistent states on thread failure. Complex recovery for consistency.
Fairness Not inherent; may cause starvation. Fairness hard to ensure.
Memory Model Can be efficient with modern CPUs. Requires deep understanding to avoid issues.