Concurrent Queues and Stacks
Concurrent Queues and Stacks Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
The Five-Fold Path
Art of Multiprocessor Programming 2 2 The Five-Fold Path Coarse-grained locking Fine-grained locking Optimistic synchronization Lazy synchronization Lock-free synchronization
Another Fundamental Problem
Art of Multiprocessor Programming 3 3 Another Fundamental Problem We told you about Sets implemented by linked lists Next: queues Next: stacks
Queues & Stacks
Art of Multiprocessor Programming 4 4 Queues & Stacks pool of items
Queues
Art of Multiprocessor Programming 5 5 Queues deq()/ enq( ) Total order First in First out
Stacks
Art of Multiprocessor Programming 6 6 Stacks pop()/ push( ) Total order Last in First out
Bounded
Art of Multiprocessor Programming 7 7 Bounded Fixed capacity Good when resources an issue
Unbounded
Art of Multiprocessor Programming 8 8 Unbounded Unlimited capacity Often more convenient
Blocking
Art of Multiprocessor Programming 9 Blocking zzz … Block on attempt to remove from empty stack or queue
Blocking
Art of Multiprocessor Programming 10 Blocking zzz … Block on attempt to add to full bounded stack or queue
Non-Blocking
Art of Multiprocessor Programming 11 Non-Blocking Ouch! Throw exception on attempt to remove from empty stack or queue
This Lecture
Art of Multiprocessor Programming 12 12 This Lecture Queue Bounded, blocking, lock-based Unbounded, non-blocking, lock-free Stack Unbounded, non-blocking lock-free Elimination-backoff algorithm
Queue: Concurrency
Art of Multiprocessor Programming 13 13 Queue: Concurrency enq(x) y=deq() enq() and deq() work at different ends of the object tail head
Concurrency
Art of Multiprocessor Programming 14 14 Concurrency enq(x) Challenge: what if the queue is empty or full? y=deq() tail head
Bounded Queue
Art of Multiprocessor Programming 15 15 Bounded Queue Sentinel head tail
Bounded Queue
Art of Multiprocessor Programming 16 16 Bounded Queue head tail First actual item
Bounded Queue
Art of Multiprocessor Programming 17 17 Bounded Queue head tail Lock out other deq() calls deqLock
Bounded Queue
Art of Multiprocessor Programming 18 18 Bounded Queue head tail Lock out other enq() calls deqLock enqLock
Not Done Yet
Art of Multiprocessor Programming 19 19 Not Done Yet head tail deqLock enqLock Need to tell whether queue is full or empty
Not Done Yet
Art of Multiprocessor Programming 20 20 Not Done Yet head tail deqLock enqLock Max size is 8 items size 1
Not Done Yet
Art of Multiprocessor Programming 21 21 Not Done Yet head tail deqLock enqLock Incremented by enq() Decremented by deq() size 1
Enqueuer
Art of Multiprocessor Programming 22 22 Enqueuer head tail deqLock enqLock size 1 Lock enqLock
Enqueuer
Art of Multiprocessor Programming 23 23 Enqueuer head tail deqLock enqLock size 1 Read size OK
Enqueuer
Art of Multiprocessor Programming 24 24 Enqueuer head tail deqLock enqLock size 1 No need to lock tail
Enqueuer
Art of Multiprocessor Programming 25 25 Enqueuer head tail deqLock enqLock size 1 Enqueue Node
Enqueuer
Art of Multiprocessor Programming 26 26 Enqueuer head tail deqLock enqLock size 1 2 getAndincrement()
Enqueuer
Art of Multiprocessor Programming 27 27 Enqueuer head tail deqLock enqLock size 8 Release lock 2
Enqueuer
Art of Multiprocessor Programming 28 28 Enqueuer head tail deqLock enqLock size 2 If queue was empty, notify waiting dequeuers
Unsuccesful Enqueuer
Art of Multiprocessor Programming 29 29 Unsuccesful Enqueuer head tail deqLock enqLock size 8 Uh-oh Read size
Dequeuer
Art of Multiprocessor Programming 30 30 Dequeuer head tail deqLock enqLock size 2 Lock deqLock
Dequeuer
Art of Multiprocessor Programming 31 31 Dequeuer head tail deqLock enqLock size 2 Read sentinel’s next field OK
Dequeuer
Art of Multiprocessor Programming 32 32 Dequeuer head tail deqLock enqLock size 2 Read value
Dequeuer
Art of Multiprocessor Programming 33 33 Dequeuer head tail deqLock enqLock size 2 Make first Node new sentinel
Dequeuer
Art of Multiprocessor Programming 34 34 Dequeuer head tail deqLock enqLock size 1 Decrement size
Dequeuer
Art of Multiprocessor Programming 35 35 Dequeuer head tail deqLock enqLock size 1 Release deqLock
Unsuccesful Dequeuer
Art of Multiprocessor Programming 36 36 Unsuccesful Dequeuer head tail deqLock enqLock size 0 Read sentinel’s next field uh-oh
Bounded Queue
Art of Multiprocessor Programming 37 37 Bounded Queue public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition(); }
Bounded Queue
Art of Multiprocessor Programming 38 38 Bounded Queue public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition(); } enq & deq locks
Digression: Monitor Locks
Art of Multiprocessor Programming 39 39 Digression: Monitor Locks Java synchronized objects and ReentrantLocks are monitors Allow blocking on a condition rather than spinning Threads: acquire and release lock wait on a condition
The Java Lock Interface
Art of Multiprocessor Programming 40 40 public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock; } The Java Lock Interface Acquire lock
The Java Lock Interface
Art of Multiprocessor Programming 41 41 public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock; } The Java Lock Interface Release lock
The Java Lock Interface
Art of Multiprocessor Programming 42 42 public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock( long time, TimeUnit unit); Condition newCondition(); void unlock; } The Java Lock Interface Try for lock, but not too hard
The Java Lock Interface
Art of Multiprocessor Programming 43 43 public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock; } The Java Lock Interface Create condition to wait on
The Java Lock Interface
Art of Multiprocessor Programming 44 44 The Java Lock Interface public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock; } Never mind what this method does
Lock Conditions
Art of Multiprocessor Programming 45 45 Lock Conditions public interface Condition { void await(); boolean await( long time, TimeUnit unit); void signal(); void signalAll(); }
Lock Conditions
Art of Multiprocessor Programming 46 46 public interface Condition { void await(); boolean await( long time, TimeUnit unit); void signal(); void signalAll(); } Lock Conditions Release lock and wait on condition
Lock Conditions
Art of Multiprocessor Programming 47 47 public interface Condition { void await(); boolean await(long time, TimeUnit unit); void signal(); void signalAll(); } Lock Conditions Wake up one waiting thread
Lock Conditions
Art of Multiprocessor Programming 48 48 public interface Condition { void await(); boolean await(long time, TimeUnit unit); void signal(); void signalAll(); } Lock Conditions Wake up all waiting threads
Await
Art of Multiprocessor Programming 49 49 Await Releases lock associated with q Sleeps (gives up processor) Awakens (resumes running) Reacquires lock & returns q.await()
Signal
Art of Multiprocessor Programming 50 50 Signal Awakens one waiting thread Which will reacquire lock q.signal();
Signal All
Art of Multiprocessor Programming 51 51 Signal All Awakens all waiting threads Which will each reacquire lock q.signalAll();
A Monitor Lock
Art of Multiprocessor Programming 52 52 A Monitor Lock Critical Section waiting room lock () unlock ()
Unsuccessful Deq
Art of Multiprocessor Programming 53 53 Unsuccessful Deq Critical Section waiting room lock () await() deq () Oh no, empty !
Another One
Art of Multiprocessor Programming 54 54 Another One Critical Section waiting room lock () await() deq () Oh no, empty !
Enqueuer to the Rescue
Art of Multiprocessor Programming 55 55 Enqueuer to the Rescue Critical Section waiting room lock () signalAll () enq ( ) unlock () Yawn! Yawn!
Monitor Signalling
Art of Multiprocessor Programming 56 56 Yawn! Monitor Signalling Critical Section waiting room Yawn! Awakened thread might still lose lock to outside contender…
Dequeuers Signalled
Art of Multiprocessor Programming 57 57 Dequeuers Signalled Critical Section waiting room Found it Yawn!
Dequeuers Signaled
Art of Multiprocessor Programming 58 58 Yawn! Dequeuers Signaled Critical Section waiting room Still empty!
Dollar Short + Day Late
Art of Multiprocessor Programming 59 59 Dollar Short + Day Late Critical Section waiting room
Java Synchronized Methods
Art of Multiprocessor Programming 60 60 public class Queue<T> { int head = 0, tail = 0; T[QSIZE] items; public synchronized T deq() { while (tail – head == 0) wait(); T result = items[head % QSIZE]; head++; notifyAll(); return result; } }} Java Synchronized Methods
Java Synchronized Methods
Art of Multiprocessor Programming 61 61 public class Queue<T> { int head = 0, tail = 0; T[QSIZE] items; public synchronized T deq() { while (tail – head == 0) wait(); T result = items[head % QSIZE]; head++; notifyAll(); return result; } }} Java Synchronized Methods Each object has an implicit lock with an implicit condition
Java Synchronized Methods
Art of Multiprocessor Programming 62 62 public class Queue<T> { int head = 0, tail = 0; T[QSIZE] items; public synchronized T deq() { while (tail – head == 0) wait(); T result = items[head % QSIZE]; head++; notifyAll(); return result; } }} Java Synchronized Methods Lock on entry, unlock on return
Java Synchronized Methods
Art of Multiprocessor Programming 63 63 public class Queue<T> { int head = 0, tail = 0; T[QSIZE] items; public synchronized T deq() { while (tail – head == 0) wait(); T result = items[head % QSIZE]; head++; this.notifyAll(); return result; } }} Java Synchronized Methods Wait on implicit condition
Java Synchronized Methods
Art of Multiprocessor Programming 64 64 public class Queue<T> { int head = 0, tail = 0; T[QSIZE] items; public synchronized T deq() { while (tail – head == 0) this.wait(); T result = items[head % QSIZE]; head++; notifyAll(); return result; } }} Java Synchronized Methods Signal all threads waiting on condition
(Pop!) The Bounded Queue
Art of Multiprocessor Programming 65 65 (Pop!) The Bounded Queue public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition(); }
Bounded Queue Fields
Art of Multiprocessor Programming 66 66 Bounded Queue Fields public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition(); } Enq & deq locks
Bounded Queue Fields
Art of Multiprocessor Programming 67 67 Bounded Queue Fields public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition(); } Enq lock’s associated condition
Bounded Queue Fields
Art of Multiprocessor Programming 68 68 Bounded Queue Fields public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition(); } size : 0 to capacity
Bounded Queue Fields
Art of Multiprocessor Programming 69 69 Bounded Queue Fields public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition(); } Head and Tail
Enq Method Part One
Art of Multiprocessor Programming 70 70 Enq Method Part One public void enq(T x) { boolean mustWakeDequeuers = false ; enqLock.lock(); try { while (size.get() == Capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true ; } finally { enqLock.unlock(); } }
Enq Method Part One
Art of Multiprocessor Programming 71 71 public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } } Enq Method Part One Lock and unlock enq lock
Enq Method Part One
Art of Multiprocessor Programming 72 72 public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } } Enq Method Part One Wait while queue is full
Enq Method Part One
Art of Multiprocessor Programming 73 73 public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } } Enq Method Part One when await() returns, you might still fail the test !
Be Afraid
Art of Multiprocessor Programming 74 74 public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } } Be Afraid After the loop: how do we know the queue won’t become full again?
Enq Method Part One
Art of Multiprocessor Programming 75 75 public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } } Enq Method Part One Add new node
Enq Method Part One
Art of Multiprocessor Programming 76 76 public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true ; } finally { enqLock.unlock(); } } Enq Method Part One If queue was empty, wake frustrated dequeuers
Beware Lost Wake-Ups
Art of Multiprocessor Programming 77 77 Beware Lost Wake-Ups Critical Section waiting room lock () Queue empty so signal () enq ( ) unlock () Yawn!
Lost Wake-Up
Art of Multiprocessor Programming 78 78 Lost Wake-Up Critical Section waiting room lock () enq ( ) unlock () Yawn! Queue not empty so no need to signal
Lost Wake-Up
Art of Multiprocessor Programming 79 79 Lost Wake-Up Critical Section waiting room Yawn!
Lost Wake-Up
Art of Multiprocessor Programming 80 80 Lost Wake-Up Critical Section waiting room Found it
What’s Wrong Here?
Art of Multiprocessor Programming 81 81 What’s Wrong Here? Critical Section waiting room Still waiting ….!
Solution to Lost Wakeup
Art of Multiprocessor Programming 82 Solution to Lost Wakeup Always use signalAll() and notifyAll() Not signal() and notify() 82
Enq Method Part Deux
Art of Multiprocessor Programming 83 83 Enq Method Part Deux public void enq(T x) { if (mustWakeDequeuers) { deqLock.lock(); try { notEmptyCondition.signalAll(); } finally { deqLock.unlock(); } } }
Enq Method Part Deux
Art of Multiprocessor Programming 84 84 Enq Method Part Deux public void enq(T x) { if (mustWakeDequeuers) { deqLock.lock(); try { notEmptyCondition.signalAll(); } finally { deqLock.unlock(); } } } Are there dequeuers to be signaled?
Enq Method Part Deux
Art of Multiprocessor Programming 85 85 public void enq(T x) { if (mustWakeDequeuers) { deqLock.lock(); try { notEmptyCondition.signalAll(); } finally { deqLock.unlock(); } } } Enq Method Part Deux Lock and unlock deq lock
Enq Method Part Deux
Art of Multiprocessor Programming 86 86 public void enq(T x) { if (mustWakeDequeuers) { deqLock.lock(); try { notEmptyCondition.signalAll(); } finally { deqLock.unlock(); } } } Enq Method Part Deux Signal dequeuers that queue is no longer empty
The Enq() & Deq() Methods
Art of Multiprocessor Programming 87 87 The Enq() & Deq() Methods Share no locks That’s good But do share an atomic counter Accessed on every method call That’s not so good Can we alleviate this bottleneck?
Split the Counter
Art of Multiprocessor Programming 88 88 Split the Counter The enq() method Increments only Cares only if value is capacity The deq() method Decrements only Cares only if value is zero
Split Counter
Art of Multiprocessor Programming 89 89 Split Counter Enqueuer increments enqSize Dequeuer decrements deqSize When enqueuer runs out Locks deqLock computes size = enqSize - DeqSize Intermittent synchronization Not with each method call Need both locks! (careful …)
A Lock-Free Queue
Art of Multiprocessor Programming 90 90 A Lock-Free Queue Sentinel head tail
Compare and Set
Art of Multiprocessor Programming 91 91 Compare and Set CAS
Enqueue
Art of Multiprocessor Programming 92 92 Enqueue head tail enq ( )
Enqueue
Art of Multiprocessor Programming 93 93 Enqueue head tail
Logical Enqueue
Art of Multiprocessor Programming 94 94 Logical Enqueue head tail CAS
Physical Enqueue
Art of Multiprocessor Programming 95 95 Physical Enqueue head tail CAS
Enqueue
Art of Multiprocessor Programming 96 96 Enqueue These two steps are not atomic The tail field refers to either Actual last Node (good) Penultimate Node (not so good) Be prepared!
Enqueue
Art of Multiprocessor Programming 97 97 Enqueue What do you do if you find A trailing tail ? Stop and help fix it If tail node has non- null next field CAS the queue’s tail field to tail.next As in the universal construction
When CASs Fail
Art of Multiprocessor Programming 98 98 When CASs Fail During logical enqueue Abandon hope, restart Still lock-free (why?) During physical enqueue Ignore it (why?)
Dequeuer
Art of Multiprocessor Programming 99 99 Dequeuer head tail Read value
Dequeuer
Art of Multiprocessor Programming 100 100 Dequeuer head tail Make first Node new sentinel CAS
Memory Reuse?
Art of Multiprocessor Programming 101 101 Memory Reuse? What do we do with nodes after we dequeue them? Java: let garbage collector deal? Suppose there is no GC, or we prefer not to use it?
Dequeuer
Art of Multiprocessor Programming 102 102 Dequeuer head tail CAS Can recycle
Simple Solution
Art of Multiprocessor Programming 103 103 Simple Solution Each thread has a free list of unused queue nodes Allocate node: pop from list Free node: push onto list Deal with underflow somehow …
Why Recycling is Hard
Art of Multiprocessor Programming 104 104 Why Recycling is Hard Free pool head tail Want to redirect head from gray to red zzz…
Both Nodes Reclaimed
Art of Multiprocessor Programming 105 105 Both Nodes Reclaimed Free pool zzz head tail
One Node Recycled
Art of Multiprocessor Programming 106 106 One Node Recycled Free pool Yawn! head tail
Why Recycling is Hard
Art of Multiprocessor Programming 107 107 Why Recycling is Hard Free pool CAS head tail OK, here I go!
Recycle FAIL
Art of Multiprocessor Programming 108 108 Recycle FAIL Free pool zOMG what went wrong? head tail
The Dreaded ABA Problem
Art of Multiprocessor Programming 109 109 The Dreaded ABA Problem head tail Head reference has value A Thread reads value A
Dreaded ABA continued
Art of Multiprocessor Programming 110 110 Dreaded ABA continued zzz head tail Head reference has value B Node A freed
Dreaded ABA continued
Art of Multiprocessor Programming 111 111 Dreaded ABA continued Yawn! head tail Head reference has value A again Node A recycled and reinitialized
Dreaded ABA continued
Art of Multiprocessor Programming 112 112 Dreaded ABA continued CAS head tail CAS succeeds because references match, even though reference’s meaning has changed
The Dreaded ABA FAIL
Art of Multiprocessor Programming 113 113 The Dreaded ABA FAIL Is a result of CAS() semantics I blame Sun, Intel, AMD, … Not with Load-Locked/Store-Conditional Good for IBM?
Dreaded ABA – A Solution
Art of Multiprocessor Programming 114 114 Dreaded ABA – A Solution Tag each pointer with a counter Unique over lifetime of node Pointer size vs word size issues Overflow? Don’t worry be happy? Bounded tags? AtomicStampedReference class
Atomic Stamped Reference
Art of Multiprocessor Programming 115 115 Atomic Stamped Reference AtomicStampedReference class Java.util.concurrent.atomic package address S Stamp Reference Can get reference & stamp atomically
Concurrent Stack
Art of Multiprocessor Programming 116 116 Concurrent Stack Methods push(x) pop() Last-in, First-out (LIFO) order Lock-Free!
Empty Stack
Art of Multiprocessor Programming 117 117 Empty Stack Top
Push
Art of Multiprocessor Programming 118 118 Push Top
Push
Art of Multiprocessor Programming 119 119 Push Top CAS
Push
Art of Multiprocessor Programming 120 120 Push Top
Push
Art of Multiprocessor Programming 121 121 Push Top
Push
Art of Multiprocessor Programming 122 122 Push Top
Push
Art of Multiprocessor Programming 123 123 Push Top CAS
Push
Art of Multiprocessor Programming 124 124 Push Top
Pop
Art of Multiprocessor Programming 125 125 Pop Top
Pop
Art of Multiprocessor Programming 126 126 Pop Top CAS
Pop
Art of Multiprocessor Programming 127 127 Pop Top CAS mine!
Pop
Art of Multiprocessor Programming 128 128 Pop Top CAS
Pop
Art of Multiprocessor Programming 129 129 Pop Top
Lock-free Stack
Art of Multiprocessor Programming 130 130 public class LockFreeStack { private AtomicReference top = new AtomicReference( null ); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return (top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while ( true ) { if (tryPush(node)) { return ; } else backoff.backoff(); }} Lock-free Stack
Lock-free Stack
Art of Multiprocessor Programming 131 131 public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public Boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff() }} Lock-free Stack tryPush attempts to push a node
Lock-free Stack
Art of Multiprocessor Programming 132 132 public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff() }} Lock-free Stack Read top value
Lock-free Stack
Art of Multiprocessor Programming 133 133 public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff() }} Lock-free Stack current top will be new node’s successor
Lock-free Stack
Art of Multiprocessor Programming 134 134 public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return (top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff() }} Lock-free Stack Try to swing top, return success or failure
Lock-free Stack
Art of Multiprocessor Programming 135 135 public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff() }} Lock-free Stack Push calls tryPush
Lock-free Stack
Art of Multiprocessor Programming 136 136 public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff() }} Lock-free Stack Create new node
Lock-free Stack
Art of Multiprocessor Programming 137 137 public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return ; } else backoff.backoff() }} Lock-free Stack If tryPush() fails, back off before retrying
Lock-free Stack
Art of Multiprocessor Programming 138 138 Lock-free Stack Good No locking Bad Without GC, fear ABA Without backoff, huge contention at top In any case, no parallelism
Big Question
Art of Multiprocessor Programming 139 139 Big Question Are stacks inherently sequential? Reasons why Every pop() call fights for top item Reasons why not Stay tuned …
Elimination-Backoff Stack
Art of Multiprocessor Programming 140 140 Elimination-Backoff Stack How to “turn contention into parallelism” Replace familiar exponential backoff With alternative elimination-backoff
Observation
Art of Multiprocessor Programming 141 141 Observation Push( ) Pop() linearizable stack After an equal number of pushes and pops, stack stays the same Yes!
Idea: Elimination Array
Art of Multiprocessor Programming 142 142 Idea: Elimination Array Push( ) Pop() stack Pick at random Pick at random Elimination Array
Push Collides With Pop
Art of Multiprocessor Programming 143 143 Push Collides With Pop Push( ) Pop() stack continue continue No need to access stack Yes!
No Collision
Art of Multiprocessor Programming 144 144 No Collision Push( ) Pop() stack If no collision, access stack If pushes collide or pops collide access stack
Elimination-Backoff Stack
Art of Multiprocessor Programming 145 145 Elimination-Backoff Stack Lock-free stack + elimination array Access Lock-free stack, If uncontended , apply operation if contended , back off to elimination array and attempt elimination
Elimination-Backoff Stack
Art of Multiprocessor Programming 146 146 Elimination-Backoff Stack Push( ) Pop() Top CAS If CAS fails, back off
Dynamic Range and Delay
Art of Multiprocessor Programming 147 147 Dynamic Range and Delay Push( ) Pick random range and max waiting time based on level of contention encountered
Linearizability
Art of Multiprocessor Programming 148 148 Linearizability Un-eliminated calls linearized as before Eliminated calls: linearize pop() immediately after matching push() Combination is a linearizable stack
Un-Eliminated Linearizability
Art of Multiprocessor Programming 149 149 Un-Eliminated Linearizability push(v 1 ) time time linearizable push(v 1 ) pop(v 1 ) pop(v 1 )
Eliminated Linearizability
Art of Multiprocessor Programming 150 150 Eliminated Linearizability pop(v 2 ) push(v 1 ) push(v 2 ) time time push(v 2 ) pop(v 2 ) push(v 1 ) pop(v 1 ) Collision Point Red calls are eliminated pop(v 1 ) linearizable
Backoff Has Dual Effect
Art of Multiprocessor Programming 151 151 Backoff Has Dual Effect Elimination introduces parallelism Backoff to array cuts contention on lock- free stack Elimination in array cuts down number of threads accessing lock-free stack
Elimination Array
Art of Multiprocessor Programming 152 152 public class EliminationArray { private static final int duration = ...; private static final int timeUnit = ...; Exchanger<T>[] exchanger; public EliminationArray( int capacity) { exchanger = new Exchanger[capacity]; for ( int i = 0; i < capacity; i++) exchanger[i] = new Exchanger<T>(); } } Elimination Array
Elimination Array
Art of Multiprocessor Programming 153 153 public class EliminationArray { private static final int duration = ...; private static final int timeUnit = ...; Exchanger<T>[] exchanger; public EliminationArray(int capacity) { exchanger = new Exchanger[capacity]; for ( int i = 0; i < capacity; i++) exchanger[i] = new Exchanger<T>(); } } Elimination Array An array of Exchangers
Digression: A Lock-Free Exchanger
Art of Multiprocessor Programming 154 154 public class Exchanger<T> { AtomicStampedReference<T> slot = new AtomicStampedReference<T>( null , 0); Digression: A Lock-Free Exchanger
A Lock-Free Exchanger
Art of Multiprocessor Programming 155 155 public class Exchanger<T> { AtomicStampedReference<T> slot = new AtomicStampedReference<T>(null, 0); A Lock-Free Exchanger Atomically modifiable reference + status
Atomic Stamped Reference
Art of Multiprocessor Programming 156 156 Atomic Stamped Reference AtomicStampedReference class Java.util.concurrent.atomic package In C or C++ : address S reference stamp
Extracting Reference & Stamp
Art of Multiprocessor Programming 157 157 Extracting Reference & Stamp public T get( int[] stampHolder);
Extracting Reference & Stamp
Art of Multiprocessor Programming 158 158 Extracting Reference & Stamp public T get(int[] stampHolder ); Returns reference to object of type T Returns stamp at array index 0!
Exchanger Status
Art of Multiprocessor Programming 159 159 Exchanger Status enum Status {EMPTY, WAITING, BUSY};
Exchanger Status
Art of Multiprocessor Programming 160 160 Exchanger Status enum Status { EMPTY , WAITING, BUSY}; Nothing yet
Exchange Status
enum Status { EMPTY, WAITING , BUSY}; Art of Multiprocessor Programming 161 161 Exchange Status Nothing yet One thread is waiting for rendez-vous
Exchange Status
Art of Multiprocessor Programming 162 162 Exchange Status enum Status { EMPTY, WAITING, BUSY}; Nothing yet One thread is waiting for rendez-vous Other threads busy with rendez-vous
The Exchange
Art of Multiprocessor Programming 163 163 public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {EMPTY}; while ( true ) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch (stamp) { case EMPTY: … // slot is free case WAITING: … // someone waiting for me case BUSY: … // others exchanging } } The Exchange
The Exchange
Art of Multiprocessor Programming 164 164 public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {EMPTY}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: … // slot is free case WAITING: … // someone waiting for me case BUSY: … // others exchanging } } The Exchange Item and timeout
The Exchange
Art of Multiprocessor Programming 165 165 public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {EMPTY}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: … // slot is free case WAITING: … // someone waiting for me case BUSY: … // others exchanging } } The Exchange Array holds status
The Exchange
Art of Multiprocessor Programming 166 166 public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {0}; while ( true ) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: // slot is free case WAITING: // someone waiting for me case BUSY: // others exchanging } }} The Exchange Loop until timeout
The Exchange
Art of Multiprocessor Programming 167 167 public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {0}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: // slot is free case WAITING: // someone waiting for me case BUSY: // others exchanging } }} The Exchange Get other’s item and status
The Exchange
Art of Multiprocessor Programming 168 168 public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {0}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch (stamp) { case EMPTY: … // slot is free case WAITING: … // someone waiting for me case BUSY: … // others exchanging } }} The Exchange An Exchanger has three possible states
Lock-free Exchanger
Art of Multiprocessor Programming 169 169 Lock-free Exchanger
Lock-free Exchanger
Art of Multiprocessor Programming 170 170 Lock-free Exchanger CAS
Lock-free Exchanger
Art of Multiprocessor Programming 171 171 Lock-free Exchanger
Lock-free Exchanger
Art of Multiprocessor Programming 172 172 Lock-free Exchanger In search of partner …
Lock-free Exchanger
Art of Multiprocessor Programming 173 173 Lock-free Exchanger Slot Still waiting … Try to exchange item and set status to BUSY CAS
Lock-free Exchanger
Art of Multiprocessor Programming 174 174 Lock-free Exchanger Slot Partner showed up, take item and reset to EMPTY item status
Lock-free Exchanger
Art of Multiprocessor Programming 175 175 EMPTY Lock-free Exchanger Slot item status Partner showed up, take item and reset to EMPTY
Exchanger State EMPTY
Art of Multiprocessor Programming 176 176 case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null , WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set( null , EMPTY); return herItem; } } break ; Exchanger State EMPTY
Exchanger State EMPTY
Art of Multiprocessor Programming 177 177 case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; } } break; Exchanger State EMPTY Try to insert myItem and change state to WAITING
Exchanger State EMPTY
Art of Multiprocessor Programming 178 178 case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; } } break; Exchanger State EMPTY Spin until either myItem is taken or timeout
Exchanger State EMPTY
Art of Multiprocessor Programming 179 179 case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; } } break; Exchanger State EMPTY myItem was taken, so return herItem that was put in its place
Exchanger State EMPTY
Art of Multiprocessor Programming 180 180 case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null , WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; } } break; Exchanger State EMPTY Otherwise we ran out of time, try to reset status to EMPTY and time out
Exchanger State EMPTY
Art of Multiprocessor Programming 181 Art of Multiprocessor Programming© Herlihy-Shavit 2007 181 case EMPTY: // slot is free if (slot.compareAndSet(herItem, myItem, WAITING, BUSY)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.compareAndSet(myItem, null, WAITING, EMPTY)){throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; } } break; Exchanger State EMPTY If reset failed, someone showed up after all, so take that item
Exchanger State EMPTY
Art of Multiprocessor Programming 182 182 case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set( null , EMPTY); return herItem; } } break; Exchanger State EMPTY Clear slot and take that item
Exchanger State EMPTY
Art of Multiprocessor Programming 183 183 case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; } } break ; Exchanger State EMPTY If initial CAS failed, then someone else changed status from EMPTY to WAITING, so retry from start
States WAITING and BUSY
Art of Multiprocessor Programming 184 184 case WAITING: // someone waiting for me if (slot.CAS(herItem, myItem, WAITING, BUSY)) return herItem; break ; case BUSY: // others in middle of exchanging break ; default : // impossible break ; } } } } States WAITING and BUSY
States WAITING and BUSY
Art of Multiprocessor Programming 185 185 case WAITING: // someone waiting for me if (slot.CAS(herItem, myItem, WAITING, BUSY)) return herItem; break; case BUSY: // others in middle of exchanging break; default: // impossible break; } } } } States WAITING and BUSY someone is waiting to exchange, so try to CAS my item in and change state to BUSY
States WAITING and BUSY
Art of Multiprocessor Programming 186 186 case WAITING: // someone waiting for me if (slot.CAS(herItem, myItem, WAITING, BUSY)) return herItem; break; case BUSY: // others in middle of exchanging break; default: // impossible break; } } } } States WAITING and BUSY If successful, return other’s item, otherwise someone else took it, so try again from start
States WAITING and BUSY
Art of Multiprocessor Programming 187 187 case WAITING: // someone waiting for me if (slot.CAS(herItem, myItem, WAITING, BUSY)) return herItem; break; case BUSY: // others in middle of exchanging break ; default: // impossible break; } } } } States WAITING and BUSY If BUSY, other threads exchanging, so start again
The Exchanger Slot
Art of Multiprocessor Programming 188 188 The Exchanger Slot Exchanger is lock-free Because the only way an exchange can fail is if others repeatedly succeeded or no-one showed up The slot we need does not require symmetric exchange
Back to the Stack: the Elimination Array
Art of Multiprocessor Programming 189 189 public class EliminationArray { public T visit(T value, int range) throws TimeoutException { int slot = random.nextInt(range); int nanodur = convertToNanos(duration, timeUnit)); return (exchanger[slot].exchange(value, nanodur) }} Back to the Stack: the Elimination Array
Elimination Array
Art of Multiprocessor Programming 190 190 public class EliminationArray { public T visit(T value, int range) throws TimeoutException { int slot = random.nextInt(range); int nanodur = convertToNanos(duration, timeUnit)); return (exchanger[slot].exchange(value, nanodur) }} Elimination Array visit the elimination array with fixed value and range
Elimination Array
Art of Multiprocessor Programming 191 191 public class EliminationArray { public T visit(T value, int range) throws TimeoutException { int slot = random.nextInt(range); int nanodur = convertToNanos(duration, timeUnit)); return (exchanger[slot].exchange(value, nanodur) }} Elimination Array Pick a random array entry
Elimination Array
Art of Multiprocessor Programming 192 192 public class EliminationArray { public T visit(T value, int range) throws TimeoutException { int slot = random.nextInt(range); int nanodur = convertToNanos(duration, timeUnit)); return (exchanger[slot].exchange(value, nanodur) }} Elimination Array Exchange value or time out
Elimination Stack Push
Art of Multiprocessor Programming 193 193 public void push(T value) { ... while ( true ) { if (tryPush(node)) { return ; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null ) { return ; } } Elimination Stack Push
Elimination Stack Push
Art of Multiprocessor Programming 194 194 public void push(T value) { ... while (true) { if (tryPush(node)) { return ; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; } } Elimination Stack Push First, try to push
Elimination Stack Push
Art of Multiprocessor Programming 195 195 public void push(T value) { ... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; } } Elimination Stack Push If I failed, backoff & try to eliminate
Elimination Stack Push
Art of Multiprocessor Programming 196 196 public void push(T value) { ... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit (value,policy.range); if (otherValue == null) { return; } } Elimination Stack Push Value pushed and range to try
Elimination Stack Push
Art of Multiprocessor Programming 197 197 public void push(T value) { ... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return ; } } Elimination Stack Push Only pop() leaves null, so elimination was successful
Elimination Stack Push
Art of Multiprocessor Programming 198 198 public void push(T value) { ... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; } } Elimination Stack Push Otherwise, retry push() on lock-free stack
Elimination Stack Pop
Art of Multiprocessor Programming 199 199 public T pop() { ... while ( true ) { if (tryPop()) { return returnNode.value; } else try { T otherValue = eliminationArray.visit(null,policy.range; if (otherValue != null ) { return otherValue; } } }} Elimination Stack Pop
Elimination Stack Pop
Art of Multiprocessor Programming 200 200 public T pop() { ... while (true) { if (tryPop()) { return returnNode.value; } else try { T otherValue = eliminationArray.visit(null,policy.range; if ( otherValue != null ) { return otherValue; } } }} Elimination Stack Pop If value not null, other thread is a push(), so elimination succeeded
Summary
Art of Multiprocessor Programming 201 201 Summary We saw both lock-based and lock-free implementations of queues and stacks Don’t be quick to declare a data structure inherently sequential Linearizable stack is not inherently sequential (though it is in worst case) ABA is a real problem, pay attention
Art of Multiprocessor Programming 202 202   creativecommons.org/licenses/by-sa/2.5/ Creative Commons Attribution-ShareAlike 2.5 License . You are free : to Share — to copy, distribute and transmit the work to Remix — to adapt the work Under the following conditions : Attribution . You must attribute the work to “The Art of Multiprocessor Programming” (but not in any way that suggests that the authors endorse you or your use of the work). Share Alike . If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license. For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to http://creativecommons.org/licenses/by-sa/3.0/. Any of the above conditions can be waived if you get permission from the copyright holder. Nothing in this license impairs or restricts the author's moral rights.
Art of Multiprocessor Programming 203