Shared Counters and Parallelism
A Shared Pool
A Shared Pool
Simple Locking Implementation
Simple Locking Implementation
Simple Locking Implementation
Simple Locking Implementation
Simple Locking Implementation
Counting Implementation
Counting Implementation
Shared Counter
Shared Counter
Shared Counter
Shared Counter
Shared Counters
Software Combining Tree
Combining Trees
Combining Trees
Combining Trees
Combining Trees
Combining Trees
Combining Trees
Combining Trees
Combining Trees
What if?
Combining Status
Combining Status
Combining Status
Combining Status
Combining Status
Combining Status
Node Synchronization
Phases
Phases
Phases
Phases
Precombining Phase
Precombining Phase
Precombining Phase
Precombining Phase
Precombining Phase
Code
Precombining Navigation
Precombining Navigation
Precombining Navigation
Precombining Navigation
Precombining Node
Precombining Node
Synchronization
Precombining Node
Node was IDLE
Precombining Node
I’m the 2nd Thread
Precombining Node
Precombining Node
Node is the Root
Precombining Node
Combining Phase
Combining Phase
Combining Phase
Combining (reloaded)
Combining (reloaded)
Combining (reloaded)
Combining (reloaded)
Combining Navigation
Combining Navigation
Combining Navigation
Combining Navigation
Combining Navigation
Combining Navigation
Combining Navigation
Combining Phase Node
Combining Phase Node
Combining Phase Node
Combining Phase Node
Combining Phase Node
Combining Phase Node
Combining Phase Node
Combining Node
Operation Phase
Operation Phase (reloaded)
Operation Phase (reloaded)
Operation Phase Navigation
Operation Phase Navigation
Operation on Stopped Node
Op States of Stop Node
At Root
Intermediate Node
Intermediate Node
Intermediate Node
Intermediate Node
Distribution Phase
Distribution Phase
Distribution Phase
Distribution Phase
Distribution Phase Navigation
Distribution Phase Navigation
Distribution Phase Navigation
Distribution Phase Navigation
Distribution Phase
Distribution Phase
Distribution Phase
Bad News: High Latency
Good News: Real Parallelism
Throughput Puzzles
Index Distribution Benchmark
Index Distribution Benchmark
Index Distribution Benchmark
Index Distribution Benchmark
Index Distribution Benchmark
Performance
Performance (Simulated)
Load Fluctuations
Combining Rate vs Work
Better to Wait Longer
Conclusions
A Balancer
Tokens Traverse Balancers
Tokens Traverse Balancers
Tokens Traverse Balancers
Tokens Traverse Balancers
Tokens Traverse Balancers
Tokens Traverse Balancers
Smoothing Network
Counting Network
Counting Networks Count!
Counting Networks Count!
Counting Networks
Counting Network
Counting Network
Counting Network
Counting Network
Counting Network
Counting Network
Bitonic[k] Counting Network
Bitonic[k] Counting Network
Bitonic[k] not Linearizable
Bitonic[k] is not Linearizable
Bitonic[k] is not Linearizable
Bitonic[k] is not Linearizable
Bitonic[k] is not Linearizable
But it is “Quiescently Consistent”
Shared Memory Implementation
Shared Memory Implementation
Shared Memory Implementation
Shared Memory Implementation
Shared Memory Implementation
Shared Memory Implementation
Shared Memory Implementation
Shared Memory Implementation
Bitonic[2k] Inductive Structure
Bitonic[4] Counting Network
Bitonic[8] Layout
Unfolded Bitonic[8] Network
Unfolded Bitonic[8] Network
Unfolded Bitonic[8] Network
Bitonic[k] Depth
Proof by Induction
Bitonic[2k] Schematic
Need to Prove only Merger[2k]
Merger[2k] Schematic
Merger[2k] Layout
Induction Step
Assume Bitonic[k] and Merger[k] and Prove Merger[2k]
Proof: Lemma 1
Lemma 1
Lemma 1
Lemma 2
Bitonic[2k] Layout Details
By induction hypothesis
By Lemma 1
By Lemma 2
By Induction Hypothesis
By Lemma 2
Last Row of Balancers
Last Row of Balancers
Last Row of Balancers
Last Row of Balancers
So Counting Networks Count
Periodic Network Block
Periodic Network Block
Periodic Network Block
Periodic Network Block
Block[2k] Schematic
Block[2k] Layout
Periodic[8]
Network Depth
Lower Bound on Depth
Sequential Theorem
Red First, Blue Second
Blue First, Red Second
Either Way
Order Doesn’t Matter
Index Distribution Benchmark
Performance (Simulated)
Performance (Simulated)
Performance (Simulated)
Performance (Simulated)
Saturation and Performance
Throughput vs. Size
Shared Pool
Shared Pool
Counting Trees
Counting Trees
Counting Trees
Counting Trees
Counting Trees
Inductive Construction
Inductive Construction
Diffraction Balancing
Performance
Summary