SlideShare a Scribd company logo
1 of 89
Low Latency Trading Architecture
Sam Adams
QCon London, March, 2017
InfoQ.com: News & Community Site
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
lmax-trading-architecture
• Over 1,000,000 software developers, architects and CTOs read the site world-
wide every month
• 250,000 senior developers subscribe to our weekly newsletter
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• 2 dedicated podcast channels: The InfoQ Podcast, with a focus on
Architecture and The Engineering Culture Podcast, with a focus on building
• 96 deep dives on innovative topics packed as downloadable emags and
minibooks
• Over 40 new content items per week
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon London
www.qconlondon.com
don't panic
don't panic
buy
sell
GBP/USD
Typical day:
1,000's active clients
100,000's trades occur
100,000,000's orders placed
– very bursty: spikes of 100s / ms
1,000,000,000's market data updates sent
End-to-end latency:
50%: 80 µs
99%: 150 µs
99.99%: 500 µs
Max: 4ms(*)
System Architecture
Building low latency applications
Instruction
Execution
reports
Market
data
* latency sensitive *
* throughput matters *
The Disruptor
ConsumerProducer
High performance inter-thread messaging
public class ArrayBlockingQueue<E>
{
final Object[] items;
int takeIndex;
int putIndex;
int count;
/** Main lock guarding all access */
final ReentrantLock lock;
}
ArrayBlockingQueue vs Disruptor
locking & contention
public class ArrayBlockingQueue<E>
{
final Object[] items;
int takeIndex;
int putIndex;
int count;
/** Main lock guarding all access */
final ReentrantLock lock;
}
public class RingBuffer<E>
implements DataProvider<E>
{
// ...
final long indexMask;
final Object[] entries;
final Sequence cursor;
// ...
}
public class BatchEventProcessor<E>
{
final DataProvider<E> dataProvider;
final Sequence sequence;
}
ArrayBlockingQueue vs Disruptor
locking & contention
vs
single writers
Claimed: -1
Published: -1
Consumer
Consumed: -1
Waiting for: 0
Producer
Claimed: 0
Published: -1
Consumer
Consumed: -1
Waiting for: 0
Producer
Claim slot: 0
Claimed: 0
Published: 0
Consumer
Consumed: -1
Waiting for: 0
Producer
Publish slot: 0
Claimed: 0
Published: 0
Consumer
Consumed: -1
Available: 0
Processing: 0
Producer
Claimed: 0
Published: 0
Consumer
Consumed: 0
Waiting for: 1
Producer
Claimed: 3
Published: 3
Consumer
Consumed: 0
Waiting for: 1
Producer
Published: 1-3
Claimed: 3
Published: 3
Consumer
Consumed: 0
Available: 3
Processing: 1,2,3
Producer
Claimed: 3
Published: 3
Consumer
Consumed: 3
Waiting for: 4
Producer
Supports dependency graphs between consumers
Messaging
Asynchronous Pub/Sub messaging:
- UDP Multicast: low latency, scalable, unreliable
- Services publish / subscribe to topics
- topic = unique multicast group
- Informatica UMS (aka 29 West LBM) provides * some reliability *
Asynchronous Pub/Sub messaging:
- Push based
- If you miss a message, it is gone
- Late-join: no history
javassist generated proxies to interfaces
public interface TradingInstructions
{
void placeOrder(PlaceOrderInstruction instruction);
void cancelOrder(CancelOrderInstruction instruction);
}
See GeneratedRingBufferProxyGenerator in disruptor-proxy for inter-thread version
https://github.com/LMAX-Exchange/disruptor-proxy
Event:
long sequence
byte operationIndex
byte[] data
int length
See GeneratedRingBufferProxyGenerator in disruptor-proxy for inter-thread version
https://github.com/LMAX-Exchange/disruptor-proxy
public void placeOrder(PlaceOrderInstruction arg0)
{
// ...
event.initialise(sequence, 1); // operation index
marshaller.encode(arg0, event.outputStream());
// ...
}
Publisher proxy:
Event:
long sequence
byte operationIndex
byte[] data
int length
See GeneratedRingBufferProxyGenerator in disruptor-proxy for inter-thread version
https://github.com/LMAX-Exchange/disruptor-proxy
Invoker invokers[];
TradingInstructions implementation;
public void onEvent(Event event)
{
Invoker invoker = invokers[event.getOperationIndex()];
invoker.invoke(event.getInputStream(), implementation);
}
public void invoke(InputStream input, TradingInstructions implementation)
{
PlaceOrderInstruction arg0 = marshaller.decode(input);
implementation.placeOrder(arg0);
}
Subscriber proxy:
Event:
long sequence
byte operationIndex
byte[] data
int length
Matching Engine
For speed:
All working state held in memory
Remove contention: single threaded
Don’t block business logic: buffer for outbound I/O
Don’t block network thread: buffer incoming events
All state in volatile memory:
Save on shutdown / Load on startup
Recover from unclean shutdown
Journal incoming events to disk, replay on startup
Replicate events to hot-standby
for resiliency
Manual fail-over
(also to offsite DR)
Holding all your state in memory
No database
No roll-back
Up-front validation is critical
Never throw exceptions
- result is inconsistent state
System must be deterministic
All operations event sourced
time sourced from events
collections must be ordered
no local configuration
Determinism bugs are really nasty
Only an issue if we have
to fail-over or replay
Primary is the source of truth
Gateways
Same principles:
- non-blocking / message passing
- minimise shared state
Stream Processing
Matching Engine
Order Book
All Orders[ ]
Order Added
Order Cancelled
Order Added
Trade
Trade
Order Added
...
Matching Engine
Order Book
All Orders[ ]
Order Added
Order Cancelled
Order Added
Trade
Trade
Order Added
...
Matching Engine
Order Book
Market Analysis
Order Book Image
Event Store
AML Alerts
Order Book Image
Where latency doesn’t matter...
- How big are the bursts?
- Buffers are your friend
Does data loss matter?
Market Analysis
Order Book Image
Event Store
AML Alerts
Order Book Image
More Reliable Messaging
Handling buffer wraps
‘better never than late’
- reset & late join
persistent data loss
- recover from event store
- journal replay and gap-fill
Low latency applications:
mechanical sympathy
[sam@box ~]$ lstopo
Machine (126GB)
NUMANode P#0 (63GB)
Socket P#0
L3 (30MB)
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#0
PU P#0
PU P#24
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#1
PU P#2
PU P#26
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#2
PU P#4
PU P#28
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#3
PU P#6
PU P#30
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#4
PU P#8
PU P#32
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#5
PU P#10
PU P#34
L2 (2
L1d
L1i (
Core
PU
PU
Main Memory
L1/L2 Caches
CPU Core / Hyper Threads
Machine (126GB)
NUMANode P#0 (63GB)
Socket P#0
L3 (30MB)
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#0
PU P#0
PU P#24
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#1
PU P#2
PU P#26
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#2
PU P#4
PU P#28
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#3
PU P#6
PU P#30
L2 (25
L1d (3
L1i (3
Core
PU
PU
CPUs are faster than memory
Intel Performance Analysis Guide:
L1 CACHE hit, 4 cycles
L2 CACHE hit, 10 cycles
local L3 CACHE hit, ~40-75 cycles
remote L3 CACHE hit, ~100-300 cycles
Local Dram ~60 ns
Remote Dram ~100 ns
Memory system optimised for:
Temporal locality
Spatial locality
Equidistant locality
Reference vs Primitives
Long[] vs long[]
public class Cash
{
long value;
}
Calculations with money
- double: inexact
- BigDecimal: expensive
Fixed-point arithmetic with long
But I want type-safety...
long price1 = 1250000L;
long quantity1 = 1520L;
// BUG
long price2 = quantity1;
Prices, precision: 6dp
1250000L → 1.250000
Quantities, precision: 2dp
1520L → 15.20
https://checkerframework.org/
With Type Annotations & Units Checker:
@Price long price1 = 1250000L;
@Qty long quantity1 = 1520L;
// Compilation error
@Price long price2 = quantity1;
Prices, precision: 6dp
1250000L → 1.250000
Quantities, precision: 2dp
1520L → 15.20
public class HashMap<K,V>
{
Node<K,V>[] table;
static class Node<K,V>
{
K key;
V value;
Node<K,V> next;
}
public class Long2ObjectOpenHashMap<V>
{
long[] keys;
V[] values;
}
java.util vs fastutil
Map<Long,X> vs LongMap<X>
public class HashMap<K,V>
{
Node<K,V>[] table;
static class Node<K,V>
{
K key;
V value;
Node<K,V> next;
}
public class Long2ObjectOpenHashMap<V>
{
long[] keys;
V[] values;
}
java.util vs fastutil
Map<Long,X> vs LongMap<X>
False sharing: revisit the Disruptor
public class ArrayBlockingQueue<E>
{
final Object[] items;
int takeIndex;
int putIndex;
int count;
/** Main lock guarding all access */
final ReentrantLock lock;
}
public class RingBuffer
{
// ...
final Object[] entries;
final Sequence cursor;
// ...
}
public class Sequence
{
long p1, p2, p3, p4, p5, p6, p7;
long value;
long p9, p10, p11, p12, p13, p14, p15;
}
False sharing: revisit the Disruptor
public class RingBuffer
{
// ...
final Object[] entries;
final Sequence cursor;
// ...
}
public class Sequence
{
@Contended
long value;
}
False sharing: revisit the Disruptor
Java 8:
Removing Jitter:
GC & Scheduling
GC Options:
Zero garbage
Massive heap, GC when convenient
Commercial JVM – Azul Zing
GC Options:
Zero garbage
Massive heap, GC when convenient
Commercial JVM – Azul Zing
GC Options:
Zero garbage
Massive heap, GC when convenient
Commercial JVM – Azul Zing
JVM
OS
Avoiding scheduling jitter
JVM
OS
Socket P#0
L3 (30MB)
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#0
PU P#0
PU P#24
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#1
PU P#2
PU P#26
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#2
PU P#4
PU P#28
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#3
PU P#6
PU P#30
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#4
PU P#8
PU P#32
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#5
PU P#10
PU P#34
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#8
PU P#12
PU P#36
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#9
PU P#14
PU P#38
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#10
PU P#16
PU P#40
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#11
PU P#18
PU P#42
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#12
PU P#20
PU P#44
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#13
PU P#22
PU P#46
JVM
OS
Socket P#0
L3 (30MB)
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#0
PU P#0
PU P#24
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#1
PU P#2
PU P#26
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#2
PU P#4
PU P#28
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#3
PU P#6
PU P#30
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#4
PU P#8
PU P#32
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#5
PU P#10
PU P#34
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#8
PU P#12
PU P#36
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#9
PU P#14
PU P#38
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#10
PU P#16
PU P#40
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#11
PU P#18
PU P#42
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#12
PU P#20
PU P#44
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#13
PU P#22
PU P#46
Remove reserved CPUs from the kernel
scheduler
isolcpus=0,2,4,6,8,24,26,28,30,32
JVM
OS
Socket P#0
L3 (30MB)
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#0
PU P#0
PU P#24
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#1
PU P#2
PU P#26
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#2
PU P#4
PU P#28
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#3
PU P#6
PU P#30
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#4
PU P#8
PU P#32
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#5
PU P#10
PU P#34
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#8
PU P#12
PU P#36
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#9
PU P#14
PU P#38
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#10
PU P#16
PU P#40
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#11
PU P#18
PU P#42
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#12
PU P#20
PU P#44
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#13
PU P#22
PU P#46
Create CPU sets for system, application
# cset set --set=/system --cpu=18,20,...,46
# cset set --set=/app --cpu=0,2,...,40
/
/system/app
OS
Socket P#0
L3 (30MB)
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#0
PU P#0
PU P#24
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#1
PU P#2
PU P#26
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#2
PU P#4
PU P#28
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#3
PU P#6
PU P#30
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#4
PU P#8
PU P#32
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#5
PU P#10
PU P#34
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#8
PU P#12
PU P#36
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#9
PU P#14
PU P#38
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#10
PU P#16
PU P#40
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#11
PU P#18
PU P#42
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#12
PU P#20
PU P#44
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#13
PU P#22
PU P#46
Processes default to the / CPU set
/
/system/app
OS
Socket P#0
L3 (30MB)
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#0
PU P#0
PU P#24
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#1
PU P#2
PU P#26
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#2
PU P#4
PU P#28
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#3
PU P#6
PU P#30
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#4
PU P#8
PU P#32
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#5
PU P#10
PU P#34
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#8
PU P#12
PU P#36
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#9
PU P#14
PU P#38
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#10
PU P#16
PU P#40
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#11
PU P#18
PU P#42
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#12
PU P#20
PU P#44
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#13
PU P#22
PU P#46
Move all threads into /system CPU set
# cset proc --move -k --threads --force 
--from-set=/ --to-set=/system
/
/system/app
JVM
OS
Socket P#0
L3 (30MB)
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#0
PU P#0
PU P#24
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#1
PU P#2
PU P#26
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#2
PU P#4
PU P#28
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#3
PU P#6
PU P#30
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#4
PU P#8
PU P#32
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#5
PU P#10
PU P#34
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#8
PU P#12
PU P#36
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#9
PU P#14
PU P#38
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#10
PU P#16
PU P#40
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#11
PU P#18
PU P#42
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#12
PU P#20
PU P#44
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#13
PU P#22
PU P#46
Launch application in /app CPU set,
taskset to run in pool
$ cset proc --exec /app 
taskset -cp 10,12...38,40 
java <args>
/
/system/app
JVM
OS
Socket P#0
L3 (30MB)
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#0
PU P#0
PU P#24
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#1
PU P#2
PU P#26
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#2
PU P#4
PU P#28
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#3
PU P#6
PU P#30
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#4
PU P#8
PU P#32
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#5
PU P#10
PU P#34
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#8
PU P#12
PU P#36
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#9
PU P#14
PU P#38
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#10
PU P#16
PU P#40
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#11
PU P#18
PU P#42
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#12
PU P#20
PU P#44
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#13
PU P#22
PU P#46
Move critical threads onto their own cores
using JNA / JNI
sched_set_affinity(0);
sched_set_affinity(2);
...
JVM
OS
Socket P#0
L3 (30MB)
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#0
PU P#0
PU P#24
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#1
PU P#2
PU P#26
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#2
PU P#4
PU P#28
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#3
PU P#6
PU P#30
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#4
PU P#8
PU P#32
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#5
PU P#10
PU P#34
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#8
PU P#12
PU P#36
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#9
PU P#14
PU P#38
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#10
PU P#16
PU P#40
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#11
PU P#18
PU P#42
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#12
PU P#20
PU P#44
L2 (256KB)
L1d (32KB)
L1i (32KB)
Core P#13
PU P#22
PU P#46
Summary
round-trip a correlation ID
round-trip a correlation ID
25µs
Thank You!
sam.adams@lmax.com
https://www.lmax.com/blog/staff-blogs/
p.s. we’re hiring!
The End.
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
lmax-trading-architecture

More Related Content

More from C4Media

Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileC4Media
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsC4Media
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like OwnersC4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsC4Media
 

More from C4Media (20)

Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 

Recently uploaded

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Recently uploaded (20)

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

Low Latency Trading Architecture at LMAX Exchange

  • 1. Low Latency Trading Architecture Sam Adams QCon London, March, 2017
  • 2. InfoQ.com: News & Community Site Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ lmax-trading-architecture • Over 1,000,000 software developers, architects and CTOs read the site world- wide every month • 250,000 senior developers subscribe to our weekly newsletter • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • 2 dedicated podcast channels: The InfoQ Podcast, with a focus on Architecture and The Engineering Culture Podcast, with a focus on building • 96 deep dives on innovative topics packed as downloadable emags and minibooks • Over 40 new content items per week
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon London www.qconlondon.com
  • 5. Typical day: 1,000's active clients 100,000's trades occur 100,000,000's orders placed – very bursty: spikes of 100s / ms 1,000,000,000's market data updates sent
  • 6. End-to-end latency: 50%: 80 µs 99%: 150 µs 99.99%: 500 µs Max: 4ms(*)
  • 7. System Architecture Building low latency applications
  • 11. public class ArrayBlockingQueue<E> { final Object[] items; int takeIndex; int putIndex; int count; /** Main lock guarding all access */ final ReentrantLock lock; } ArrayBlockingQueue vs Disruptor locking & contention
  • 12. public class ArrayBlockingQueue<E> { final Object[] items; int takeIndex; int putIndex; int count; /** Main lock guarding all access */ final ReentrantLock lock; } public class RingBuffer<E> implements DataProvider<E> { // ... final long indexMask; final Object[] entries; final Sequence cursor; // ... } public class BatchEventProcessor<E> { final DataProvider<E> dataProvider; final Sequence sequence; } ArrayBlockingQueue vs Disruptor locking & contention vs single writers
  • 13. Claimed: -1 Published: -1 Consumer Consumed: -1 Waiting for: 0 Producer
  • 14. Claimed: 0 Published: -1 Consumer Consumed: -1 Waiting for: 0 Producer Claim slot: 0
  • 15. Claimed: 0 Published: 0 Consumer Consumed: -1 Waiting for: 0 Producer Publish slot: 0
  • 16. Claimed: 0 Published: 0 Consumer Consumed: -1 Available: 0 Processing: 0 Producer
  • 17. Claimed: 0 Published: 0 Consumer Consumed: 0 Waiting for: 1 Producer
  • 18. Claimed: 3 Published: 3 Consumer Consumed: 0 Waiting for: 1 Producer Published: 1-3
  • 19. Claimed: 3 Published: 3 Consumer Consumed: 0 Available: 3 Processing: 1,2,3 Producer
  • 20. Claimed: 3 Published: 3 Consumer Consumed: 3 Waiting for: 4 Producer
  • 21. Supports dependency graphs between consumers
  • 23. Asynchronous Pub/Sub messaging: - UDP Multicast: low latency, scalable, unreliable - Services publish / subscribe to topics - topic = unique multicast group - Informatica UMS (aka 29 West LBM) provides * some reliability *
  • 24. Asynchronous Pub/Sub messaging: - Push based - If you miss a message, it is gone - Late-join: no history
  • 25. javassist generated proxies to interfaces public interface TradingInstructions { void placeOrder(PlaceOrderInstruction instruction); void cancelOrder(CancelOrderInstruction instruction); } See GeneratedRingBufferProxyGenerator in disruptor-proxy for inter-thread version https://github.com/LMAX-Exchange/disruptor-proxy Event: long sequence byte operationIndex byte[] data int length
  • 26. See GeneratedRingBufferProxyGenerator in disruptor-proxy for inter-thread version https://github.com/LMAX-Exchange/disruptor-proxy public void placeOrder(PlaceOrderInstruction arg0) { // ... event.initialise(sequence, 1); // operation index marshaller.encode(arg0, event.outputStream()); // ... } Publisher proxy: Event: long sequence byte operationIndex byte[] data int length
  • 27. See GeneratedRingBufferProxyGenerator in disruptor-proxy for inter-thread version https://github.com/LMAX-Exchange/disruptor-proxy Invoker invokers[]; TradingInstructions implementation; public void onEvent(Event event) { Invoker invoker = invokers[event.getOperationIndex()]; invoker.invoke(event.getInputStream(), implementation); } public void invoke(InputStream input, TradingInstructions implementation) { PlaceOrderInstruction arg0 = marshaller.decode(input); implementation.placeOrder(arg0); } Subscriber proxy: Event: long sequence byte operationIndex byte[] data int length
  • 29. For speed: All working state held in memory Remove contention: single threaded
  • 30. Don’t block business logic: buffer for outbound I/O
  • 31. Don’t block network thread: buffer incoming events
  • 32. All state in volatile memory: Save on shutdown / Load on startup
  • 33. Recover from unclean shutdown Journal incoming events to disk, replay on startup
  • 34. Replicate events to hot-standby for resiliency Manual fail-over (also to offsite DR)
  • 35. Holding all your state in memory No database No roll-back Up-front validation is critical Never throw exceptions - result is inconsistent state
  • 36. System must be deterministic All operations event sourced time sourced from events collections must be ordered no local configuration
  • 37. Determinism bugs are really nasty Only an issue if we have to fail-over or replay Primary is the source of truth
  • 39. Same principles: - non-blocking / message passing - minimise shared state
  • 42. All Orders[ ] Order Added Order Cancelled Order Added Trade Trade Order Added ... Matching Engine Order Book
  • 43. All Orders[ ] Order Added Order Cancelled Order Added Trade Trade Order Added ... Matching Engine Order Book Market Analysis Order Book Image Event Store AML Alerts Order Book Image
  • 44. Where latency doesn’t matter... - How big are the bursts? - Buffers are your friend Does data loss matter? Market Analysis Order Book Image Event Store AML Alerts Order Book Image
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55. Handling buffer wraps ‘better never than late’ - reset & late join persistent data loss - recover from event store - journal replay and gap-fill
  • 58. Machine (126GB) NUMANode P#0 (63GB) Socket P#0 L3 (30MB) L2 (256KB) L1d (32KB) L1i (32KB) Core P#0 PU P#0 PU P#24 L2 (256KB) L1d (32KB) L1i (32KB) Core P#1 PU P#2 PU P#26 L2 (256KB) L1d (32KB) L1i (32KB) Core P#2 PU P#4 PU P#28 L2 (256KB) L1d (32KB) L1i (32KB) Core P#3 PU P#6 PU P#30 L2 (256KB) L1d (32KB) L1i (32KB) Core P#4 PU P#8 PU P#32 L2 (256KB) L1d (32KB) L1i (32KB) Core P#5 PU P#10 PU P#34 L2 (2 L1d L1i ( Core PU PU Main Memory L1/L2 Caches CPU Core / Hyper Threads
  • 59. Machine (126GB) NUMANode P#0 (63GB) Socket P#0 L3 (30MB) L2 (256KB) L1d (32KB) L1i (32KB) Core P#0 PU P#0 PU P#24 L2 (256KB) L1d (32KB) L1i (32KB) Core P#1 PU P#2 PU P#26 L2 (256KB) L1d (32KB) L1i (32KB) Core P#2 PU P#4 PU P#28 L2 (256KB) L1d (32KB) L1i (32KB) Core P#3 PU P#6 PU P#30 L2 (25 L1d (3 L1i (3 Core PU PU CPUs are faster than memory Intel Performance Analysis Guide: L1 CACHE hit, 4 cycles L2 CACHE hit, 10 cycles local L3 CACHE hit, ~40-75 cycles remote L3 CACHE hit, ~100-300 cycles Local Dram ~60 ns Remote Dram ~100 ns
  • 60. Memory system optimised for: Temporal locality Spatial locality Equidistant locality
  • 62. public class Cash { long value; } Calculations with money - double: inexact - BigDecimal: expensive Fixed-point arithmetic with long But I want type-safety...
  • 63. long price1 = 1250000L; long quantity1 = 1520L; // BUG long price2 = quantity1; Prices, precision: 6dp 1250000L → 1.250000 Quantities, precision: 2dp 1520L → 15.20
  • 64. https://checkerframework.org/ With Type Annotations & Units Checker: @Price long price1 = 1250000L; @Qty long quantity1 = 1520L; // Compilation error @Price long price2 = quantity1; Prices, precision: 6dp 1250000L → 1.250000 Quantities, precision: 2dp 1520L → 15.20
  • 65. public class HashMap<K,V> { Node<K,V>[] table; static class Node<K,V> { K key; V value; Node<K,V> next; } public class Long2ObjectOpenHashMap<V> { long[] keys; V[] values; } java.util vs fastutil Map<Long,X> vs LongMap<X>
  • 66. public class HashMap<K,V> { Node<K,V>[] table; static class Node<K,V> { K key; V value; Node<K,V> next; } public class Long2ObjectOpenHashMap<V> { long[] keys; V[] values; } java.util vs fastutil Map<Long,X> vs LongMap<X>
  • 67. False sharing: revisit the Disruptor public class ArrayBlockingQueue<E> { final Object[] items; int takeIndex; int putIndex; int count; /** Main lock guarding all access */ final ReentrantLock lock; }
  • 68. public class RingBuffer { // ... final Object[] entries; final Sequence cursor; // ... } public class Sequence { long p1, p2, p3, p4, p5, p6, p7; long value; long p9, p10, p11, p12, p13, p14, p15; } False sharing: revisit the Disruptor
  • 69. public class RingBuffer { // ... final Object[] entries; final Sequence cursor; // ... } public class Sequence { @Contended long value; } False sharing: revisit the Disruptor Java 8:
  • 70. Removing Jitter: GC & Scheduling
  • 71. GC Options: Zero garbage Massive heap, GC when convenient Commercial JVM – Azul Zing
  • 72. GC Options: Zero garbage Massive heap, GC when convenient Commercial JVM – Azul Zing
  • 73. GC Options: Zero garbage Massive heap, GC when convenient Commercial JVM – Azul Zing
  • 75. JVM OS Socket P#0 L3 (30MB) L2 (256KB) L1d (32KB) L1i (32KB) Core P#0 PU P#0 PU P#24 L2 (256KB) L1d (32KB) L1i (32KB) Core P#1 PU P#2 PU P#26 L2 (256KB) L1d (32KB) L1i (32KB) Core P#2 PU P#4 PU P#28 L2 (256KB) L1d (32KB) L1i (32KB) Core P#3 PU P#6 PU P#30 L2 (256KB) L1d (32KB) L1i (32KB) Core P#4 PU P#8 PU P#32 L2 (256KB) L1d (32KB) L1i (32KB) Core P#5 PU P#10 PU P#34 L2 (256KB) L1d (32KB) L1i (32KB) Core P#8 PU P#12 PU P#36 L2 (256KB) L1d (32KB) L1i (32KB) Core P#9 PU P#14 PU P#38 L2 (256KB) L1d (32KB) L1i (32KB) Core P#10 PU P#16 PU P#40 L2 (256KB) L1d (32KB) L1i (32KB) Core P#11 PU P#18 PU P#42 L2 (256KB) L1d (32KB) L1i (32KB) Core P#12 PU P#20 PU P#44 L2 (256KB) L1d (32KB) L1i (32KB) Core P#13 PU P#22 PU P#46
  • 76. JVM OS Socket P#0 L3 (30MB) L2 (256KB) L1d (32KB) L1i (32KB) Core P#0 PU P#0 PU P#24 L2 (256KB) L1d (32KB) L1i (32KB) Core P#1 PU P#2 PU P#26 L2 (256KB) L1d (32KB) L1i (32KB) Core P#2 PU P#4 PU P#28 L2 (256KB) L1d (32KB) L1i (32KB) Core P#3 PU P#6 PU P#30 L2 (256KB) L1d (32KB) L1i (32KB) Core P#4 PU P#8 PU P#32 L2 (256KB) L1d (32KB) L1i (32KB) Core P#5 PU P#10 PU P#34 L2 (256KB) L1d (32KB) L1i (32KB) Core P#8 PU P#12 PU P#36 L2 (256KB) L1d (32KB) L1i (32KB) Core P#9 PU P#14 PU P#38 L2 (256KB) L1d (32KB) L1i (32KB) Core P#10 PU P#16 PU P#40 L2 (256KB) L1d (32KB) L1i (32KB) Core P#11 PU P#18 PU P#42 L2 (256KB) L1d (32KB) L1i (32KB) Core P#12 PU P#20 PU P#44 L2 (256KB) L1d (32KB) L1i (32KB) Core P#13 PU P#22 PU P#46 Remove reserved CPUs from the kernel scheduler isolcpus=0,2,4,6,8,24,26,28,30,32
  • 77. JVM OS Socket P#0 L3 (30MB) L2 (256KB) L1d (32KB) L1i (32KB) Core P#0 PU P#0 PU P#24 L2 (256KB) L1d (32KB) L1i (32KB) Core P#1 PU P#2 PU P#26 L2 (256KB) L1d (32KB) L1i (32KB) Core P#2 PU P#4 PU P#28 L2 (256KB) L1d (32KB) L1i (32KB) Core P#3 PU P#6 PU P#30 L2 (256KB) L1d (32KB) L1i (32KB) Core P#4 PU P#8 PU P#32 L2 (256KB) L1d (32KB) L1i (32KB) Core P#5 PU P#10 PU P#34 L2 (256KB) L1d (32KB) L1i (32KB) Core P#8 PU P#12 PU P#36 L2 (256KB) L1d (32KB) L1i (32KB) Core P#9 PU P#14 PU P#38 L2 (256KB) L1d (32KB) L1i (32KB) Core P#10 PU P#16 PU P#40 L2 (256KB) L1d (32KB) L1i (32KB) Core P#11 PU P#18 PU P#42 L2 (256KB) L1d (32KB) L1i (32KB) Core P#12 PU P#20 PU P#44 L2 (256KB) L1d (32KB) L1i (32KB) Core P#13 PU P#22 PU P#46 Create CPU sets for system, application # cset set --set=/system --cpu=18,20,...,46 # cset set --set=/app --cpu=0,2,...,40 / /system/app
  • 78. OS Socket P#0 L3 (30MB) L2 (256KB) L1d (32KB) L1i (32KB) Core P#0 PU P#0 PU P#24 L2 (256KB) L1d (32KB) L1i (32KB) Core P#1 PU P#2 PU P#26 L2 (256KB) L1d (32KB) L1i (32KB) Core P#2 PU P#4 PU P#28 L2 (256KB) L1d (32KB) L1i (32KB) Core P#3 PU P#6 PU P#30 L2 (256KB) L1d (32KB) L1i (32KB) Core P#4 PU P#8 PU P#32 L2 (256KB) L1d (32KB) L1i (32KB) Core P#5 PU P#10 PU P#34 L2 (256KB) L1d (32KB) L1i (32KB) Core P#8 PU P#12 PU P#36 L2 (256KB) L1d (32KB) L1i (32KB) Core P#9 PU P#14 PU P#38 L2 (256KB) L1d (32KB) L1i (32KB) Core P#10 PU P#16 PU P#40 L2 (256KB) L1d (32KB) L1i (32KB) Core P#11 PU P#18 PU P#42 L2 (256KB) L1d (32KB) L1i (32KB) Core P#12 PU P#20 PU P#44 L2 (256KB) L1d (32KB) L1i (32KB) Core P#13 PU P#22 PU P#46 Processes default to the / CPU set / /system/app
  • 79. OS Socket P#0 L3 (30MB) L2 (256KB) L1d (32KB) L1i (32KB) Core P#0 PU P#0 PU P#24 L2 (256KB) L1d (32KB) L1i (32KB) Core P#1 PU P#2 PU P#26 L2 (256KB) L1d (32KB) L1i (32KB) Core P#2 PU P#4 PU P#28 L2 (256KB) L1d (32KB) L1i (32KB) Core P#3 PU P#6 PU P#30 L2 (256KB) L1d (32KB) L1i (32KB) Core P#4 PU P#8 PU P#32 L2 (256KB) L1d (32KB) L1i (32KB) Core P#5 PU P#10 PU P#34 L2 (256KB) L1d (32KB) L1i (32KB) Core P#8 PU P#12 PU P#36 L2 (256KB) L1d (32KB) L1i (32KB) Core P#9 PU P#14 PU P#38 L2 (256KB) L1d (32KB) L1i (32KB) Core P#10 PU P#16 PU P#40 L2 (256KB) L1d (32KB) L1i (32KB) Core P#11 PU P#18 PU P#42 L2 (256KB) L1d (32KB) L1i (32KB) Core P#12 PU P#20 PU P#44 L2 (256KB) L1d (32KB) L1i (32KB) Core P#13 PU P#22 PU P#46 Move all threads into /system CPU set # cset proc --move -k --threads --force --from-set=/ --to-set=/system / /system/app
  • 80. JVM OS Socket P#0 L3 (30MB) L2 (256KB) L1d (32KB) L1i (32KB) Core P#0 PU P#0 PU P#24 L2 (256KB) L1d (32KB) L1i (32KB) Core P#1 PU P#2 PU P#26 L2 (256KB) L1d (32KB) L1i (32KB) Core P#2 PU P#4 PU P#28 L2 (256KB) L1d (32KB) L1i (32KB) Core P#3 PU P#6 PU P#30 L2 (256KB) L1d (32KB) L1i (32KB) Core P#4 PU P#8 PU P#32 L2 (256KB) L1d (32KB) L1i (32KB) Core P#5 PU P#10 PU P#34 L2 (256KB) L1d (32KB) L1i (32KB) Core P#8 PU P#12 PU P#36 L2 (256KB) L1d (32KB) L1i (32KB) Core P#9 PU P#14 PU P#38 L2 (256KB) L1d (32KB) L1i (32KB) Core P#10 PU P#16 PU P#40 L2 (256KB) L1d (32KB) L1i (32KB) Core P#11 PU P#18 PU P#42 L2 (256KB) L1d (32KB) L1i (32KB) Core P#12 PU P#20 PU P#44 L2 (256KB) L1d (32KB) L1i (32KB) Core P#13 PU P#22 PU P#46 Launch application in /app CPU set, taskset to run in pool $ cset proc --exec /app taskset -cp 10,12...38,40 java <args> / /system/app
  • 81. JVM OS Socket P#0 L3 (30MB) L2 (256KB) L1d (32KB) L1i (32KB) Core P#0 PU P#0 PU P#24 L2 (256KB) L1d (32KB) L1i (32KB) Core P#1 PU P#2 PU P#26 L2 (256KB) L1d (32KB) L1i (32KB) Core P#2 PU P#4 PU P#28 L2 (256KB) L1d (32KB) L1i (32KB) Core P#3 PU P#6 PU P#30 L2 (256KB) L1d (32KB) L1i (32KB) Core P#4 PU P#8 PU P#32 L2 (256KB) L1d (32KB) L1i (32KB) Core P#5 PU P#10 PU P#34 L2 (256KB) L1d (32KB) L1i (32KB) Core P#8 PU P#12 PU P#36 L2 (256KB) L1d (32KB) L1i (32KB) Core P#9 PU P#14 PU P#38 L2 (256KB) L1d (32KB) L1i (32KB) Core P#10 PU P#16 PU P#40 L2 (256KB) L1d (32KB) L1i (32KB) Core P#11 PU P#18 PU P#42 L2 (256KB) L1d (32KB) L1i (32KB) Core P#12 PU P#20 PU P#44 L2 (256KB) L1d (32KB) L1i (32KB) Core P#13 PU P#22 PU P#46 Move critical threads onto their own cores using JNA / JNI sched_set_affinity(0); sched_set_affinity(2); ...
  • 82. JVM OS Socket P#0 L3 (30MB) L2 (256KB) L1d (32KB) L1i (32KB) Core P#0 PU P#0 PU P#24 L2 (256KB) L1d (32KB) L1i (32KB) Core P#1 PU P#2 PU P#26 L2 (256KB) L1d (32KB) L1i (32KB) Core P#2 PU P#4 PU P#28 L2 (256KB) L1d (32KB) L1i (32KB) Core P#3 PU P#6 PU P#30 L2 (256KB) L1d (32KB) L1i (32KB) Core P#4 PU P#8 PU P#32 L2 (256KB) L1d (32KB) L1i (32KB) Core P#5 PU P#10 PU P#34 L2 (256KB) L1d (32KB) L1i (32KB) Core P#8 PU P#12 PU P#36 L2 (256KB) L1d (32KB) L1i (32KB) Core P#9 PU P#14 PU P#38 L2 (256KB) L1d (32KB) L1i (32KB) Core P#10 PU P#16 PU P#40 L2 (256KB) L1d (32KB) L1i (32KB) Core P#11 PU P#18 PU P#42 L2 (256KB) L1d (32KB) L1i (32KB) Core P#12 PU P#20 PU P#44 L2 (256KB) L1d (32KB) L1i (32KB) Core P#13 PU P#22 PU P#46
  • 84.
  • 89. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ lmax-trading-architecture