SlideShare a Scribd company logo
1 of 38
Writing and Testing
Higher Frequency Trading Engine
Peter Lawrey
Higher Frequency Trading Ltd
Who am I?
Australian living in UK. Father of three 15, 9 and 6
“Vanilla Java” blog gets 120K page views per month.
3rd
for Java on StackOverflow.
Six years designing, developing and supporting HFT
systems in Java for hedge funds, trading houses and
investment banks.
Principal Consultant for Higher Frequency Trading Ltd.
Event driven determinism
Critical operations are modelled as a series of
asynchronous events
Producer is not slowed by the consumer
Can be recorded for deterministic testing and monitoring
Can known the state for the cirtical system without having to
ask it.
Transparency and Understanding
Horizontal scalability is valueable for high
throughput.
For low latency, you need simplicity. The less the
system has to do the less time it takes.
Productivity
For many systems, a key driver is; how easy is it to add new
features.
For low latency, a key driver is; how easy is it to take out
redundant operations from the critical path.
Layering
Traditional design encourages layering to deal with one
concept at a time. A driver is to hide from the developer
what the lower layers are really doing.
In low latency, you need to understand what critical code is
doing, and often combine layers to minimise the work done.
This is more challenging for developers to deal with.
Taming your system
Ultra low GC, ideally not while trading.
Busy waiting isolated critical threads. Giving up the CPU
slows your program by 2-5x.
Lock free coding. While locks are typically cheap, they
make very bad outliers.
Direct access to memory for critical structures. You can
control the layout and minimise garbage.
Latency profile
In a complex system, the latency increases sharply as you
approach the worst latencies.
Latency
In a typical system, the worst 0.1% latency can be ten times
the typical latency, but is often much more. This means
your application needs to be able to track these outliers and
profile them.
This is something most existing tools won't do for you. You
need to build these into your system so you can monitor
production.
What does a low GC system look like?
Typical tick to trade latency of 60 micros external to the box
Logged Eden space usage every 5 minutes.
Full GC every morning at 5 AM.
Low level Java
Java the language is suitable for low latency
You can use natural Java for non critical code. This should be
the majrity of your code
For critical sections you need a subset of Java and the
libraires which are suitable for low latency.
Low level Java and natural Java integrate very easily, unlike
other low level languages.
Latency reporting
●
Look at the percentiles, typical, 90%, 99%, 99.9% and
worse in sample.
●
You should try to minimise the 99% or 99.9%. You should
look at the worst latencies for acceptability.
Latency and throughput
●
There are periodic disturbances in your system. This
means low throughput sees all of these.
●
In high throughput systems, the delays not only impact one
event, but many events, possibly thousands.
●
Test realistic throughputs for your systems, as well as
stress tests.
Why ultra low garbage
●
When a program accesses L1 cache is about 3x faster than
using L2. L2 is 4 to 7 times faster than accessing L3. L3 is
shared between cores. One thread running in L1 cache
can be faster than using all your CPUs at once using L3
cache.
●
You L1 cache is 32 KB, so if you are creating 32 MB/s of
garbage you are filling your L1 cache with garbage every
milli-second.
Recycling is good
Recycling mutable objects works best if;
They replace short or medium lived immutable objects.
The lifecycle is easy to reason about.
Data structure is simple and doesn't change significantly.
These can help eliminate, not just reduce GCs.
Avoid the kernel
The kernel can be the biggest source of delays in your
system. It can be avoided by
●
Kernel bypass network adapters
●
Isolating busy waiting CPUs
●
Memory mapped files for storage.
Avoid the kernel
Binding critical, busy waiting threads to isolated
CPUs can make a big difference to jitter.
Count of interrupts
per hour by length.
Lock free coding
Minimising the use of lock allows thread to perform more
consistently.

More complex to test.

Useful in ultra low latency context

Will scale better.
Faster math

Use double with rounding or long instead of BigDecimal
~100x faster and no garbage

Use long instead of Date or Calendar

Use sentinal values such as 0, NaN, MIN_VALUE or
MAX_VALUE instead of nullable references.

Use Trove for collections with primitives.
Low latency libraries
Light weight as possible
The essence of what you need and no more
Designed to make full use of your hardware
Performance characteristics is a key requirement.
OpenHFT project
●
Thread Affinity binding
OpenHFT/Java-Thread-Affinity
●
Low latency persistence and IPC
OpenHFT/Java-Chronicle
●
Data structures in off heap memory
OpenHFT/Java-Lang
●
Runtime Compiler and loader
OpenHFT/Java-Runtime-Compiler
Apache 2.0 open source.
Java Chronicle
●
Designed to allow you to log everything. Esp tracing
timestamps for profiling.
●
Typical IPC latency is less than one micro-second for small
messages. And less than 10 micro-seconds for large
messages.
●
Support reading/writing text and binary.
Java Chronicle performance
●
Sustained throughput limited by bandwidth of disk
subsystem.
●
Burst throughput can be 1 to 3 GB per second depending
on your hardware
●
Latencies for loads up to 100K events per second stable for
good hardware (ok on a laptop)
●
Latencies for loads over one million per second, magnify
any jitter in your system or application.
Java Chronicle Example
Writing text
int count = 10 * 1000 * 1000;
for (ExcerptAppender e = chronicle.createAppender();
e.index() < count; ) {
e.startExcerpt(100);
e.appendDateTimeMillis(System.currentTimeMillis());
e.append(", id=").append(e.index());
e.append(", name=lyj").append(e.index());
e.finish();
}
Writes 10 million messages in 1.7 seconds on this laptop
Java Chronicle Example
Writing binary
ExcerptAppender excerpt = ic.createAppender();
long next = System.nanoTime();
for (int i = 1; i <= runs; i++) {
double v = random.nextDouble();
excerpt.startExcerpt(25);
excerpt.writeUnsignedByte('M'); // message type
excerpt.writeLong(next); // write time stamp
excerpt.writeLong(0L); // read time stamp
excerpt.writeDouble(v);
excerpt.finish();
next += 1e9 / rate;
while (System.nanoTime() < next) ;
}
Java Chronicle Example
Reading binary
ExcerptTailer excerpt = ic.createTailer();
for (int i = 1; i <= runs; i++) {
while (!excerpt.nextIndex()) {
// busy wait
}
char ch = (char) excerpt.readUnsignedByte();
long writeTS = excerpt.readLong();
excerpt.writeLong(System.nanoTime());
double d = excerpt.readDouble();
}
Java Chronicle Latencies
500K/second
Took 10.11 seconds to write and read 5,000,000 entries
Time 1us: 1.541% 3us: 0.378% 10us: 0.218% 30us: 0.008% 100us: 0.002%
1 million/second
Took 5.01 seconds to write and read 5,000,000 entries
Time 1us: 3.064% 3us: 0.996% 10us: 0.625% 30us: 0.147% 100us: 0.105%
2 million/second
Took 2.51 seconds to write and read 5,000,000 entries
Time 1us: 7.769% 3us: 3.836% 10us: 2.943% 30us: 1.865% 100us: 1.798%
5 million/second
Took 1.01 seconds to write and read 5,000,000 entries
Time 1us: 37.039% 3us: 27.926% 10us: 23.635% 30us: 21% 100us: 21%
How does it perform
With one thread writing and another reading
Chronicle 2.0.1
-Xmx32m
Tiny
4 B
Small
16 B
Medium
64 B
Large
256 B
tmpfs 77 M/s 57 M/s 23 M/s 6.6 M/s
ext4 65 M/s 35 M/s 12 M/s 3.2 M/s
Java Affinity
●
Designed to help reduce jitter in your system.
●
Can reduce the amount of jitter if ~50 micro-seconds is
important to you.
●
Only really useful for isolated cpus
●
Understands the CPU layout so you can be declaritive
about your requirement.
Java Lang
●
Suports allocation and deallocation of 64-bit sized off heap
memory regions
●
Thread safe data structures.
●
Fast low level serialization and deserialization
●
Wraps Unsafe to make it safer to use, without losing to
much performance.
Java Runtime Compiler
●
Wraps the Compiler API so you can compile in memory
from a String and have the class loaded
●
Supports writing the text to a directory which in debug
mode allowing you to step into generated code.
●
Generate Java code is slower but easier to read/debug
than generated byte code
●
Dependency injection from Java is easier to debug and
profile than XML
Higher level interface
Instead of serializing raw messages, you can abstract this
functionality with asynchonous interfaces.
You have one or more interfaces which describe all the messages
into the system and all the messages out of the system.
You can test the processing engine without any queuing/transport
layers.
An example
An interface for messages
inbound.
An interface for messages
outbound.
All messages via persisted
IPC.
Is there a higher level API?
The interfaces look like this
public interface Gw2PeEvents {
public void small(MetaData metaData, SmallCommand command);
}
public interface Pe2GwEvents {
public void report(MetaData metaData, SmallReport smallReport);
}
Is there a higher level API?
The processing engine
class PEEvents implements Gw2PeEvents {
private final Pe2GwWriter pe2GwWriter;
private final SmallReport smallReport = new SmallReport();
public PEEvents(Pe2GwWriter pe2GwWriter) {
this.pe2GwWriter = pe2GwWriter;
}
@Override
public void small(MetaData metaData, SmallCommand command) {
smallReport.orderOkay(command.clientOrderId);
pe2GwWriter.report(metaData, smallReport);
}
}
Demo
An interface for messages
inbound.
An interface for messages
outbound.
All messages via persisted
IPC.
How does it perform?
On this laptop
[GC 15925K->5838K(120320K), 0.0132370 secs]
[Full GC 5838K->5755K(120320K), 0.0521970 secs]
Started
processed 0
processed 1000000
Processed 2000000
… deleted …
processed 9000000
processed 10000000
Received 10000000
Processed 10,000,000 events in and out in 20.2 seconds
The latency distribution was 0.6, 0.7/2.7, 5/26 (611) us for the
50, 90/99, 99.9/99.99 %tile, (worst)
On an i7 desktop
Processed 10,000,000 events in and out in 20.0 seconds
The latency distribution was 0.3, 0.3/1.6, 2/12 (77) us for the
50, 90/99, 99.9/99.99 %tile, (worst)
Q & A
Blog: Vanilla Java
Libraries: OpenHFT
peter.lawrey@higherfrequencytrading.com

More Related Content

What's hot

Implementing a JavaScript Engine
Implementing a JavaScript EngineImplementing a JavaScript Engine
Implementing a JavaScript EngineKris Mok
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?lucenerevolution
 
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...ScyllaDB
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberDemystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberFlink Forward
 
High Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance TuningHigh Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance TuningAlbert Chen
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkFlink Forward
 
Spring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise PlatformSpring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise PlatformVMware Tanzu
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase强 王
 
Building a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaBuilding a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaGuozhang Wang
 
Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT326) - AWS re:Inv...
Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT326) - AWS re:Inv...Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT326) - AWS re:Inv...
Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT326) - AWS re:Inv...Amazon Web Services
 
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...SANG WON PARK
 
Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...
Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...
Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...Flink Forward
 
Delta Lake: Optimizing Merge
Delta Lake: Optimizing MergeDelta Lake: Optimizing Merge
Delta Lake: Optimizing MergeDatabricks
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...Chris Fregly
 
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...confluent
 
Distributed applications using Hazelcast
Distributed applications using HazelcastDistributed applications using Hazelcast
Distributed applications using HazelcastTaras Matyashovsky
 
Analyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-timeAnalyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-timeDataWorks Summit
 

What's hot (20)

Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Implementing a JavaScript Engine
Implementing a JavaScript EngineImplementing a JavaScript Engine
Implementing a JavaScript Engine
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?
 
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberDemystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
 
High Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance TuningHigh Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance Tuning
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of Flink
 
Spring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise PlatformSpring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise Platform
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
Building a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaBuilding a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache Kafka
 
Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT326) - AWS re:Inv...
Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT326) - AWS re:Inv...Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT326) - AWS re:Inv...
Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT326) - AWS re:Inv...
 
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
 
Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...
Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...
Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...
 
Delta Lake: Optimizing Merge
Delta Lake: Optimizing MergeDelta Lake: Optimizing Merge
Delta Lake: Optimizing Merge
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
 
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
 
Distributed applications using Hazelcast
Distributed applications using HazelcastDistributed applications using Hazelcast
Distributed applications using Hazelcast
 
Analyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-timeAnalyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-time
 
Planning for Disaster Recovery (DR) with Galera Cluster
Planning for Disaster Recovery (DR) with Galera ClusterPlanning for Disaster Recovery (DR) with Galera Cluster
Planning for Disaster Recovery (DR) with Galera Cluster
 

Similar to Writing and testing high frequency trading engines in java

Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey J On The Beach
 
Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Peter Lawrey
 
Open HFT libraries in @Java
Open HFT libraries in @JavaOpen HFT libraries in @Java
Open HFT libraries in @JavaPeter Lawrey
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the Worldjhugg
 
MySQL 5.7 clustering: The developer perspective
MySQL 5.7 clustering: The developer perspectiveMySQL 5.7 clustering: The developer perspective
MySQL 5.7 clustering: The developer perspectiveUlf Wendel
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operationniallmilton
 
Determinism in finance
Determinism in financeDeterminism in finance
Determinism in financePeter Lawrey
 
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareBeyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareQuantum Leaps, LLC
 
Deterministic behaviour and performance in trading systems
Deterministic behaviour and performance in trading systemsDeterministic behaviour and performance in trading systems
Deterministic behaviour and performance in trading systemsPeter Lawrey
 
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareBeyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareMiro Samek
 
Blades for HPTC
Blades for HPTCBlades for HPTC
Blades for HPTCGuy Coates
 
C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
C* Summit 2013: Time is Money Jake Luciani and Carl YeksigianC* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
C* Summit 2013: Time is Money Jake Luciani and Carl YeksigianDataStax Academy
 
Graylog Engineering - Design Your Architecture
Graylog Engineering - Design Your ArchitectureGraylog Engineering - Design Your Architecture
Graylog Engineering - Design Your ArchitectureGraylog
 
Natural Laws of Software Performance
Natural Laws of Software PerformanceNatural Laws of Software Performance
Natural Laws of Software PerformanceGibraltar Software
 
Scalable Apache for Beginners
Scalable Apache for BeginnersScalable Apache for Beginners
Scalable Apache for Beginnerswebhostingguy
 
Optimizing your java applications for multi core hardware
Optimizing your java applications for multi core hardwareOptimizing your java applications for multi core hardware
Optimizing your java applications for multi core hardwareIndicThreads
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready AppsVMware Tanzu
 

Similar to Writing and testing high frequency trading engines in java (20)

Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey
 
Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016
 
Open HFT libraries in @Java
Open HFT libraries in @JavaOpen HFT libraries in @Java
Open HFT libraries in @Java
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
 
MySQL 5.7 clustering: The developer perspective
MySQL 5.7 clustering: The developer perspectiveMySQL 5.7 clustering: The developer perspective
MySQL 5.7 clustering: The developer perspective
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operation
 
Determinism in finance
Determinism in financeDeterminism in finance
Determinism in finance
 
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareBeyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
 
Realtime
RealtimeRealtime
Realtime
 
Deterministic behaviour and performance in trading systems
Deterministic behaviour and performance in trading systemsDeterministic behaviour and performance in trading systems
Deterministic behaviour and performance in trading systems
 
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareBeyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
 
Blades for HPTC
Blades for HPTCBlades for HPTC
Blades for HPTC
 
Transactional Memory
Transactional MemoryTransactional Memory
Transactional Memory
 
C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
C* Summit 2013: Time is Money Jake Luciani and Carl YeksigianC* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
 
Clustering van IT-componenten
Clustering van IT-componentenClustering van IT-componenten
Clustering van IT-componenten
 
Graylog Engineering - Design Your Architecture
Graylog Engineering - Design Your ArchitectureGraylog Engineering - Design Your Architecture
Graylog Engineering - Design Your Architecture
 
Natural Laws of Software Performance
Natural Laws of Software PerformanceNatural Laws of Software Performance
Natural Laws of Software Performance
 
Scalable Apache for Beginners
Scalable Apache for BeginnersScalable Apache for Beginners
Scalable Apache for Beginners
 
Optimizing your java applications for multi core hardware
Optimizing your java applications for multi core hardwareOptimizing your java applications for multi core hardware
Optimizing your java applications for multi core hardware
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
 

More from Peter Lawrey

Chronicle accelerate building a digital currency
Chronicle accelerate   building a digital currencyChronicle accelerate   building a digital currency
Chronicle accelerate building a digital currencyPeter Lawrey
 
Chronicle Accelerate Crypto Investor conference
Chronicle Accelerate Crypto Investor conferenceChronicle Accelerate Crypto Investor conference
Chronicle Accelerate Crypto Investor conferencePeter Lawrey
 
Low latency for high throughput
Low latency for high throughputLow latency for high throughput
Low latency for high throughputPeter Lawrey
 
Legacy lambda code
Legacy lambda codeLegacy lambda code
Legacy lambda codePeter Lawrey
 
Responding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaResponding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaPeter Lawrey
 
Streams and lambdas the good, the bad and the ugly
Streams and lambdas the good, the bad and the uglyStreams and lambdas the good, the bad and the ugly
Streams and lambdas the good, the bad and the uglyPeter Lawrey
 
Advanced off heap ipc
Advanced off heap ipcAdvanced off heap ipc
Advanced off heap ipcPeter Lawrey
 
GC free coding in @Java presented @Geecon
GC free coding in @Java presented @GeeconGC free coding in @Java presented @Geecon
GC free coding in @Java presented @GeeconPeter Lawrey
 
Introduction to OpenHFT for Melbourne Java Users Group
Introduction to OpenHFT for Melbourne Java Users GroupIntroduction to OpenHFT for Melbourne Java Users Group
Introduction to OpenHFT for Melbourne Java Users GroupPeter Lawrey
 
Thread Safe Interprocess Shared Memory in Java (in 7 mins)
Thread Safe Interprocess Shared Memory in Java (in 7 mins)Thread Safe Interprocess Shared Memory in Java (in 7 mins)
Thread Safe Interprocess Shared Memory in Java (in 7 mins)Peter Lawrey
 
Using BigDecimal and double
Using BigDecimal and doubleUsing BigDecimal and double
Using BigDecimal and doublePeter Lawrey
 
Introduction to chronicle (low latency persistence)
Introduction to chronicle (low latency persistence)Introduction to chronicle (low latency persistence)
Introduction to chronicle (low latency persistence)Peter Lawrey
 

More from Peter Lawrey (12)

Chronicle accelerate building a digital currency
Chronicle accelerate   building a digital currencyChronicle accelerate   building a digital currency
Chronicle accelerate building a digital currency
 
Chronicle Accelerate Crypto Investor conference
Chronicle Accelerate Crypto Investor conferenceChronicle Accelerate Crypto Investor conference
Chronicle Accelerate Crypto Investor conference
 
Low latency for high throughput
Low latency for high throughputLow latency for high throughput
Low latency for high throughput
 
Legacy lambda code
Legacy lambda codeLegacy lambda code
Legacy lambda code
 
Responding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaResponding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in Java
 
Streams and lambdas the good, the bad and the ugly
Streams and lambdas the good, the bad and the uglyStreams and lambdas the good, the bad and the ugly
Streams and lambdas the good, the bad and the ugly
 
Advanced off heap ipc
Advanced off heap ipcAdvanced off heap ipc
Advanced off heap ipc
 
GC free coding in @Java presented @Geecon
GC free coding in @Java presented @GeeconGC free coding in @Java presented @Geecon
GC free coding in @Java presented @Geecon
 
Introduction to OpenHFT for Melbourne Java Users Group
Introduction to OpenHFT for Melbourne Java Users GroupIntroduction to OpenHFT for Melbourne Java Users Group
Introduction to OpenHFT for Melbourne Java Users Group
 
Thread Safe Interprocess Shared Memory in Java (in 7 mins)
Thread Safe Interprocess Shared Memory in Java (in 7 mins)Thread Safe Interprocess Shared Memory in Java (in 7 mins)
Thread Safe Interprocess Shared Memory in Java (in 7 mins)
 
Using BigDecimal and double
Using BigDecimal and doubleUsing BigDecimal and double
Using BigDecimal and double
 
Introduction to chronicle (low latency persistence)
Introduction to chronicle (low latency persistence)Introduction to chronicle (low latency persistence)
Introduction to chronicle (low latency persistence)
 

Recently uploaded

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Recently uploaded (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Writing and testing high frequency trading engines in java

  • 1. Writing and Testing Higher Frequency Trading Engine Peter Lawrey Higher Frequency Trading Ltd
  • 2. Who am I? Australian living in UK. Father of three 15, 9 and 6 “Vanilla Java” blog gets 120K page views per month. 3rd for Java on StackOverflow. Six years designing, developing and supporting HFT systems in Java for hedge funds, trading houses and investment banks. Principal Consultant for Higher Frequency Trading Ltd.
  • 3. Event driven determinism Critical operations are modelled as a series of asynchronous events Producer is not slowed by the consumer Can be recorded for deterministic testing and monitoring Can known the state for the cirtical system without having to ask it.
  • 4. Transparency and Understanding Horizontal scalability is valueable for high throughput. For low latency, you need simplicity. The less the system has to do the less time it takes.
  • 5. Productivity For many systems, a key driver is; how easy is it to add new features. For low latency, a key driver is; how easy is it to take out redundant operations from the critical path.
  • 6. Layering Traditional design encourages layering to deal with one concept at a time. A driver is to hide from the developer what the lower layers are really doing. In low latency, you need to understand what critical code is doing, and often combine layers to minimise the work done. This is more challenging for developers to deal with.
  • 7. Taming your system Ultra low GC, ideally not while trading. Busy waiting isolated critical threads. Giving up the CPU slows your program by 2-5x. Lock free coding. While locks are typically cheap, they make very bad outliers. Direct access to memory for critical structures. You can control the layout and minimise garbage.
  • 8. Latency profile In a complex system, the latency increases sharply as you approach the worst latencies.
  • 9. Latency In a typical system, the worst 0.1% latency can be ten times the typical latency, but is often much more. This means your application needs to be able to track these outliers and profile them. This is something most existing tools won't do for you. You need to build these into your system so you can monitor production.
  • 10. What does a low GC system look like? Typical tick to trade latency of 60 micros external to the box Logged Eden space usage every 5 minutes. Full GC every morning at 5 AM.
  • 11. Low level Java Java the language is suitable for low latency You can use natural Java for non critical code. This should be the majrity of your code For critical sections you need a subset of Java and the libraires which are suitable for low latency. Low level Java and natural Java integrate very easily, unlike other low level languages.
  • 12. Latency reporting ● Look at the percentiles, typical, 90%, 99%, 99.9% and worse in sample. ● You should try to minimise the 99% or 99.9%. You should look at the worst latencies for acceptability.
  • 13. Latency and throughput ● There are periodic disturbances in your system. This means low throughput sees all of these. ● In high throughput systems, the delays not only impact one event, but many events, possibly thousands. ● Test realistic throughputs for your systems, as well as stress tests.
  • 14. Why ultra low garbage ● When a program accesses L1 cache is about 3x faster than using L2. L2 is 4 to 7 times faster than accessing L3. L3 is shared between cores. One thread running in L1 cache can be faster than using all your CPUs at once using L3 cache. ● You L1 cache is 32 KB, so if you are creating 32 MB/s of garbage you are filling your L1 cache with garbage every milli-second.
  • 15. Recycling is good Recycling mutable objects works best if; They replace short or medium lived immutable objects. The lifecycle is easy to reason about. Data structure is simple and doesn't change significantly. These can help eliminate, not just reduce GCs.
  • 16. Avoid the kernel The kernel can be the biggest source of delays in your system. It can be avoided by ● Kernel bypass network adapters ● Isolating busy waiting CPUs ● Memory mapped files for storage.
  • 17. Avoid the kernel Binding critical, busy waiting threads to isolated CPUs can make a big difference to jitter. Count of interrupts per hour by length.
  • 18. Lock free coding Minimising the use of lock allows thread to perform more consistently.  More complex to test.  Useful in ultra low latency context  Will scale better.
  • 19. Faster math  Use double with rounding or long instead of BigDecimal ~100x faster and no garbage  Use long instead of Date or Calendar  Use sentinal values such as 0, NaN, MIN_VALUE or MAX_VALUE instead of nullable references.  Use Trove for collections with primitives.
  • 20. Low latency libraries Light weight as possible The essence of what you need and no more Designed to make full use of your hardware Performance characteristics is a key requirement.
  • 21. OpenHFT project ● Thread Affinity binding OpenHFT/Java-Thread-Affinity ● Low latency persistence and IPC OpenHFT/Java-Chronicle ● Data structures in off heap memory OpenHFT/Java-Lang ● Runtime Compiler and loader OpenHFT/Java-Runtime-Compiler Apache 2.0 open source.
  • 22. Java Chronicle ● Designed to allow you to log everything. Esp tracing timestamps for profiling. ● Typical IPC latency is less than one micro-second for small messages. And less than 10 micro-seconds for large messages. ● Support reading/writing text and binary.
  • 23. Java Chronicle performance ● Sustained throughput limited by bandwidth of disk subsystem. ● Burst throughput can be 1 to 3 GB per second depending on your hardware ● Latencies for loads up to 100K events per second stable for good hardware (ok on a laptop) ● Latencies for loads over one million per second, magnify any jitter in your system or application.
  • 24. Java Chronicle Example Writing text int count = 10 * 1000 * 1000; for (ExcerptAppender e = chronicle.createAppender(); e.index() < count; ) { e.startExcerpt(100); e.appendDateTimeMillis(System.currentTimeMillis()); e.append(", id=").append(e.index()); e.append(", name=lyj").append(e.index()); e.finish(); } Writes 10 million messages in 1.7 seconds on this laptop
  • 25. Java Chronicle Example Writing binary ExcerptAppender excerpt = ic.createAppender(); long next = System.nanoTime(); for (int i = 1; i <= runs; i++) { double v = random.nextDouble(); excerpt.startExcerpt(25); excerpt.writeUnsignedByte('M'); // message type excerpt.writeLong(next); // write time stamp excerpt.writeLong(0L); // read time stamp excerpt.writeDouble(v); excerpt.finish(); next += 1e9 / rate; while (System.nanoTime() < next) ; }
  • 26. Java Chronicle Example Reading binary ExcerptTailer excerpt = ic.createTailer(); for (int i = 1; i <= runs; i++) { while (!excerpt.nextIndex()) { // busy wait } char ch = (char) excerpt.readUnsignedByte(); long writeTS = excerpt.readLong(); excerpt.writeLong(System.nanoTime()); double d = excerpt.readDouble(); }
  • 27. Java Chronicle Latencies 500K/second Took 10.11 seconds to write and read 5,000,000 entries Time 1us: 1.541% 3us: 0.378% 10us: 0.218% 30us: 0.008% 100us: 0.002% 1 million/second Took 5.01 seconds to write and read 5,000,000 entries Time 1us: 3.064% 3us: 0.996% 10us: 0.625% 30us: 0.147% 100us: 0.105% 2 million/second Took 2.51 seconds to write and read 5,000,000 entries Time 1us: 7.769% 3us: 3.836% 10us: 2.943% 30us: 1.865% 100us: 1.798% 5 million/second Took 1.01 seconds to write and read 5,000,000 entries Time 1us: 37.039% 3us: 27.926% 10us: 23.635% 30us: 21% 100us: 21%
  • 28. How does it perform With one thread writing and another reading Chronicle 2.0.1 -Xmx32m Tiny 4 B Small 16 B Medium 64 B Large 256 B tmpfs 77 M/s 57 M/s 23 M/s 6.6 M/s ext4 65 M/s 35 M/s 12 M/s 3.2 M/s
  • 29. Java Affinity ● Designed to help reduce jitter in your system. ● Can reduce the amount of jitter if ~50 micro-seconds is important to you. ● Only really useful for isolated cpus ● Understands the CPU layout so you can be declaritive about your requirement.
  • 30. Java Lang ● Suports allocation and deallocation of 64-bit sized off heap memory regions ● Thread safe data structures. ● Fast low level serialization and deserialization ● Wraps Unsafe to make it safer to use, without losing to much performance.
  • 31. Java Runtime Compiler ● Wraps the Compiler API so you can compile in memory from a String and have the class loaded ● Supports writing the text to a directory which in debug mode allowing you to step into generated code. ● Generate Java code is slower but easier to read/debug than generated byte code ● Dependency injection from Java is easier to debug and profile than XML
  • 32. Higher level interface Instead of serializing raw messages, you can abstract this functionality with asynchonous interfaces. You have one or more interfaces which describe all the messages into the system and all the messages out of the system. You can test the processing engine without any queuing/transport layers.
  • 33. An example An interface for messages inbound. An interface for messages outbound. All messages via persisted IPC.
  • 34. Is there a higher level API? The interfaces look like this public interface Gw2PeEvents { public void small(MetaData metaData, SmallCommand command); } public interface Pe2GwEvents { public void report(MetaData metaData, SmallReport smallReport); }
  • 35. Is there a higher level API? The processing engine class PEEvents implements Gw2PeEvents { private final Pe2GwWriter pe2GwWriter; private final SmallReport smallReport = new SmallReport(); public PEEvents(Pe2GwWriter pe2GwWriter) { this.pe2GwWriter = pe2GwWriter; } @Override public void small(MetaData metaData, SmallCommand command) { smallReport.orderOkay(command.clientOrderId); pe2GwWriter.report(metaData, smallReport); } }
  • 36. Demo An interface for messages inbound. An interface for messages outbound. All messages via persisted IPC.
  • 37. How does it perform? On this laptop [GC 15925K->5838K(120320K), 0.0132370 secs] [Full GC 5838K->5755K(120320K), 0.0521970 secs] Started processed 0 processed 1000000 Processed 2000000 … deleted … processed 9000000 processed 10000000 Received 10000000 Processed 10,000,000 events in and out in 20.2 seconds The latency distribution was 0.6, 0.7/2.7, 5/26 (611) us for the 50, 90/99, 99.9/99.99 %tile, (worst) On an i7 desktop Processed 10,000,000 events in and out in 20.0 seconds The latency distribution was 0.3, 0.3/1.6, 2/12 (77) us for the 50, 90/99, 99.9/99.99 %tile, (worst)
  • 38. Q & A Blog: Vanilla Java Libraries: OpenHFT peter.lawrey@higherfrequencytrading.com