Introduction to OpenHFT for Melbourne Java Users Group

Introduction to OpenHFT
Peter Lawrey
Melbourne Java & JVM Users Group.

What is OpenHFT?

Apache 2.0, open source libraries designed
with HFT systems in mind

Designed to be useful in systems with high
performance requirements.

Intended to encourage developers to think
differently about what Java can do.

What is HFT?

HFT stands for High Frequency Trading, no
technical definition of what that is.

Too fast to see. Application must measure
itself.

Speed is critical for the commercial success of
the application. A slow HFT system can lose
money in the long term. A fast HFT system
can make money.

What is HFT in Java?

A fast trading system in Java is
< 100 micro-seconds 90% and no GCs during
the trading day.

A medium speed trading system in Java is
< 1 ms 95% of the time and rare minor
collections.

A slower trading system in Java is
< 10 ms, 99% or 99.9% of the time with minor
GCs every few minutes.

Why use Java at all?

Shorter time to market means being able to
chase short term trading opportunities.

Larger developer pool.

Larger open source library pool which can be
used in a commercial context.

Usually the external systems are 10+ times
slower than your Java trading system, so there
is more gains in being smarter about how you
use those external system.

What is Chronicle?
Very fast embedded persistence for Java.
Functionality is simple and low level by design

Where does Chronicle come from

Low latency, high frequency trading
– Applications which are sub 100 micro-second
external to the system.


High throughput trading systems
– Hundreds of thousand of events per second


Modes of use
– GC free
– Lock-less
– Shared memory
– Text or binary
– Replicated over
TCP
– Supports thread
affinity

Is there a free version?

It is open source and free with an Apache 2.0
license.

You can pay for training and consulting

Use for Chronicle

Synchronous text logging
– support for SLF4J coming.

Synchronous binary data logging

Use for Chronicle

Messaging between processes
via shared memory

Messaging across systems

Use for Chronicle

Supports recording micro-second timestamps
across the systems

Replay for production data in test

Writing to Chronicle
IndexedChronicle ic = new IndexedChronicle(basePath);
Appender excerpt = ic.createAppender();
for (int i = 1; i <= runs; i++) {
excerpt.startExcerpt();
excerpt.writeUnsignedByte('M'); // message type
excerpt.writeLong(i); // e.g. time stamp
excerpt.writeDouble(i);
excerpt.finish();
}
ic.close();

Reading from Chronicle
IndexedChronicle ic = new IndexedChronicle(basePath);
ic.useUnsafe(true); // for benchmarks
Tailer excerpt = ic.createTailer();
for (int i = 1; i <= runs; i++) {
while (!excerpt.nextIndex()) {
// busy wait
}
char ch = (char) excerpt.readUnsignedByte();
long l = excerpt.readLong();
double d = excerpt.readDouble();
assert ch == 'M';
assert l == i;
assert d == i;
excerpt.finish();
}
ic.close();

How does it perform
With one thread writing and another reading
* Chronicle 2.0
-Xmx32m -verbose:gc
Tiny
4 B
Small
16 B
Medium
64 B
Large
256 B
tmpfs 77 M/s 57 M/s 23 M/s 6.6 M/s
ext4 65 M/s 35 M/s 12 M/s 3.2 M/s

How does it recover?
Once finish()
returns, the OS will do
the rest.
If an excerpt is
incomplete, it will be
pruned.

Cache friendly
Data is laid out continuously, naturally packed.
You can compress some types. One entry
starts in the next byte to the previous one.

Consumer insensitive
No matter how slow the consumer is, the
producer never has to wait. It never needs to
clean messages before publishing (as a ring
buffer does)
You can start a consumer at the end of the day
e.g. for reporting. The consumer can be more
than the main memory size behind the
producer as a Chronicle is not limited by main
memory.

How does it collect garbage?
There is an assumption that your application has a daily
or weekly maintenance cycle.
This is implemented by
closing the files and
creating new ones.
i.e. the whole lot is moved,
compressed or deleted.
Anything which must be
retained can be copied
to the new Chronicle

Is there a lower level API?
Chronicle 2.0 is based on OpenHFT Java Lang
library which supports access to 64-bit native
memory.

Has long size and offsets.

Support serialization and deserialization

Thread safe access including locking

Is there a higher level API?
You can hide the low level details with an
interface.

Is there a higher level API?
There is a demo
program with a
simple interface.
This models a “hub”
process which take in
events, processes
them and publishes
results.

Introduction to HugeCollections
Two main collections are;
− HugeHashMap (off heap, volatile, private)
− SharedHashMap (off heap, persisted,
shared)
Both are designed to support billions of entries,
with zero copy serialization.
Concurrent access, with over a million
operations per second, per core.

Creating a SharedHashMap

Uses a builder to create the map as there are
a number of options.

Updating an entry in the SHM

Create an off heap reference from an interface
and update it as if it were on the heap

Accessing a SHM entry

Accessing an entry looks like normal Java
code, except arrays use a method xxxAt(n)

Why use SHM?

Shared between processes

Persisted, or “written” to tmpfs e.g. /dev/shm

Can be GC-less, so not impact on pause
times.

As little as 1/5th of the memory of
ConcurrentHashMap

TCP/UDP multi-master replication planned.

Performance of CHM
With a 30 GB heap, 12 updates per entry

Performance of SHM
With a 64 MB heap, 12 updates per entry, no GCs

Introduction to OpenHFT for Melbourne Java Users Group

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (9)

Similar to Introduction to OpenHFT for Melbourne Java Users Group

Similar to Introduction to OpenHFT for Melbourne Java Users Group (20)

Recently uploaded

Recently uploaded (20)

Introduction to OpenHFT for Melbourne Java Users Group