4. Current Design Flaw
● Bad Assumptions
– Serial Execution Among Instructments
– Batch Job ( 1 Sec Interval)
– I/O Cost/ Latency: Neglectable
– Computation Cost/Latency: Neglectable
– No Uncertainty
5. Reality
● Parallel execution of Instruments
● Market change in Ms, 1 Sec (or More) fixed reponse interval
is 1000X
● I/O is SLOW comaring to CPU speed (10^9 order)
● Communication Channel will Break
● Only God have ALL TIMELY information for decision making.
We people live in UNCERTAINTY
● Tradeoff: Best Effort Decision versus Perfect Decision
6. Distributed System: Our Reality
● Parallel: Everything is Moving
● Error Prone (Fault): No 99.9999999....%
Gurantee
● Uncertainty: Who/What/Where/When/Why
● Concensus/Consistency with HIGH COST
● Speed limit: Light Travels at (300,000km/Sec)
9. Case:Two Generals' Problem
● 1975 . E. A. Akkoyunlu, K. Ekanadham, and R.
V. Huber "Some Constraints and Trade-offs in
the Design of Network Communications"
● “A pragmatic approach to dealing with the Two
Generals' Problem is to use schemes that
accept the uncertainty of the communications
channel and not attempt to eliminate it, but
rather mitigate it to an acceptable degree.”
21. So we are event-driven ready.
But...
● Components are mostly event driven
● Interface Among components are BLOCKING
INTERFACE! (block/wait/timeout....)
● One blocking will ruin the whole infrastructure!
● “Public Land Tragic”
25. Vision
● A fault tolerant, responsive trading platorm
– Respond to every marktet tick/heartbeat
● Maintain “fresh” market snapshot
● React as every tick comes
– Async, Parallel order fullfillment
● Handle > 1000 orders one server instance without
pressure
● Milisec level processing delay per order action
26. Vision: Unified Event Loop
Libuv
event Loop
ZeroMq
socket
Normal socket
timer
Other Async events
eventfd
fd
timer_t handle
async_t handle
Parallel Tasks req_t
Thread Pool
Async Tasks
Order Fullfill
Market Snapshot
update
Other
Event Loop
Other business
modules
29. Project Plan
● Goal: A testable new framwork
– Algotrading as typical business test case
– High throughput, low latency
– Fault Tolerant with sensible tradeoff strategy
Side by Side comparison with current order system
30. Task:
● Project Managment (Wei Song)
● Event Loop, ZeroMQ (Tie Gang/Luke)
● Market snapshot update (Wei Song/Ze Yu)
● Order business module rewrite(Ze Yu/Wei Song)
● Testing environment (Wei Song/Tie Gang)
– Simulation data/replay
– OMS simulator
● Testing
– All
– + Calvin + other business staff