Modernizing with Microservices and Fast Data

Presented by Patrick Di Loreto
Head of Engineering
Site: https://developer.williamhill.com/
BLOG: http://patricknoir.blogspot.com
Twitter: https://twitter.com/patricknoir
Modernizing with
Microservices and Fast Data

Big Data in Numbers
By the end of 2016 there will be more than:
25,000,000,000 devices connected in internet
On 2013 we produced more data in 2 days
than the whole human history since the origin

What does it mean for us
- 160TB of Data are flowing through our system every day
- We push more than 5 millions price changes in real time
- On a busy day we have ½ million simultaneous customers on our platform

The Challenge
Build a data platform suitable for the development of modern applications

Requirements
- Be able to process large amounts of data in a close real time fashion
- Respect non functional requirements such as:
- FAULT TOLERANCE
- HIGHLY AVAILABILITY
- SCALABILITY
- Dealing with existing/legacy systems
- Scale team delivery capability through adoption of Microservices Architecture

• Microservices are not exclusively STATELESS applications!
False Myth: Microservices Architecture 1/2
Monolith
A CBA A
MONOLITH

• Achieve great ISOLATION without using synchronous protocols
False Myth: Microservices Architecture 2/2
A B
D E
C
A C E
DB
Message Bus
Monolith

Respecting Reactive Principles
Based on a Lambda Architecture
• Chronos – Data Source
• Fates – Batch Layer
• NeoCortex – Speed Layer
• Hermes – Serving Layer
Omnia – Distributed Data Management Platform
Omnia
Chronos
Fates
Hermes
NeoCortex

Omnia Chronos
Is in charge to collect/intercept the data
from different sources and make them
available as streams of observable
events.
Observable [ ]
•Social media
•Facebook
•Twitter
•Affiliates
•Page viewing
•Articles read,
following and
followers, bets etc…
•Sports related
•Tweets
•News
•Gaming
•Web Analytics
•Activities with in
our applications
Internal
Product
Centric
External
Customer
Centric
{
“type” : “bet”,
“version” : “1.0”
“time” : “2015-06-03
08:00:31”,
“acquisitionTime: “ . . .”,
“source” : “WHBetSystem”
“payload” : { … any valid json }
}

Omnia Chronos
In Chronos you define streams that collect data and
convert/persist into a stream of Observable[Incident].
Chronos
Stream
3
Stream
2
Stream
1
Stream

Omnia Chronos - Clustering
Chronos 1 Chronos 2 Chronos 3
Twitter
Distributed System Properties:
1. Concurrency
2. Distribution
3. Mobility

Omnia Chronos
• Chronos is built on top of Akka to leverage:
– Referential transparency (Mobility)
– Error Kernel Patter (Fail fast and in isolation)
– Concurrency and Distribution for Horizontal and Vertical Scalability
• We use Scala Rx API to promote non blocking API to achieve
Vertical Scalability
• Data are persisted in Kafka for durability:
– Fast Write Operation with Zero Copy and Filesystem Cache
– Compaction and Compression to optimise messages consumption

Vertical Scalability vs Horizontal Scalability
Horizontal – Distribute the load across different machines (Akka Cluster)
Vertical – Maximise local resource utilisation (Non blocking IO + Non blocking API)

Timing for Machine operations
Instruction Time
Execute typical instruction 1/1,000,000,000 = 1 nanosec
Fetch from L1 cache memory 0.5 nanosec
Branch misprediction 5 nanosec
Fetch from L2 cache memory 7 nanosec
Mutex lock/unlock 25 nanosec
Fetch from main memory 100 nanosec
Send 2K bytes over 1Gbps network 20,000 nanosec (20µs)
Read 1MB sequentially from memory 250,000 nanosec (250µs)
Fetch from new disk location (seek) 8,000,000 (8ms)
Read 1MB sequentially from disk 20,000,000 nanosec (20ms)
Send packet US to Europe and back 150,000,000 nanosec (150ms)

Humanised Time
Instruction Time
Execute typical instruction 1 s
Fetch from L1 cache memory 0.5 s
Branch misprediction 5 s
Fetch from L2 cache memory 7 s
Mutex lock/unlock ½ s
Fetch from main memory 1½ min
Send 2K bytes over 1Gbps network 5½ hours
Read 1MB sequentially from memory 3 days
Fetch from new disk location (seek) 13 weeks
Read 1MB sequentially from disk 6½ months
Send packet US to Europe and back 5 years

Fates represents the long term memory of Omnia. Is in charge to organise all the incidents recorded by Chronos into
timelines and create new information as views by using machine learning, logical reasoning and time series analysis.
• A timeline represents the history, the sequence of incidents performed by a specific entity over the time. Timelines
are organised per categories. An example of timeline can be the customer timeline, which might contain all the bets
placed, deposit and withdraw activities, tweets etc... performed by the specific customer.
A timeline category is not limited just to customers, it can be anything, for example: Sport Event: football match,
competition
• Views are the result of job task that elaborates data from:
– Timelines
– Other Views
Omnia Fates

Fates represents the long term memory of Omnia. It organizes the incidents that
Chronos collected into timelines and also elaborates new information as views by
using machine learning, logical reasoning and time series analysis.
Fates: Batch layer
1
9
Omnia: Distributed & Reactive platform for data management
Customer: 123
Login
Deposit
Bet placed
…
Logout
Event: 78
Started
Fault
Penalty
…
Goal
Timelines & Views
Bets DepositsEvents Session
Fates
Batch Layer

Timelines are created from timeline streams, each timeline stream read data from a Chronos stream and
fed the right timeline.
Omnia FatesChronos
Fates

• Fates persist timelines of incidents.
• Column Family Name: <TimelineCategory>_tl
• Key Definition: ( (entityId, date), timestamp )
• The partition key is a strong hash key : well balanced Cassandra Cluster
• Composite key: incidents are ordered by timestamp under a specific entity within a day
(date = yyyy-MM-dd )
Omnia Fates - Cassandra

• Multi Data Center application for operation and analytics/reporting
• On line analysis against ETL!
Omnia Fates – Separation of Concerns

Omnia Fates
• We build views with job able to do:
Jobs are performed on top of NeoCortex
Logical
Reasoning
• Deduction
• Induction
• Abduction
Time line analysis
• Trends
• Cycles
• Seasonality
Other ML
• Classification
• Clustering
• Predictions

Omnia Neo Cortex
• NeoCortex is a runtime platform and a set of libraries to perform concurrent and
distributed computations in a highly resilient way.
• Was initially desgined as a library on top of spark (streaming) but it evolved in a
platform for Reactive Microservice which allows to build application in:
– SPARK STREAMING
– AKKA STREAMS
– WILLIAM HILL LAMBDAS
• Applications are deployed in Neocortex as docker isolated microservices and they
can interact each other using chronos streams and with client applications through
Hermes.

Omnia Neo Cortex – SPARK STREAMING

Omnia Neo Cortex - Parallelism
chronos
stream
Driver
Executor 1
Executor 2
Executor 3
Executor 4
Executor 3
Executor 4
Hermes
(ServingLayer)
Stage 1
(map)
Stage 2
(reduceByKey)
Fates
timelines
views

Omnia Hermes
Is the layer on which data get represented for consumption: B2B and B2C. At its
foundation micro-services, notifications and data as API are key aspects of the design
Scalable and simple full duplex communication for the web
Express the correlation between the entities of the model
Inspired by Falcor (Netflix) and GraphQL (Facebook)

Omnia Hermes
Hermes
DistributedCache
Hermes Node
LocalCache
SubscriptionManager
ClientManager
AuthenticationHandler
Dispatcher
HTTP
WS
TCP
Browser
HermesJS
WHApps
Chronos

Omnia Infrastructure – Mesos/Marathon/Docker

Omnia Infrastructure
Omnia
Docker
Marathon
Mesos
Node Node Node Node Node

Modernizing with Microservices and Fast Data

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to Modernizing with Microservices and Fast Data

Similar to Modernizing with Microservices and Fast Data (20)

Recently uploaded

Recently uploaded (20)

Modernizing with Microservices and Fast Data

Editor's Notes