Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Modernizing with Microservices and Fast Data
1. Presented by Patrick Di Loreto
Head of Engineering
Site: https://developer.williamhill.com/
BLOG: http://patricknoir.blogspot.com
Twitter: https://twitter.com/patricknoir
Modernizing with
Microservices and Fast Data
2. Big Data in Numbers
By the end of 2016 there will be more than:
25,000,000,000 devices connected in internet
On 2013 we produced more data in 2 days
than the whole human history since the origin
3. What does it mean for us
- 160TB of Data are flowing through our system every day
- We push more than 5 millions price changes in real time
- On a busy day we have ½ million simultaneous customers on our platform
4. The Challenge
Build a data platform suitable for the development of modern applications
5. Requirements
- Be able to process large amounts of data in a close real time fashion
- Respect non functional requirements such as:
- FAULT TOLERANCE
- HIGHLY AVAILABILITY
- SCALABILITY
- Dealing with existing/legacy systems
- Scale team delivery capability through adoption of Microservices Architecture
6. • Microservices are not exclusively STATELESS applications!
False Myth: Microservices Architecture 1/2
Monolith
A CBA A
MONOLITH
7. • Achieve great ISOLATION without using synchronous protocols
False Myth: Microservices Architecture 2/2
A B
D E
C
A C E
DB
Message Bus
Monolith
8. Respecting Reactive Principles
Based on a Lambda Architecture
• Chronos – Data Source
• Fates – Batch Layer
• NeoCortex – Speed Layer
• Hermes – Serving Layer
Omnia – Distributed Data Management Platform
Omnia
Chronos
Fates
Hermes
NeoCortex
10. Omnia Chronos
Is in charge to collect/intercept the data
from different sources and make them
available as streams of observable
events.
Observable [ ]
•Social media
•Facebook
•Twitter
•Affiliates
•Page viewing
•Articles read,
following and
followers, bets etc…
•Sports related
•Tweets
•News
•Gaming
•Web Analytics
•Activities with in
our applications
Internal
Product
Centric
External
Customer
Centric
{
“type” : “bet”,
“version” : “1.0”
“time” : “2015-06-03
08:00:31”,
“acquisitionTime: “ . . .”,
“source” : “WHBetSystem”
“payload” : { … any valid json }
}
11. Omnia Chronos
In Chronos you define streams that collect data and
convert/persist into a stream of Observable[Incident].
Chronos
Stream
3
Stream
2
Stream
1
Stream
12. Omnia Chronos - Clustering
Chronos 1 Chronos 2 Chronos 3
Twitter
Distributed System Properties:
1. Concurrency
2. Distribution
3. Mobility
13. Omnia Chronos
• Chronos is built on top of Akka to leverage:
– Referential transparency (Mobility)
– Error Kernel Patter (Fail fast and in isolation)
– Concurrency and Distribution for Horizontal and Vertical Scalability
• We use Scala Rx API to promote non blocking API to achieve
Vertical Scalability
• Data are persisted in Kafka for durability:
– Fast Write Operation with Zero Copy and Filesystem Cache
– Compaction and Compression to optimise messages consumption
14. Vertical Scalability vs Horizontal Scalability
Horizontal – Distribute the load across different machines (Akka Cluster)
Vertical – Maximise local resource utilisation (Non blocking IO + Non blocking API)
15. Timing for Machine operations
Instruction Time
Execute typical instruction 1/1,000,000,000 = 1 nanosec
Fetch from L1 cache memory 0.5 nanosec
Branch misprediction 5 nanosec
Fetch from L2 cache memory 7 nanosec
Mutex lock/unlock 25 nanosec
Fetch from main memory 100 nanosec
Send 2K bytes over 1Gbps network 20,000 nanosec (20µs)
Read 1MB sequentially from memory 250,000 nanosec (250µs)
Fetch from new disk location (seek) 8,000,000 (8ms)
Read 1MB sequentially from disk 20,000,000 nanosec (20ms)
Send packet US to Europe and back 150,000,000 nanosec (150ms)
16. Humanised Time
Instruction Time
Execute typical instruction 1 s
Fetch from L1 cache memory 0.5 s
Branch misprediction 5 s
Fetch from L2 cache memory 7 s
Mutex lock/unlock ½ s
Fetch from main memory 1½ min
Send 2K bytes over 1Gbps network 5½ hours
Read 1MB sequentially from memory 3 days
Fetch from new disk location (seek) 13 weeks
Read 1MB sequentially from disk 6½ months
Send packet US to Europe and back 5 years
18. Fates represents the long term memory of Omnia. Is in charge to organise all the incidents recorded by Chronos into
timelines and create new information as views by using machine learning, logical reasoning and time series analysis.
• A timeline represents the history, the sequence of incidents performed by a specific entity over the time. Timelines
are organised per categories. An example of timeline can be the customer timeline, which might contain all the bets
placed, deposit and withdraw activities, tweets etc... performed by the specific customer.
A timeline category is not limited just to customers, it can be anything, for example: Sport Event: football match,
competition
• Views are the result of job task that elaborates data from:
– Timelines
– Other Views
Omnia Fates
19. Fates represents the long term memory of Omnia. It organizes the incidents that
Chronos collected into timelines and also elaborates new information as views by
using machine learning, logical reasoning and time series analysis.
Fates: Batch layer
1
9
Omnia: Distributed & Reactive platform for data management
Customer: 123
Login
Deposit
Bet placed
…
Logout
Event: 78
Started
Fault
Penalty
…
Goal
Timelines & Views
Bets DepositsEvents Session
Fates
Batch Layer
20. Timelines are created from timeline streams, each timeline stream read data from a Chronos stream and
fed the right timeline.
Omnia FatesChronos
Fates
21. • Fates persist timelines of incidents.
• Column Family Name: <TimelineCategory>_tl
• Key Definition: ( (entityId, date), timestamp )
• The partition key is a strong hash key : well balanced Cassandra Cluster
• Composite key: incidents are ordered by timestamp under a specific entity within a day
(date = yyyy-MM-dd )
Omnia Fates - Cassandra
22. • Multi Data Center application for operation and analytics/reporting
• On line analysis against ETL!
Omnia Fates – Separation of Concerns
23. Omnia Fates
• We build views with job able to do:
Jobs are performed on top of NeoCortex
Logical
Reasoning
• Deduction
• Induction
• Abduction
Time line analysis
• Trends
• Cycles
• Seasonality
Other ML
• Classification
• Clustering
• Predictions
25. Omnia Neo Cortex
• NeoCortex is a runtime platform and a set of libraries to perform concurrent and
distributed computations in a highly resilient way.
• Was initially desgined as a library on top of spark (streaming) but it evolved in a
platform for Reactive Microservice which allows to build application in:
– SPARK STREAMING
– AKKA STREAMS
– WILLIAM HILL LAMBDAS
• Applications are deployed in Neocortex as docker isolated microservices and they
can interact each other using chronos streams and with client applications through
Hermes.
30. Omnia Hermes
Is the layer on which data get represented for consumption: B2B and B2C. At its
foundation micro-services, notifications and data as API are key aspects of the design
Scalable and simple full duplex communication for the web
Express the correlation between the entities of the model
Inspired by Falcor (Netflix) and GraphQL (Facebook)
Modern applications need to be design around the customer needs
Mobile use cases are tailored on specific context which need to be analised in real time
Data must be delivered efficiently, always available and scale linearly.