1) What is data-driven business?
2) What and why is Lambda Architecture 2.0?
3) What problems did it solve for us?
4) Workshop with case study:
Building A/B testing tool for digital marketing with Lambda Architecture 2.0
1. Lambda
Architecture 2.0
for Data-Driven
Business
Team
Trieu Nguyen - http://nguyentantrieu.info
Truc Le - https://www.linkedin.com/pub/le-kien-truc/31/379/938
Data-driven + Lambda Architecture = growing business2
mc2ads.com - Fast Data Labs
2. Key questions for us today ?
1. What if the business is not driven by data?
2. What and why is Lambda Architecture?
3. What problems did it solve for us?
Workshop with case study:
Improving “Flappy bird” with
A/B Testing Tool and
Lambda Architecture 2.0
3. Red bird Blue bird
Which bird could let you down soon ?
OK, let’s play the Game ! Design it better with data
VS
9. Why is Lambda Architecture 2.0 ?
It helps to organize your data infrastructure into
understandable structure and react quickly to
context changes
10. “Vision Without Execution Is Just Hallucination”
Ok, cool ideas,
but how we build it ? Our
Our
We are here
11. Our goals
1. Understand the big picture
2. See the reality
3. Do actions to make it happen
Ok! Let’s make “Flappy bird” into “Happy bird” !
12. What is Lambda Architecture 2.0 ?
It’s just the architecture for data-driven business
for reacting to
fast data
for data mining
and machine
learning on Big
Data
for observable
data
for SQL querying
(SQL is true lambda
language !?)
13. Case study:
Improving “Flappy bird” with
A/B Testing Tool and
Lambda Architecture 2.0
● Short introduction about A/B testing
● Setup full open source technology stack
● Run example code with Java and Python
16. How? One of basic principle is “Test our theory”
From observable solutions, test them all to find the best
one ! More at http://en.wikipedia.org/wiki/A/B_testing
17. 1. Working with A/B testing tool (using Abba framework)
2. Let’s play Flappy Bird 2.0 !
3. Collecting data → store data as stream (Kafka)
4. Stream processing → real-time view processing (RFX)
5. Batch processing → sampling AB Test (Spark)
6. Query processing → finding facts from experiment
(SQL over Phoenix / HBase)
7. Collecting feedback data → Game Design Report
Steps
18. For simple demo, we use Abba,
a simple A/B testing self-hosted
framework
19. Why is reactive view in Lambda Architecture 2.0 ?
UX is the key for successful product development, so
we must react to bad UX quickly (with data)
20. Technology stack ( 5D model )
1) Data collector (I/O networking)
● Netty for event log collector and HTTP server (lambda2)
2) Data persistence (aka: data storage)
● Kafka for distributed message storage (Apache Kafka)
● HBase for scalable big table
3) Data processing
● RFX with fast data processing (RFX framework)
● Python for data sampling in A/B test experiments
● Rx(Java/JS) for reacting to data experiment (reactivex)
4) Data analysis
● measures of uncertainty(Python Dempster-Shafer theory)
5) Data ad-hoc reporting
● SQL over Phoenix / HBase ( http://phoenix.apache.org )