SlideShare a Scribd company logo
1 of 27
Download to read offline
Fast Data Processing with RFX
Simplify Fast Data Processing
trieunt@fpt.com.vn
tantrieuf31@gmail.com
Today topic : We would talk about all things in this red circle
Demo first
https://github.com/rfxlab/pageview-analytics-with-rfx
Content at glance
1. BEAM✲ methodology for agile data warehouse
2. Introduction to Fast Data
3. Problem “Fast Data in web analytics”
4. Examples for fast data design pattern (RFX or Reactive Function X)
4.1. Event data actor
4.2. Event data agent
4.3. Event data collector
4.4. Event data router
4.5. Event data processor
4.6. Event data storage
4.7. Event data query
4.8. Event data reactor
5. Demo “Fast Data in web analytics” with source code explanation
1 - BEAM✲ methodology
1 - BEAM✲ methodology for Agile Data Warehouse
BEAM✲ stands for Business Event Analysis &
Modelling, and it’s a methodology for gathering business
requirements for Agile Data Warehouses and building
those warehouses.
It was developed by Lawrence Corr (@LawrenceCorr) and
Jim Stagnitto (@JimStag), and published in their book Agile
Data Warehouse Design: Collaborative Dimensional
Modeling, from Whiteboard to Star Schema.
Example with BEAM✲
Goal: Modeling all business events and put into a database
in agile way
2 - Fast Data
Introduction to Fast Data
3 - Problems in Practice
Problems
“Fast Data in web analytics”
1. Counting pageview of website
2. Counting unique user of website
3. Sending email when pageview is unnormal (simple DDOS
attack detection)
4 - Thinking with RFX
● A design pattern to solve big fast data problems
● A collection of Open Source Tools
● The mission of RFX
1. Build data product quickly with design patterns
2. Apply BEAM✲ for agile data pipeline
3. React to critical events in near-real-time
What is RFX or Reactive Function X ?
RFX framework
What ?
● The Java framework, is built from open source projects:
○ Based on core Akka Actor ( http://akka.io )
○ Lightweight DAO with Spring JDBC ( https://spring.io )
○ Netty ( http://netty.io ) and VertX ( http://vertx.io/ )
○ Common utils class for Apache { Kafka, Hadoop , Spark }
○ Common utils class for NoSQL ( Redis ( http://redis.io ), MongoDB )
● a R&D project, started since 11/2013 for fast data processing
Why ?
● Divide Java code into modules:
○ common infrastructure code ( rfx-stream )
○ business logic code ( check valid data stream )
○ machine learning code ( automation & optimization )
● Focus on best practices and reusability
● Foundation for scalability (system and business)
● Test-driven development for Real-Time Analytics
● Continuous integration & improvement
Your business logic here
Reactive Function (X) Philosophy
Core elements of rfx-stream
Core backend modules
rfx-track:
● collecting all events from log agent
rfx-stream:
● processing stream data (PipelineProcessing pattern)
● processing real-time analytics
● processing business logic (by reactive function)
rfx-cronjob:
● synchronizing real-time data to report database (by
parsing data in Redis and update to Report database)
Core frontend modules
rfx-report:
● visualizing data in real-time
● monitoring real-time event
rfx-agent:
● tracking user activity: heatmap data, pageview, ...
● logging user activity to rfx-track (via network
protocol: HTTP, TCP or UDP)
How to solve problems with RFX ?
Use Cases in “Fast Data in web analytics”
1. Counting pageview of website
2. Counting unique user of website
3. Sending email when pageview is unnormal (simple
DDOS attack detection)
Apply RFX into Pageview Analytics
1.1. Event data actor: a web user
1.2. Event data agent: RFX-track-js
1.3. Event data collector: RFX-track-server
1.4. Event data queue: Apache Kafka
1.5. Event data processor: RFX-stream
1.6. Event data storage: Redis, MySQL
1.7. Event data query: RFX-data-api
1.8. Event data reactor: RFX-reactor
Demo and Explanation for code and concepts
Readings
● http://www.decisionone.co.uk/press/agile-data-warehouse-design-sampler.pdf
● http://www.slideshare.net/votrongdao/agile-data-warehouse-34427798
● Apache Kafka Installation Video | How To Setup Apache Kafka https://youtu.be/Fg8cTsEk7Gc
● https://www.tutorialspoint.com/apache_kafka/
● https://kafka.apache.org/quickstart
● http://xyu.io/2015/07/13/building-a-faster-etl-pipeline-with-flume-kafka-and-hive/
● http://blog.cloudera.com/blog/2015/06/architectural-patterns-for-near-real-time-data-pr
ocessing-with-apache-hadoop/
● https://www.oreilly.com/ideas/drivetrain-approach-data-products

More Related Content

What's hot

Data analytic for mobile app development
Data analytic for mobile app developmentData analytic for mobile app development
Data analytic for mobile app developmentTrieu Nguyen
 
Misusing MLflow To Help Deduplicate Data At Scale
Misusing MLflow To Help Deduplicate Data At ScaleMisusing MLflow To Help Deduplicate Data At Scale
Misusing MLflow To Help Deduplicate Data At ScaleDatabricks
 
CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on Databricks
CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on DatabricksCI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on Databricks
CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on DatabricksDatabricks
 
London atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesLondon atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesRudiger Wolf
 
The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019Karthik Murugesan
 
City of Amsterdam: High velocity development
City of Amsterdam: High velocity developmentCity of Amsterdam: High velocity development
City of Amsterdam: High velocity developmentBoris van Hoytema
 
Time travel and time series analysis with pandas + statsmodels
Time travel and time series analysis with pandas + statsmodelsTime travel and time series analysis with pandas + statsmodels
Time travel and time series analysis with pandas + statsmodelsAlexander Hendorf
 
Amundsen at Brex and Looker integration
Amundsen at Brex and Looker integrationAmundsen at Brex and Looker integration
Amundsen at Brex and Looker integrationmarkgrover
 
2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledgeChristopher Williams
 
Intranet show and_tell_2010
Intranet show and_tell_2010Intranet show and_tell_2010
Intranet show and_tell_2010Charlie Hull
 
The More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at ZendeskThe More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at ZendeskDatabricks
 
FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...
FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...
FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...Charlie Hull
 
WSO2 Guest Webinar: Building Enterprise Awareness with API Analytics in the A...
WSO2 Guest Webinar: Building Enterprise Awareness with API Analytics in the A...WSO2 Guest Webinar: Building Enterprise Awareness with API Analytics in the A...
WSO2 Guest Webinar: Building Enterprise Awareness with API Analytics in the A...WSO2
 
NLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingNLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingDatabricks
 
Importance of ML Reproducibility & Applications with MLfLow
Importance of ML Reproducibility & Applications with MLfLowImportance of ML Reproducibility & Applications with MLfLow
Importance of ML Reproducibility & Applications with MLfLowDatabricks
 
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...Introduction to Distributed Computing Engines for Data Processing - Simone Ro...
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...Data Science Milan
 
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbtSiligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbtJon Su
 
What's the story with Open Source?
What's the story with Open Source? What's the story with Open Source?
What's the story with Open Source? Charlie Hull
 
Connected data meetup group - introduction & scope
Connected data meetup group - introduction & scopeConnected data meetup group - introduction & scope
Connected data meetup group - introduction & scopeConnected Data World
 

What's hot (20)

Data analytic for mobile app development
Data analytic for mobile app developmentData analytic for mobile app development
Data analytic for mobile app development
 
Misusing MLflow To Help Deduplicate Data At Scale
Misusing MLflow To Help Deduplicate Data At ScaleMisusing MLflow To Help Deduplicate Data At Scale
Misusing MLflow To Help Deduplicate Data At Scale
 
CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on Databricks
CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on DatabricksCI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on Databricks
CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on Databricks
 
London atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesLondon atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slides
 
The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019
 
City of Amsterdam: High velocity development
City of Amsterdam: High velocity developmentCity of Amsterdam: High velocity development
City of Amsterdam: High velocity development
 
Time travel and time series analysis with pandas + statsmodels
Time travel and time series analysis with pandas + statsmodelsTime travel and time series analysis with pandas + statsmodels
Time travel and time series analysis with pandas + statsmodels
 
Amundsen at Brex and Looker integration
Amundsen at Brex and Looker integrationAmundsen at Brex and Looker integration
Amundsen at Brex and Looker integration
 
2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge
 
Intranet show and_tell_2010
Intranet show and_tell_2010Intranet show and_tell_2010
Intranet show and_tell_2010
 
The More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at ZendeskThe More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at Zendesk
 
FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...
FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...
FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...
 
WSO2 Guest Webinar: Building Enterprise Awareness with API Analytics in the A...
WSO2 Guest Webinar: Building Enterprise Awareness with API Analytics in the A...WSO2 Guest Webinar: Building Enterprise Awareness with API Analytics in the A...
WSO2 Guest Webinar: Building Enterprise Awareness with API Analytics in the A...
 
NLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingNLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated Training
 
Importance of ML Reproducibility & Applications with MLfLow
Importance of ML Reproducibility & Applications with MLfLowImportance of ML Reproducibility & Applications with MLfLow
Importance of ML Reproducibility & Applications with MLfLow
 
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...Introduction to Distributed Computing Engines for Data Processing - Simone Ro...
Introduction to Distributed Computing Engines for Data Processing - Simone Ro...
 
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbtSiligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
 
What's the story with Open Source?
What's the story with Open Source? What's the story with Open Source?
What's the story with Open Source?
 
Charles Ivie
Charles Ivie Charles Ivie
Charles Ivie
 
Connected data meetup group - introduction & scope
Connected data meetup group - introduction & scopeConnected data meetup group - introduction & scope
Connected data meetup group - introduction & scope
 

Viewers also liked

Slide 2 collecting, storing and analyzing big data
Slide 2 collecting, storing and analyzing big dataSlide 2 collecting, storing and analyzing big data
Slide 2 collecting, storing and analyzing big dataTrieu Nguyen
 
How to build a data driven business in big data age
How to build a data driven business in big data ageHow to build a data driven business in big data age
How to build a data driven business in big data ageTrieu Nguyen
 
Reactive Data System in Practice
Reactive Data System in PracticeReactive Data System in Practice
Reactive Data System in PracticeTrieu Nguyen
 
Where is my next jobs in the age of Big Data and Automation
Where is my next jobs in the age of Big Data and AutomationWhere is my next jobs in the age of Big Data and Automation
Where is my next jobs in the age of Big Data and AutomationTrieu Nguyen
 
2016 Data Science Salary Survey
2016 Data Science Salary Survey2016 Data Science Salary Survey
2016 Data Science Salary SurveyTrieu Nguyen
 
Experience economy
Experience economyExperience economy
Experience economyTrieu Nguyen
 
Introduction to Human Data Theory for Digital Economy
Introduction to Human Data Theory for Digital EconomyIntroduction to Human Data Theory for Digital Economy
Introduction to Human Data Theory for Digital EconomyTrieu Nguyen
 
Application-oriented ping-pong benchmarking: how to assess the real communica...
Application-oriented ping-pong benchmarking: how to assess the real communica...Application-oriented ping-pong benchmarking: how to assess the real communica...
Application-oriented ping-pong benchmarking: how to assess the real communica...Trieu Nguyen
 
A Day in the Life of a Hadoop Administrator
A Day in the Life of a Hadoop AdministratorA Day in the Life of a Hadoop Administrator
A Day in the Life of a Hadoop AdministratorEdureka!
 
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in ProductionUpgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in ProductionCloudera, Inc.
 
Luan van hadoop-final
Luan van hadoop-finalLuan van hadoop-final
Luan van hadoop-finalnobjta2015
 
Hadoop trong triển khai Big Data
Hadoop trong triển khai Big DataHadoop trong triển khai Big Data
Hadoop trong triển khai Big DataNguyễn Duy Nhân
 
TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)
TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)
TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)Trieu Nguyen
 
Introduction to RFX for Backend Developer
Introduction to RFX for Backend DeveloperIntroduction to RFX for Backend Developer
Introduction to RFX for Backend DeveloperTrieu Nguyen
 
Parallel and Iterative Processing for Machine Learning Recommendations with S...
Parallel and Iterative Processing for Machine Learning Recommendations with S...Parallel and Iterative Processing for Machine Learning Recommendations with S...
Parallel and Iterative Processing for Machine Learning Recommendations with S...MapR Technologies
 
Giới thiệu cơ bản về Big Data và các ứng dụng thực tiễn
Giới thiệu cơ bản về Big Data và các ứng dụng thực tiễnGiới thiệu cơ bản về Big Data và các ứng dụng thực tiễn
Giới thiệu cơ bản về Big Data và các ứng dụng thực tiễnTrieu Nguyen
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineTrieu Nguyen
 
Netty Cookbook - Chapter 1
Netty Cookbook - Chapter 1Netty Cookbook - Chapter 1
Netty Cookbook - Chapter 1Trieu Nguyen
 
Netty Cookbook - Chapter 2
Netty Cookbook - Chapter 2Netty Cookbook - Chapter 2
Netty Cookbook - Chapter 2Trieu Nguyen
 

Viewers also liked (20)

Slide 2 collecting, storing and analyzing big data
Slide 2 collecting, storing and analyzing big dataSlide 2 collecting, storing and analyzing big data
Slide 2 collecting, storing and analyzing big data
 
How to build a data driven business in big data age
How to build a data driven business in big data ageHow to build a data driven business in big data age
How to build a data driven business in big data age
 
Reactive Data System in Practice
Reactive Data System in PracticeReactive Data System in Practice
Reactive Data System in Practice
 
Where is my next jobs in the age of Big Data and Automation
Where is my next jobs in the age of Big Data and AutomationWhere is my next jobs in the age of Big Data and Automation
Where is my next jobs in the age of Big Data and Automation
 
2016 Data Science Salary Survey
2016 Data Science Salary Survey2016 Data Science Salary Survey
2016 Data Science Salary Survey
 
Experience economy
Experience economyExperience economy
Experience economy
 
Introduction to Human Data Theory for Digital Economy
Introduction to Human Data Theory for Digital EconomyIntroduction to Human Data Theory for Digital Economy
Introduction to Human Data Theory for Digital Economy
 
Application-oriented ping-pong benchmarking: how to assess the real communica...
Application-oriented ping-pong benchmarking: how to assess the real communica...Application-oriented ping-pong benchmarking: how to assess the real communica...
Application-oriented ping-pong benchmarking: how to assess the real communica...
 
A Day in the Life of a Hadoop Administrator
A Day in the Life of a Hadoop AdministratorA Day in the Life of a Hadoop Administrator
A Day in the Life of a Hadoop Administrator
 
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in ProductionUpgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
 
Luan van hadoop-final
Luan van hadoop-finalLuan van hadoop-final
Luan van hadoop-final
 
Hadoop trong triển khai Big Data
Hadoop trong triển khai Big DataHadoop trong triển khai Big Data
Hadoop trong triển khai Big Data
 
TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)
TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)
TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)
 
Building Netty Servers
Building Netty ServersBuilding Netty Servers
Building Netty Servers
 
Introduction to RFX for Backend Developer
Introduction to RFX for Backend DeveloperIntroduction to RFX for Backend Developer
Introduction to RFX for Backend Developer
 
Parallel and Iterative Processing for Machine Learning Recommendations with S...
Parallel and Iterative Processing for Machine Learning Recommendations with S...Parallel and Iterative Processing for Machine Learning Recommendations with S...
Parallel and Iterative Processing for Machine Learning Recommendations with S...
 
Giới thiệu cơ bản về Big Data và các ứng dụng thực tiễn
Giới thiệu cơ bản về Big Data và các ứng dụng thực tiễnGiới thiệu cơ bản về Big Data và các ứng dụng thực tiễn
Giới thiệu cơ bản về Big Data và các ứng dụng thực tiễn
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data Pipeline
 
Netty Cookbook - Chapter 1
Netty Cookbook - Chapter 1Netty Cookbook - Chapter 1
Netty Cookbook - Chapter 1
 
Netty Cookbook - Chapter 2
Netty Cookbook - Chapter 2Netty Cookbook - Chapter 2
Netty Cookbook - Chapter 2
 

Similar to Slide 3 Fast Data processing with kafka, rfx and redis

Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analyticskgshukla
 
Neo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform OverviewNeo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform OverviewNeo4j
 
Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019Zhenxiao Luo
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksSlim Baltagi
 
Apache Flink: Past, Present and Future
Apache Flink: Past, Present and FutureApache Flink: Past, Present and Future
Apache Flink: Past, Present and FutureGyula Fóra
 
LINQ 2 SQL Presentation To Palmchip And Trg, Technology Resource Group
LINQ 2 SQL Presentation To Palmchip  And Trg, Technology Resource GroupLINQ 2 SQL Presentation To Palmchip  And Trg, Technology Resource Group
LINQ 2 SQL Presentation To Palmchip And Trg, Technology Resource GroupShahzad
 
aip_developer_overview_icar_2014
aip_developer_overview_icar_2014aip_developer_overview_icar_2014
aip_developer_overview_icar_2014Matthew Vaughn
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDogRedis Labs
 
I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices
I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices
I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices Apigee | Google Cloud
 
Making Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxMaking Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxTrayan Iliev
 
Practical automation for beginners
Practical automation for beginnersPractical automation for beginners
Practical automation for beginnersSeoweon Yoo
 
Getting started with apache flink streaming api
Getting started with apache flink streaming apiGetting started with apache flink streaming api
Getting started with apache flink streaming apiPreetdeep Kumar
 
Apache Flink Adoption at Shopify
Apache Flink Adoption at ShopifyApache Flink Adoption at Shopify
Apache Flink Adoption at ShopifyYaroslav Tkachenko
 
Enterprise Software Architecture styles
Enterprise Software Architecture stylesEnterprise Software Architecture styles
Enterprise Software Architecture stylesAraf Karsh Hamid
 
From an experiment to a real production environment
From an experiment to a real production environmentFrom an experiment to a real production environment
From an experiment to a real production environmentDataWorks Summit
 
Web Technology Management Lecture III
Web Technology Management Lecture IIIWeb Technology Management Lecture III
Web Technology Management Lecture IIIsopekmir
 
Apricot2017 Request tracing in distributed environment
Apricot2017 Request tracing in distributed environmentApricot2017 Request tracing in distributed environment
Apricot2017 Request tracing in distributed environmentHieu LE ☁
 

Similar to Slide 3 Fast Data processing with kafka, rfx and redis (20)

Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analytics
 
Neo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform OverviewNeo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform Overview
 
Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019
 
Sergey Stoyan 2016
Sergey Stoyan 2016Sergey Stoyan 2016
Sergey Stoyan 2016
 
Sergey Stoyan 2016
Sergey Stoyan 2016Sergey Stoyan 2016
Sergey Stoyan 2016
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics Frameworks
 
Apache Flink: Past, Present and Future
Apache Flink: Past, Present and FutureApache Flink: Past, Present and Future
Apache Flink: Past, Present and Future
 
LINQ 2 SQL Presentation To Palmchip And Trg, Technology Resource Group
LINQ 2 SQL Presentation To Palmchip  And Trg, Technology Resource GroupLINQ 2 SQL Presentation To Palmchip  And Trg, Technology Resource Group
LINQ 2 SQL Presentation To Palmchip And Trg, Technology Resource Group
 
aip_developer_overview_icar_2014
aip_developer_overview_icar_2014aip_developer_overview_icar_2014
aip_developer_overview_icar_2014
 
20170126 big data processing
20170126 big data processing20170126 big data processing
20170126 big data processing
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 
I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices
I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices
I Love APIs 2015: Building Predictive Apps with Lamda and MicroServices
 
Making Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxMaking Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFlux
 
Practical automation for beginners
Practical automation for beginnersPractical automation for beginners
Practical automation for beginners
 
Getting started with apache flink streaming api
Getting started with apache flink streaming apiGetting started with apache flink streaming api
Getting started with apache flink streaming api
 
Apache Flink Adoption at Shopify
Apache Flink Adoption at ShopifyApache Flink Adoption at Shopify
Apache Flink Adoption at Shopify
 
Enterprise Software Architecture styles
Enterprise Software Architecture stylesEnterprise Software Architecture styles
Enterprise Software Architecture styles
 
From an experiment to a real production environment
From an experiment to a real production environmentFrom an experiment to a real production environment
From an experiment to a real production environment
 
Web Technology Management Lecture III
Web Technology Management Lecture IIIWeb Technology Management Lecture III
Web Technology Management Lecture III
 
Apricot2017 Request tracing in distributed environment
Apricot2017 Request tracing in distributed environmentApricot2017 Request tracing in distributed environment
Apricot2017 Request tracing in distributed environment
 

More from Trieu Nguyen

Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfBuilding Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfTrieu Nguyen
 
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel BusinessBuilding Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel BusinessTrieu Nguyen
 
Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP Trieu Nguyen
 
How to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDPHow to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDPTrieu Nguyen
 
[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDPTrieu Nguyen
 
Leo CDP - Pitch Deck
Leo CDP - Pitch DeckLeo CDP - Pitch Deck
Leo CDP - Pitch DeckTrieu Nguyen
 
LEO CDP - What's new in 2022
LEO CDP  - What's new in 2022LEO CDP  - What's new in 2022
LEO CDP - What's new in 2022Trieu Nguyen
 
Lộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sảnLộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sảnTrieu Nguyen
 
Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?Trieu Nguyen
 
From Dataism to Customer Data Platform
From Dataism to Customer Data PlatformFrom Dataism to Customer Data Platform
From Dataism to Customer Data PlatformTrieu Nguyen
 
Data collection, processing & organization with USPA framework
Data collection, processing & organization with USPA frameworkData collection, processing & organization with USPA framework
Data collection, processing & organization with USPA frameworkTrieu Nguyen
 
Part 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technologyPart 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technologyTrieu Nguyen
 
Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Trieu Nguyen
 
How to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation PlatformHow to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation PlatformTrieu Nguyen
 
How to grow your business in the age of digital marketing 4.0
How to grow your business  in the age of digital marketing 4.0How to grow your business  in the age of digital marketing 4.0
How to grow your business in the age of digital marketing 4.0Trieu Nguyen
 
Video Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataVideo Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataTrieu Nguyen
 
Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Trieu Nguyen
 
Open OTT - Video Content Platform
Open OTT - Video Content PlatformOpen OTT - Video Content Platform
Open OTT - Video Content PlatformTrieu Nguyen
 
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisApache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisTrieu Nguyen
 
Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)Trieu Nguyen
 

More from Trieu Nguyen (20)

Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfBuilding Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
 
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel BusinessBuilding Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
 
Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP
 
How to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDPHow to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDP
 
[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP
 
Leo CDP - Pitch Deck
Leo CDP - Pitch DeckLeo CDP - Pitch Deck
Leo CDP - Pitch Deck
 
LEO CDP - What's new in 2022
LEO CDP  - What's new in 2022LEO CDP  - What's new in 2022
LEO CDP - What's new in 2022
 
Lộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sảnLộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sản
 
Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?
 
From Dataism to Customer Data Platform
From Dataism to Customer Data PlatformFrom Dataism to Customer Data Platform
From Dataism to Customer Data Platform
 
Data collection, processing & organization with USPA framework
Data collection, processing & organization with USPA frameworkData collection, processing & organization with USPA framework
Data collection, processing & organization with USPA framework
 
Part 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technologyPart 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technology
 
Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?
 
How to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation PlatformHow to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation Platform
 
How to grow your business in the age of digital marketing 4.0
How to grow your business  in the age of digital marketing 4.0How to grow your business  in the age of digital marketing 4.0
How to grow your business in the age of digital marketing 4.0
 
Video Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataVideo Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big data
 
Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)
 
Open OTT - Video Content Platform
Open OTT - Video Content PlatformOpen OTT - Video Content Platform
Open OTT - Video Content Platform
 
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisApache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
 
Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)
 

Recently uploaded

Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 

Recently uploaded (20)

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 

Slide 3 Fast Data processing with kafka, rfx and redis

  • 1. Fast Data Processing with RFX Simplify Fast Data Processing trieunt@fpt.com.vn tantrieuf31@gmail.com
  • 2. Today topic : We would talk about all things in this red circle
  • 4. Content at glance 1. BEAM✲ methodology for agile data warehouse 2. Introduction to Fast Data 3. Problem “Fast Data in web analytics” 4. Examples for fast data design pattern (RFX or Reactive Function X) 4.1. Event data actor 4.2. Event data agent 4.3. Event data collector 4.4. Event data router 4.5. Event data processor 4.6. Event data storage 4.7. Event data query 4.8. Event data reactor 5. Demo “Fast Data in web analytics” with source code explanation
  • 5. 1 - BEAM✲ methodology
  • 6. 1 - BEAM✲ methodology for Agile Data Warehouse BEAM✲ stands for Business Event Analysis & Modelling, and it’s a methodology for gathering business requirements for Agile Data Warehouses and building those warehouses. It was developed by Lawrence Corr (@LawrenceCorr) and Jim Stagnitto (@JimStag), and published in their book Agile Data Warehouse Design: Collaborative Dimensional Modeling, from Whiteboard to Star Schema.
  • 8. Goal: Modeling all business events and put into a database in agile way
  • 9. 2 - Fast Data
  • 10.
  • 12.
  • 13. 3 - Problems in Practice
  • 14. Problems “Fast Data in web analytics” 1. Counting pageview of website 2. Counting unique user of website 3. Sending email when pageview is unnormal (simple DDOS attack detection)
  • 15. 4 - Thinking with RFX
  • 16. ● A design pattern to solve big fast data problems ● A collection of Open Source Tools ● The mission of RFX 1. Build data product quickly with design patterns 2. Apply BEAM✲ for agile data pipeline 3. React to critical events in near-real-time What is RFX or Reactive Function X ?
  • 17. RFX framework What ? ● The Java framework, is built from open source projects: ○ Based on core Akka Actor ( http://akka.io ) ○ Lightweight DAO with Spring JDBC ( https://spring.io ) ○ Netty ( http://netty.io ) and VertX ( http://vertx.io/ ) ○ Common utils class for Apache { Kafka, Hadoop , Spark } ○ Common utils class for NoSQL ( Redis ( http://redis.io ), MongoDB ) ● a R&D project, started since 11/2013 for fast data processing Why ? ● Divide Java code into modules: ○ common infrastructure code ( rfx-stream ) ○ business logic code ( check valid data stream ) ○ machine learning code ( automation & optimization ) ● Focus on best practices and reusability ● Foundation for scalability (system and business) ● Test-driven development for Real-Time Analytics ● Continuous integration & improvement
  • 19. Reactive Function (X) Philosophy
  • 20. Core elements of rfx-stream
  • 21. Core backend modules rfx-track: ● collecting all events from log agent rfx-stream: ● processing stream data (PipelineProcessing pattern) ● processing real-time analytics ● processing business logic (by reactive function) rfx-cronjob: ● synchronizing real-time data to report database (by parsing data in Redis and update to Report database)
  • 22. Core frontend modules rfx-report: ● visualizing data in real-time ● monitoring real-time event rfx-agent: ● tracking user activity: heatmap data, pageview, ... ● logging user activity to rfx-track (via network protocol: HTTP, TCP or UDP)
  • 23. How to solve problems with RFX ?
  • 24. Use Cases in “Fast Data in web analytics” 1. Counting pageview of website 2. Counting unique user of website 3. Sending email when pageview is unnormal (simple DDOS attack detection)
  • 25. Apply RFX into Pageview Analytics 1.1. Event data actor: a web user 1.2. Event data agent: RFX-track-js 1.3. Event data collector: RFX-track-server 1.4. Event data queue: Apache Kafka 1.5. Event data processor: RFX-stream 1.6. Event data storage: Redis, MySQL 1.7. Event data query: RFX-data-api 1.8. Event data reactor: RFX-reactor
  • 26. Demo and Explanation for code and concepts
  • 27. Readings ● http://www.decisionone.co.uk/press/agile-data-warehouse-design-sampler.pdf ● http://www.slideshare.net/votrongdao/agile-data-warehouse-34427798 ● Apache Kafka Installation Video | How To Setup Apache Kafka https://youtu.be/Fg8cTsEk7Gc ● https://www.tutorialspoint.com/apache_kafka/ ● https://kafka.apache.org/quickstart ● http://xyu.io/2015/07/13/building-a-faster-etl-pipeline-with-flume-kafka-and-hive/ ● http://blog.cloudera.com/blog/2015/06/architectural-patterns-for-near-real-time-data-pr ocessing-with-apache-hadoop/ ● https://www.oreilly.com/ideas/drivetrain-approach-data-products