SlideShare a Scribd company logo
1 of 48
Download to read offline
{"
"eventType: " "PageViewEvent,"
"3mestamp: " "1413215518,"
"viewerId:" " "1234,"
"sessionId: " "fa1afe101234deadbeef,"
"pageKey:" " "profileFview,"
"viewedProfileId:"4321,"
"trackingKey:" "invita3onFemail,"
"..."etc."metadata"about"what"content"was"displayed..."
}"
{"
"eventType: " "ProfileEditEvent,"
"3mestamp: " "1413215518,"
"profileId:" " "1234,"
"old:"{"
" "loca3on:" ""San"Francisco,"CA","
" "industry:" ""Internet"},"
"new:"{"
" "loca3on:" ""New"York,"NY","
" "industry:" ""Financial"Services"}"
}"
References (fun stuff to read)!
1.  Martin Kleppmann:“Designing data-intensive applications.” O’Reilly Media, to appear in 2015. http://dataintensive.net!
2.  Jay Kreps:“Why local state is a fundamental primitive in stream processing.” 31 July 2014. http://radar.oreilly.com/2014/07/why-local-
state-is-a-fundamental-primitive-in-stream-processing.html!
3.  Jay Kreps:“I ︎ Logs.” O'Reilly Media, September 2014. http://shop.oreilly.com/product/0636920034339.do!
4.  Nathan Marz and James Warren:“Big Data: Principles and best practices of scalable realtime data systems.” Manning MEAP, to appear
January 2015. http://manning.com/marz/!
5.  Jakob Homan:“Real time insights into LinkedIn's performance using Apache Samza.” 18 Aug 2014. http://engineering.linkedin.com/samza/
real-time-insights-linkedins-performance-using-apache-samza!
6.  Martin Kleppmann:“Moving faster with data streams:The rise of Samza at LinkedIn.” 14 July 2014. http://engineering.linkedin.com/stream-
processing/moving-faster-data-streams-rise-samza-linkedin!
7.  Praveen Neppalli Naga:“Real-time Analytics at Massive Scale with Pinot.” 29 Sept 2014. http://engineering.linkedin.com/analytics/real-
time-analytics-massive-scale-pinot!
8.  David He:“Monitor and Improve Web Performance Using RUM DataVisualization.” 19 Sept 2014. http://engineering.linkedin.com/
performance/monitor-and-improve-web-performance-using-rum-data-visualization!
9.  Lili Wu, Sam Shah, Sean Choi, Mitul Tiwari, and Christian Posse:“The Browsemaps: Collaborative Filtering at LinkedIn,” at 6th Workshop
on Recommender Systems and the Social Web, Oct 2014. http://ls13-www.cs.uni-dortmund.de/homepage/rsweb2014/papers/
rsweb2014_submission_3.pdf!
10.  Shirshanka Das, Chavdar Botev, Kapil Surlaker, et al.:“All Aboard the Databus!,” at ACM Symposium on Cloud Computing (SoCC),
October 2012. http://www.socc2012.org/s18-das.pdf!
11.  Apache Samza documentation. http://samza.incubator.apache.org!

More Related Content

Viewers also liked

A Random Walk Through Search Research
A Random Walk Through Search ResearchA Random Walk Through Search Research
A Random Walk Through Search ResearchNick Watkins
 
LinkedIn Data Products
LinkedIn Data ProductsLinkedIn Data Products
LinkedIn Data ProductsVitaly Gordon
 
Developing Data Products
Developing Data ProductsDeveloping Data Products
Developing Data ProductsPeter Skomoroch
 
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraCassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraDave Gardner
 
Building Automated REST APIs with Python
Building Automated REST APIs with PythonBuilding Automated REST APIs with Python
Building Automated REST APIs with PythonJeff Knupp
 
Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming AnalyticsGuido Schmutz
 
Experience economy
Experience economyExperience economy
Experience economyTrieu Nguyen
 
ML and Data Science at Uber - GITPro talk 2017
ML and Data Science at Uber - GITPro talk 2017ML and Data Science at Uber - GITPro talk 2017
ML and Data Science at Uber - GITPro talk 2017Sudhir Tonse
 
Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Rajul Kukreja
 
Customer Event Hub - the modern Customer 360° view
Customer Event Hub - the modern Customer 360° viewCustomer Event Hub - the modern Customer 360° view
Customer Event Hub - the modern Customer 360° viewGuido Schmutz
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Impetus Technologies
 
Graph database super star
Graph database super starGraph database super star
Graph database super starandres_taylor
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Helena Edelson
 
General Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsGeneral Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsMark Peng
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsDatabricks
 
Stream Computing & Analytics at Uber
Stream Computing & Analytics at UberStream Computing & Analytics at Uber
Stream Computing & Analytics at UberSudhir Tonse
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflixnkorla1share
 

Viewers also liked (18)

A Random Walk Through Search Research
A Random Walk Through Search ResearchA Random Walk Through Search Research
A Random Walk Through Search Research
 
LinkedIn Data Products
LinkedIn Data ProductsLinkedIn Data Products
LinkedIn Data Products
 
Developing Data Products
Developing Data ProductsDeveloping Data Products
Developing Data Products
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraCassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache Cassandra
 
Building Automated REST APIs with Python
Building Automated REST APIs with PythonBuilding Automated REST APIs with Python
Building Automated REST APIs with Python
 
Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming Analytics
 
Experience economy
Experience economyExperience economy
Experience economy
 
ML and Data Science at Uber - GITPro talk 2017
ML and Data Science at Uber - GITPro talk 2017ML and Data Science at Uber - GITPro talk 2017
ML and Data Science at Uber - GITPro talk 2017
 
Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...
 
Customer Event Hub - the modern Customer 360° view
Customer Event Hub - the modern Customer 360° viewCustomer Event Hub - the modern Customer 360° view
Customer Event Hub - the modern Customer 360° view
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
 
Graph database super star
Graph database super starGraph database super star
Graph database super star
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
 
General Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsGeneral Tips for participating Kaggle Competitions
General Tips for participating Kaggle Competitions
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
 
Stream Computing & Analytics at Uber
Stream Computing & Analytics at UberStream Computing & Analytics at Uber
Stream Computing & Analytics at Uber
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflix
 

Similar to Building Real-time Data Products at LinkedIn with Apache Samza

Introduction to CQRS and DDDD
Introduction to CQRS and DDDDIntroduction to CQRS and DDDD
Introduction to CQRS and DDDDVladik Khononov
 
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...Markus Lanthaler
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance TuningPuneet Behl
 
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)Dan Robinson
 
A model based approach for developing event-driven architectures with AsyncAPI
A model based approach for developing event-driven architectures with AsyncAPIA model based approach for developing event-driven architectures with AsyncAPI
A model based approach for developing event-driven architectures with AsyncAPIabgolla
 
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...confluent
 
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"South Tyrol Free Software Conference
 
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...OpenStack Korea Community
 
Microservices in Java and Scala (sfscala)
Microservices in Java and Scala (sfscala)Microservices in Java and Scala (sfscala)
Microservices in Java and Scala (sfscala)Chris Richardson
 
Implementing and Visualizing Clickstream data with MongoDB
Implementing and Visualizing Clickstream data with MongoDBImplementing and Visualizing Clickstream data with MongoDB
Implementing and Visualizing Clickstream data with MongoDBMongoDB
 
Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...
Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...
Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...Lucidworks
 

Similar to Building Real-time Data Products at LinkedIn with Apache Samza (12)

Introduction to CQRS and DDDD
Introduction to CQRS and DDDDIntroduction to CQRS and DDDD
Introduction to CQRS and DDDD
 
Mixpanel
MixpanelMixpanel
Mixpanel
 
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
 
A model based approach for developing event-driven architectures with AsyncAPI
A model based approach for developing event-driven architectures with AsyncAPIA model based approach for developing event-driven architectures with AsyncAPI
A model based approach for developing event-driven architectures with AsyncAPI
 
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
 
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
 
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...
 
Microservices in Java and Scala (sfscala)
Microservices in Java and Scala (sfscala)Microservices in Java and Scala (sfscala)
Microservices in Java and Scala (sfscala)
 
Implementing and Visualizing Clickstream data with MongoDB
Implementing and Visualizing Clickstream data with MongoDBImplementing and Visualizing Clickstream data with MongoDB
Implementing and Visualizing Clickstream data with MongoDB
 
Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...
Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...
Events Processing and Data Analysis with Lucidworks Fusion: Presented by Kira...
 

More from Trieu Nguyen

Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfBuilding Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfTrieu Nguyen
 
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel BusinessBuilding Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel BusinessTrieu Nguyen
 
Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP Trieu Nguyen
 
How to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDPHow to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDPTrieu Nguyen
 
[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDPTrieu Nguyen
 
Leo CDP - Pitch Deck
Leo CDP - Pitch DeckLeo CDP - Pitch Deck
Leo CDP - Pitch DeckTrieu Nguyen
 
LEO CDP - What's new in 2022
LEO CDP  - What's new in 2022LEO CDP  - What's new in 2022
LEO CDP - What's new in 2022Trieu Nguyen
 
Lộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sảnLộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sảnTrieu Nguyen
 
Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?Trieu Nguyen
 
From Dataism to Customer Data Platform
From Dataism to Customer Data PlatformFrom Dataism to Customer Data Platform
From Dataism to Customer Data PlatformTrieu Nguyen
 
Data collection, processing & organization with USPA framework
Data collection, processing & organization with USPA frameworkData collection, processing & organization with USPA framework
Data collection, processing & organization with USPA frameworkTrieu Nguyen
 
Part 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technologyPart 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technologyTrieu Nguyen
 
Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Trieu Nguyen
 
How to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation PlatformHow to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation PlatformTrieu Nguyen
 
How to grow your business in the age of digital marketing 4.0
How to grow your business  in the age of digital marketing 4.0How to grow your business  in the age of digital marketing 4.0
How to grow your business in the age of digital marketing 4.0Trieu Nguyen
 
Video Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataVideo Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataTrieu Nguyen
 
Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Trieu Nguyen
 
Open OTT - Video Content Platform
Open OTT - Video Content PlatformOpen OTT - Video Content Platform
Open OTT - Video Content PlatformTrieu Nguyen
 
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisApache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisTrieu Nguyen
 
Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)Trieu Nguyen
 

More from Trieu Nguyen (20)

Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfBuilding Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
 
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel BusinessBuilding Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
 
Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP
 
How to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDPHow to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDP
 
[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP
 
Leo CDP - Pitch Deck
Leo CDP - Pitch DeckLeo CDP - Pitch Deck
Leo CDP - Pitch Deck
 
LEO CDP - What's new in 2022
LEO CDP  - What's new in 2022LEO CDP  - What's new in 2022
LEO CDP - What's new in 2022
 
Lộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sảnLộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sản
 
Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?
 
From Dataism to Customer Data Platform
From Dataism to Customer Data PlatformFrom Dataism to Customer Data Platform
From Dataism to Customer Data Platform
 
Data collection, processing & organization with USPA framework
Data collection, processing & organization with USPA frameworkData collection, processing & organization with USPA framework
Data collection, processing & organization with USPA framework
 
Part 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technologyPart 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technology
 
Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?
 
How to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation PlatformHow to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation Platform
 
How to grow your business in the age of digital marketing 4.0
How to grow your business  in the age of digital marketing 4.0How to grow your business  in the age of digital marketing 4.0
How to grow your business in the age of digital marketing 4.0
 
Video Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataVideo Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big data
 
Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)
 
Open OTT - Video Content Platform
Open OTT - Video Content PlatformOpen OTT - Video Content Platform
Open OTT - Video Content Platform
 
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisApache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
 
Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)
 

Recently uploaded

Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 

Recently uploaded (20)

Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 

Building Real-time Data Products at LinkedIn with Apache Samza

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. {" "eventType: " "PageViewEvent," "3mestamp: " "1413215518," "viewerId:" " "1234," "sessionId: " "fa1afe101234deadbeef," "pageKey:" " "profileFview," "viewedProfileId:"4321," "trackingKey:" "invita3onFemail," "..."etc."metadata"about"what"content"was"displayed..." }"
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. {" "eventType: " "ProfileEditEvent," "3mestamp: " "1413215518," "profileId:" " "1234," "old:"{" " "loca3on:" ""San"Francisco,"CA"," " "industry:" ""Internet"}," "new:"{" " "loca3on:" ""New"York,"NY"," " "industry:" ""Financial"Services"}" }"
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48. References (fun stuff to read)! 1.  Martin Kleppmann:“Designing data-intensive applications.” O’Reilly Media, to appear in 2015. http://dataintensive.net! 2.  Jay Kreps:“Why local state is a fundamental primitive in stream processing.” 31 July 2014. http://radar.oreilly.com/2014/07/why-local- state-is-a-fundamental-primitive-in-stream-processing.html! 3.  Jay Kreps:“I ︎ Logs.” O'Reilly Media, September 2014. http://shop.oreilly.com/product/0636920034339.do! 4.  Nathan Marz and James Warren:“Big Data: Principles and best practices of scalable realtime data systems.” Manning MEAP, to appear January 2015. http://manning.com/marz/! 5.  Jakob Homan:“Real time insights into LinkedIn's performance using Apache Samza.” 18 Aug 2014. http://engineering.linkedin.com/samza/ real-time-insights-linkedins-performance-using-apache-samza! 6.  Martin Kleppmann:“Moving faster with data streams:The rise of Samza at LinkedIn.” 14 July 2014. http://engineering.linkedin.com/stream- processing/moving-faster-data-streams-rise-samza-linkedin! 7.  Praveen Neppalli Naga:“Real-time Analytics at Massive Scale with Pinot.” 29 Sept 2014. http://engineering.linkedin.com/analytics/real- time-analytics-massive-scale-pinot! 8.  David He:“Monitor and Improve Web Performance Using RUM DataVisualization.” 19 Sept 2014. http://engineering.linkedin.com/ performance/monitor-and-improve-web-performance-using-rum-data-visualization! 9.  Lili Wu, Sam Shah, Sean Choi, Mitul Tiwari, and Christian Posse:“The Browsemaps: Collaborative Filtering at LinkedIn,” at 6th Workshop on Recommender Systems and the Social Web, Oct 2014. http://ls13-www.cs.uni-dortmund.de/homepage/rsweb2014/papers/ rsweb2014_submission_3.pdf! 10.  Shirshanka Das, Chavdar Botev, Kapil Surlaker, et al.:“All Aboard the Databus!,” at ACM Symposium on Cloud Computing (SoCC), October 2012. http://www.socc2012.org/s18-das.pdf! 11.  Apache Samza documentation. http://samza.incubator.apache.org!