SlideShare a Scribd company logo
1 of 51
Download to read offline
Scalable Real-Time Complex Event
Processing @Uber
Shuyi Chen
Uber Technology Inc.
● 6 continents, 70 countries and 400+ cities
● Transportation as reliable as running water, everywhere, for
everyone
Uber
Outline
• Motivation
• Architecture
• Limitations
• Challenges
Outline
• Motivation
• Architecture
• Limitations
• Challenges
Uber is a data-driven company
Thousands of Kafka topics from different services
We can extract a lot of useful information from this rich
set of logs in real-time!
Multiple logins from the same IP in the last 10 minutes
Partner accepted a trip
→ partner calls rider through the Uber APP
→ rider cancels the trip
Partners reject the second pickup of a UberPOOL trip
Multiple logins from the same IP in the last 10 minutes
Window Aggregation
Partner accepted a trip
→ partner calls rider through the Uber APP
→ rider cancels the trip
Pattern detection
Partners reject the second pickup of a UberPOOL trip
Filter
Can we use declarative semantics to specify these stream
processing logics?
Complex event processing
• Combines data from multiple sources to infer events or patterns that
suggest more complicated circumstances
• CEP is used across many industries for various use cases, including:
– Finance: Trade analysis, fraud detection
– Airlines: Operations monitoring
– Healthcare: Claims processing, patient monitoring
– Energy and Telecommunications: Outage detection
• CEP uses declarative rule/query language to specify event processing
logic
WSO2/Siddhi: Complex event processing engine
• Lightweight, extensible, open source, released as a Java library
• Features supported
– Filter
– Join
– Aggregation
– Group by
– Window
– Pattern processing
– Sequence processing
– Event tables
– Event-time processing
– UDF
– Extensions
– Declarative query language: SiddhiQL
How Siddhi works
• Specify processing logic declaratively with SiddhiQL
How Siddhi works
• Query is parsed at runtime into an execution plan runtime
• As events flow in, the execution plan runtime process events inside
the CEP engine according the query logic
How can we make it scalable at Uber scale?
Apache Samza
• A distributed stream processing framework
– Distributed and Scalable
– Built-in State management
– Built-in fault tolerant
– At-least-once message processing
How can we make the stream processing output useful?
Actions
• Generalize a set of common action templates to make it easy for
services and human to harness the power of realtime stream
processing
• Currently we support
– Make an RPC call
– Invoke a Webhook endpoint
– Index to ElasticSearch
– Index to Cassandra
– Kafka
– Statsd
– Chat service
– Email
– Push notification
Actions
Real-time Scalable Complex Event Processing
Outline
• Motivation
• Architecture
• Limitations
• Challenges
Partitioner
• Re-partition events based on key
• Support predicate pushdown through query analysis
• Support column pruning through query analysis (WIP)
Query processor
• Parse Siddhi queries into execution plan runtime
• Process events in Siddhi execution plan runtime
• Checkpoint state regularly to ensure recovery upon crash/restart
using RocksDB
Action processor
• Execute actions upon the complex event processing output
• Support various kinds of actions for easy integration
• Implement action retry mechanism using RocksDB to provide
at-least-once delivery
How do we translate a query into psychical plan that
runs?
DAG (Directed Acyclic Graph) generation
• Analyze Siddhi query to automatically generate the stream
processing DAG in Samza using the processors
Filter, transformation
Join, window, pattern
More complicated
No stream processing logic is hard-coded in any of the
processors
REST API backend
• All queries, actions are stored externally in database.
• RESTFUL API for CRUD operations
• If query/action logic changed
– Redeploy the Samza DAG if needed
– Otherwise, the updated queries/actions will be loaded at runtime w/o
interruption
Unified management and monitoring
• Every use case
– share the same set of processors
– Use queries and actions to describe its processing logic
• A single monitoring template can be reused across different use
cases
Production status
• 100+ production use cases
• 30+ billion messages processed per day
Applications
• Real-time fraud detection
• Real-time anomaly detection
• Real-time marketing campaign
• Real-time promotion
• Real-time monitoring
• Real-time feedback system
• Real-time analytics
• Real-time visualizations
• And etc.
Outline
• Motivation
• Architecture
• Limitations
• Challenges
Out-of-order event handling
• Not a big concern
– Events of the same rider/partner are usually seconds aparts
• K-slack extension in Siddhi for out-of-order event processing
Auto-scaling
• Manually re-partition kafka topics to increase parallelism
• Manually tune container memory if needed
• Future
– Use CPU/memory/IO stats to auto-scale the data pipelines
Outline
• Motivation
• Architecture
• Limitations
• Challenges
Large checkpointing state
• Samza use Kafka to log state changes
• Siddhi engine snapshot can be large
• Kafka message size limit to 1MB by default
• Solution: we build logics to slice state into smaller pieces and
checkpoint them.
Synchronous checkpointing
• If state is large, time to checkpoint can be long
• Samza uses single-threaded model, unsafe to do it asynchronously
(SAMZA-863)
Exactly once state processing?
• Can not commit state and offset atomically
• No exactly once state processing
Custom business logic
• Common logic implemented as Siddhi extensions
• Ad-hoc logic implemented as UDF in javascript or scala
Intermediate Kafka messages
• Samza uses Kafka as message queue for intermediate processing
output
– This can create large load on Kafka if a heave topic is partitioned multiple
times
– Encode the intermediate messages to reduce footprint
Multi-tenancy
• Older Siddhi version process events using a thread pool
– Bad for multi-tenancy in YARN
– Consume more CPU resource than claimed
• Newer version still use thread pool for scheduled task, but main
processing in single thread
– Good: CPU consumption per YARN container is bounded
Upgrading Samza jobs
• Upgrade Samza jobs require a full restart, and can take minutes due
to
– Offset checkpointing topic too large → set retention to hours
– Changelog topic too large → set retention or enable compaction in
Kafka or host affinity (SAMZA-617)
• To minimize the interruption during upgrade, it would be nice to
have
– Rolling restart
– Per container restart
Our solution: non-interrupted handoff
• For critical jobs, we use replication during upgrade
– Start a shadow job
– Upgrade shadow
– Switch primary and shadow
– Upgrade primary
– Switch back
• Downside: require 2x capacity during upgrade
Thank You!

More Related Content

What's hot

WSO2 Product Release Webinar Introducing the WSO2 Message Broker
WSO2 Product Release Webinar   Introducing the WSO2 Message BrokerWSO2 Product Release Webinar   Introducing the WSO2 Message Broker
WSO2 Product Release Webinar Introducing the WSO2 Message BrokerWSO2
 
Keynote-Service Orientation – Why is it good for your business
Keynote-Service Orientation – Why is it good for your businessKeynote-Service Orientation – Why is it good for your business
Keynote-Service Orientation – Why is it good for your businessWSO2
 
Event Bus as Backbone for Decoupled Microservice Choreography (Oracle Code, A...
Event Bus as Backbone for Decoupled Microservice Choreography (Oracle Code, A...Event Bus as Backbone for Decoupled Microservice Choreography (Oracle Code, A...
Event Bus as Backbone for Decoupled Microservice Choreography (Oracle Code, A...Lucas Jellema
 
JDD 2016 - Jacek Bukowski - "Flying To Clouds" - Can It Be Easy?
JDD 2016 - Jacek Bukowski - "Flying To Clouds" - Can It Be Easy?JDD 2016 - Jacek Bukowski - "Flying To Clouds" - Can It Be Easy?
JDD 2016 - Jacek Bukowski - "Flying To Clouds" - Can It Be Easy?PROIDEA
 
Flying to clouds - can it be easy? Cloud Native Applications
Flying to clouds - can it be easy? Cloud Native ApplicationsFlying to clouds - can it be easy? Cloud Native Applications
Flying to clouds - can it be easy? Cloud Native ApplicationsJacek Bukowski
 
AWS RDS Oracle - What is missing for a fully managed service?
AWS RDS Oracle - What is missing for a fully managed service?AWS RDS Oracle - What is missing for a fully managed service?
AWS RDS Oracle - What is missing for a fully managed service?DanielHillinger
 
A closer look to locaweb IaaS
A closer look to locaweb IaaSA closer look to locaweb IaaS
A closer look to locaweb IaaSGleicon Moraes
 
Cf summit2014 roadmap
Cf summit2014 roadmapCf summit2014 roadmap
Cf summit2014 roadmapJames Bayer
 
Using Kafka as a Database For Real-Time Transaction Processing | Chad Preisle...
Using Kafka as a Database For Real-Time Transaction Processing | Chad Preisle...Using Kafka as a Database For Real-Time Transaction Processing | Chad Preisle...
Using Kafka as a Database For Real-Time Transaction Processing | Chad Preisle...HostedbyConfluent
 
Cloud Computing101 Azure, updated june 2017
Cloud Computing101 Azure, updated june 2017Cloud Computing101 Azure, updated june 2017
Cloud Computing101 Azure, updated june 2017Fernando Mejía
 
Grails in the Cloud (2013)
Grails in the Cloud (2013)Grails in the Cloud (2013)
Grails in the Cloud (2013)Meni Lubetkin
 
Scalability Availabilty and Management of WSO2 Carbon
Scalability Availabilty and Management of WSO2 CarbonScalability Availabilty and Management of WSO2 Carbon
Scalability Availabilty and Management of WSO2 CarbonWSO2
 
MVC 6 - the new unified Web programming model
MVC 6 - the new unified Web programming modelMVC 6 - the new unified Web programming model
MVC 6 - the new unified Web programming modelAlex Thissen
 
Tokyo azure meetup #9 azure update, october
Tokyo azure meetup #9   azure update, octoberTokyo azure meetup #9   azure update, october
Tokyo azure meetup #9 azure update, octoberTokyo Azure Meetup
 
Lessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, Cloudera
Lessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, ClouderaLessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, Cloudera
Lessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, ClouderaHostedbyConfluent
 
Por trás da infraestrutura do Cloud - Campus Party 2014
Por trás da infraestrutura do Cloud - Campus Party 2014Por trás da infraestrutura do Cloud - Campus Party 2014
Por trás da infraestrutura do Cloud - Campus Party 2014Gleicon Moraes
 
Tokyo Azure Meetup #4 - Build 2016 Overview
Tokyo Azure Meetup #4 -  Build 2016 OverviewTokyo Azure Meetup #4 -  Build 2016 Overview
Tokyo Azure Meetup #4 - Build 2016 OverviewTokyo Azure Meetup
 
Tokyo Azure Meetup #9 - Azure Update, september
Tokyo Azure Meetup #9 - Azure Update, septemberTokyo Azure Meetup #9 - Azure Update, september
Tokyo Azure Meetup #9 - Azure Update, septemberTokyo Azure Meetup
 
Docker y azure container service
Docker y azure container serviceDocker y azure container service
Docker y azure container serviceFernando Mejía
 
Stratoscale Latest and Greatest
Stratoscale Latest and GreatestStratoscale Latest and Greatest
Stratoscale Latest and GreatestZach Lanksbury
 

What's hot (20)

WSO2 Product Release Webinar Introducing the WSO2 Message Broker
WSO2 Product Release Webinar   Introducing the WSO2 Message BrokerWSO2 Product Release Webinar   Introducing the WSO2 Message Broker
WSO2 Product Release Webinar Introducing the WSO2 Message Broker
 
Keynote-Service Orientation – Why is it good for your business
Keynote-Service Orientation – Why is it good for your businessKeynote-Service Orientation – Why is it good for your business
Keynote-Service Orientation – Why is it good for your business
 
Event Bus as Backbone for Decoupled Microservice Choreography (Oracle Code, A...
Event Bus as Backbone for Decoupled Microservice Choreography (Oracle Code, A...Event Bus as Backbone for Decoupled Microservice Choreography (Oracle Code, A...
Event Bus as Backbone for Decoupled Microservice Choreography (Oracle Code, A...
 
JDD 2016 - Jacek Bukowski - "Flying To Clouds" - Can It Be Easy?
JDD 2016 - Jacek Bukowski - "Flying To Clouds" - Can It Be Easy?JDD 2016 - Jacek Bukowski - "Flying To Clouds" - Can It Be Easy?
JDD 2016 - Jacek Bukowski - "Flying To Clouds" - Can It Be Easy?
 
Flying to clouds - can it be easy? Cloud Native Applications
Flying to clouds - can it be easy? Cloud Native ApplicationsFlying to clouds - can it be easy? Cloud Native Applications
Flying to clouds - can it be easy? Cloud Native Applications
 
AWS RDS Oracle - What is missing for a fully managed service?
AWS RDS Oracle - What is missing for a fully managed service?AWS RDS Oracle - What is missing for a fully managed service?
AWS RDS Oracle - What is missing for a fully managed service?
 
A closer look to locaweb IaaS
A closer look to locaweb IaaSA closer look to locaweb IaaS
A closer look to locaweb IaaS
 
Cf summit2014 roadmap
Cf summit2014 roadmapCf summit2014 roadmap
Cf summit2014 roadmap
 
Using Kafka as a Database For Real-Time Transaction Processing | Chad Preisle...
Using Kafka as a Database For Real-Time Transaction Processing | Chad Preisle...Using Kafka as a Database For Real-Time Transaction Processing | Chad Preisle...
Using Kafka as a Database For Real-Time Transaction Processing | Chad Preisle...
 
Cloud Computing101 Azure, updated june 2017
Cloud Computing101 Azure, updated june 2017Cloud Computing101 Azure, updated june 2017
Cloud Computing101 Azure, updated june 2017
 
Grails in the Cloud (2013)
Grails in the Cloud (2013)Grails in the Cloud (2013)
Grails in the Cloud (2013)
 
Scalability Availabilty and Management of WSO2 Carbon
Scalability Availabilty and Management of WSO2 CarbonScalability Availabilty and Management of WSO2 Carbon
Scalability Availabilty and Management of WSO2 Carbon
 
MVC 6 - the new unified Web programming model
MVC 6 - the new unified Web programming modelMVC 6 - the new unified Web programming model
MVC 6 - the new unified Web programming model
 
Tokyo azure meetup #9 azure update, october
Tokyo azure meetup #9   azure update, octoberTokyo azure meetup #9   azure update, october
Tokyo azure meetup #9 azure update, october
 
Lessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, Cloudera
Lessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, ClouderaLessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, Cloudera
Lessons from the field: Catalog of Kafka Deployments | Joseph Niemiec, Cloudera
 
Por trás da infraestrutura do Cloud - Campus Party 2014
Por trás da infraestrutura do Cloud - Campus Party 2014Por trás da infraestrutura do Cloud - Campus Party 2014
Por trás da infraestrutura do Cloud - Campus Party 2014
 
Tokyo Azure Meetup #4 - Build 2016 Overview
Tokyo Azure Meetup #4 -  Build 2016 OverviewTokyo Azure Meetup #4 -  Build 2016 Overview
Tokyo Azure Meetup #4 - Build 2016 Overview
 
Tokyo Azure Meetup #9 - Azure Update, september
Tokyo Azure Meetup #9 - Azure Update, septemberTokyo Azure Meetup #9 - Azure Update, september
Tokyo Azure Meetup #9 - Azure Update, september
 
Docker y azure container service
Docker y azure container serviceDocker y azure container service
Docker y azure container service
 
Stratoscale Latest and Greatest
Stratoscale Latest and GreatestStratoscale Latest and Greatest
Stratoscale Latest and Greatest
 

Similar to Scalable Real-Time CEP at Uber

Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uberconfluent
 
Scalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBERScalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBERShuyi Chen
 
Extending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event ProcessingExtending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event ProcessingOh Chan Kwon
 
Performance Comparison of Streaming Big Data Platforms
Performance Comparison of Streaming Big Data PlatformsPerformance Comparison of Streaming Big Data Platforms
Performance Comparison of Streaming Big Data PlatformsDataWorks Summit/Hadoop Summit
 
Guide to Application Performance: Planning to Continued Optimization
Guide to Application Performance: Planning to Continued OptimizationGuide to Application Performance: Planning to Continued Optimization
Guide to Application Performance: Planning to Continued OptimizationMuleSoft
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022HostedbyConfluent
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...confluent
 
Kinesis @ lyft
Kinesis @ lyftKinesis @ lyft
Kinesis @ lyftMian Hamid
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduDataWorks Summit
 
Software Architecture for Cloud Infrastructure
Software Architecture for Cloud InfrastructureSoftware Architecture for Cloud Infrastructure
Software Architecture for Cloud InfrastructureTapio Rautonen
 
AWS for Java Developers workshop
AWS for Java Developers workshopAWS for Java Developers workshop
AWS for Java Developers workshopRory Preddy
 
Azug - successfully breeding rabits
Azug - successfully breeding rabitsAzug - successfully breeding rabits
Azug - successfully breeding rabitsYves Goeleven
 
High Performance Computing with AWS
High Performance Computing with AWSHigh Performance Computing with AWS
High Performance Computing with AWSAmazon Web Services
 
Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...Pete Siddall
 
Realtime streaming architecture in INFINARIO
Realtime streaming architecture in INFINARIORealtime streaming architecture in INFINARIO
Realtime streaming architecture in INFINARIOJozo Kovac
 
Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)Govind Kanshi
 
Mtc learnings from isv & enterprise interaction
Mtc learnings from isv & enterprise  interactionMtc learnings from isv & enterprise  interaction
Mtc learnings from isv & enterprise interactionGovind Kanshi
 

Similar to Scalable Real-Time CEP at Uber (20)

Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
 
Scalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBERScalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBER
 
Spark cep
Spark cepSpark cep
Spark cep
 
Extending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event ProcessingExtending Spark Streaming to Support Complex Event Processing
Extending Spark Streaming to Support Complex Event Processing
 
Performance Comparison of Streaming Big Data Platforms
Performance Comparison of Streaming Big Data PlatformsPerformance Comparison of Streaming Big Data Platforms
Performance Comparison of Streaming Big Data Platforms
 
Guide to Application Performance: Planning to Continued Optimization
Guide to Application Performance: Planning to Continued OptimizationGuide to Application Performance: Planning to Continued Optimization
Guide to Application Performance: Planning to Continued Optimization
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
 
Kinesis @ lyft
Kinesis @ lyftKinesis @ lyft
Kinesis @ lyft
 
IoT Austin CUG talk
IoT Austin CUG talkIoT Austin CUG talk
IoT Austin CUG talk
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache Kudu
 
Software Architecture for Cloud Infrastructure
Software Architecture for Cloud InfrastructureSoftware Architecture for Cloud Infrastructure
Software Architecture for Cloud Infrastructure
 
AWS for Java Developers workshop
AWS for Java Developers workshopAWS for Java Developers workshop
AWS for Java Developers workshop
 
Azug - successfully breeding rabits
Azug - successfully breeding rabitsAzug - successfully breeding rabits
Azug - successfully breeding rabits
 
High Performance Computing with AWS
High Performance Computing with AWSHigh Performance Computing with AWS
High Performance Computing with AWS
 
Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...
 
Operational-Analytics
Operational-AnalyticsOperational-Analytics
Operational-Analytics
 
Realtime streaming architecture in INFINARIO
Realtime streaming architecture in INFINARIORealtime streaming architecture in INFINARIO
Realtime streaming architecture in INFINARIO
 
Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)
 
Mtc learnings from isv & enterprise interaction
Mtc learnings from isv & enterprise  interactionMtc learnings from isv & enterprise  interaction
Mtc learnings from isv & enterprise interaction
 

More from WSO2

Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
How to Create a Service in Choreo
How to Create a Service in ChoreoHow to Create a Service in Choreo
How to Create a Service in ChoreoWSO2
 
Ballerina Tech Talk - May 2023
Ballerina Tech Talk - May 2023Ballerina Tech Talk - May 2023
Ballerina Tech Talk - May 2023WSO2
 
Platform Strategy to Deliver Digital Experiences on Azure
Platform Strategy to Deliver Digital Experiences on AzurePlatform Strategy to Deliver Digital Experiences on Azure
Platform Strategy to Deliver Digital Experiences on AzureWSO2
 
GartnerITSymSessionSlides.pdf
GartnerITSymSessionSlides.pdfGartnerITSymSessionSlides.pdf
GartnerITSymSessionSlides.pdfWSO2
 
[Webinar] How to Create an API in Minutes
[Webinar] How to Create an API in Minutes[Webinar] How to Create an API in Minutes
[Webinar] How to Create an API in MinutesWSO2
 
Modernizing the Student Journey with Ethos Identity
Modernizing the Student Journey with Ethos IdentityModernizing the Student Journey with Ethos Identity
Modernizing the Student Journey with Ethos IdentityWSO2
 
Choreo - Build unique digital experiences on WSO2's platform, secured by Etho...
Choreo - Build unique digital experiences on WSO2's platform, secured by Etho...Choreo - Build unique digital experiences on WSO2's platform, secured by Etho...
Choreo - Build unique digital experiences on WSO2's platform, secured by Etho...WSO2
 
CIO Summit Berlin 2022.pptx.pdf
CIO Summit Berlin 2022.pptx.pdfCIO Summit Berlin 2022.pptx.pdf
CIO Summit Berlin 2022.pptx.pdfWSO2
 
Delivering New Digital Experiences Fast - Introducing Choreo
Delivering New Digital Experiences Fast - Introducing ChoreoDelivering New Digital Experiences Fast - Introducing Choreo
Delivering New Digital Experiences Fast - Introducing ChoreoWSO2
 
Fueling the Digital Experience Economy with Connected Products
Fueling the Digital Experience Economy with Connected ProductsFueling the Digital Experience Economy with Connected Products
Fueling the Digital Experience Economy with Connected ProductsWSO2
 
A Reference Methodology for Agile Digital Businesses
 A Reference Methodology for Agile Digital Businesses A Reference Methodology for Agile Digital Businesses
A Reference Methodology for Agile Digital BusinessesWSO2
 
Workflows in WSO2 API Manager - WSO2 API Manager Community Call (12/15/2021)
Workflows in WSO2 API Manager - WSO2 API Manager Community Call (12/15/2021)Workflows in WSO2 API Manager - WSO2 API Manager Community Call (12/15/2021)
Workflows in WSO2 API Manager - WSO2 API Manager Community Call (12/15/2021)WSO2
 
Lessons from the pandemic - From a single use case to true transformation
 Lessons from the pandemic - From a single use case to true transformation Lessons from the pandemic - From a single use case to true transformation
Lessons from the pandemic - From a single use case to true transformationWSO2
 
Adding Liveliness to Banking Experiences
Adding Liveliness to Banking ExperiencesAdding Liveliness to Banking Experiences
Adding Liveliness to Banking ExperiencesWSO2
 
Building a Future-ready Bank
Building a Future-ready BankBuilding a Future-ready Bank
Building a Future-ready BankWSO2
 
WSO2 API Manager Community Call - November 2021
WSO2 API Manager Community Call - November 2021WSO2 API Manager Community Call - November 2021
WSO2 API Manager Community Call - November 2021WSO2
 
[API World ] - Managing Asynchronous APIs
[API World ] - Managing Asynchronous APIs[API World ] - Managing Asynchronous APIs
[API World ] - Managing Asynchronous APIsWSO2
 
[API World 2021 ] - Understanding Cloud Native Deployment
[API World 2021 ] - Understanding Cloud Native Deployment[API World 2021 ] - Understanding Cloud Native Deployment
[API World 2021 ] - Understanding Cloud Native DeploymentWSO2
 
[API Word 2021] - Quantum Duality of “API as a Business and a Technology”
[API Word 2021] - Quantum Duality of “API as a Business and a Technology”[API Word 2021] - Quantum Duality of “API as a Business and a Technology”
[API Word 2021] - Quantum Duality of “API as a Business and a Technology”WSO2
 

More from WSO2 (20)

Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
How to Create a Service in Choreo
How to Create a Service in ChoreoHow to Create a Service in Choreo
How to Create a Service in Choreo
 
Ballerina Tech Talk - May 2023
Ballerina Tech Talk - May 2023Ballerina Tech Talk - May 2023
Ballerina Tech Talk - May 2023
 
Platform Strategy to Deliver Digital Experiences on Azure
Platform Strategy to Deliver Digital Experiences on AzurePlatform Strategy to Deliver Digital Experiences on Azure
Platform Strategy to Deliver Digital Experiences on Azure
 
GartnerITSymSessionSlides.pdf
GartnerITSymSessionSlides.pdfGartnerITSymSessionSlides.pdf
GartnerITSymSessionSlides.pdf
 
[Webinar] How to Create an API in Minutes
[Webinar] How to Create an API in Minutes[Webinar] How to Create an API in Minutes
[Webinar] How to Create an API in Minutes
 
Modernizing the Student Journey with Ethos Identity
Modernizing the Student Journey with Ethos IdentityModernizing the Student Journey with Ethos Identity
Modernizing the Student Journey with Ethos Identity
 
Choreo - Build unique digital experiences on WSO2's platform, secured by Etho...
Choreo - Build unique digital experiences on WSO2's platform, secured by Etho...Choreo - Build unique digital experiences on WSO2's platform, secured by Etho...
Choreo - Build unique digital experiences on WSO2's platform, secured by Etho...
 
CIO Summit Berlin 2022.pptx.pdf
CIO Summit Berlin 2022.pptx.pdfCIO Summit Berlin 2022.pptx.pdf
CIO Summit Berlin 2022.pptx.pdf
 
Delivering New Digital Experiences Fast - Introducing Choreo
Delivering New Digital Experiences Fast - Introducing ChoreoDelivering New Digital Experiences Fast - Introducing Choreo
Delivering New Digital Experiences Fast - Introducing Choreo
 
Fueling the Digital Experience Economy with Connected Products
Fueling the Digital Experience Economy with Connected ProductsFueling the Digital Experience Economy with Connected Products
Fueling the Digital Experience Economy with Connected Products
 
A Reference Methodology for Agile Digital Businesses
 A Reference Methodology for Agile Digital Businesses A Reference Methodology for Agile Digital Businesses
A Reference Methodology for Agile Digital Businesses
 
Workflows in WSO2 API Manager - WSO2 API Manager Community Call (12/15/2021)
Workflows in WSO2 API Manager - WSO2 API Manager Community Call (12/15/2021)Workflows in WSO2 API Manager - WSO2 API Manager Community Call (12/15/2021)
Workflows in WSO2 API Manager - WSO2 API Manager Community Call (12/15/2021)
 
Lessons from the pandemic - From a single use case to true transformation
 Lessons from the pandemic - From a single use case to true transformation Lessons from the pandemic - From a single use case to true transformation
Lessons from the pandemic - From a single use case to true transformation
 
Adding Liveliness to Banking Experiences
Adding Liveliness to Banking ExperiencesAdding Liveliness to Banking Experiences
Adding Liveliness to Banking Experiences
 
Building a Future-ready Bank
Building a Future-ready BankBuilding a Future-ready Bank
Building a Future-ready Bank
 
WSO2 API Manager Community Call - November 2021
WSO2 API Manager Community Call - November 2021WSO2 API Manager Community Call - November 2021
WSO2 API Manager Community Call - November 2021
 
[API World ] - Managing Asynchronous APIs
[API World ] - Managing Asynchronous APIs[API World ] - Managing Asynchronous APIs
[API World ] - Managing Asynchronous APIs
 
[API World 2021 ] - Understanding Cloud Native Deployment
[API World 2021 ] - Understanding Cloud Native Deployment[API World 2021 ] - Understanding Cloud Native Deployment
[API World 2021 ] - Understanding Cloud Native Deployment
 
[API Word 2021] - Quantum Duality of “API as a Business and a Technology”
[API Word 2021] - Quantum Duality of “API as a Business and a Technology”[API Word 2021] - Quantum Duality of “API as a Business and a Technology”
[API Word 2021] - Quantum Duality of “API as a Business and a Technology”
 

Recently uploaded

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Recently uploaded (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 

Scalable Real-Time CEP at Uber

  • 1. Scalable Real-Time Complex Event Processing @Uber Shuyi Chen Uber Technology Inc.
  • 2. ● 6 continents, 70 countries and 400+ cities ● Transportation as reliable as running water, everywhere, for everyone Uber
  • 3. Outline • Motivation • Architecture • Limitations • Challenges
  • 4. Outline • Motivation • Architecture • Limitations • Challenges
  • 5. Uber is a data-driven company
  • 6. Thousands of Kafka topics from different services
  • 7. We can extract a lot of useful information from this rich set of logs in real-time!
  • 8. Multiple logins from the same IP in the last 10 minutes
  • 9. Partner accepted a trip → partner calls rider through the Uber APP → rider cancels the trip
  • 10. Partners reject the second pickup of a UberPOOL trip
  • 11. Multiple logins from the same IP in the last 10 minutes Window Aggregation
  • 12. Partner accepted a trip → partner calls rider through the Uber APP → rider cancels the trip Pattern detection
  • 13. Partners reject the second pickup of a UberPOOL trip Filter
  • 14. Can we use declarative semantics to specify these stream processing logics?
  • 15. Complex event processing • Combines data from multiple sources to infer events or patterns that suggest more complicated circumstances • CEP is used across many industries for various use cases, including: – Finance: Trade analysis, fraud detection – Airlines: Operations monitoring – Healthcare: Claims processing, patient monitoring – Energy and Telecommunications: Outage detection • CEP uses declarative rule/query language to specify event processing logic
  • 16. WSO2/Siddhi: Complex event processing engine • Lightweight, extensible, open source, released as a Java library • Features supported – Filter – Join – Aggregation – Group by – Window – Pattern processing – Sequence processing – Event tables – Event-time processing – UDF – Extensions – Declarative query language: SiddhiQL
  • 17. How Siddhi works • Specify processing logic declaratively with SiddhiQL
  • 18. How Siddhi works • Query is parsed at runtime into an execution plan runtime • As events flow in, the execution plan runtime process events inside the CEP engine according the query logic
  • 19. How can we make it scalable at Uber scale?
  • 20. Apache Samza • A distributed stream processing framework – Distributed and Scalable – Built-in State management – Built-in fault tolerant – At-least-once message processing
  • 21. How can we make the stream processing output useful?
  • 22. Actions • Generalize a set of common action templates to make it easy for services and human to harness the power of realtime stream processing • Currently we support – Make an RPC call – Invoke a Webhook endpoint – Index to ElasticSearch – Index to Cassandra – Kafka – Statsd – Chat service – Email – Push notification
  • 24. Outline • Motivation • Architecture • Limitations • Challenges
  • 25.
  • 26.
  • 27. Partitioner • Re-partition events based on key • Support predicate pushdown through query analysis • Support column pruning through query analysis (WIP)
  • 28. Query processor • Parse Siddhi queries into execution plan runtime • Process events in Siddhi execution plan runtime • Checkpoint state regularly to ensure recovery upon crash/restart using RocksDB
  • 29. Action processor • Execute actions upon the complex event processing output • Support various kinds of actions for easy integration • Implement action retry mechanism using RocksDB to provide at-least-once delivery
  • 30. How do we translate a query into psychical plan that runs?
  • 31. DAG (Directed Acyclic Graph) generation • Analyze Siddhi query to automatically generate the stream processing DAG in Samza using the processors Filter, transformation
  • 34. No stream processing logic is hard-coded in any of the processors
  • 35. REST API backend • All queries, actions are stored externally in database. • RESTFUL API for CRUD operations • If query/action logic changed – Redeploy the Samza DAG if needed – Otherwise, the updated queries/actions will be loaded at runtime w/o interruption
  • 36. Unified management and monitoring • Every use case – share the same set of processors – Use queries and actions to describe its processing logic • A single monitoring template can be reused across different use cases
  • 37. Production status • 100+ production use cases • 30+ billion messages processed per day
  • 38. Applications • Real-time fraud detection • Real-time anomaly detection • Real-time marketing campaign • Real-time promotion • Real-time monitoring • Real-time feedback system • Real-time analytics • Real-time visualizations • And etc.
  • 39. Outline • Motivation • Architecture • Limitations • Challenges
  • 40. Out-of-order event handling • Not a big concern – Events of the same rider/partner are usually seconds aparts • K-slack extension in Siddhi for out-of-order event processing
  • 41. Auto-scaling • Manually re-partition kafka topics to increase parallelism • Manually tune container memory if needed • Future – Use CPU/memory/IO stats to auto-scale the data pipelines
  • 42. Outline • Motivation • Architecture • Limitations • Challenges
  • 43. Large checkpointing state • Samza use Kafka to log state changes • Siddhi engine snapshot can be large • Kafka message size limit to 1MB by default • Solution: we build logics to slice state into smaller pieces and checkpoint them.
  • 44. Synchronous checkpointing • If state is large, time to checkpoint can be long • Samza uses single-threaded model, unsafe to do it asynchronously (SAMZA-863)
  • 45. Exactly once state processing? • Can not commit state and offset atomically • No exactly once state processing
  • 46. Custom business logic • Common logic implemented as Siddhi extensions • Ad-hoc logic implemented as UDF in javascript or scala
  • 47. Intermediate Kafka messages • Samza uses Kafka as message queue for intermediate processing output – This can create large load on Kafka if a heave topic is partitioned multiple times – Encode the intermediate messages to reduce footprint
  • 48. Multi-tenancy • Older Siddhi version process events using a thread pool – Bad for multi-tenancy in YARN – Consume more CPU resource than claimed • Newer version still use thread pool for scheduled task, but main processing in single thread – Good: CPU consumption per YARN container is bounded
  • 49. Upgrading Samza jobs • Upgrade Samza jobs require a full restart, and can take minutes due to – Offset checkpointing topic too large → set retention to hours – Changelog topic too large → set retention or enable compaction in Kafka or host affinity (SAMZA-617) • To minimize the interruption during upgrade, it would be nice to have – Rolling restart – Per container restart
  • 50. Our solution: non-interrupted handoff • For critical jobs, we use replication during upgrade – Start a shadow job – Upgrade shadow – Switch primary and shadow – Upgrade primary – Switch back • Downside: require 2x capacity during upgrade