Scalable Real-Time CEP at Uber

Scalable Real-Time Complex Event
Processing @Uber
Shuyi Chen
Uber Technology Inc.

● 6 continents, 70 countries and 400+ cities
● Transportation as reliable as running water, everywhere, for
everyone
Uber

Outline
• Motivation
• Architecture
• Limitations
• Challenges

Thousands of Kafka topics from different services

We can extract a lot of useful information from this rich
set of logs in real-time!

Multiple logins from the same IP in the last 10 minutes

Partner accepted a trip
→ partner calls rider through the Uber APP
→ rider cancels the trip

Partners reject the second pickup of a UberPOOL trip

Multiple logins from the same IP in the last 10 minutes
Window Aggregation

Partner accepted a trip
→ partner calls rider through the Uber APP
→ rider cancels the trip
Pattern detection

Partners reject the second pickup of a UberPOOL trip
Filter

Can we use declarative semantics to specify these stream
processing logics?

Complex event processing
• Combines data from multiple sources to infer events or patterns that
suggest more complicated circumstances
• CEP is used across many industries for various use cases, including:
– Finance: Trade analysis, fraud detection
– Airlines: Operations monitoring
– Healthcare: Claims processing, patient monitoring
– Energy and Telecommunications: Outage detection
• CEP uses declarative rule/query language to specify event processing
logic

WSO2/Siddhi: Complex event processing engine
• Lightweight, extensible, open source, released as a Java library
• Features supported
– Filter
– Join
– Aggregation
– Group by
– Window
– Pattern processing
– Sequence processing
– Event tables
– Event-time processing
– UDF
– Extensions
– Declarative query language: SiddhiQL

How Siddhi works
• Specify processing logic declaratively with SiddhiQL

How Siddhi works
• Query is parsed at runtime into an execution plan runtime
• As events flow in, the execution plan runtime process events inside
the CEP engine according the query logic

How can we make it scalable at Uber scale?

Apache Samza
• A distributed stream processing framework
– Distributed and Scalable
– Built-in State management
– Built-in fault tolerant
– At-least-once message processing

How can we make the stream processing output useful?

Actions
• Generalize a set of common action templates to make it easy for
services and human to harness the power of realtime stream
processing
• Currently we support
– Make an RPC call
– Invoke a Webhook endpoint
– Index to ElasticSearch
– Index to Cassandra
– Kafka
– Statsd
– Chat service
– Email
– Push notification

Actions
Real-time Scalable Complex Event Processing

Partitioner
• Re-partition events based on key
• Support predicate pushdown through query analysis
• Support column pruning through query analysis (WIP)

Query processor
• Parse Siddhi queries into execution plan runtime
• Process events in Siddhi execution plan runtime
• Checkpoint state regularly to ensure recovery upon crash/restart
using RocksDB

Action processor
• Execute actions upon the complex event processing output
• Support various kinds of actions for easy integration
• Implement action retry mechanism using RocksDB to provide
at-least-once delivery

How do we translate a query into psychical plan that
runs?

DAG (Directed Acyclic Graph) generation
• Analyze Siddhi query to automatically generate the stream
processing DAG in Samza using the processors
Filter, transformation

No stream processing logic is hard-coded in any of the
processors

REST API backend
• All queries, actions are stored externally in database.
• RESTFUL API for CRUD operations
• If query/action logic changed
– Redeploy the Samza DAG if needed
– Otherwise, the updated queries/actions will be loaded at runtime w/o
interruption

Unified management and monitoring
• Every use case
– share the same set of processors
– Use queries and actions to describe its processing logic
• A single monitoring template can be reused across different use
cases

Production status
• 100+ production use cases
• 30+ billion messages processed per day

Applications
• Real-time fraud detection
• Real-time anomaly detection
• Real-time marketing campaign
• Real-time promotion
• Real-time monitoring
• Real-time feedback system
• Real-time analytics
• Real-time visualizations
• And etc.

Out-of-order event handling
• Not a big concern
– Events of the same rider/partner are usually seconds aparts
• K-slack extension in Siddhi for out-of-order event processing

Auto-scaling
• Manually re-partition kafka topics to increase parallelism
• Manually tune container memory if needed
• Future
– Use CPU/memory/IO stats to auto-scale the data pipelines

Large checkpointing state
• Samza use Kafka to log state changes
• Siddhi engine snapshot can be large
• Kafka message size limit to 1MB by default
• Solution: we build logics to slice state into smaller pieces and
checkpoint them.

Synchronous checkpointing
• If state is large, time to checkpoint can be long
• Samza uses single-threaded model, unsafe to do it asynchronously
(SAMZA-863)

Exactly once state processing?
• Can not commit state and offset atomically
• No exactly once state processing

Custom business logic
• Common logic implemented as Siddhi extensions
• Ad-hoc logic implemented as UDF in javascript or scala

Intermediate Kafka messages
• Samza uses Kafka as message queue for intermediate processing
output
– This can create large load on Kafka if a heave topic is partitioned multiple
times
– Encode the intermediate messages to reduce footprint

Multi-tenancy
• Older Siddhi version process events using a thread pool
– Bad for multi-tenancy in YARN
– Consume more CPU resource than claimed
• Newer version still use thread pool for scheduled task, but main
processing in single thread
– Good: CPU consumption per YARN container is bounded

Upgrading Samza jobs
• Upgrade Samza jobs require a full restart, and can take minutes due
to
– Offset checkpointing topic too large → set retention to hours
– Changelog topic too large → set retention or enable compaction in
Kafka or host affinity (SAMZA-617)
• To minimize the interruption during upgrade, it would be nice to
have
– Rolling restart
– Per container restart

Our solution: non-interrupted handoff
• For critical jobs, we use replication during upgrade
– Start a shadow job
– Upgrade shadow
– Switch primary and shadow
– Upgrade primary
– Switch back
• Downside: require 2x capacity during upgrade

Scalable Real-Time CEP at Uber

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Scalable Real-Time CEP at Uber

Similar to Scalable Real-Time CEP at Uber (20)

More from WSO2

More from WSO2 (20)

Recently uploaded

Recently uploaded (20)

Scalable Real-Time CEP at Uber