The Marketplace data team at Uber has built a scalable complex event processing platform to solve many challenging real-time data needs for various Uber products. This platform has been in production for more than a year and supports over 100 real-time data use cases with a team of 3. In this talk, we will share the detail of the design and our experience, and how we employ Siddhi, Kafka and Samza at scale.
14. Can we use declarative semantics to specify these stream
processing logics?
15. Complex event processing
• Combines data from multiple sources to infer events or patterns that
suggest more complicated circumstances
• CEP is used across many industries for various use cases, including:
– Finance: Trade analysis, fraud detection
– Airlines: Operations monitoring
– Healthcare: Claims processing, patient monitoring
– Energy and Telecommunications: Outage detection
• CEP uses declarative rule/query language to specify event processing
logic
16. WSO2/Siddhi: Complex event processing engine
• Lightweight, extensible, open source, released as a Java library
• Features supported
– Filter
– Join
– Aggregation
– Group by
– Window
– Pattern processing
– Sequence processing
– Event tables
– Event-time processing
– UDF
– Extensions
– Declarative query language: SiddhiQL
18. How Siddhi works
• Query is parsed at runtime into an execution plan runtime
• As events flow in, the execution plan runtime process events inside
the CEP engine according the query logic
20. Apache Samza
• A distributed stream processing framework
– Distributed and Scalable
– Built-in State management
– Built-in fault tolerant
– At-least-once message processing
21. How can we make the stream processing output useful?
22. Actions
• Generalize a set of common action templates to make it easy for
services and human to harness the power of realtime stream
processing
• Currently we support
– Make an RPC call
– Invoke a Webhook endpoint
– Index to ElasticSearch
– Index to Cassandra
– Kafka
– Statsd
– Chat service
– Email
– Push notification
27. Partitioner
• Re-partition events based on key
• Support predicate pushdown through query analysis
• Support column pruning through query analysis (WIP)
28. Query processor
• Parse Siddhi queries into execution plan runtime
• Process events in Siddhi execution plan runtime
• Checkpoint state regularly to ensure recovery upon crash/restart
using RocksDB
29. Action processor
• Execute actions upon the complex event processing output
• Support various kinds of actions for easy integration
• Implement action retry mechanism using RocksDB to provide
at-least-once delivery
30. How do we translate a query into psychical plan that
runs?
31. DAG (Directed Acyclic Graph) generation
• Analyze Siddhi query to automatically generate the stream
processing DAG in Samza using the processors
Filter, transformation
35. REST API backend
• All queries, actions are stored externally in database.
• RESTFUL API for CRUD operations
• If query/action logic changed
– Redeploy the Samza DAG if needed
– Otherwise, the updated queries/actions will be loaded at runtime w/o
interruption
36. Unified management and monitoring
• Every use case
– share the same set of processors
– Use queries and actions to describe its processing logic
• A single monitoring template can be reused across different use
cases
40. Out-of-order event handling
• Not a big concern
– Events of the same rider/partner are usually seconds aparts
• K-slack extension in Siddhi for out-of-order event processing
41. Auto-scaling
• Manually re-partition kafka topics to increase parallelism
• Manually tune container memory if needed
• Future
– Use CPU/memory/IO stats to auto-scale the data pipelines
43. Large checkpointing state
• Samza use Kafka to log state changes
• Siddhi engine snapshot can be large
• Kafka message size limit to 1MB by default
• Solution: we build logics to slice state into smaller pieces and
checkpoint them.
44. Synchronous checkpointing
• If state is large, time to checkpoint can be long
• Samza uses single-threaded model, unsafe to do it asynchronously
(SAMZA-863)
45. Exactly once state processing?
• Can not commit state and offset atomically
• No exactly once state processing
46. Custom business logic
• Common logic implemented as Siddhi extensions
• Ad-hoc logic implemented as UDF in javascript or scala
47. Intermediate Kafka messages
• Samza uses Kafka as message queue for intermediate processing
output
– This can create large load on Kafka if a heave topic is partitioned multiple
times
– Encode the intermediate messages to reduce footprint
48. Multi-tenancy
• Older Siddhi version process events using a thread pool
– Bad for multi-tenancy in YARN
– Consume more CPU resource than claimed
• Newer version still use thread pool for scheduled task, but main
processing in single thread
– Good: CPU consumption per YARN container is bounded
49. Upgrading Samza jobs
• Upgrade Samza jobs require a full restart, and can take minutes due
to
– Offset checkpointing topic too large → set retention to hours
– Changelog topic too large → set retention or enable compaction in
Kafka or host affinity (SAMZA-617)
• To minimize the interruption during upgrade, it would be nice to
have
– Rolling restart
– Per container restart
50. Our solution: non-interrupted handoff
• For critical jobs, we use replication during upgrade
– Start a shadow job
– Upgrade shadow
– Switch primary and shadow
– Upgrade primary
– Switch back
• Downside: require 2x capacity during upgrade