Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - re:Invent 2017

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS re:INVENT
Netflix Keystone SPaaS
S t r e a m P r o c e s s i n g A s a S e r v i c e
A B D 3 2 0
Monal Daxini @monaldax #reInvent #Netflix
Stream Processing Infrastructure

@monaldax
● Data Engineer Why stream processing, and what to expect
from a platform?
● Data Leader Product / Vision of Stream Processing As a
Service platform
● Platform engineer How to build and operate a a Stream
What Do I Get Out Of This Talk?
Different perspectives:
@monaldax

@monaldax
● I will focus on stream processing platform for business
insights, which my team builds mostly based on Flink
● I won’t be addressing operational insights for which we
have different systems

@monaldax
Why Stream
Processing?
@monaldax

@monaldax
● Low latency insights and analytics
● Processing data as it arrives helps spread workload
over time, & reduce processing redundancy
● Need to process unbounded data sets becoming
increasingly common
Why Real Time Data?

@monaldax
● Enable users to focus on data and business insights,
and not worry about building stream processing
infrastructure and tooling
Why Build A Stream Processing
Platform?

@monaldax
What Does A Stream
Processing Platform
Offer?

@monaldax
Platform Needs To Offer Robust Way To Process
Streams Allowing To Tradeoff Between Ease,
Capability, &Flexibility
SPaa
S

@monaldax
Point & Click
Routing, Filtering,
Projection
Streaming
Jobs
● Support Streaming SQL Future
● Interactive exploration of streams for quick
prototyping Future
Stream Processing as a Service platform
offers

@monaldax
Point & Click
Routing, Filtering,
Projection

@monaldax
Event
Producers
Sinks
Ingest Pipelines Are The Backbone Of A
Real-time Data Infrastructure
SERVERL
ESS
Turnkey
100% in
AWS

@monaldax
Keystone Pipeline– Provision A Managed
Data Stream 📽

@monaldax
Keystone Self-serve - Optional Filter &
Message Parser
* We would eventually like to move away from xpath & our
custom parser

@monaldax
* We would eventually like to move away from xpath & our
custom parser
Keystone Self-serve – Message Format

@monaldax
Keystone Self-Serve – Optional
Projection 📽

@monaldax
Keystone Self-serve – Elasticsearch
Sink Config

@monaldax
Keystone Self-serve – Kafka Sink Partition
Key Support

@monaldax
Keystone - Configure 1 Data Stream, A Filter, &
3 Sinks

Event
Producer
Create Kafka Topic, And Three
Separate Jobs
SPaaS
Router
Fronting
Kafka
KSGateway
Consumer
Kafka
KC
W
Elasticsearc
h
3 Jobs1
Topic
Keystone
Management
1
Topic
@monaldax

Event Flow: Producer Uses Kafka Client
Wrapper Or Proxy
SPaaS
Router
Fronting
Kafka
Event
Producer
KSGateway
Consumer
Kafka
Keystone
Management
KC
W
Elasticsearc
h
@monaldax

Event Flow: Events Queued In Kafka
SPaaS
Router
Fronting
Kafka
Event
Producer
KSGateway
Consumer
Kafka
KC
W
Elasticsearc
h
3
instance
s
Keystone
Management
@monaldax

Event Flow: Each Router Reads From Source,
Optionally Applies Filter & Projection
SPaaS
Router
Fronting
Kafka
Event
Producer
KSGateway
Consumer
Kafka
KC
W
Elasticsearc
h
3
instance
s
Keystone
Management
@monaldax

Event Flow: Each Router Writes To Their
Respective Sinks
SPaaS
Router
Fronting
Kafka
Event
Producer
KSGateway
Consumer
Kafka
KC
W
Elasticsearc
h
3
instance
s
Non-
Keyed
Keyed
Supported
Keystone
Management
@monaldax

Dashboard & Alert Config Generation For
Provisioned Streams

@monaldax
Keystone Admin Links

Data Stream Operations is Managed
• Fully managed scaling
• Managed capacity planning
• 24 X 7 availability [Scale]
• Garbage collect unused streams
@monaldax

Keystone Pipeline - The Road Ahead
• Additional components – UDFs, Data Hygiene, Data
Alerting, etc
• Component chaining
• Schema Support
• Data Lineage
• Cost attribution
@monaldax

@monaldax
Point & Click
Routing, Filtering,
Projection
(prod)
Streaming
Jobs

Why A Streaming Job?
• When we need more flexibility and power than
what Point & Click pipeline offers, use stream
processing jobs.
@monaldax

Generate Streaming Job From Template
@monaldax

Run And Debug Locally In The IDE
@monaldax

Create A New Streaming Job Config For
Deployment
@monaldax

Deploying A Streaming Job In Test
@monaldax

Deploying A Streaming Job In Other
Environments
@monaldax

Deployment Status Of A Sample Streaming Job

Streaming Job Actions & Links
@monaldax

Streaming Job Dashboard
@monaldax

Searchable Streaming-Job Logs
@monaldax

@monaldax
● Use case specific consulting
● Recipes
● Examples and Documentation
In Addition, Consulting &
Documentation

@monaldax
Types of Streaming
Jobs

Broadly, Two Categories Of Streaming Jobs
• Stateless
• No state maintained across events
• Stateful
• State maintained across events
@monaldax

Event
Producer
Streaming Job In Context Of Keystone Pipeline
SPaaS
Router
Fronting
Kafka
KSGateway
Consumer
Kafka
Keystone
Management
KC
W
Elasticsearc
h
Streaming
Job
@monaldax

Image adapted from:
Stateless Stream Processor – No Internal State
@monaldax

Stateless Stream Processor – External State
Image adapted from:
Stephan Ewen@monaldax

Stateless Example: Generating Plays Feed For
Personalization,
And Discovery Of Shows

@monaldax
Stateless Streaming Job Use Case: High Level
Architecture
Enriching And Identifying Certain Plays
Playback
History
Service
Video
Metada
ta
Streaming
Job
Play
Logs
Live
Service
Lookup
Data

Stateful Stream Processing
Image adapted from:
Stephan Ewen
@monaldax

Stateful Example: Creating Search Sessions

Search Personalization – Custom
Windowing On Out-of-order Events
...... S ES
……….Session 2: S
Hour
s
S E
Session 1:
SE …
@monaldax

Streaming
Application
Flink
Engine Local
State
Stateful Streaming Application With Local
State, Checkpoints, And Savepoints
Sinks
Savepoints
(Explicitly
Triggered)
Checkpoi
nts
(Automatic)
Sources
@monaldax

Streaming Job (Flink) Savepoint Tooling
Support
• S3 based multi-tenant storage management
• Auto savepoint and resume from savepoint on redeploy
• Resume from an existing savepoint
@monaldax

Streaming Job (Flink) High Level Features
• Stateless jobs
• Event enrichment support by accessing services using
platform thick clients
• Stateful jobs 100s of GB, with larger state support in the
works
• Reusable blocks (in progress)
• Job development, deployment, and monitoring tooling
(alpha)@monaldax

Streaming Jobs - The Road Ahead
• Easy resource provisioning estimates
• Flink support for reading and writing from data warehouse,
backfill
• Continue to evolve tooling and support for large state
• Reusable Components - sources, sinks, operators, schema support,
data hygiene
• Tooling support for Spark Streaming
@monaldax

Prod – Trending Events & Scale
With Events Flowing To Hive, Elasticsearch,
Kafka
≅ 80B to
1.3T
• 1.3T+ events processed per day
• 600B to 1T unique per day
• 2+ PB in 4.5+ PB out per day
• Peak: 12M events in / sec & 36
GB / sec
@monaldax

@monaldax
Keystone Router Stream Processing
Jobs Scale
m4.4xl

@monaldax
RTDI Consists Of 4 Systems. Keystone Pipeline
Runs 24 X 7, & Does Not Impact Members
Ability To Play Videos
Keystone
Stream
Processing
(SPaaS)
Keystone
Management
Keystone
Messaging
24 x 7
- Dev
- Test
- Prod
Granular
shadowin
g

Event
Producer
Components & Streaming Jobs
SPaaS
Router
Fronting
Kafka
KSGateway
Consumer
Kafka
Keystone
Management
KC
W
Hive
Elasticsearc
h
Streaming
Job
@monaldax

Event
Producer
Event Producer Library
SPaaS
Router
Fronting
Kafka
KSGateway
Consumer
Kafka
Keystone
Management
KC
W
Hive
Elasticsearc
h
Streaming
Job
@monaldax

• Inject event metadata - GUID, timestamp, host, app
• Transparent and dynamic traffic routing for
producers
• Chaski - Custom binary data wrapper within
Keystone pipeline
• Multiple serialization support & Additional
metadata
Producer Library - Kafka Client Wrapper
@monaldax

Streaming
JobEvent
Producer
Boundary Of Custom Binary Data Wrapper
SPaaS
Router
Fronting
Kafka
KSGateway
Consumer
Kafka
Keystone
Management
KC
W
Hive
Elasticsearc
h
@monaldax

• Automated Kafka producer buffer (60s) tuning
based on traffic
• Best effort delivery, Prioritizes host application
availability
• acks=1, Do not block to send events, Unclean
leader election
• Non-keyed messages, retry send to available
Producer Library - Kafka Client Wrapper
@monaldax

Event
Producer
Ksgateway - Event Proxy For Non-java Clients,
REST & GRPC
SPaaS
Router
Fronting
Kafka
KSGateway
Consumer
Kafka
Keystone
Management
KC
W
Hive
Elasticsearc
h
Streaming
Job
@monaldax

Event
Producer
Kafka Clusters (0.10) on Amazon EC2
SPaaS
Router
Fronting
Kafka
KSGateway
Consumer
Kafka
Keystone
Management
KC
W
Elasticsearc
h
Streaming
Job
@monaldax

• Have message sizes > 1MB and up to 10MB
• Large Scale Keystone Ingest pipelines results in
large fan out
• Lower Latency – used for ad-hoc messaging as
well
• Open source – enhance, patch, or extend
Why Kafka?
@monaldax

Scale for Large Fan-out and Isolation -
Cascading Topology
Fronting
Kafka
Consumer
Kafka
Consumer
@monaldax

Alternative: Logical Stream (Topic) Spread
Across Multiple Topics Across Multiple
Clusters (WIP)
Multi-Cluster
Producer
Multi-
Cluster
Consumer
@monaldax

• Dedicated Zookeeper cluster per Kafka cluster
• Small Clusters < 200 brokers, partitions <= 10K
• Partitions distributed evenly across brokers
• Rack-aware replica assignment, brokers spread in
3 Zones
• 2 copies & Unclean leader election on
• Non-transactional
Kafka Deployment Strategies – Version 0.10
(YMMV)
@monaldax

• 36+ Kafka & Zookeeper clusters
• 4000+ brokers (EC2), 700+ topics
• 3000+ d2.xl, 900+ i2.2xl
• Highly available 99.99%+
• Retention 2hr, 4hr, 8hr, 24hr
Kafka Clusters Scale
@monaldax

Event
Producer
Stream Processing Platform
Router
Fronting
Kafka
KSGateway
Consumer
Kafka
Keystone
Management
KC
W
Elasticsearc
h
Stream
Consumers
@monaldax

High-level Stream Processing Platform
Architecture - Routers
Keystone
Management
Point & Click
Router
Streaming Job
Container
Runtime
1.
Create
Streaming
Job
2. Launch Job
with
Config,
Source, Sink,
Filters,
Projections,
etc.
3. Launch
Containers
• Immutable Image
• Automated, system driven config
overrides
@monaldax

• Keystone pipeline is built on Flink Routers
• Each Flink Router is a stream processing job
• Router provisioning based on incoming traffic or
estimates
• Runs on containers atop EC2
• Island mode - single AWS Region
Streaming Jobs 1.3.2
@monaldax

High-level Stream Processing Platform
Architecture
Streaming Jobs
Keystone
Management
Point & Click or
Streaming Job
Container
Runtime
1.
Create
Streaming
Job
2. Launch Job
with
Config
overrides 3. Launch
Containers
• Immutable Image
• User driven config
overrides
@monaldax

Stream Processing Platform - Layered cake
Amazon EC2
Titus Container Runtime
(Flink Streaming Engine, Config Management)
Reusable Components
Source & Sink Connectors, Filtering, Projection, etc.
Routers
(Streaming Job)
Streaming Jobs
@monaldax

@monaldax
Flink Job Cluster In HA Mode
Zookeeper
Job
Manager
Leader
(WebUI)
Task
Manager
Task
Manager
Task
Manager
Job
Manager
(WebUI)
One dedicated
Zookeeper cluster for
all streamig Jobs

Flink Task Slots & Automatic Operator
Chaining
Image: Flink 1.2
documentation@monaldax

@monaldax
Flink Job Cluster In HA Mode With
Checkpoints
Zookeeper
Job
Manager
(Leader)
Task
Manager
Task
Manager
Task
Manager
Job
Manager
State
Checkpoints
State
Metadata
Checkpoint
s

Flink Checkpoints Similar To 2 Phase Commit
Image: Flink 1.2 documentation@monaldax

@monaldax
Titus
Job
Task
Manager
I
P
Titus Host 4 Titus Host 5
Checkpoints Are Taken Often
Zookeep
er
Job
Manager
(standby)
Job
Manager
(master)
Task
Manager
Titus Host
1 I
P
Titus
Host 1
…. Task
Manager
Titus Host
2 I
P
Titus
Job I
P
I
P
AWS
VPC
State
-
Checkpoints
- Kafka
Offset
Save

@monaldax
Titus
Job
Task
Manager
I
P
Checkpoints Are Taken Often. A Container
Could Fail…
Zookeep
er
Job
Manager
(standby)
Job
Manager
(master)
Task
Manager
Titus Host
1 I
P
Titus
Host 1
…. Task
Manager
Titus Host
2 I
P
Titus
Job I
P
I
P
AWS
VPC
State
-
Checkpoints
- Kafka
Offset
Save
X

@monaldax
Titus
Job
Task
Manager
I
P
Zookeep
er
Job
Manager
(standby)
Job
Manager
(master)
Task
Manager
Titus Host
1 I
P
Titus
Host 2
…. Task
Manager
Titus
Host 3 I
P
Titus
Job I
P
I
P
AWS
VPC
State
-
Checkpoints
- Kafka
Offset
Restor
e
Restored To Last Checkpoint, Partially
Recovery Supported
Replacement
container

Event
Producer
and Streaming Jobs Management
SPaaS
Router
Fronting
Kafka
KSGateway
Consumer
Kafka
Keystone
Management
KC
W
Hive
Elasticsearc
h
Streaming
Job
@monaldax

@monaldax
Keystone Management Current Architecture -
Imperative
Composab
le Joblets
Composab
le Joblets

@monaldax
Keystone Management New Architecture (WIP) –
Declarative

@monaldax
Keystone Management New Architecture (WIP)

• The ability to pass data along the chain of Joblets within a
Job
• Locks and semaphores on resources spanning across jobs
• Customization and integration into Netflix ecosystem –
Eureka, etc.
Keystone Management Unique Features
@monaldax

@monaldax
How Do We Operate
It?
Scale Operations Using Systems Not
Humans

• No separate Ops team
• No separate QA team
• No separate Dev team
• It’s all done by developers of the Real Time Data
Infrastructure
We Run What We Build!
@monaldax

• We rely on metrics, monitoring, alerting & paging, &
automation
• Separate metrics system – Atlas
• Separate alert configuration and alert actions system
• Options for separate system to run cross-system
automation tasks
We Leverage Other Netflix Systems
@monaldax

Easy Alert Configuration And Status
@monaldax

Easy View Of Fired Alerts
@monaldax

Streaming
JobEvent
Producer
Operating Ksgateway - Event Proxy For Non-
Java Clients
SPaaS
Router
Fronting
Kafka
KSGateway
Consumer
Kafka
Keystone
Management Hive
Elasticsearc
h
• Stateless Service
• Scaled Using Elastic Load Balancing and
Auto Scaling Group
• Pre-scaled for planned increase in traffic
@monaldax

Streaming
JobEvent
Producer
Event Producer Related Monitoring And Alerts
SPaaS
Router
Fronting
Kafka
KSGateway
Consumer
Kafka
Keystone
Management
KC
W
Elasticsearc
h
@monaldax

@monaldax
Monitoring Producer, Alert On Drop Rate

Event
Producer
Kafka Clusters
SPaaS
Router
Fronting
Kafka
KSGateway
Consumer
Kafka
Keystone
Management
KC
W
Hive
Elasticsearc
h
Streaming
Job
@monaldax

@monaldax
Kafka Failover - Fronting Kafka Clusters

@monaldax
Fully
Automated
Kafka Cluster Failover – As Fast As 5 Minute

@monaldax
Kafka Cluster & Routers In
Healthy State
Flink
RouterFronting
Kafka
Event
Producer

@monaldax
Issue With Kafka Cluster
Flink
RouterFronting
Kafka
Event
Producer
X

@monaldax
Launch Backup Kafka Cluster With Same
Number Of Instances, But Smaller Instance
Type
Flink
RouterFronting
Kafka
Event
Producer
Bring up
failover
Kafka cluster
Copy
metadata
from
Zookeeper
X

@monaldax
Change Producer Config To Produce To
Failover Cluster, And Launch Routers For
Failover Traffic
Flink
RouterFronting
Kafka
Event
Producer
Failover
Flink
Router
X

@monaldax
Change Producer Config To Original
Cluster, And Finish Draining Events From
Backup Flink Router
Flink
RouterFronting
Kafka
Event
Producer
Failover
Flink
Router

@monaldax
Decommission Backup Cluster And Router
Once Original Cluster Is Fixed, Or A
Replacement Cluster Is Live
Flink
RouterFronting
Kafka
Event
Producer
Failover
Flink
Router
X X

@monaldax
Flink
RouterFronting
Kafka
Event
Producer
Back To Steady State With Click Of A
Button

• Failover currently supported for Fronting Kafka
clusters only
• We are working on multi-consumer client with
support for keyed message to support failover of
consumer Kafka clusters.
Consumer Kafka Clusters
@monaldax

Planned & Regular
Kafka Kong
This Automation Also Serves As Kafka Kong,
A Tool That Follows Principles Of Chaos
Engineering
@monaldax

• Over provision for variations and traffic for
failover
• Broker health & outlier detection and auto
termination
• 99 percentile response time
• Broker TCP timeouts, errors, retransmissions
• Producer’s send latency
Kafka Operation Strategies (YMMV)
@monaldax

• Scale up by
• Adding partitions – to new brokers, requires no
keyed messages
• Partition reassignment – in small batches with
custom tool
• Scale down by
• Create New topics / New clusters
• Create new clusters - use Kafka failover
Kafka Operation Strategies (YMMV)
@monaldax

Event
Producer
Router
Fronting
Kafka
KSGateway
Consumer
Kafka
Keystone
Management
KC
W
Elasticsearc
h
Flink Streaming
Job
@monaldax

• Container replacement
• Checkpoints and Savepoints
• Keep retrying if event data format is valid
• Isolation – issue with one sink does not impact
another
Routers & Streaming Job Fault Tolerance
By Design
@monaldax

• Provision new or updated streams
• Bulk updates and terminate routers and re-
deployment
• Automatic partial recovery allows zero-touch
migration of underlying container infrastructure
• Manual – KSRunbook
Router Deployment Automation
@monaldax

Manual Intervention, We Have Runbook.
Goal Is To Automate And Keep Runbook
Small
@monaldax

• Per stream provisioning based on past weeks traffic or
bit rate estimate
• Provision buffer capacity
• Run 1 additional container for latency sensitive
consumers
• Manual, % increase, easy to compute and deploy
• Plan capacity to handle service failover, and
Router Capacity Planning And Provisioning
@monaldax

Admin Tooling To Scale Up Manually, Or To
Deploy A New Build
@monaldax

Application Metrics – Router Message Flow
@monaldax

Application Metrics – Router filtering
@monaldax

Platform-level Metrics – Kafka Offset
Metrics

System Metrics - Router JVM Metrics
@monaldax

Alerts– Hive Sink Router
@monaldax

@monaldax
Flink Streaming Job
● Split between application and infrastructure
● Metrics and monitoring and
● Alerts
● Paging and on-call rotations
● Platform customers follow the same “We build it we run it
model”

Example Streaming Job Application Level
Simulated Metrics

Example Streaming Job System Level
Simulated Metrics

@monaldax
Operations – The road ahead
● True auto scaling
● Bootstrap capacity planning for stateful streaming jobs
● Automated Canary tooling & Data parity
● Point and Click components quick testing, and performance
profiling
● E.g., - iterating over a Filter definition

@monaldax
I Want To Learn More
● http://bit.ly/mLOOP - Deep dive into Unbounded Data
Processing Systems
● http://bit.ly/m17FF - Keynote – Stream Processing with Flink
at Netflix
● http://bit.ly/2BoYAq0 - Multi-tenant Multi-cluster Kafka

Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - re:Invent 2017

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - re:Invent 2017

Similar to Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - re:Invent 2017 (20)

More from Amazon Web Services

More from Amazon Web Services (20)

Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - re:Invent 2017