SlideShare a Scribd company logo
1 of 82
Monal Daxini
11/ 6 / 2018 @ monaldax
Patterns Of
Streaming Applications
P
O
S
A
InfoQ.com: News & Community Site
• Over 1,000,000 software developers, architects and CTOs read the site world-
wide every month
• 250,000 senior developers subscribe to our weekly newsletter
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• 2 dedicated podcast channels: The InfoQ Podcast, with a focus on
Architecture and The Engineering Culture Podcast, with a focus on building
• 96 deep dives on innovative topics packed as downloadable emags and
minibooks
• Over 40 new content items per week
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
apache-flink-streaming-app
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
• 4+ years building stream processing platform at Netflix
• Drove technical vision, roadmap, led implementation
• 17+ years building distributed systems
Profile
@monaldax
Set The Stage 8 Patterns
5 Functional 3 Non-Functional
Structure Of The Talk
Stream
Processing ?
@monaldax
Disclaimer
Inspired by True Events encountered building and operating a
Stream Processing platform, and use cases that are in
production or in ideation phase in the cloud.
Some code and identifying details have been changed, artistic
liberties have been taken, to protect the privacy of streaming
applications, and for sharing the know-how. Some use cases
may have been simplified.
@monaldax
Stream Processing?
Processing Data-In-Motion
@monaldax
Lower Latency Analytics
User Activity Stream - Batched
Flash
Jessica
Luke
Feb 25 Feb 26
@monaldax
Sessions - Batched User Activity Stream
Feb 25 Feb 26
@monaldax
Flash
Jessica
Luke
Correct Session - Batched User Activity Stream
Feb 25 Feb 26
@monaldax
Flash
Jessica
Luke
Stream Processing Natural For User Activity Stream Sessions
@monaldax
Flash
Jessica
Luke
1. Low latency insights and analytics
2. Process unbounded data sets
3. ETL as data arrives
4. Ad-hoc analytics and Event driven applications
Why Stream Processing?
@monaldax
Set The Stage
Architecture & Flink
Stream Processing App Architecture Blueprint
Stream
Processing Job
Source
@monaldax
Sink
Stream Processing App Architecture Blueprint
Stream
Processing Job
Source
@monaldax
Sinks
Source
Source
Side Input
Why Flink?
Flink Programs Are Streaming Dataflows – Streams And
Transformation Operators
Image adapted, source: Flink Docs@monaldax
Streams And Transformation Operators - Windowing
Image source: Flink Docs
10 Second
@monaldax
Streaming Dataflow DAG
Image adapted, source: Flink Docs@monaldax
Scalable Automatic Scheduling Of Operations
Image adapted, source: Flink Docs
Job Manager
(Process)(Process)
(Process)
@monaldax
Parallelism 2
Sink 1
Flexible Deployment
Bare Metal VM / Cloud Containers
@monaldax
Image adapted from: Stephan Ewen@monaldax
Stateless Stream Processing
No state maintained across events
Streaming Application
Flink TaskManager
Local State
Fault-tolerant Processing – Stateful Processing
Sink
Savepoints
(Explicitly Triggered)
Checkpoints
(Periodic, Asynchronous,
Incremental Checkpoint)
Source /
Producers
@monaldax
In-Memory / On-Disk
Local State Access
Levels Of API Abstraction In Flink
Source: Flink Documentation
Describing Patterns
@monaldax
● Use Case / Motivation
● Pattern
● Code Snippet & Deployment mechanism
● Related Pattern, if any
Describing Design Patterns
@monaldax
Patterns
Functional
1. Configurable Router
@monaldax
• Create ingest pipelines for different event streams declaratively
• Route events to data warehouse, data stores for analytics
• With at-least-once semantics
• Streaming ETL - Allow declarative filtering and projection
1.1 Use Case / Motivation – Ingest Pipelines
@monaldax
1.1 Keystone Pipeline – A Self-serve Product
• SERVERLESS
• Turnkey – ready to use
• 100% in the cloud
• No code, Managed Code & Operations
@monaldax
1.1 UI To Provision 1 Data Stream, A Filter, & 3 Sinks
1.1 Optional Filter & Projection (Out of the box)
1.1 Provision 1 Kafka Topic, 3 Configurable Router Jobs
play_events
Consumer Kafka
Elasticsearch
@monaldax
Events
Configurable Router Job
Filter Projection Connector
Configurable Router Job
Filter Projection Connector
Configurable Router Job
Filter Projection Connector
1 2 3 4 5 6 7
Fan-out: 3
R
@monaldax
1.1 Keystone Pipeline Scale
● Up to 1 trillion new events / day
● Peak: 12M events / sec, 36 GB / sec
● ̴ 4 PB of data transported / day
● ̴ 2000 Router Jobs / 10,000 containers
1.1 Pattern: Configurable Isolated Router
@monaldax
Sink
Configurable Router Job
Declarative ProcessorsDeclarative Processors
Events
Producer
val ka#aSource = getSourceBuilder.fromKa#a("topic1").build()
val selectedSink = getSinkBuilder()
.toSelector(sinkName).declareWith("ka,asink", ka#aSink)
.or("s3sink", s3Sink).or("essink", esSink).or("nullsink", nullSink).build();
ka#aSource
.filter(KeystoneFilterFunc6on).map(KeystoneProjec6onFunc6on)
.addSink(selectedSink)
1.1 Code Snippet: Configurable Isolated Router
NoUserCode
@monaldax
• Popular stream / topic has high fan-out factor
• Requires large Kafka Clusters, expensive
1.2 Use Case / Motivation – Ingest large streams with high fan-out Efficiently
@monaldax
R
R
Filter
Projection
Kafka
R
TopicA Cluster1
TopicB Cluster1
Events
Producer
1.2 Pattern: Configurable Co-Isolated Router
@monaldax
Co-Isolated Router
R
Filter
Projection
Kafka
TopicA Cluster1
TopicB Cluster1
Merge Routing To Same Kafka Cluster
Events
Producer
ui_A_Clicks_KakfaSource
.filter(filter)
.map(projection)
.map(outputConverter)
.addSink(kafkaSinkA_Topic1)
1.2 Code Snippet: Configurable Co-Isolated Router
ui_A_Clicks_KafkaSource
.map(transformer)
.flatMap(outputFlatMap)
.map(outputConverter)
.addSink(kafkaSinkA_Topic2)
NoUserCode
@monaldax
2. Script UDF* Component
[Static / Dynamic]
*UDF – User Defined Function
@monaldax
2. Use Case / Motivation – Configurable Business Logic Code
for operations like transformations and filtering
Managed Router / Streaming Job
Source Sink
Biz
Logic
Job DAG
@monaldax
2. Pattern: Static or Dynamic Script UDF (stateless) Component
Comes with all the Pros and Cons of scripting engine
Streaming Job
Source Sink
UDF
Script Engine executes
function defined in the UI
@monaldax
val xscript =
new DynamicConfig("x.script")
kakfaSource
.map(new ScriptFunc>on(xscript))
.filter(new ScriptFunc>on(xsricpt2))
.addSink(new NoopSink())
2. Code Snippet: Script UDF Component
// Script Function
val sm = new ScriptEngineManager()
ScriptEngine se =
m.getEngineByName ("nashorn");
se .eval(script)
Contents configurable at runtime
@monaldax
3. The Enricher
@monaldax
Next 3 Patterns (3-5) Require Explicit Deployment
@monaldax
3. User Case - Generating Play Events For
Personalization And Show Discovery
@monaldax
@monaldax
3. Use-case: Create play events with current data from services, and lookup
table for analytics. Using lookup table keeps originating events lightweight
Playback
History Service
Video
Metadata
Play Logs
Service call
Periodically updating
lookup data
Streaming Job
Resource
Rate Limiter
3. Pattern: The Enricher
- Rate limit with source or service rate limiter, or with resources
- Pull or push data, Sync / async
Streaming Job
Source Sink
Side Input
• Service call
• Lookup from Data Store
• Static or Periodically updated lookup data
@monaldax
Source / Service
Rate Limiter
3. Code Snippet: The Enricher
val kafkaSource = getSourceBuilder.fromKafka("topic1").build()
val parsedMessages = kafkaSource.flatMap(parser).name(”parser")
val enrichedSessions = parsedMessages.filter(reflushFilter).name(”filter")
.map(playbackEnrichment).name(”service")
.map(dataLookup)
enrichmentSessions.addSink(sink).name("sink")
@monaldax
4. The Co-process Joiner
@monaldax
4. Use Case – Play-Impressions Conversion Rate
@monaldax
• 130+ M members
• 10+ B Impressions / day
• 2.5+ B Play Events / day
• ~ 2 TB Processing State
4. Impressions And Plays Scale
@monaldax
Streaming Job
• # Impressions per user play
• Impression attributes leading to the play
4. Join Large Streams With Delayed, Out Of Order Events Based on
Event Time
@monaldax
P1P3
I2I1
Sink
Kafka Topics
plays
impressions
Understanding Event Time
Image Adapted from The Apache Beam Presentation Material
Event Time
Processing
Time 11:0010:00 15:0014:0013:0012:00
11:0010:00 15:0014:0013:0012:00
Input
Output
1 hour Window
impressions
Streaming Job
keyBy F1
4. Use Case: Join Impressions And Plays Stream On Event Time
Co-process
@monaldax
Kafka Topics
keyBy F2playsP2 K
Keyed State
I1 K
I2 K
Merge I2 P2
& Emit
Merge I2 P2
& Emit
Source 1
Streaming Job
keyBy F1
Co-process
State 1
State 2
@monaldax
keyBy F2Source 2
Keyed State
Sink
4. Pattern: The Co-process Joiner
• Process and Coalesce events for each stream grouped by same key
• Join if there is a match, evict when joined or timed out
4. Code Snippet – The Co-process Joiner, Setup sources
env.setStreamTimeCharacteristic(EventTime)
val impressionSource = kafkaSrc1
.filter(eventTypeFilter)
.flatMap(impressionParser)
.keyBy(in => (s"${profile_id}_${title_id}"))
val impressionSource = kafkaSrc2
.flatMap(playbackParser)
.keyBy(in => (s"${profile_id}_${title_id}"))
@monaldax
env.setStreamTimeCharacteristic(EventTime)
val impressionSource = kafkaSrc1.filter(eventTypeFilter)
.flatMap(impressionParser)
.assignTimestampsAndWatermarks(
new BoundedOutOfOrdernessTimestampExtractor(Time.seconds(10) ) {...})
.keyBy(in => (s"${profile_id}_${title_id}"))
val impressionSource = kafkaSrc2.flatMap(playbackParser)
.assignTimestampsAndWatermarks(
new BoundedOutOfOrdernessTimestampExtractor(Time.seconds(10) ) {...})
.keyBy(in => (s"${profile_id}_${title_id}"))
4. Code Snippet – The Co-process Joiner, Setup sources
4. Code Snippet – The Co-process Joiner, Connect Streams
// Connect
impressionSource.connect(playSessionSource)
.process( new CoprocessImpressionsPlays())
.addSink(kafkaSink)
@monaldax
4. Code Snippet – The Co-process Joiner, Co-process Function
class CoprocessJoin extends CoProcessFunction {
override def processElement1(value, context, collector) {
… // update and reduce state, join with stream 2, set timer
}
override def processElement2(value, context, collector) {
… // update and reduce state, join with stream 2, set timer
}
override def onTimer(timestamp, context, collector) {
… // clear up state based on event time
}
@monaldax
5. Event-Sourced Materialized View
[Event Driven Application]
@monaldax
5. Use-case: Publish movie assets CDN location to Cache, to steer
clients to the closest location for playback
EvCache
asset1 OCA1, CA2
asset2 OCA1, CA3
Playback Assets
Service
Service
asset_1 added asset_8 deleted
OCA
OCA1
OCA
OCAN
Open Connect
Appliances (CDN)
Upload asset_1 Delete asset_8
Streaming Job
Source1
Generates Events to
publish all assets
asset1 OCA1, CA2….
asset2 OCA1, CA3 …
@monaldax
Materialized View
5. Use-case: Publish movie assets CDN location to Cache, to steer
clients to the closest location for playback
Streaming Job
Event Publisher
Optional Trigger
to flush view
Materialized View
@monaldax
Materialized View
Sink
5. Code Snippet – Setting up Sources
val fullPublishSource = env .addSource(new FullPublishSourceFunc7on(),
TypeInfoParser.parse("Tuple3<String, Integer, com.ne7lix.AMUpdate>"))
.setParallelism(1);
val kaCaSource = getSourceBuilder().fromKaCa("am_source")
@monaldax
5. Code Snippet – Union Source & Processing
val kafkaSource
.flatMap(new FlatmapFunction())) //split by movie
.assignTimestampsAndWatermarks(new AssignerWithPunctuatedWatermarks[...]())
.union(fullPublishSource) // union with full publish source
.keyBy(0, 1) // (cdn stack, movie)
.process(new UpdateFunction())) // update in-memory state, output at intervals.
.keyBy(0, 1) // (cdn stack, movie)
.process(new PublishToEvCacheFunction())); // publish to evcache
@monaldax
Patterns
Non-Functional
6. Elastic Dev Interface
@monaldax
6 Elastic Dev Interface
Spectrum Of Ease, Capability, & Flexibility
• Point & Click, with UDFs
• SQL with UDFs
• Annotation based API with code generation
• Code with reusable components
• (e.g., Data Hygiene, Script Transformer)
EaseofUse
Capability
7. Stream Processing Platform
@monaldax
7. Stream Processing Platform (SpaaS - Stream Processing Service as a Service)
Amazon EC2
Container Runtime
Reusable Components
Source & Sink Connectors, Filtering, Projection, etc.
Routers
(Streaming Job)
Streaming Jobs
ManagementService&UI
Metrics&Monitoring
StreamingJobDevelopment
Dashboards
@monaldax
Stream Processing Platform
(Streaming Engine, Config Management)
8. Rewind & Restatement
@monaldax
8. Use Case - Restate Results Due To Outage Or Bug In Business Logic
@monaldax
Time
Checkpoint y Checkpoint x
outage
Checkpoint x+1
Now
Time
8. Pattern: Rewind And Restatement
Rewind the source and state to a know good state
@monaldax
Checkpoint y Checkpoint x
outage
Checkpoint x+1
Now
Summary
@monaldax
1. Configurable Router
2. Script UDF Component
3. The Enricher
4. The Co-process Joiner
5. Event-Sourced Materialized View
Patterns Summary
@monaldax
6. Elastic Dev Interface
7. Stream Processing Platform
8. Rewind & Restatement
FUNCTIONAL NON-FUNCTIONAL
Q & A
Thank you
If you would like to discuss more
• @monaldax
• linkedin.com/in/monaldax
• Flink at Netflix, Paypal speaker series, 2018– http://bit.ly/monal-paypal
• Unbounded Data Processing Systems, Strangeloop, 2016 - http://bit.ly/monal-sloop
• AWS Re-Invent 2017 Netflix Keystone SPaaS, 2017 – http://bit.ly/monal-reInvent
• Keynote - Stream Processing with Flink, 2017 - http://bit.ly/monal-ff2017
• Dataflow Paper - http://bit.ly/dataflow-paper
Additional Stream Processing Material
@monaldax
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
apache-flink-streaming-app

More Related Content

What's hot

Data Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RData Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RRadek Maciaszek
 
Cloud-based Data Stream Processing
Cloud-based Data Stream ProcessingCloud-based Data Stream Processing
Cloud-based Data Stream ProcessingZbigniew Jerzak
 
Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache SparkCloudera, Inc.
 
High Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OHigh Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OSri Ambati
 
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...Brian O'Neill
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAAlbert Bifet
 
Apache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetupApache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetupNed Shawa
 
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)Spark Summit
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with SparkKrishna Sankar
 
Deep Dive Into Catalyst: Apache Spark 2.0'S Optimizer
Deep Dive Into Catalyst: Apache Spark 2.0'S OptimizerDeep Dive Into Catalyst: Apache Spark 2.0'S Optimizer
Deep Dive Into Catalyst: Apache Spark 2.0'S OptimizerSpark Summit
 
Time Series Processing with Apache Spark
Time Series Processing with Apache SparkTime Series Processing with Apache Spark
Time Series Processing with Apache SparkJosef Adersberger
 
Big Data Science with H2O in R
Big Data Science with H2O in RBig Data Science with H2O in R
Big Data Science with H2O in RAnqi Fu
 
Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Databricks
 
Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015Yves Raimond
 
Combining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache SparkCombining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache SparkDatabricks
 
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim HunterDeep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim HunterDatabricks
 
Sparkling Water 5 28-14
Sparkling Water 5 28-14Sparkling Water 5 28-14
Sparkling Water 5 28-14Sri Ambati
 
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...Lucidworks
 
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXrhatr
 

What's hot (20)

Data Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RData Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and R
 
Cloud-based Data Stream Processing
Cloud-based Data Stream ProcessingCloud-based Data Stream Processing
Cloud-based Data Stream Processing
 
Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache Spark
 
High Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OHigh Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2O
 
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
 
Apache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetupApache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetup
 
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with Spark
 
Meetup tensorframes
Meetup tensorframesMeetup tensorframes
Meetup tensorframes
 
Deep Dive Into Catalyst: Apache Spark 2.0'S Optimizer
Deep Dive Into Catalyst: Apache Spark 2.0'S OptimizerDeep Dive Into Catalyst: Apache Spark 2.0'S Optimizer
Deep Dive Into Catalyst: Apache Spark 2.0'S Optimizer
 
Time Series Processing with Apache Spark
Time Series Processing with Apache SparkTime Series Processing with Apache Spark
Time Series Processing with Apache Spark
 
Big Data Science with H2O in R
Big Data Science with H2O in RBig Data Science with H2O in R
Big Data Science with H2O in R
 
Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)
 
Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015
 
Combining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache SparkCombining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache Spark
 
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim HunterDeep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
 
Sparkling Water 5 28-14
Sparkling Water 5 28-14Sparkling Water 5 28-14
Sparkling Water 5 28-14
 
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
 
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
 

Similar to Patterns of Streaming Applications

Patterns of-streaming-applications-qcon-2018-monal-daxini
Patterns of-streaming-applications-qcon-2018-monal-daxiniPatterns of-streaming-applications-qcon-2018-monal-daxini
Patterns of-streaming-applications-qcon-2018-monal-daxiniMonal Daxini
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017Monal Daxini
 
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...Amazon Web Services
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterPaolo Castagna
 
Flink at netflix paypal speaker series
Flink at netflix   paypal speaker seriesFlink at netflix   paypal speaker series
Flink at netflix paypal speaker seriesMonal Daxini
 
O'Reilly Media Webcast: Building Real-Time Data Pipelines
O'Reilly Media Webcast: Building Real-Time Data PipelinesO'Reilly Media Webcast: Building Real-Time Data Pipelines
O'Reilly Media Webcast: Building Real-Time Data PipelinesSingleStore
 
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
Real-Time Data Pipelines with Kafka, Spark, and Operational DatabasesReal-Time Data Pipelines with Kafka, Spark, and Operational Databases
Real-Time Data Pipelines with Kafka, Spark, and Operational DatabasesSingleStore
 
What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 confluent
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
The Power of Distributed Snapshots in Apache Flink
The Power of Distributed Snapshots in Apache FlinkThe Power of Distributed Snapshots in Apache Flink
The Power of Distributed Snapshots in Apache FlinkC4Media
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 
BDX 2016- Monal daxini @ Netflix
BDX 2016-  Monal daxini  @ NetflixBDX 2016-  Monal daxini  @ Netflix
BDX 2016- Monal daxini @ NetflixIdo Shilon
 
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Lucidworks
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaGuido Schmutz
 
Big Data Open Source Security LLC: Realtime log analysis with Mesos, Docker, ...
Big Data Open Source Security LLC: Realtime log analysis with Mesos, Docker, ...Big Data Open Source Security LLC: Realtime log analysis with Mesos, Docker, ...
Big Data Open Source Security LLC: Realtime log analysis with Mesos, Docker, ...DataStax Academy
 
Building Stream Processing as a Service
Building Stream Processing as a ServiceBuilding Stream Processing as a Service
Building Stream Processing as a ServiceSteven Wu
 
Louise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsLouise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsDataconomy Media
 
Netflix Keystone—Cloud scale event processing pipeline
Netflix Keystone—Cloud scale event processing pipelineNetflix Keystone—Cloud scale event processing pipeline
Netflix Keystone—Cloud scale event processing pipelineMonal Daxini
 

Similar to Patterns of Streaming Applications (20)

Patterns of-streaming-applications-qcon-2018-monal-daxini
Patterns of-streaming-applications-qcon-2018-monal-daxiniPatterns of-streaming-applications-qcon-2018-monal-daxini
Patterns of-streaming-applications-qcon-2018-monal-daxini
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
 
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
Netflix Keystone SPaaS: Real-time Stream Processing as a Service - ABD320 - r...
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 
Flink at netflix paypal speaker series
Flink at netflix   paypal speaker seriesFlink at netflix   paypal speaker series
Flink at netflix paypal speaker series
 
O'Reilly Media Webcast: Building Real-Time Data Pipelines
O'Reilly Media Webcast: Building Real-Time Data PipelinesO'Reilly Media Webcast: Building Real-Time Data Pipelines
O'Reilly Media Webcast: Building Real-Time Data Pipelines
 
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
Real-Time Data Pipelines with Kafka, Spark, and Operational DatabasesReal-Time Data Pipelines with Kafka, Spark, and Operational Databases
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
 
What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
The Power of Distributed Snapshots in Apache Flink
The Power of Distributed Snapshots in Apache FlinkThe Power of Distributed Snapshots in Apache Flink
The Power of Distributed Snapshots in Apache Flink
 
Data Science
Data ScienceData Science
Data Science
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
BDX 2016- Monal daxini @ Netflix
BDX 2016-  Monal daxini  @ NetflixBDX 2016-  Monal daxini  @ Netflix
BDX 2016- Monal daxini @ Netflix
 
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
 
Big Data Open Source Security LLC: Realtime log analysis with Mesos, Docker, ...
Big Data Open Source Security LLC: Realtime log analysis with Mesos, Docker, ...Big Data Open Source Security LLC: Realtime log analysis with Mesos, Docker, ...
Big Data Open Source Security LLC: Realtime log analysis with Mesos, Docker, ...
 
Building Stream Processing as a Service
Building Stream Processing as a ServiceBuilding Stream Processing as a Service
Building Stream Processing as a Service
 
Louise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsLouise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx Systems
 
Netflix Keystone—Cloud scale event processing pipeline
Netflix Keystone—Cloud scale event processing pipelineNetflix Keystone—Cloud scale event processing pipeline
Netflix Keystone—Cloud scale event processing pipeline
 

More from C4Media

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoC4Media
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileC4Media
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsC4Media
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like OwnersC4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 

More from C4Media (20)

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 

Recently uploaded

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 

Recently uploaded (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Patterns of Streaming Applications

  • 1. Monal Daxini 11/ 6 / 2018 @ monaldax Patterns Of Streaming Applications P O S A
  • 2. InfoQ.com: News & Community Site • Over 1,000,000 software developers, architects and CTOs read the site world- wide every month • 250,000 senior developers subscribe to our weekly newsletter • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • 2 dedicated podcast channels: The InfoQ Podcast, with a focus on Architecture and The Engineering Culture Podcast, with a focus on building • 96 deep dives on innovative topics packed as downloadable emags and minibooks • Over 40 new content items per week Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ apache-flink-streaming-app
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  • 4. • 4+ years building stream processing platform at Netflix • Drove technical vision, roadmap, led implementation • 17+ years building distributed systems Profile @monaldax
  • 5. Set The Stage 8 Patterns 5 Functional 3 Non-Functional Structure Of The Talk Stream Processing ? @monaldax
  • 6. Disclaimer Inspired by True Events encountered building and operating a Stream Processing platform, and use cases that are in production or in ideation phase in the cloud. Some code and identifying details have been changed, artistic liberties have been taken, to protect the privacy of streaming applications, and for sharing the know-how. Some use cases may have been simplified. @monaldax
  • 8.
  • 10. User Activity Stream - Batched Flash Jessica Luke Feb 25 Feb 26 @monaldax
  • 11. Sessions - Batched User Activity Stream Feb 25 Feb 26 @monaldax Flash Jessica Luke
  • 12. Correct Session - Batched User Activity Stream Feb 25 Feb 26 @monaldax Flash Jessica Luke
  • 13. Stream Processing Natural For User Activity Stream Sessions @monaldax Flash Jessica Luke
  • 14. 1. Low latency insights and analytics 2. Process unbounded data sets 3. ETL as data arrives 4. Ad-hoc analytics and Event driven applications Why Stream Processing? @monaldax
  • 16. Stream Processing App Architecture Blueprint Stream Processing Job Source @monaldax Sink
  • 17. Stream Processing App Architecture Blueprint Stream Processing Job Source @monaldax Sinks Source Source Side Input
  • 19. Flink Programs Are Streaming Dataflows – Streams And Transformation Operators Image adapted, source: Flink Docs@monaldax
  • 20. Streams And Transformation Operators - Windowing Image source: Flink Docs 10 Second @monaldax
  • 21. Streaming Dataflow DAG Image adapted, source: Flink Docs@monaldax
  • 22. Scalable Automatic Scheduling Of Operations Image adapted, source: Flink Docs Job Manager (Process)(Process) (Process) @monaldax Parallelism 2 Sink 1
  • 23. Flexible Deployment Bare Metal VM / Cloud Containers @monaldax
  • 24. Image adapted from: Stephan Ewen@monaldax Stateless Stream Processing No state maintained across events
  • 25. Streaming Application Flink TaskManager Local State Fault-tolerant Processing – Stateful Processing Sink Savepoints (Explicitly Triggered) Checkpoints (Periodic, Asynchronous, Incremental Checkpoint) Source / Producers @monaldax In-Memory / On-Disk Local State Access
  • 26. Levels Of API Abstraction In Flink Source: Flink Documentation
  • 28. ● Use Case / Motivation ● Pattern ● Code Snippet & Deployment mechanism ● Related Pattern, if any Describing Design Patterns @monaldax
  • 31.
  • 32.
  • 33. • Create ingest pipelines for different event streams declaratively • Route events to data warehouse, data stores for analytics • With at-least-once semantics • Streaming ETL - Allow declarative filtering and projection 1.1 Use Case / Motivation – Ingest Pipelines @monaldax
  • 34. 1.1 Keystone Pipeline – A Self-serve Product • SERVERLESS • Turnkey – ready to use • 100% in the cloud • No code, Managed Code & Operations @monaldax
  • 35. 1.1 UI To Provision 1 Data Stream, A Filter, & 3 Sinks
  • 36. 1.1 Optional Filter & Projection (Out of the box)
  • 37. 1.1 Provision 1 Kafka Topic, 3 Configurable Router Jobs play_events Consumer Kafka Elasticsearch @monaldax Events Configurable Router Job Filter Projection Connector Configurable Router Job Filter Projection Connector Configurable Router Job Filter Projection Connector 1 2 3 4 5 6 7 Fan-out: 3 R
  • 38. @monaldax 1.1 Keystone Pipeline Scale ● Up to 1 trillion new events / day ● Peak: 12M events / sec, 36 GB / sec ● ̴ 4 PB of data transported / day ● ̴ 2000 Router Jobs / 10,000 containers
  • 39. 1.1 Pattern: Configurable Isolated Router @monaldax Sink Configurable Router Job Declarative ProcessorsDeclarative Processors Events Producer
  • 40. val ka#aSource = getSourceBuilder.fromKa#a("topic1").build() val selectedSink = getSinkBuilder() .toSelector(sinkName).declareWith("ka,asink", ka#aSink) .or("s3sink", s3Sink).or("essink", esSink).or("nullsink", nullSink).build(); ka#aSource .filter(KeystoneFilterFunc6on).map(KeystoneProjec6onFunc6on) .addSink(selectedSink) 1.1 Code Snippet: Configurable Isolated Router NoUserCode @monaldax
  • 41. • Popular stream / topic has high fan-out factor • Requires large Kafka Clusters, expensive 1.2 Use Case / Motivation – Ingest large streams with high fan-out Efficiently @monaldax R R Filter Projection Kafka R TopicA Cluster1 TopicB Cluster1 Events Producer
  • 42. 1.2 Pattern: Configurable Co-Isolated Router @monaldax Co-Isolated Router R Filter Projection Kafka TopicA Cluster1 TopicB Cluster1 Merge Routing To Same Kafka Cluster Events Producer
  • 43. ui_A_Clicks_KakfaSource .filter(filter) .map(projection) .map(outputConverter) .addSink(kafkaSinkA_Topic1) 1.2 Code Snippet: Configurable Co-Isolated Router ui_A_Clicks_KafkaSource .map(transformer) .flatMap(outputFlatMap) .map(outputConverter) .addSink(kafkaSinkA_Topic2) NoUserCode @monaldax
  • 44. 2. Script UDF* Component [Static / Dynamic] *UDF – User Defined Function @monaldax
  • 45. 2. Use Case / Motivation – Configurable Business Logic Code for operations like transformations and filtering Managed Router / Streaming Job Source Sink Biz Logic Job DAG @monaldax
  • 46. 2. Pattern: Static or Dynamic Script UDF (stateless) Component Comes with all the Pros and Cons of scripting engine Streaming Job Source Sink UDF Script Engine executes function defined in the UI @monaldax
  • 47. val xscript = new DynamicConfig("x.script") kakfaSource .map(new ScriptFunc>on(xscript)) .filter(new ScriptFunc>on(xsricpt2)) .addSink(new NoopSink()) 2. Code Snippet: Script UDF Component // Script Function val sm = new ScriptEngineManager() ScriptEngine se = m.getEngineByName ("nashorn"); se .eval(script) Contents configurable at runtime @monaldax
  • 49. Next 3 Patterns (3-5) Require Explicit Deployment @monaldax
  • 50. 3. User Case - Generating Play Events For Personalization And Show Discovery @monaldax
  • 51. @monaldax 3. Use-case: Create play events with current data from services, and lookup table for analytics. Using lookup table keeps originating events lightweight Playback History Service Video Metadata Play Logs Service call Periodically updating lookup data Streaming Job Resource Rate Limiter
  • 52. 3. Pattern: The Enricher - Rate limit with source or service rate limiter, or with resources - Pull or push data, Sync / async Streaming Job Source Sink Side Input • Service call • Lookup from Data Store • Static or Periodically updated lookup data @monaldax Source / Service Rate Limiter
  • 53. 3. Code Snippet: The Enricher val kafkaSource = getSourceBuilder.fromKafka("topic1").build() val parsedMessages = kafkaSource.flatMap(parser).name(”parser") val enrichedSessions = parsedMessages.filter(reflushFilter).name(”filter") .map(playbackEnrichment).name(”service") .map(dataLookup) enrichmentSessions.addSink(sink).name("sink") @monaldax
  • 54. 4. The Co-process Joiner @monaldax
  • 55. 4. Use Case – Play-Impressions Conversion Rate @monaldax
  • 56. • 130+ M members • 10+ B Impressions / day • 2.5+ B Play Events / day • ~ 2 TB Processing State 4. Impressions And Plays Scale @monaldax
  • 57. Streaming Job • # Impressions per user play • Impression attributes leading to the play 4. Join Large Streams With Delayed, Out Of Order Events Based on Event Time @monaldax P1P3 I2I1 Sink Kafka Topics plays impressions
  • 58. Understanding Event Time Image Adapted from The Apache Beam Presentation Material Event Time Processing Time 11:0010:00 15:0014:0013:0012:00 11:0010:00 15:0014:0013:0012:00 Input Output 1 hour Window
  • 59. impressions Streaming Job keyBy F1 4. Use Case: Join Impressions And Plays Stream On Event Time Co-process @monaldax Kafka Topics keyBy F2playsP2 K Keyed State I1 K I2 K Merge I2 P2 & Emit Merge I2 P2 & Emit
  • 60. Source 1 Streaming Job keyBy F1 Co-process State 1 State 2 @monaldax keyBy F2Source 2 Keyed State Sink 4. Pattern: The Co-process Joiner • Process and Coalesce events for each stream grouped by same key • Join if there is a match, evict when joined or timed out
  • 61. 4. Code Snippet – The Co-process Joiner, Setup sources env.setStreamTimeCharacteristic(EventTime) val impressionSource = kafkaSrc1 .filter(eventTypeFilter) .flatMap(impressionParser) .keyBy(in => (s"${profile_id}_${title_id}")) val impressionSource = kafkaSrc2 .flatMap(playbackParser) .keyBy(in => (s"${profile_id}_${title_id}")) @monaldax
  • 62. env.setStreamTimeCharacteristic(EventTime) val impressionSource = kafkaSrc1.filter(eventTypeFilter) .flatMap(impressionParser) .assignTimestampsAndWatermarks( new BoundedOutOfOrdernessTimestampExtractor(Time.seconds(10) ) {...}) .keyBy(in => (s"${profile_id}_${title_id}")) val impressionSource = kafkaSrc2.flatMap(playbackParser) .assignTimestampsAndWatermarks( new BoundedOutOfOrdernessTimestampExtractor(Time.seconds(10) ) {...}) .keyBy(in => (s"${profile_id}_${title_id}")) 4. Code Snippet – The Co-process Joiner, Setup sources
  • 63. 4. Code Snippet – The Co-process Joiner, Connect Streams // Connect impressionSource.connect(playSessionSource) .process( new CoprocessImpressionsPlays()) .addSink(kafkaSink) @monaldax
  • 64. 4. Code Snippet – The Co-process Joiner, Co-process Function class CoprocessJoin extends CoProcessFunction { override def processElement1(value, context, collector) { … // update and reduce state, join with stream 2, set timer } override def processElement2(value, context, collector) { … // update and reduce state, join with stream 2, set timer } override def onTimer(timestamp, context, collector) { … // clear up state based on event time } @monaldax
  • 65. 5. Event-Sourced Materialized View [Event Driven Application] @monaldax
  • 66. 5. Use-case: Publish movie assets CDN location to Cache, to steer clients to the closest location for playback EvCache asset1 OCA1, CA2 asset2 OCA1, CA3 Playback Assets Service Service asset_1 added asset_8 deleted OCA OCA1 OCA OCAN Open Connect Appliances (CDN) Upload asset_1 Delete asset_8 Streaming Job Source1 Generates Events to publish all assets asset1 OCA1, CA2…. asset2 OCA1, CA3 … @monaldax Materialized View
  • 67. 5. Use-case: Publish movie assets CDN location to Cache, to steer clients to the closest location for playback Streaming Job Event Publisher Optional Trigger to flush view Materialized View @monaldax Materialized View Sink
  • 68. 5. Code Snippet – Setting up Sources val fullPublishSource = env .addSource(new FullPublishSourceFunc7on(), TypeInfoParser.parse("Tuple3<String, Integer, com.ne7lix.AMUpdate>")) .setParallelism(1); val kaCaSource = getSourceBuilder().fromKaCa("am_source") @monaldax
  • 69. 5. Code Snippet – Union Source & Processing val kafkaSource .flatMap(new FlatmapFunction())) //split by movie .assignTimestampsAndWatermarks(new AssignerWithPunctuatedWatermarks[...]()) .union(fullPublishSource) // union with full publish source .keyBy(0, 1) // (cdn stack, movie) .process(new UpdateFunction())) // update in-memory state, output at intervals. .keyBy(0, 1) // (cdn stack, movie) .process(new PublishToEvCacheFunction())); // publish to evcache @monaldax
  • 71. 6. Elastic Dev Interface @monaldax
  • 72. 6 Elastic Dev Interface Spectrum Of Ease, Capability, & Flexibility • Point & Click, with UDFs • SQL with UDFs • Annotation based API with code generation • Code with reusable components • (e.g., Data Hygiene, Script Transformer) EaseofUse Capability
  • 73. 7. Stream Processing Platform @monaldax
  • 74. 7. Stream Processing Platform (SpaaS - Stream Processing Service as a Service) Amazon EC2 Container Runtime Reusable Components Source & Sink Connectors, Filtering, Projection, etc. Routers (Streaming Job) Streaming Jobs ManagementService&UI Metrics&Monitoring StreamingJobDevelopment Dashboards @monaldax Stream Processing Platform (Streaming Engine, Config Management)
  • 75. 8. Rewind & Restatement @monaldax
  • 76. 8. Use Case - Restate Results Due To Outage Or Bug In Business Logic @monaldax Time Checkpoint y Checkpoint x outage Checkpoint x+1 Now
  • 77. Time 8. Pattern: Rewind And Restatement Rewind the source and state to a know good state @monaldax Checkpoint y Checkpoint x outage Checkpoint x+1 Now
  • 79. 1. Configurable Router 2. Script UDF Component 3. The Enricher 4. The Co-process Joiner 5. Event-Sourced Materialized View Patterns Summary @monaldax 6. Elastic Dev Interface 7. Stream Processing Platform 8. Rewind & Restatement FUNCTIONAL NON-FUNCTIONAL
  • 80. Q & A Thank you If you would like to discuss more • @monaldax • linkedin.com/in/monaldax
  • 81. • Flink at Netflix, Paypal speaker series, 2018– http://bit.ly/monal-paypal • Unbounded Data Processing Systems, Strangeloop, 2016 - http://bit.ly/monal-sloop • AWS Re-Invent 2017 Netflix Keystone SPaaS, 2017 – http://bit.ly/monal-reInvent • Keynote - Stream Processing with Flink, 2017 - http://bit.ly/monal-ff2017 • Dataflow Paper - http://bit.ly/dataflow-paper Additional Stream Processing Material @monaldax
  • 82. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ apache-flink-streaming-app