The document describes Pravega and how it provides a unified model for batch and stream processing. Pravega stores data streams and provides a common storage layer. It handles both large historic batch data and real-time streaming data using segments that can be scaled independently. Pravega supports exactly-once processing through transactions that write to temporary transaction segments before committing to the stream. Readers can access segments in parallel for unordered reads of the stream data.
Apidays New York 2024 - The value of a flexible API Management solution for O...
An elastic batch-and stream-processing stack with Pravega and Apache Flink
1. Unified and Elastic
Batch and Stream Processing
with Pravega and Apache Flink
Stephan Ewen, data Artisans
Flavio Junqueira, Pravega
2. Batch and Stream Processing
2DataWorks Summit Berlin - April, 2018
3. What changes faster? Data or Query?
3
Data changes slowly
compared to fast
changing queries
ad-hoc queries, data exploration,
ML training and
(hyper) parameter tuning
Batch Processing
Use Case
Data changes fast
application logic
is long-lived
continuous applications,
data pipelines, standing queries,
anomaly detection, ML evaluation, …
Stream Processing
Use Case
DataWorks Summit Berlin - April, 2018
4. Streams as a Unified View on Data
4DataWorks Summit Berlin - April, 2018
5. Stream Processing Unifies Data Use Cases
Batch Processing
process static and
historic data
Data Stream
Processing
realtime results
from data streams
Event-driven
Applications
data-driven actions
and services
Stateful Computations Over Data Streams
DataWorks Summit Berlin - April, 2018 5
6. The Quest for Unified
Batch- and Stream Processing
DataWorks Summit Berlin - April, 2018 6
7. Querying the Past
SELECT
campaign,
TUMBLE_START(clickTime, INTERVAL ’1’ HOUR),
COUNT(ip) AS clickCnt
FROM adClicks
WHERE clickTime BETWEEN ‘2015-01-01’ AND ‘2017-12-31’
GROUP BY campaign, TUMBLE(clickTime, INTERVAL ‘1’ HOUR)
Query
past futurenowstart of
the stream
DataWorks Summit Berlin - April, 2018
Use a Batch Processor
(or capable stream processor)
Connect to a bulk storage
(S3, HDFS, GFS, …)
7
8. Querying the Past
DataWorks Summit Berlin - April, 2018
Recorded Events
(File system,
Object storage)
Batch Processor
Massively parallel,
unordered scan.
Algorithms and data structures
to process finite data
8
9. Querying the Future
SELECT
campaign,
TUMBLE_START(clickTime, INTERVAL ’1’ HOUR),
COUNT(ip) AS clickCnt
FROM adClicks
WHERE clickTime > now()
GROUP BY campaign, TUMBLE(clickTime, INTERVAL ‘1’ HOUR)
DataWorks Summit Berlin - April, 2018
Query
past futurenowstart of
the stream
Use a Stream Processor
Connect to a PubSub
(Kafka, Kinesis, PubSub, …)
9
10. Querying the Future
DataWorks Summit Berlin - April, 2018
Real-time Events
(Message Queue,
Event Log)
Stream Processor
Serves real-time events
in order
State and event-time support to
process unbounded data
10
11. Querying the Past and the Future
SELECT
campaign,
TUMBLE_START(clickTime, INTERVAL ’1’ HOUR),
COUNT(ip) AS clickCnt
FROM adClicks
WHERE clickTime > ‘2017-01-01’
GROUP BY campaign, TUMBLE(clickTime, INTERVAL ‘1’ HOUR)
DataWorks Summit Berlin - April, 2018
Query
past futurenowstart of
the stream
Use a Stream Processor?
Connect to both bulk storage
and PubSub?
11
12. Querying the Past and the Future
DataWorks Summit Berlin - April, 2018
Real-time Events
(Message Queue,
Event Log)
Unified
Batch/Stream
Processor
Low-latency serving
of real-time events
Recorded Events
(File system,
Object storage)
Parallel scans of
historic data
Reading data from two systems. Switch from
batch scan to stream ingestion.
12
14. The Stack
14
Unified Model,
Semantics, APIs
Unified Storage
Unified Runtime
Same Model/API to treat
historic and real-time data
Same view on and access to
historic and real-time data
Handle both large historic
and low latency real-time data
DataWorks Summit Berlin - April, 2018
15. The Stack
15
Unified Model,
Semantics, APIs
Unified Storage
Unified Runtime
Same Model/API to treat
historic and real-time data
Same view on and access to
historic and real-time data
Handle both large historic
and low latency real-time data
DataWorks Summit Berlin - April, 2018
17. Pravega
• Storing data streams
• Young project, under active development
• Open source
http://pravega.io
http://github.com/pravega/pravega
17DataWorks Summit Berlin - April, 2018
19. DataWorks Summit Berlin - April, 2018
Time
PresentRecent
Past
Distant
Past
Anatomy of a stream
19
20. DataWorks Summit Berlin - April, 2018
Messaging
Pub-sub
Bulk store
Time
PresentRecent
Past
Distant
Past
Anatomy of a stream
20
21. DataWorks Summit Berlin - April, 2018
Pravega
Time
PresentRecent
Past
Distant
Past
Anatomy of a stream
21
22. DataWorks Summit Berlin - April, 2018
Pravega
Time
PresentRecent
Past
Distant
Past
Anatomy of a stream
Unbounded
amount of data
Ingestion rate
might vary
22
23. Pravega and Streams
….. 01110110 01100001 01101100
….. 01001010 01101111 01101001
01000110
01110110
DataWorks Summit Berlin - April, 2018
Ingest stream
data
Process stream
data
23
01000110
01110110
Append
Read
Pravega
26. Guarantees of the write path
• Order
• Writer appends following application order
• Per key order
• No duplicates
• Writer IDs
• Maps to last appended data on the segment store
• Exactly once on the write path
• Deduplication based on writer IDs
• Atomicity for groups of writes with transactions
DataWorks Summit Berlin - April, 2018 26
28. The read path
DataWorks Summit Berlin - April, 2018
past nowstart of
the stream
Long-term
storage
(Tier 2)
recent
past
Cache
Stream data
Cache
miss
Segment
store
28
30. Segments in Pravega
DataWorks Summit Berlin - April, 2018
01000111
01110110
11000110
01000111
01110110
11000110
Pravega
Stream Composition of
Segment:
• Stream unit
• Append only
• Sequence of bytes
30
32. Segments in Pravega
Pravega
01000110
01110110
Segments
Append Read
01101111
01101001
Segments
• Segments are sequences of bytes
• Use routing keys to determine segment
DataWorks Summit Berlin - April, 2018
〈key, 01101001 〉
Routing
key
32
Event
Writer
Event
Reader
Event
Reader
Event
Writer
33. Segments can be sealed
DataWorks Summit Berlin - April, 2018 33
34. Segments in Pravega
Pravega
01000110
01110110
Segments
Append Read
01101111
01101001
Segments
Once sealed, a segment can’t be
appended to any longer.
DataWorks Summit Berlin - April, 2018
E.g., ad clicks
34
Event
Writer
Event
Writer Event
Reader
Event
Reader
35. How is sealing segments useful?
DataWorks Summit Berlin - April, 2018 35
39. Some useful ways to compose segments
DataWorks Summit Berlin - April, 2018 39
40. 01000110
Scaling a stream
….. 01110110 01100001 01101100 01000110
Stream has one
segment
1
….. 01110110 01100001 01101100
• Seal current
segment
• Create new ones
2
01000110
01000110
• Say input load has increased
• Need more parallelism
• Auto or manual scaling
DataWorks Summit Berlin - April, 2018 40
47. Daily Cycles
Peak rate is 10x higher than lowest rate
4:00 AM
9:00 AM
NYC Yellow Taxi Trip Records, March 2015
http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml
49. Transactions
• Transactional writes
• All or nothing
• Any open segment of the stream
• No limitation on the routing key range
• Interleaved with regular writes
• Important for exactly-once semantics
• Either all writes become visible or none
• Aborted manually or via timeout
DataWorks Summit Berlin - April, 2018
01100000
Txn segments01110110
01000110
01100000
s1
s2
01111000
Regular write to segment
49
50. Transactions
01100001
01000110
Stream has two
segments
1
Begin txn
2
Txn segments
Write to txn
3
01100001
Upon commit
5
Seal txn segment
01000110
6
01100001
Merge txn segment
into stream segment
01110110 01110110
01000110
Txn segments01110110
01000110
01100000
Write to txn
4
01100001
Txn segments01110110
01000110
01100000
01110110
01000110
01100000
s1
s2
s1
s2
s1
s2
s1
s2
s1
s2
01100001
Upon commit
s10111011001100000
s2
DataWorks Summit Berlin - April, 2018 50
51. Transactions
01100001
01000110
Stream has two
segments
1
Begin txn
2
Txn segments
Write to txn
3
01100001
Upon abort
5
Eliminate
segments
01110110 01110110
01000110
Txn segments01110110
01000110
01100000
Write to txn
4
01100001
Txn segments01110110
01000110
01100000
01110110
01000110
s1
s2
s1
s2
s1
s2
s1
s2
s1
s2
01100001
01100000
DataWorks Summit Berlin - April, 2018 51
52. Unordered reads
DataWorks Summit Berlin - April, 2018
01000110
01000110
….. 01110110 01100001 01101100
01000110
• Stream started with single
segment
• Scale up from one to two
segments
• Three segments available
01000110
01000110
01000110
• One iterator per segment
• Can read in parallel from all segments
1 2
Iterate over segments
52
53. Putting it all together
DataWorks Summit Berlin - April, 2018 53
54. Segments
Writers, Readers, and Reader Groups
DataWorks Summit Berlin - April, 2018
Pravega
Event
Writer
Event
Reader
Event
Reader
01110110
01101111
01101001
Append Read
Reader
group
• Regular and
transactional
appends
• Coordinate the
assignment of
segments
• Checkpointing
Stream
54
Event
Writer
55. Putting everything together
DataWorks Summit Berlin - April, 2018 55
Event
Writer
Event
Writer
Stream segments
Segment 1
Segment 2
Event
Reader 1
Event
Reader 2
Reader group
Reader group state:
Event Reader 1: {1}
Event Reader 2: {2}
Unassigned: {}
56. Putting everything together
DataWorks Summit Berlin - April, 2018 56
Event
Writer
Event
Writer
Stream segments
Event
Reader 1
Event
Reader 2
Reader group
• Start scaling
• Seal Segment 2
• Create Segments
3 and 4
Reader group state:
Event Reader 1: {1}
Event Reader 2: {2}
Unassigned: {}
Segment 1
Segment 2
Segment 3
Segment 4
57. Putting everything together
DataWorks Summit Berlin - April, 2018 57
Event
Writer
Event
Writer
Stream segments
Segment 1
Segment 2
Event
Reader 1
Event
Reader 2
Reader group
Segment 3
Segment 4
Reader group state:
Event Reader 1: {1}
Event Reader 2: {}
Unassigned: {3, 4}
Controller
• Get successors
from controller
• Add to reader
group state
58. Putting everything together
DataWorks Summit Berlin - April, 2018 58
Event
Writer
Event
Writer
Stream segments
Event
Reader 1
Event
Reader 2
Reader group
Reader group state:
Event Reader 1: {1, 3}
Event Reader 2: {4}
Unassigned: {}
Segment 1
Segment 2
Segment 3
Segment 4
59. Putting everything together
DataWorks Summit Berlin - April, 2018 59
Event
Writer
Event
Writer
Stream segments
Event
Reader 1
Event
Reader 2
Reader group
Reader group state:
Event Reader 1: {1, 3}
Event Reader 2: {4}
Unassigned: {}
Segment 1
Segment 2
Segment 3
Segment 4
60. Putting everything together
DataWorks Summit Berlin - April, 2018 60
Event
Writer
Event
Writer
Stream segments
Event
Reader 1
Event
Reader 2
Reader group
Reader group state:
Event Reader 1: {1, 3}
Event Reader 2: {4}
Unassigned: {}
Segment 1
Segment 2
Segment 3
Segment 4
61. Putting everything together
DataWorks Summit Berlin - April, 2018 61
Event
Writer
Event
Writer
Stream segments
Event
Reader 1
Event
Reader 2
Reader group
Reader group state:
Event Reader 1: {1, 3}
Event Reader 2: {4}
Unassigned: {}
Segment 1
Segment 2
Segment 3
Segment 4
Initiate checkpoint
62. Putting everything together
DataWorks Summit Berlin - April, 2018 62
Event
Writer
Event
Writer
Stream segments
Event
Reader 1
Event
Reader 2
Reader group
Reader group state:
Event Reader 1: {1, 3}
Event Reader 2: {4}
Unassigned: {}
Segment 1
Segment 2
Segment 3
Segment 4
C
C
Checkpoint
events
Checkpoint
Checkpoint:
- Segment 1: 2
- Segment 3: 1
- Segment 4: 1
63. Apache Flink
APIs and Execution
DataWorks Summit Berlin - April, 2018
Model, Semantics, APIs
Storage
Execution Runtime
63
64. Apache Flink in a Nutshell
64
Queries
Applications
Devices
etc.
Database
Stream
File / Object
Storage
Stateful computations over streams
real-time and historic
fast, scalable, fault tolerant, in-memory,
event time, large state, exactly-once
Historic
Data
Streams
Application
DataWorks Summit Berlin - April, 2018
66. Layered APIs
66
Process Function (events, state, time)
DataStream API (streams, windows)
Stream SQL / Tables (dynamic tables)
Stream- & Batch
Data Processing
High-level
Analytics API
Stateful Event-
Driven Applications
val stats = stream
.keyBy("sensor")
.timeWindow(Time.seconds(5))
.sum((a, b) -> a.add(b))
def processElement(event: MyEvent, ctx: Context, out: Collector[Result]) = {
// work with event and state
(event, state.value) match { … }
out.collect(…) // emit events
state.update(…) // modify state
// schedule a timer callback
ctx.timerService.registerEventTimeTimer(event.timestamp + 500)
}
Navigate simple to complex use cases
DataWorks Summit Berlin - April, 2018
67. DataStream API
67
Source
Transformation
Windowed Transformation
Sink
val lines: DataStream[String] = env.addSource(new FlinkKafkaConsumer011(…))
val events: DataStream[Event] = lines.map((line) => parse(line))
val stats: DataStream[Statistic] = stream
.keyBy("sensor")
.timeWindow(Time.seconds(5))
.sum(new MyAggregationFunction())
stats.addSink(new RollingSink(path))
Streaming
Dataflow
Source Transform Window
(state read/write)
Sink
DataWorks Summit Berlin - April, 2018
68. SQL (ANSI) – Streaming and Batch
68
SELECT
campaign,
TUMBLE_START(clickTime, INTERVAL ’1’ HOUR),
COUNT(ip) AS clickCnt
FROM adClicks
WHERE clickTime > ‘2017-01-01’
GROUP BY campaign, TUMBLE(clickTime, INTERVAL ‘1’ HOUR)
Query
past futurenowstart of
the stream
DataWorks Summit Berlin - April, 2018
69. Flink in Practice
69
Athena X Streaming SQL
Platform Service
Streaming Platform as a Service
Fraud detection
Streaming Analytics Platform
100s jobs, 1000s nodes, TBs state
metrics, analytics, real time ML
Streaming SQL as a platform
DataWorks Summit Berlin - April, 2018
70. Unified Batch- & Streaming APIs
DataWorks Summit Berlin - April, 2018 70
71. Batch and Streaming in the APIs
71
Data changes slowly
compared to fast
changing queries
ad-hoc queries, data exploration,
ML training and
(hyper) parameter tuning
Batch Processing
Use Case
Data changes fast
application logic
is long-lived
continuous applications,
data pipelines, standing queries,
anomaly detection, ML evaluation, …
Stream Processing
Use Case
DataWorks Summit Berlin - April, 2018
72. Batch and Streaming in the APIs
72
Data changes slowly
compared to fast
changing queries
ad-hoc queries, data exploration,
ML training and
(hyper) parameter tuning
Batch Processing
Use Case
Data changes fast
application logic
is long-lived
continuous applications,
data pipelines, standing queries,
anomaly detection, ML evaluation, …
Stream Processing
Use Case
DataStream API
Bounded Streams Unbounded Streams
DataSet API DataWorks Summit Berlin - April, 2018
73. Latency vs. Completeness
• Streaming trades data completeness
(wait longer for delayed data)
with latency (emit results early)
• Tradeoff is captured by the watermark
which drives Flink's Event Time Clock
• Watermark captures full- or heuristic
completeness with respect to a
certain event time
73DataWorks Summit Berlin - April, 2018
78. Connecting Flink and Pravega
FlinkPravegaReader
• Exactly-once Reader
• Integrates Flink Checkpoints with Pravega Checkpoints
FlinkPravegaWriter
• Transactional exactly-once event producer
• Distributed 2-phase commit coordinated by async. checkpoints
78DataWorks Summit Berlin - April, 2018
https://github.com/pravega/flink-connectors
79. Streaming and Batch Reads
DataWorks Summit Berlin - April, 2018
DataStream API
01000110
01000110
01000110
01000110
S4
S2
S3
S1
In-order reads
Parallelism limited to #segments
at a certain time
DataSet API
01000110
01000110
01000110
01000110S4
S2
S3
S1
Out-of-order reads
Fully parallel reads of
all segments
79
81. Status of Batch and Streaming Unification
We have unified Batch and Streaming APIs
• Apache Flink and Apache Beam (Dataflow Model style)
• Stream SQL (Apache Flink + Beam + Calcite)
• Batch makes some simplifying assumptions
Pravega is streaming storage with an end-to-end
Streaming abstraction
• Also has optimizations for Batch-style reads
DataWorks Summit Berlin - April, 2018 81
82. Status of Batch and Streaming Unification
Batch and Streaming Runtimes still different
• Streaming: Needs some form of bounded out-of-orderness
• Batch: Highly-parallel bulk out-of-order processing
Potential to use both Modes in the same Application
• Use cases that process historic an realtime data (bootstrapping)
• Use batch-style execution on historic data
• Use streaming execution on live data
DataWorks Summit Berlin - April, 2018 82
83. Outlook: Autoscaling
• Scaling policies (Flink 1.6.0+) enable applications that dynamically
adjust their parallelism
• The Pravega Source operator integrates with scaling policies
• Adjust the Flink source stage parallelism together with Pravega
Stream scaling.
DataWorks Summit Berlin - April, 2018 83
84. Outlook: Batch and Streaming Runtime
DataWorks Summit Berlin - April, 2018
Query
past futurenowstart of
the stream
Batch Execution Streaming Execution
84
85. Outlook: Batch and Streaming Runtime
DataWorks Summit Berlin - April, 2018
Query
past futurenowstart of
the stream
01000110
01000110
01000110
01000110
S4
S2
S3
S1
Streaming (ordered) readsParallel batch reads
85
86. Outlook: Batch and Streaming Runtime
DataWorks Summit Berlin - April, 2018
Query
past futurenowstart of
the stream
010 S4 (cont.)
Streaming (ordered) readsParallel batch reads
01000110
01000110
01000110
00110S4 (part.)
S2
S3
S1
86
94. Powerful Abstractions
94
Process Function (events, state, time)
DataStream API (streams, windows)
Stream SQL / Tables (dynamic tables)
Stream- & Batch
Data Processing
High-level
Analytics API
Stateful Event-
Driven Applications
val stats = stream
.keyBy("sensor")
.timeWindow(Time.seconds(5))
.sum((a, b) -> a.add(b))
def processElement(event: MyEvent, ctx: Context, out: Collector[Result]) = {
// work with event and state
(event, state.value) match { … }
out.collect(…) // emit events
state.update(…) // modify state
// schedule a timer callback
ctx.timerService.registerEventTimeTimer(event.timestamp + 500)
}
Layered abstractions to
navigate simple to complex use cases
DataWorks Summit Berlin - April, 2018
95. DataStream API
95
Source
Transformation
Windowed Transformation
Sink
val lines: DataStream[String] = env.addSource(new FlinkKafkaConsumer011(…))
val events: DataStream[Event] = lines.map((line) => parse(line))
val stats: DataStream[Statistic] = stream
.keyBy("sensor")
.timeWindow(Time.seconds(5))
.sum(new MyAggregationFunction())
stats.addSink(new RollingSink(path))
Streaming
Dataflow
Source Transform Window
(state read/write)
Sink
DataWorks Summit Berlin - April, 2018
97. High Level: SQL (ANSI)
97
SELECT
campaign,
TUMBLE_START(clickTime, INTERVAL ’1’ HOUR),
COUNT(ip) AS clickCnt
FROM adClicks
WHERE clickTime > ‘2017-01-01’
GROUP BY campaign, TUMBLE(clickTime, INTERVAL ‘1’ HOUR)
Query
past futurenowstart of
the stream
DataWorks Summit Berlin - April, 2018
99. Latency vs. Completeness (in my words)
99
1977 1980 1983 1999 2002 2005 2015
Processing Time
Episode
IV
Episode
V
Episode
VI
Episode
I
Episode
II
Episode
III
Episode
VII
Event Time
2016
Rogue
One
III.5
2017
Episode
VIII
DataWorks Summit Berlin - April, 2018
102. The FlinkPravegaWriter
• Regular Flink SinkFunction
• No partitioner, but a routing key
• Remember: No partitions in Pravega
• Just dynamically created segments
• Same key always goes to the same segment
• Order of elements guaranteed per key!
Flink Application Pravega Nodes
seg 2
seg 1
seg 3
seg 4
DataWorks Summit Berlin - April, 2018 102
103. Exactly-Once Writes via Transactions
• Similar to a distributed 2-phase commit
• Coordinated by asynchronous checkpoints, no voting delays
• Basic algorithm:
• Between checkpoints: Produce into transaction
• On operator snapshot: Flush local transaction (vote-to-commit)
• On checkpoint complete: Commit transactions
• On recovery: check and commit any pending transactions
DataWorks Summit Berlin - April, 2018 103
104. Exactly-Once Writes via Transactions
chk-1 chk-2
TXN-1
✔chk-1 ✔chk-2
TXN-2
✘
TXN-3
Pravega
Stream
✔ global ✔ global
DataWorks Summit Berlin - April, 2018 104
105. Transaction fails after local snapshot
chk-1 chk-2
TXN-1
✔chk-1
TXN-2
✘
TXN-3
Pravega
Stream
✔ global
DataWorks Summit Berlin - April, 2018 105
106. Transaction fails before commit…
chk-1 chk-2
TXN-1
✔chk-1
TXN-2
✘
TXN-3
Pravega
Stream
✔ global ✔ global
DataWorks Summit Berlin - April, 2018 106
107. … commit on recovery
chk-2
TXN-2 TXN-3
Pravega
Stream
✔ global
recover
TXN handle
chk-3
DataWorks Summit Berlin - April, 2018 107
108. Use Cases for Unified Stream-Batch Processing
• More applications than ”just” analytics
• Building a machine-learning model from the past (in batch mode) and
apply and refine it on real-time data
• Run A/B tests for algorithms on historic and live data
• …
DataWorks Summit Berlin - April, 2018 108
109. • Abstract
• Stream processing is becoming more relevant as many applications provide low-latency response time and
new application domains emerge that naturally demand data to be processed in motion. One particularly
attractive characteristic of the stream processing paradigm is that it conceptually unifies batch processing
(bounded/static historic data) and continuous near-real-time data processing (unbounded streaming event
data).
• However, in practice, implementing a unified batch and streaming data architecture is not seamless: near-
real-time event data and bulk historic data use different storage systems (messages queues or logs versus
filesystems or object stores). Consequently, running the same analysis now and at some arbitrary time in the
future (e.g., months, possibly years ahead) means dealing with different data sources and APIs. Few systems
are capable of handling both near-real-time streaming workloads and large batch workloads at the same
time. And streaming workloads tend to be inherently dynamic, requiring both storage and compute to adjust
continuously for maximum resource efficiency.
• Flavio Junqueira and Fabian Hueske detail an open source streaming data stack consisting of Pravega (stream
storage) and Apache Flink (computation on streams) that offers an unprecedented way of handling
“everything as a stream” that includes unbounded streaming storage and unified batch and streaming
abstraction and dynamically accommodates workload variations in a novel way.
• Pravega enables the ingestion capacity of a stream to grow and shrink according to workload and sends
signals downstream to enable Flink to scale accordingly; it also offers a permanent streaming storage,
exposing an API than enables applications to access data in either near real time or at any arbitrary time in
the future in a uniform fashion. Apache Flink’s SQL and streaming APIs provide a common interface for
processing continuous near-real-time data and a set of historic data, or combinations of both. A deep
integration between these two systems provides end-to-end exactly once semantics for pipelines of streams
and stream processing and lets both systems jointly scale and adjust automatically to changing data rates.
DataWorks Summit Berlin - April, 2018 109
110. Notes by Flavio
• The talk will have three parts:
• Motivation for “everything as a stream”.
• Realizing our vision with a combination of a stream store + unified stream/batch processor
• Where we are with respect to our vision and where we want to go
• Motivation
• There are three cases mentioned that we can use to motivate:
• 1. Always process data as a stream: same API independent of when the application processes the data (reprocessing, historical
processing)
• 2. Catch-up: does not require starting from a bulk store like HDFS and then switch to something else
• 3. Processing stream data in parallel (batch processing)
• Realizing vision
• Pravega intro
• Flink connector
• Flink examples?
• How do we compare to other systems?
• Apache Pulsar: Pub-sub messaging
• Apache Kafka: inflexible in a number of ways
DataWorks Summit Berlin - April, 2018 110
111. I’ve heard Batch is a Subset of Streaming…
-> Stream processing subsumes batch processing.
Batch Stream
Input Bounded, fixed-sized input Unbounded, infinite input
Input Ordering No ordering required.
Full data set can be sorted.
Ordering can be required to
reason about completeness of
input.
Processing Algorithms can collect all input
data before processing it.
Algorithms must process data as
it arrives.
Termination &
Output
Batch programs terminate and
produce finite output
Streaming programs do not
terminate and produce
continuous output
DataWorks Summit Berlin - April, 2018 111
112. Scanning the Past in Order
• Many streaming queries have temporal operations
• Time-windowed aggregations
• Joins with temporal condition
• Processor can leverage (imperfect) time order
• No full sort or hash tables required -> smaller memory requirements
• Clustered Index Scan in relational DBMS
• Not need to switch to ordered ingestion when reaching the tail of the stream
• BUT: Scanning in order typically means scanning with lower parallelism
DataWorks Summit Berlin - April, 2018 112
113. Ordered Scans are not Always Beneficial
• Get total number of clicks per campaign.
• Query does not have a temporal operation
• Events can be processed without respecting time order
• Massively parallel catch-up scan of past
DataWorks Summit Berlin - April, 2018
SELECT
campaign,
COUNT(*) AS clickCnt,
FROM adClicks
GROUP BY campaign
113
114. Requirements for Unified Stream-Batch Processing
• Storage
• Single storage system for historic and real-time data with unified API
• Scanning historic data in time order
• Scanning historic data out of time order with high parallelism
• Ingestion of data in time order
• Processor
• Efficient processing of nearly time-sorted data
• Efficient processing of unordered, bounded data
DataWorks Summit Berlin - April, 2018 114
115. A System for Unified Stream-Batch Processing
• Stream Storage: Pravega
• Long-term storage with support for ordered and unordered scans
• Real-time event log with ordered scans
• Dynamically scales writes and reads
• Unified Stream-Batch Processor: Apache Flink
• Stream processing with sophisticated state handling
• Event-time with watermark support for ingestion of ordered data
• Dedicated algorithms to efficiently handle bounded data
• Tight integration of storage and processor
• End-to-end exactly-once processing
• Dynamic scaling
DataWorks Summit Berlin - April, 2018 115