Marton Balassi – Stateful Stream Processing

Stateful Stream Processing
Márton Balassi
mbalassi@apache.org
Gábor Hermann
ghermann@ilab.sztaki.hu

This talk
 Stateful stream processing by example
 Open source stream processors
 Runtime architecture
 Fault tolerance
 Stateful processing
 Closing
2Flink Forward2015-10-13

Stateful stream
processing by example
2015-10-13 Flink Forward 3

Streaming applications
ETL style operations
• Filter incoming data,
Log analysis
• No or minimal state
Window aggregations
• Trending tweets,
User sessions, Stream joins
• State: Window buffer
Inpu
t
Inpu
t
Inpu
tInput
Process/Enrich

Streaming applications
Machine learning
• Fitting trends to the evolving
stream, Stream clustering
• State: Model
Pattern recognition
• Fraud detection, Triggering
signals based on activity
• State: Finite state machine

State in streaming programs
 Almost all non-trivial streaming programs are
stateful
 Stateful operators (in essence):
𝒇: 𝒊𝒏, 𝒔𝒕𝒂𝒕𝒆 ⟶ 𝒐𝒖𝒕, 𝒔𝒕𝒂𝒕𝒆′
 State hangs around and can be read and
modified as the stream evolves
 Goal: Get as close as possible while
maintaining scalability and fault-tolerance

Open source stream
processors

Apache Streaming landscape
82015-10-13 Flink Forward

Apache Storm
 Started in 2010, development driven by
BackType, then Twitter
 Pioneer in large scale stream processing
 Distributed dataflow abstraction (spouts &
bolts)

Apache Flink
 Started in 2008 as a research project
(Stratosphere) at European universities
 Unique combination of low latency streaming
and high throughput batch analysis
 Flexible operator states and windowing
10
Batch data
Kafka, RabbitMQ,
...
HDFS, JDBC,
...
Stream Data

Apache Samza
 Developed at LinkedIn, open sourced in 2013
 Builds heavily on Kafka’s log based philosophy
 Pluggable messaging system and execution
backend

Apache Spark
 Started in 2009 at UC Berkley, Apache since 2013
 Very strong community, wide adoption
 Unified batch and stream processing over a
batch runtime
 Good integration with batch programs

Runtime and
programming model

Native Streaming

Distributed dataflow runtime
 Storm, Samza and Flink
 General properties
• Long standing operators
• Pipelined execution
• Usually possible to create
cyclic flows
Pros
• Full expressivity
• Low-latency execution
• Stateful operators
Cons
• Fault-tolerance is hard
• Throughput may suffer
• Load balancing is an
issue

Micro-batching

Micro-batch runtime
 Implemented by Apache Spark
 General properties
• Computation broken down
to time intervals
• Load aware scheduling
• Easy interaction with batch
Pros
• Easy to reason about
• High-throughput
• FT comes for “free”
• Dynamic load balancing
Cons
• Latency depends on
batch size
• Limited expressivity
• Stateless by nature

Fault tolerance

Fault tolerance intro
 Fault-tolerance in streaming systems is
inherently harder than in batch
• Can’t just restart computation
• State is a problem
• Fast recovery is crucial
• Streaming topologies run 24/7 for a long period
 Fault-tolerance is a complex issue
• No single point of failure is allowed
• Guaranteeing input processing
• Consistent operator state
• Fast recovery
• At-least-once vs Exactly-once semantics

Storm record acknowledgements
 Track the lineage of tuples as they are
processed (anchors and acks)
 Special “acker” bolts track each lineage
DAG (efficient xor based algorithm)
 Replay the root of failed (or timed out)
tuples

Samza offset tracking
 Exploits the properties of a durable, offset
based messaging layer
 Each task maintains its current offset, which
moves forward as it processes elements
 The offset is checkpointed and restored on
failure (some messages might be repeated)

Flink checkpointing
 Based on consistent global snapshots
 Algorithm designed for stateful dataflows
(minimal runtime overhead)
 Exactly-once semantics

Spark RDD recomputation
 Immutable data model with
repeatable computation
 Failed RDDs are recomputed
using their lineage
 Checkpoint RDDs to reduce
lineage length
 Parallel recovery of failed
RDDs

Stateful stream
processing

State in streaming programs
 Almost all non-trivial streaming programs are
stateful
 Stateful operators (in essence):
𝒇: 𝒊𝒏, 𝒔𝒕𝒂𝒕𝒆 ⟶ 𝒐𝒖𝒕, 𝒔𝒕𝒂𝒕𝒆′
 State hangs around and can be read and
modified as the stream evolves
 Goal: Get as close as possible while
maintaining scalability and fault-tolerance

 States available only in Trident API
 Dedicated operators for state updates and
queries
 State access methods
• stateQuery(…)
• partitionPersist(…)
• persistentAggregate(…)
 It’s very difficult to
implement transactional
states
Exactly-once guarantee

Storm Trident Word Count

 Stateless runtime by design
• No continuous operators
• UDFs are assumed to be stateless
 State can be generated as a separate
stream of RDDs: updateStateByKey(…)
𝒇: 𝑺𝒆𝒒[𝒊𝒏 𝒌], 𝒔𝒕𝒂𝒕𝒆 𝒌 ⟶ 𝒔𝒕𝒂𝒕𝒆′
𝒌
 𝒇 is scoped to a specific key

val stateDstream = wordDstream.updateStateByKey[Int](
newUpdateFunc,
new HashPartitioner(ssc.sparkContext.defaultParallelism),
true,
initialRDD)
val updateFunc = (values: Seq[Int], state: Option[Int]) => {
val currentCount = values.sum
val previousCount = state.getOrElse(0)
Some(currentCount + previousCount)
}
Spark Streaming Word Count

 Stateful dataflow operators
(Any task can hold state)
 State changes are stored
as a log by Kafka
 Custom storage engines can
be plugged in to the log
 𝒇 is scoped to a specific task
 At-least-once processing
semantics

Samza Word Count
public class WordCounter implements StreamTask, InitableTask {
//Some omitted details…
private KeyValueStore<String, Integer> store;
public void process(IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
//Get the current count
String word = (String) envelope.getKey();
Integer count = store.get(word);
if (count == null) count = 0;
//Increment, store and send
count += 1;
store.put(word, count);
collector.send(
new OutgoingMessageEnvelope(OUTPUT_STREAM, word ,count));
}
}

 Stateful dataflow operators (conceptually
similar to Samza)
 Two state access patterns
• Local (Task) state
• Partitioned (Key) state
 Proper API integration
• Java: OperatorState interface
• Scala: mapWithState, flatMapWithState…
 Exactly-once semantics by checkpointing
0.9.1

Flink Word Count
words.keyBy(x => x).mapWithState {
(word, count: Option[Int]) =>
{
val newCount = count.getOrElse(0) + 1
val output = (word, newCount)
(output, Some(newCount))
}
}

Local State Example (Java)
public class MySource extends RichParallelSourceFunction {
// Omitted details
private OperatorState<Long> offset;
@Override
public void run(SourceContext ctx) {
Object checkpointLock = ctx.getCheckpointLock();
isRunning = true;
while (isRunning) {
synchronized (checkpointLock) {
offset.update(offset.value() + 1);
//ctx.collect(next);
}
}
}
}

 Internal operators are checkpointed:
• Aggregations
• Window operators
• …
 KeyValue state
• Easing common access patterns
 Flexible state backend interface
 Removes non-partitioned operator state
0.10

Performance (Fault tolerance)

Performance (Statefullness)
CheckpointInterval: 5 sec BatchDuration: 5 sec

Closing

Summary
 Storm (Trident)
+ Consistent state accessible from outside
– Only works well with idempotent states
– States are not part of the operators
 Spark
+ Integrates well with the system guarantees
– Limited expressivity
– Immutability increases update complexity
 Samza
+ Efficient log based state updates
+ States are well integrated with the operators
– Lack of exactly-once semantics
– State access is not fully transparent
 Flink
+ Light-weight, exactly once distributed snaphots
+ Flexible checkpointing and state backend interfaces
– Has to coordinate with a persistent source
– Internal operators not checkpointed in 0.9.1

List of Figures (in order of usage)
 https://upload.wikimedia.org/wikipedia/commons/thumb/2/2a/CPT-FSM-
abcd.svg/326px-CPT-FSM-abcd.svg.png
 https://storm.apache.org/images/topology.png
 https://databricks.com/wp-content/uploads/2015/07/image11-1024x655.png
 https://people.csail.mit.edu/matei/papers/2012/hotcloud_spark_streaming.pdf,
page 2.
 http://www.slideshare.net/ptgoetz/storm-hadoop-summit2014, page 69-71.
 http://samza.apache.org/img/0.9/learn/documentation/container/checkpointi
ng.svg
 https://storm.apache.org/documentation/images/spout-vs-state.png
 http://samza.apache.org/img/0.9/learn/documentation/container/stateful_job.
png

Marton Balassi – Stateful Stream Processing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Marton Balassi – Stateful Stream Processing

Similar to Marton Balassi – Stateful Stream Processing (20)

More from Flink Forward

More from Flink Forward (20)

Recently uploaded

Recently uploaded (20)

Marton Balassi – Stateful Stream Processing