Flink started with the mission to unify batch and stream processing. We believe that Flink’s architecture is uniquely positioned to be a great engine for streaming, batch and AI workloads at the same time. We will talk about the work we did in this direction.
2. Stratify Inc.
2000 - 2002
Member of Technical Staff
2002
Microsoft
2002 - 2010
Principal Software Engineer
Facebook
2010 - 2014
Software Engineer
Alibaba Group
2014 - Now
Senior Director
2010 2014 Now
About Me
3. EB Total PB Everyday 472M Events/sec1T Event/Day
About Alibaba
5. Low Latency
Fixed Query
Periodic/Continuous
Batch Jobs
High Latency
Flexible Query
Stream Processing Progressive Processing Batch Analytics
Streaming as the core abstraction, Batch as a special case of Streaming
Scenarios for Data Processing
6. Runtime
Table API & SQL
Relational
DataStream API
Stream Processing
DataSet API
Batch Processing
Runtime
Distributed Streaming DataFlow
Local
Single JVM
Cluster
Standalone/YARN
Cloud
GCE/EC2
DataStream API
Stream Processing
DataSet API
Batch Processing
Transformation
StreamGraph
Operator Tree
Batch Plan
Optimized Plan
Job Graph
Stream Task & OP Batch Task & Driver
Very different code for
Stream & Batch
Different API for Stream
and Batch Processing
Flink Architecture
API
7. Apache Flink is the
most sophisticated
open-source Stream Processor
Stream Processing Engines
Stream Processing
samza Apex
8. Can Apache Flink become the
most sophisticated
open-source batch processor?
Batch Processing
Batch Processing Engines
9. Batch Micro-benchmark
1000
1120
1240
1360
1480
1600
1720
1840
1960
2080
2200
Hadoop-2.7.1 Tez-0.7.0 Spark-1.5.1 Flink-0.9.1
1,480
2,171
1,887
2,157
Flink is the fastest due to its pipelined execution
Tez and Spark do not overlap 1st and 2nd stages
MapReduce is slow despite overlapping stages
Result of sorting 80GB/node (3.2TB)
A Comparative Performance Evaluation of Flink, Dongwon Kim, POSTECH, Flink Forward 2015
16. Unified Operator Framework
Unified Operator Abstraction
• Operators can choose inputs
• Operators can be chained easily
• Helps batch as well as streaming
Unified
Operator
17. Table API & SQL
Relational
DataStream API
Stream Processing
DataSet API
Batch Processing
Runtime
Distributed Streaming DataFlow
Local
Single JVM
Cluster
Standalone/YARN
Cloud
GCE/EC2
New DesignOld Design
Flink Architecture: New Design
DataSet
API
Runtime
DDAG API & Stream Operators
Local
Single JVM
Cluster
Standalone/YARN
Cloud
ECS/EC2
Query Processor
Query Optimization & Query Execution
DataStream
API
SQL/TableAPI
Relational
22. A B
Only One-Input Operators can be chained
A B
Multi-Input Operators can also be chained
Flexible Chaining
23. 0x000… “awesome” “flink”321 32L 7 39L 5
Fixed length part Variable length part
Record Format
• Introduced new row format: BinaryRow
• Tight integration with memory management
• Avoid deserialization cost
Record Format
28. Reliability Improvements
Failover
• Region Based Failover
• JM Failover
• Blacklist
• ….
More Details?
Sept 4th, 2018
5:10 PM - 5:50 PM
Maschinenhaus
Feng Wang, Alibaba
Runtime Improvements for Flink Batch
Processing
Shuffle
• Decoupled from TM
• Yarn Shuffle Service
• Async mode
29. Next Steps
Grand Unification of Data Processing
Flink Machine Learning/AI
Switch between batch processing and streaming
seamlessly
PyFlink, TableAPI, DL Integration, Flink ML Improvements
Next Steps
30. Already the best
stream processor
Becoming the best
batch processor
Unified approach
benefits both batch &
stream processing
Working on seamless
experience for AI
Takeaway
31. Flink Forward China
Flink Forward China
Dec 20th-21st @ Beijing National Conference Center
First Flink Forward Conference in Asia,3000+ participants expected
Flink Community
Joint efforts by all major players in Flink community from China
Call For Talks & Sponsors
Submit your talks to flink-forward-china@list.alibaba-inc.com