SlideShare a Scribd company logo
1 of 34
A Deep Dive into Flink SQL
Jark Wu
Software Engineer at Alibaba
Apache Flink Committer & PMC member
Flink SQL Architecture
Content
CONTENT
How Flink SQL Works?
Flink SQL Optimizations
Summary and Futures
This is joint efforts with entire
Apache Flink community!
Architecture before Flink 1.9
Does this lookunified?
A Step Closer
Different APIs for
streaming and batch
Different translation
path
Different codes for
streaming and batch
What we want
Future Architecture
Flink Task Runtime
Planner
Table API & SQL
Stream Transformation
Stream Operator
stream & batch
Architecture since Flink 1.9+
Flink Task Runtime
Blink Planner
Table API & SQL
Stream Transformation
Stream Operator
Flink Planner
DataSet
Driver
stream & batchbatch stream
How Flink SQL Works ?
How Flink SQL Works?
SQL
Table API
Logical Plan Physical Plan Transformations JobGraph
Configurable optimizer phases
Catalog
Hive
Metastore
Code Generation
Optimizer
SubQuery
Decorrelation
Filter/Project
PushDown
Join
Reorder
…
Code Optimizations State-of-art Operators Resource Optimizations
Generated operators
JVM intrinsic
Declarative expressions
Operate on binary data
Cache efficient sorter
Compact binary hash map
Hybrid hash join
Full managed memory
IO Manager
Off-Heap memory
Flink Cluster
Submit Job
An Example
SELECT
t1.id, 1 +2 + t1.value AS v
FROMt1 JOIN t2
WHERE
t1.id = t2.id AND
t2.id < 1000
Scan (t1) Scan (t2)
Join
Filter
Project
t1.id = t2.id
t2.id < 1000
t1.id,
1+2+t1.value
Logical PlanSQL Query
Expression Reduce
Scan (t1) Scan (t2)
Join
Filter
Project
t1.id = t2.id
t2.id < 1000
t1.id,
1+2+t1.value
Logical Plan
Literal(1) Literal(2)
Plus Field(t1.value)
Plus
Expression Tree
Literal(3) Field(t1.value)
Plus
1+2+t1.value 3+t1.value
Evaluate 1+2 for every row
Reduce constant
expressions
Filter Push Down
Scan (t1) Scan (t2)
Join
Filter
Project
t1.id = t2.id
t2.id < 1000
t1.id,
3+t1.value
Scan (t1) Scan (t2)
Filter
Join
Project
t1.id = t2.id
t1.id,
3+t1.value
t2.id < 1000
1 million
1 thousand
1 million1 billion
Projection Push Down
Scan (t1) Scan (t2)
Filter
Join
Project
t1.id = t2.id
t1.id,
3+t1.value
t2.id > 1000
Scan (t1) Scan (t2)
Filter
Join
Project
t1.id = t2.id
t1.id,
3+t1.value
t2.id > 1000
ProjectProject t2.id
t1.id,
t1.value
Physical Planning (Batch)
Optimized Logical Plan
Scan (t1) Scan (t2)
BoradcastHashJoin
Calc
t1.id,
3+t1.value
t2.id > 1000
CalcCalc t2.idt1.id,
t1.value
Physical Plan
Scan (t1) Scan (t2)
Filter
Join
Project
t1.id = t2.id
t1.id,
3+t1.value
t2.id > 1000
ProjectProject t2.id
t1.id,
t1.value
1 thousand
1 million
Translation & Code Generation
Scan (t1) Scan (t2)
BoradcastHashJoin
Calc
CalcCalc
Physical Plan
Source
BoradcastHashJoin
Calc
CalcCalc
Source
t2.id < 1000
t2.id
Transformation Tree
code generation
Physical Planning (Stream)
What is changelog and Why we need it?
Special things for streaming: Changelog Mechanism
aka Retraction Mechanism
Physical Planning (Stream): Changelog Mechanism
SELECT cnt, COUNT(cnt) as freq
FROM(
SELECT word,COUNT(*) as cnt
FROMwords
GROUP BY word)
GROUP BY cnt
word
Hello
World
Hello
Source
cnt freq
1 2
2 1
1 1
Expected Result
Physical Planning (Stream): Changelog Mechanism
word
Hello word cnt
Hello 1
word_count
World 1
Hello 2
Hello, 1
World, 1
Hello, 2
SELECT
word,
COUNT(*) as cnt
FROM words
GROUP BY word
World
Hello
Source
cnt freq
1 2
2 1
SELECT
cnt,
COUNT(cnt) as freq
FROM word_count
GROUP BY cnt
1 2
Should be
“1”
Count Frequency
without changelog
Word Count
Physical Planning (Stream): Changelog Mechanism
word
Hello word cnt
Hello 1
word_count
World 1
Hello 2
SELECT
word,
COUNT(*) as cnt
FROM logs
GROUP BY word
World
Hello
cnt freq
1 2
2 1
1 1
SELECT
cnt,
COUNT(cnt) as freq
FROM word_count
GROUP BY cnt
with changelog
由查询优化器判断是否需要Retraction,用户无感知。① Changelog makes the streaming query result correct
② Query optimizer determines whether update_before is needed
③ Users are not aware of it
Hello, 1insert
World, 1insert
Hello, 1update_before
Hello, 2update_after
Helloinsert
Worldinsert
Helloinsert
Source Word Count Count Frequency
Physical Planning (Stream): Changelog Mechanism
Calc
Source
Aggregate
Aggregate
UpsertSink
[I]:Insert
[U]:Update
[D]: Delete
produce
[INSERT]
produce
[INSERT, UPDATE]
produce
[INSERT, UPDATE]
produce
[INSERT, UPDATE, DELETE]
[I]
[I,U,D]
[I,U]
[I,U]
Step1: determine what changes
will a node produce
words
word,
count(*) as cnt
cnt
cnt, count(*)
Physical Plan
Physical Planning (Stream): Changelog Mechanism
Calc
Source
Aggregate
Aggregate
UpsertSink
produce
[UPDATE_BEFORE+UPDATE_AFTER]
require
only UPDATE_AFTER
[I]
[I,U,D]
[I,U]
[I,U]
require
UPDATE_BEFORE + UPDATE_AFTER
require
UPDATE_BEFORE + UPDATE_AFTER
require nothing
produce
[UPDATE_BEFORE+UPDATE_AFTER]
produce
[UPDATE_AFTER][UB]:Update_Before
[UA]:Update_After
Step2: determine how to produce updates
[UB+UA]
[UB+UA]
[UA]
Physical Planning (Stream): Changelog Mechanism
Calc
Source
Aggregate
Aggregate
UpsertSink
[I]
[I,U,D]
[I,U]
[I,U]
[UB+UA]
[UB+UA]
[UA]
Simple COUNT implementation,
Generate UPDATE_BEFORE
COUNT with retract() implementation,
Not generate UPDATE_BEFORE
words
word,
count(*) as cnt
cnt
cnt, count(*)
Final Physical Plan
Flink SQL Optimizations
• Internal Data Structure (BinaryRow)
• Mini-Batch Processing
• Aggregation Skew Handling
• Plan Rewrite
Old Planner: Row
Object[]
Integer(2019)
String(“Flink”)
String(“Forward”)
Row
• Space inefficiency (object header)
• Boxing and unboxing
• Serialization and deserialization cost, especially when we want to access fields
randomly
• Row(2019, “Flink”, “Forward”)
• Deeply integrated with MemorySegment
• No need to deserialize / Compact layout / Random accessible
• Also have BinaryString, BinaryArray, BinaryMap
New Blink Planner: BinaryRow
2019 pointer pointer 5 Flink 7 Forward
Memory Segment
Fixed-length part Variable-length partNull info
Header (Row Kind)
Blink planner is +54.6% than old planner when object reuse is enabled:
https://www.ververica.com/blog/a-journey-to-beating-flinks-sql-performance
• Each record would cost:
• One state reading and writing
• One deserialization and serialization
• One output
Mini-Batch Processing
Normal aggregation:
SELECT SUM(num) FROM T GROUP BY color
• Use heap memory to hold bundle
• In-memory aggregation before
accessing states and serde operations
• Also ease the downstream loads
Mini-Batch Processing
Mini-Batch aggregation:
table.exec.mini-batch.enabled = true
table.exec.mini-batch.allow-latency = “5000 ms”
table.exec.mini-batch.size = 1000
SELECT SUM(num) FROM T GROUP BY color
Aggregation Skew Handling
SELECT SUM(num) FROM T GROUP BY color
table.optimizer.agg-phase-strategy = TWO_PHASE
• It’s impractical to do a global streaming sort
• But it becomes possible if user only cares about the top n elements
• E.g. Calculate the top 3 shops for each category
Plan Rewrite (Top-N)
SELECT *
FROM (
SELECT *, // you can get like shopId or other information from this
ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) AS rowNum
FROM shop_sales)
WHERE rowNum <= 3
OverAggregate
Calc
…
…
Rank
…
…
Original Plan Optimized Plan
Plan Rewrite (Top-N)
rownum <= 3
ROW_NUMBER
partition key = category
sort key = sales
partition key = category
sort key = sales
top = 3
Summary & Futures
• Flink took a big step towards truly unified architecture
• Introduced how Flink SQL works step by step.
• Flink SQL does a lot of optimizations for users automatically
• Future (Flink 1.11+)
• Blink planner will be the default planner and ready for production
• New TableSource and TableSink interfaces (FLIP-95)
• Support to read changelogs (FLIP-105)
• Unified Batch and Streaming Filesystem connector (FLIP-115)
• Hive DDL & DML compatible (FLIP-123)
Summary & Futures
Thank You!
Questions?
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu

More Related Content

What's hot

Apache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorApache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorDatabricks
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Databricks
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022Flink Forward
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlJiangjie Qin
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink Forward
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
Extending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesFlink Forward
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Flink Forward
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...StreamNative
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanVerverica
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkFlink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxData
 
Apache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudApache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudDatabricks
 
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Flink Forward
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversScyllaDB
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistentconfluent
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 

What's hot (20)

Apache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorApache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark Operator
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Extending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use cases
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
 
Apache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudApache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the Cloud
 
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database Drivers
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistent
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 

Similar to Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu

Make streaming processing towards ANSI SQL
Make streaming processing towards ANSI SQLMake streaming processing towards ANSI SQL
Make streaming processing towards ANSI SQLDataWorks Summit
 
Apache Flink Deep Dive
Apache Flink Deep DiveApache Flink Deep Dive
Apache Flink Deep DiveVasia Kalavri
 
What's new in 1.9.0 blink planner - Kurt Young, Alibaba
What's new in 1.9.0 blink planner - Kurt Young, AlibabaWhat's new in 1.9.0 blink planner - Kurt Young, Alibaba
What's new in 1.9.0 blink planner - Kurt Young, AlibabaFlink Forward
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0Petr Zapletal
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectMao Geng
 
Flink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at AlibabaFlink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at AlibabaDataWorks Summit
 
Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteChris Baynes
 
Hypertable
HypertableHypertable
Hypertablebetaisao
 
Flink's SQL Engine: Let's Open the Engine Room!
Flink's SQL Engine: Let's Open the Engine Room!Flink's SQL Engine: Let's Open the Engine Room!
Flink's SQL Engine: Let's Open the Engine Room!HostedbyConfluent
 
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...Spark Summit
 
Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017Petr Zapletal
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internalsKostas Tzoumas
 
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...Flink Forward
 
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltdhbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., LtdMichael Stack
 
SamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationSamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationYi Pan
 
OLTP+OLAP=HTAP
 OLTP+OLAP=HTAP OLTP+OLAP=HTAP
OLTP+OLAP=HTAPEDB
 
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...Ontico
 
AthenaX - Unified Stream & Batch Processing using SQL at Uber, Zhenqiu Huang,...
AthenaX - Unified Stream & Batch Processing using SQL at Uber, Zhenqiu Huang,...AthenaX - Unified Stream & Batch Processing using SQL at Uber, Zhenqiu Huang,...
AthenaX - Unified Stream & Batch Processing using SQL at Uber, Zhenqiu Huang,...Bowen Li
 

Similar to Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu (20)

Make streaming processing towards ANSI SQL
Make streaming processing towards ANSI SQLMake streaming processing towards ANSI SQL
Make streaming processing towards ANSI SQL
 
Flink internals web
Flink internals web Flink internals web
Flink internals web
 
Apache Flink Deep Dive
Apache Flink Deep DiveApache Flink Deep Dive
Apache Flink Deep Dive
 
What's new in 1.9.0 blink planner - Kurt Young, Alibaba
What's new in 1.9.0 blink planner - Kurt Young, AlibabaWhat's new in 1.9.0 blink planner - Kurt Young, Alibaba
What's new in 1.9.0 blink planner - Kurt Young, Alibaba
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log project
 
Flink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at AlibabaFlink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at Alibaba
 
Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache Calcite
 
Hypertable Nosql
Hypertable NosqlHypertable Nosql
Hypertable Nosql
 
Hypertable
HypertableHypertable
Hypertable
 
Flink's SQL Engine: Let's Open the Engine Room!
Flink's SQL Engine: Let's Open the Engine Room!Flink's SQL Engine: Let's Open the Engine Room!
Flink's SQL Engine: Let's Open the Engine Room!
 
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...
 
Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
 
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltdhbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
 
SamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationSamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentation
 
OLTP+OLAP=HTAP
 OLTP+OLAP=HTAP OLTP+OLAP=HTAP
OLTP+OLAP=HTAP
 
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...
 
AthenaX - Unified Stream & Batch Processing using SQL at Uber, Zhenqiu Huang,...
AthenaX - Unified Stream & Batch Processing using SQL at Uber, Zhenqiu Huang,...AthenaX - Unified Stream & Batch Processing using SQL at Uber, Zhenqiu Huang,...
AthenaX - Unified Stream & Batch Processing using SQL at Uber, Zhenqiu Huang,...
 

More from Flink Forward

Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorFlink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink Forward
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesFlink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!Flink Forward
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsFlink Forward
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleFlink Forward
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitFlink Forward
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkFlink Forward
 
Large Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior DetectionLarge Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior DetectionFlink Forward
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Flink Forward
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeFlink Forward
 

More from Flink Forward (16)

Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scale
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and Profit
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache Flink
 
Large Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior DetectionLarge Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior Detection
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 

Recently uploaded

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu

  • 1. A Deep Dive into Flink SQL Jark Wu Software Engineer at Alibaba Apache Flink Committer & PMC member
  • 2. Flink SQL Architecture Content CONTENT How Flink SQL Works? Flink SQL Optimizations Summary and Futures
  • 3. This is joint efforts with entire Apache Flink community!
  • 4. Architecture before Flink 1.9 Does this lookunified?
  • 5. A Step Closer Different APIs for streaming and batch Different translation path Different codes for streaming and batch What we want
  • 6. Future Architecture Flink Task Runtime Planner Table API & SQL Stream Transformation Stream Operator stream & batch
  • 7. Architecture since Flink 1.9+ Flink Task Runtime Blink Planner Table API & SQL Stream Transformation Stream Operator Flink Planner DataSet Driver stream & batchbatch stream
  • 8. How Flink SQL Works ?
  • 9. How Flink SQL Works? SQL Table API Logical Plan Physical Plan Transformations JobGraph Configurable optimizer phases Catalog Hive Metastore Code Generation Optimizer SubQuery Decorrelation Filter/Project PushDown Join Reorder … Code Optimizations State-of-art Operators Resource Optimizations Generated operators JVM intrinsic Declarative expressions Operate on binary data Cache efficient sorter Compact binary hash map Hybrid hash join Full managed memory IO Manager Off-Heap memory Flink Cluster Submit Job
  • 10. An Example SELECT t1.id, 1 +2 + t1.value AS v FROMt1 JOIN t2 WHERE t1.id = t2.id AND t2.id < 1000 Scan (t1) Scan (t2) Join Filter Project t1.id = t2.id t2.id < 1000 t1.id, 1+2+t1.value Logical PlanSQL Query
  • 11. Expression Reduce Scan (t1) Scan (t2) Join Filter Project t1.id = t2.id t2.id < 1000 t1.id, 1+2+t1.value Logical Plan Literal(1) Literal(2) Plus Field(t1.value) Plus Expression Tree Literal(3) Field(t1.value) Plus 1+2+t1.value 3+t1.value Evaluate 1+2 for every row Reduce constant expressions
  • 12. Filter Push Down Scan (t1) Scan (t2) Join Filter Project t1.id = t2.id t2.id < 1000 t1.id, 3+t1.value Scan (t1) Scan (t2) Filter Join Project t1.id = t2.id t1.id, 3+t1.value t2.id < 1000 1 million 1 thousand 1 million1 billion
  • 13. Projection Push Down Scan (t1) Scan (t2) Filter Join Project t1.id = t2.id t1.id, 3+t1.value t2.id > 1000 Scan (t1) Scan (t2) Filter Join Project t1.id = t2.id t1.id, 3+t1.value t2.id > 1000 ProjectProject t2.id t1.id, t1.value
  • 14. Physical Planning (Batch) Optimized Logical Plan Scan (t1) Scan (t2) BoradcastHashJoin Calc t1.id, 3+t1.value t2.id > 1000 CalcCalc t2.idt1.id, t1.value Physical Plan Scan (t1) Scan (t2) Filter Join Project t1.id = t2.id t1.id, 3+t1.value t2.id > 1000 ProjectProject t2.id t1.id, t1.value 1 thousand 1 million
  • 15. Translation & Code Generation Scan (t1) Scan (t2) BoradcastHashJoin Calc CalcCalc Physical Plan Source BoradcastHashJoin Calc CalcCalc Source t2.id < 1000 t2.id Transformation Tree code generation
  • 16. Physical Planning (Stream) What is changelog and Why we need it? Special things for streaming: Changelog Mechanism aka Retraction Mechanism
  • 17. Physical Planning (Stream): Changelog Mechanism SELECT cnt, COUNT(cnt) as freq FROM( SELECT word,COUNT(*) as cnt FROMwords GROUP BY word) GROUP BY cnt word Hello World Hello Source cnt freq 1 2 2 1 1 1 Expected Result
  • 18. Physical Planning (Stream): Changelog Mechanism word Hello word cnt Hello 1 word_count World 1 Hello 2 Hello, 1 World, 1 Hello, 2 SELECT word, COUNT(*) as cnt FROM words GROUP BY word World Hello Source cnt freq 1 2 2 1 SELECT cnt, COUNT(cnt) as freq FROM word_count GROUP BY cnt 1 2 Should be “1” Count Frequency without changelog Word Count
  • 19. Physical Planning (Stream): Changelog Mechanism word Hello word cnt Hello 1 word_count World 1 Hello 2 SELECT word, COUNT(*) as cnt FROM logs GROUP BY word World Hello cnt freq 1 2 2 1 1 1 SELECT cnt, COUNT(cnt) as freq FROM word_count GROUP BY cnt with changelog 由查询优化器判断是否需要Retraction,用户无感知。① Changelog makes the streaming query result correct ② Query optimizer determines whether update_before is needed ③ Users are not aware of it Hello, 1insert World, 1insert Hello, 1update_before Hello, 2update_after Helloinsert Worldinsert Helloinsert Source Word Count Count Frequency
  • 20. Physical Planning (Stream): Changelog Mechanism Calc Source Aggregate Aggregate UpsertSink [I]:Insert [U]:Update [D]: Delete produce [INSERT] produce [INSERT, UPDATE] produce [INSERT, UPDATE] produce [INSERT, UPDATE, DELETE] [I] [I,U,D] [I,U] [I,U] Step1: determine what changes will a node produce words word, count(*) as cnt cnt cnt, count(*) Physical Plan
  • 21. Physical Planning (Stream): Changelog Mechanism Calc Source Aggregate Aggregate UpsertSink produce [UPDATE_BEFORE+UPDATE_AFTER] require only UPDATE_AFTER [I] [I,U,D] [I,U] [I,U] require UPDATE_BEFORE + UPDATE_AFTER require UPDATE_BEFORE + UPDATE_AFTER require nothing produce [UPDATE_BEFORE+UPDATE_AFTER] produce [UPDATE_AFTER][UB]:Update_Before [UA]:Update_After Step2: determine how to produce updates [UB+UA] [UB+UA] [UA]
  • 22. Physical Planning (Stream): Changelog Mechanism Calc Source Aggregate Aggregate UpsertSink [I] [I,U,D] [I,U] [I,U] [UB+UA] [UB+UA] [UA] Simple COUNT implementation, Generate UPDATE_BEFORE COUNT with retract() implementation, Not generate UPDATE_BEFORE words word, count(*) as cnt cnt cnt, count(*) Final Physical Plan
  • 23. Flink SQL Optimizations • Internal Data Structure (BinaryRow) • Mini-Batch Processing • Aggregation Skew Handling • Plan Rewrite
  • 24. Old Planner: Row Object[] Integer(2019) String(“Flink”) String(“Forward”) Row • Space inefficiency (object header) • Boxing and unboxing • Serialization and deserialization cost, especially when we want to access fields randomly • Row(2019, “Flink”, “Forward”)
  • 25. • Deeply integrated with MemorySegment • No need to deserialize / Compact layout / Random accessible • Also have BinaryString, BinaryArray, BinaryMap New Blink Planner: BinaryRow 2019 pointer pointer 5 Flink 7 Forward Memory Segment Fixed-length part Variable-length partNull info Header (Row Kind) Blink planner is +54.6% than old planner when object reuse is enabled: https://www.ververica.com/blog/a-journey-to-beating-flinks-sql-performance
  • 26. • Each record would cost: • One state reading and writing • One deserialization and serialization • One output Mini-Batch Processing Normal aggregation: SELECT SUM(num) FROM T GROUP BY color
  • 27. • Use heap memory to hold bundle • In-memory aggregation before accessing states and serde operations • Also ease the downstream loads Mini-Batch Processing Mini-Batch aggregation: table.exec.mini-batch.enabled = true table.exec.mini-batch.allow-latency = “5000 ms” table.exec.mini-batch.size = 1000 SELECT SUM(num) FROM T GROUP BY color
  • 28. Aggregation Skew Handling SELECT SUM(num) FROM T GROUP BY color table.optimizer.agg-phase-strategy = TWO_PHASE
  • 29. • It’s impractical to do a global streaming sort • But it becomes possible if user only cares about the top n elements • E.g. Calculate the top 3 shops for each category Plan Rewrite (Top-N) SELECT * FROM ( SELECT *, // you can get like shopId or other information from this ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) AS rowNum FROM shop_sales) WHERE rowNum <= 3
  • 30. OverAggregate Calc … … Rank … … Original Plan Optimized Plan Plan Rewrite (Top-N) rownum <= 3 ROW_NUMBER partition key = category sort key = sales partition key = category sort key = sales top = 3
  • 32. • Flink took a big step towards truly unified architecture • Introduced how Flink SQL works step by step. • Flink SQL does a lot of optimizations for users automatically • Future (Flink 1.11+) • Blink planner will be the default planner and ready for production • New TableSource and TableSink interfaces (FLIP-95) • Support to read changelogs (FLIP-105) • Unified Batch and Streaming Filesystem connector (FLIP-115) • Hive DDL & DML compatible (FLIP-123) Summary & Futures