SlideShare a Scribd company logo
1 of 72
Hive Bucketing in
Apache Spark
Tejas Patil
Facebook
Agenda
• Why bucketing ?
• Why is shuffle bad ?
• How to avoid shuffle ?
• When to use bucketing ?
• Spark’s bucketing support
• Bucketing semantics of Spark vs Hive
• Hive bucketing support in Spark
• SQL Planner improvements
Why bucketing ?
Table A Table B
. . . . . . . . . . . .JOIN
Table A Table B
. . . . . . . . . . . .JOIN
Broadcast hash join
- Ship smaller table to all nodes
- Stream the other table
Table A Table B
. . . . . . . . . . . .JOIN
Broadcast hash join Shuffle hash join
- Ship smaller table to all nodes
- Stream the other table
- Shuffle both tables,
- Hash smaller one, stream the
bigger one
Table A Table B
. . . . . . . . . . . .JOIN
Broadcast hash join Shuffle hash join
- Ship smaller table to all nodes
- Stream the other table
- Shuffle both tables,
- Hash smaller one, stream the
bigger one
Sort merge join
- Shuffle both tables,
- Sort both tables, buffer one,
stream the bigger one
Table A Table B
. . . . . . . . . . . .JOIN
Broadcast hash join Shuffle hash join Sort merge join
- Ship smaller table to all nodes
- Stream the other table
- Shuffle both tables,
- Hash smaller one, stream the
bigger one
- Shuffle both tables,
- Sort both tables, buffer one,
stream the bigger one
Table A Table B
. . . . . . . . . . . .
Sort Merge Join
Table A Table B
. . . . . . . . . . . .
Shuffle ShuffleShuffleShuffle
Sort Merge Join
Table A Table B
. . . . . . . . . . . .
Sort Sort Sort Sort
Shuffle ShuffleShuffleShuffle
Sort Merge Join
Table A Table B
. . . . . . . . . . . .
Sort
Join
Sort
Join
Sort
Join
Sort
Join
Shuffle ShuffleShuffleShuffle
Sort Merge Join
Sort Merge Join
Why is shuffle bad ?
Table A Table B
. . . . . . . . . . . .
Disk Disk
Disk IO for shuffle outputs
Why is shuffle bad ?
Table A Table B
. . . . . . . . . . . .
Shuffle Shuffle Shuffle Shuffle
M * R network connections
Why is shuffle bad ?
How to avoid shuffle ?
student s JOIN attendance a
ON s.id = a.student_id
student s JOIN results r
ON s.id = r.student_id
student s JOIN course_registration c
ON s.id = c.student_id
……..
……..
student s JOIN attendance a
ON s.id = a.student_id
student s JOIN results r
ON s.id = r.student_id
student s JOIN attendance a
ON s.id = a.student_id
student s JOIN results r
ON s.id = r.student_id
Pre-compute at
table creation
Bucketing =
pre-(shuffle + sort) inputs
on join keys
Without
bucketing
Job(s) populating input table
Without
bucketing
With
bucketing
Job(s) populating input table
Performance comparison
0
100
200
300
400
500
600
700
800
900
Before After Bucketing
Reserved CPU time (days)
0
1
2
3
4
5
6
7
8
Before After Bucketing
Latency (hours)
When to use bucketing ?
• Tables used frequently in JOINs with same key
• Student -> student_id
• Employee -> employee_id
• Users -> user_id
When to use bucketing ?
• Tables used frequently in JOINs with same key
• Student -> student_id
• Employee -> employee_id
• Users -> user_id
…. => Dimension tables
When to use bucketing ?
• Tables used frequently in JOINs with same key
• Student -> student_id
• Employee -> employee_id
• Users -> user_id
• Loading daily cumulative tables
• Both base and delta tables could be bucketed on a common column
When to use bucketing ?
• Tables used frequently in JOINs with same key
• Student -> student_id
• Employee -> employee_id
• Users -> user_id
• Loading daily cumulative tables
• Both base and delta tables could be bucketed on a common column
• Indexing capability
Spark’s bucketing support
Creation of bucketed tables
df.write
.bucketBy(numBuckets, "col1", ...)
.sortBy("col1", ...)
.saveAsTable("bucketed_table")
via Dataframe API
Creation of bucketed tables
CREATE TABLE bucketed_table(
column1 INT,
...
) USING parquet
CLUSTERED BY (column1, …)
SORTED BY (column1, …)
INTO `n` BUCKETS
via SQL statement
Check bucketing spec
scala> sparkContext.sql("DESC FORMATTED student").collect.foreach(println)
[# col_name,data_type,comment]
[student_id,int,null]
[name,int,null]
[# Detailed Table Information,,]
[Database,default,]
[Table,table1,]
[Owner,tejas,]
[Created,Fri May 12 08:06:33 PDT 2017,]
[Type,MANAGED,]
[Num Buckets,64,]
[Bucket Columns,[`student_id`],]
[Sort Columns,[`student_id`],]
[Properties,[serialization.format=1],]
[Serde Library,org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,]
[InputFormat,org.apache.hadoop.mapred.SequenceFileInputFormat,]
Bucketing config
SET spark.sql.sources.bucketing.enabled=true
[SPARK-15453] Extract bucketing info in
FileSourceScanExec
SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j
[SPARK-15453] Extract bucketing info in
FileSourceScanExec
SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j
Sort Merge Join
Sort
Exchange
TableScan
Sort
Exchange
TableScan Already bucketed on columns (i, j)
Shuffle on columns (i, j)
Sort on columns (i, j)
Join on [i,j], [i,j] respectively of relations
[SPARK-15453] Extract bucketing info in
FileSourceScanExec
SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j
Sort Merge Join
Sort
Exchange
TableScan
Sort
Exchange
TableScan Already bucketed on columns (i, j)
Shuffle on columns (i, j)
Sort on columns (i, j)
Join on [i,j], [i,j] respectively of relations
[SPARK-15453] Extract bucketing info in
FileSourceScanExec
SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j
Sort Merge Join
TableScan TableScan
Sort Merge Join
TableScan TableScanSort
Exchange
Sort
Exchange
Bucketing semantics of
Spark vs Hive
Bucketing semantics of Spark vs Hive
bucket 0 bucket 1
bucket
(n-1)
…..
Hive
my_bucketed_table
Bucketing semantics of Spark vs Hive
bucket 0 bucket 1
bucket
(n-1)
…..
Hive Spark
bucket 0 bucket 1
bucket
(n-1)
…..
bucket 0
bucket 0
bucket 1
bucket
(n-1)bucket
(n-1)bucket
(n-1)
my_bucketed_tablemy_bucketed_table
Bucketing semantics of Spark vs Hive
bucket 0 bucket 1
bucket
(n-1)
…..
Hive
M M M M M
R R R
…..
Bucketing semantics of Spark vs Hive
bucket 0 bucket 1
bucket
(n-1)
…..
Hive Spark
M M M M M
R R R
…..
bucket 0 bucket 1
bucket
(n-1)
…..
M M M M M…..
bucket 0
bucket 0
bucket 1
bucket
(n-1)bucket
(n-1)bucket
(n-1)
Bucketing semantics of Spark vs Hive
Hive Spark
Model Optimizes reads, writes are costly Writes are cheaper, reads are
costlier
Bucketing semantics of Spark vs Hive
Hive Spark
Model Optimizes reads, writes are costly Writes are cheaper, reads are
costlier
Unit of bucketing A single file represents a single
bucket
A collection of files together
comprise a single bucket
Bucketing semantics of Spark vs Hive
Hive Spark
Model Optimizes reads, writes are costly Writes are cheaper, reads are
costlier
Unit of bucketing A single file represents a single
bucket
A collection of files together
comprise a single bucket
Sort ordering Each bucket is sorted globally. This
makes it easy to optimize queries
reading this data
Each file is individually sorted but
the “bucket” as a whole is not
globally sorted
Bucketing semantics of Spark vs Hive
Hive Spark
Model Optimizes reads, writes are costly Writes are cheaper, reads are
costlier
Unit of bucketing A single file represents a single
bucket
A collection of files together
comprise a single bucket
Sort ordering Each bucket is sorted globally. This
makes it easy to optimize queries
reading this data
Each file is individually sorted but
the “bucket” as a whole is not
globally sorted
Hashing function Hive’s inbuilt hash Murmur3Hash
[SPARK-19256] Hive bucketing support
• Introduce Hive’s hashing function [SPARK-17495]
• Enable creating hive bucketed tables [SPARK-17729]
• Support Hive’s `Bucketed file format`
• Propagate Hive bucketing information to planner [SPARK-17654]
• Expressing outputPartitioning and requiredChildDistribution
• Creating empty bucket files when no data for a bucket
• Allow Sort Merge join over tables with number of buckets
multiple of each other
• Support N-way Sort Merge Join
[SPARK-19256] Hive bucketing support
• Introduce Hive’s hashing function [SPARK-17495]
• Enable creating hive bucketed tables [SPARK-17729]
• Support Hive’s `Bucketed file format`
• Propagate Hive bucketing information to planner [SPARK-17654]
• Expressing outputPartitioning and requiredChildDistribution
• Creating empty bucket files when no data for a bucket
• Allow Sort Merge join over tables with number of buckets
multiple of each other
• Support N-way Sort Merge Join
Merged in
upstream
[SPARK-19256] Hive bucketing support
• Introduce Hive’s hashing function [SPARK-17495]
• Enable creating hive bucketed tables [SPARK-17729]
• Support Hive’s `Bucketed file format`
• Propagate Hive bucketing information to planner [SPARK-17654]
• Expressing outputPartitioning and requiredChildDistribution
• Creating empty bucket files when no data for a bucket
• Allow Sort Merge join over tables with number of buckets
multiple of each other
• Support N-way Sort Merge Join
FB-prod (6 months)
SQL Planner improvements
[SPARK-19122] Unnecessary shuffle+sort added if
join predicates ordering differ from bucketing and
sorting order
SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y
[SPARK-19122] Unnecessary shuffle+sort added if
join predicates ordering differ from bucketing and
sorting order
Sort Merge Join
TableScan TableScan
SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y
[SPARK-19122] Unnecessary shuffle+sort added if
join predicates ordering differ from bucketing and
sorting order
SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y
Sort Merge Join
TableScan TableScan
SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
[SPARK-19122] Unnecessary shuffle+sort added if
join predicates ordering differ from bucketing and
sorting order
Sort Merge Join
Exchange
TableScan
Sort
Exchange
TableScan
Sort
Sort Merge Join
TableScan TableScan
SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
[SPARK-19122] Unnecessary shuffle+sort added if
join predicates ordering differ from bucketing and
sorting order
Sort Merge Join
Exchange
TableScan
Sort
Exchange
TableScan
Sort
Sort Merge Join
TableScan TableScan
SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
[SPARK-19122] Unnecessary shuffle+sort added if
join predicates ordering differ from bucketing and
sorting order
Sort Merge Join
TableScan TableScan
SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
Sort Merge Join
TableScan TableScan
[SPARK-17271] Planner adds un-necessary Sort
even if child ordering is semantically same as
required ordering
SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1
[SPARK-17271] Planner adds un-necessary Sort
even if child ordering is semantically same as
required ordering
SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1
Sort Merge Join
Sort
TableScan
Sort
TableScan Already bucketed on column `col1`
Sort on columns (a.col1 and b.col1)
Join on [col1], [col1] respectively of relations
[SPARK-17271] Planner adds un-necessary Sort
even if child ordering is semantically same as
required ordering
SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1
Sort Merge Join
TableScan TableScan Already bucketed on column `col1`
Sort on columns (a.col1 and b.col1)Sort Sort
Join on [col1], [col1] respectively of relations
[SPARK-17271] Planner adds un-necessary Sort
even if child ordering is semantically same as
required ordering
SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1
Sort Merge Join
TableScan TableScan
Sort Sort
Sort Merge Join
TableScan TableScan
[SPARK-17698] Join predicates should not
contain filter clauses
SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1'
[SPARK-17698] Join predicates should not
contain filter clauses
SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1'
Sort Merge Join
TableScan TableScan
Sort
Exchange
Sort
Exchange
Already bucketed on column [id]
Shuffle on columns [id, 1, id] and [id, id, 1]
Sort on columns [id, id, 1] and [id, 1, id]
Join on [id, id, 1], [id, 1, id] respectively of relations
[SPARK-17698] Join predicates should not
contain filter clauses
SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1'
Sort Merge Join
TableScan TableScan
Sort
Exchange
Sort
Exchange
Already bucketed on column [id]
Shuffle on columns [id, 1, id] and [id, id, 1]
Sort on columns [id, id, 1] and [id, 1, id]
Join on [id, id, 1], [id, 1, id] respectively of relations
[SPARK-17698] Join predicates should not
contain filter clauses
SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1'
Sort Merge Join
TableScan TableScan
Sort
Exchange
Sort
Exchange
Already bucketed on column [id]
Shuffle on columns [id, 1, id] and [id, id, 1]
Sort on columns [id, id, 1] and [id, 1, id]
Join on [id, id, 1], [id, 1, id] respectively of relations
[SPARK-17698] Join predicates should not
contain filter clauses
SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1'
Sort Merge Join
TableScan TableScan
Sort Merge Join
TableScan TableScanSort
Exchange
Sort
Exchange
[SPARK-13450] Introduce
ExternalAppendOnlyUnsafeRowArray
[SPARK-13450] Introduce
ExternalAppendOnlyUnsafeRowArray
Buffered
relation
Streamed
relation
If there are a lot of rows to be
buffered (eg. Skew),
the query would OOM
row with
key=`foo`
rows with
key=`foo`
[SPARK-13450] Introduce
ExternalAppendOnlyUnsafeRowArray
Buffered
relation
Streamed
relation
row with
key=`foo`
rows with
key=`foo`
Disk
buffer would spill to disk
after reaching a threshold
spill
Summary
• Shuffle and sort is costly due to disk and network IO
• Bucketing will pre-(shuffle and sort) the inputs
• Expect at least 2-5x gains after bucketing input tables for joins
• Candidates for bucketing:
• Tables used frequently in JOINs with same key
• Loading of data in cumulative tables
• Spark’s bucketing is not compatible with Hive’s bucketing
Questions ?

More Related Content

What's hot

The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraFlink Forward
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkSamy Dindane
 
Beyond SQL: Speeding up Spark with DataFrames
Beyond SQL: Speeding up Spark with DataFramesBeyond SQL: Speeding up Spark with DataFrames
Beyond SQL: Speeding up Spark with DataFramesDatabricks
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeDatabricks
 
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing ShuffleBucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing ShuffleDatabricks
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Ryan Blue
 
Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Databricks
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkRahul Jain
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinDataStax Academy
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to sparkDuyhai Doan
 
Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesDatabricks
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introductionsudhakara st
 
Spark SQL Join Improvement at Facebook
Spark SQL Join Improvement at FacebookSpark SQL Join Improvement at Facebook
Spark SQL Join Improvement at FacebookDatabricks
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesNishith Agarwal
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkBo Yang
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationDatabricks
 

What's hot (20)

The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Beyond SQL: Speeding up Spark with DataFrames
Beyond SQL: Speeding up Spark with DataFramesBeyond SQL: Speeding up Spark with DataFrames
Beyond SQL: Speeding up Spark with DataFrames
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
 
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing ShuffleBucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
 
Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation Engines
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introduction
 
Spark SQL Join Improvement at Facebook
Spark SQL Join Improvement at FacebookSpark SQL Join Improvement at Facebook
Spark SQL Join Improvement at Facebook
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
 

Similar to Hive Bucketing in Apache Spark

Spark SQL Bucketing at Facebook
 Spark SQL Bucketing at Facebook Spark SQL Bucketing at Facebook
Spark SQL Bucketing at FacebookDatabricks
 
Scala in practice - 3 years later
Scala in practice - 3 years laterScala in practice - 3 years later
Scala in practice - 3 years laterpatforna
 
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...Thoughtworks
 
Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...
Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...
Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...Databricks
 
Table Retrieval and Generation
Table Retrieval and GenerationTable Retrieval and Generation
Table Retrieval and Generationkrisztianbalog
 
Paris jug ksql - 2018-06-28
Paris jug ksql - 2018-06-28Paris jug ksql - 2018-06-28
Paris jug ksql - 2018-06-28Florent Ramiere
 
SQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedSQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedTony Rogerson
 
Apache Hive, data segmentation and bucketing
Apache Hive, data segmentation and bucketingApache Hive, data segmentation and bucketing
Apache Hive, data segmentation and bucketingearnwithme2522
 
Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...
Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...
Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...Chris Fregly
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversScyllaDB
 
Database Programming
Database ProgrammingDatabase Programming
Database ProgrammingHenry Osborne
 
Mahout Introduction BarCampDC
Mahout Introduction BarCampDCMahout Introduction BarCampDC
Mahout Introduction BarCampDCDrew Farris
 
Data_base.pptx
Data_base.pptxData_base.pptx
Data_base.pptxMohit89650
 
Scaling Machine Learning Feature Engineering in Apache Spark at Facebook
Scaling Machine Learning Feature Engineering in Apache Spark at FacebookScaling Machine Learning Feature Engineering in Apache Spark at Facebook
Scaling Machine Learning Feature Engineering in Apache Spark at FacebookDatabricks
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD VivaAidan Hogan
 
Hive_An Brief Introduction to HIVE_BIGDATAANALYTICS
Hive_An Brief Introduction to HIVE_BIGDATAANALYTICSHive_An Brief Introduction to HIVE_BIGDATAANALYTICS
Hive_An Brief Introduction to HIVE_BIGDATAANALYTICSRUHULAMINHAZARIKA
 
Performing Data Science with HBase
Performing Data Science with HBasePerforming Data Science with HBase
Performing Data Science with HBaseWibiData
 
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive TeamHive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive TeamZheng Shao
 
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSAWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSCobus Bernard
 

Similar to Hive Bucketing in Apache Spark (20)

Spark SQL Bucketing at Facebook
 Spark SQL Bucketing at Facebook Spark SQL Bucketing at Facebook
Spark SQL Bucketing at Facebook
 
Scala in practice - 3 years later
Scala in practice - 3 years laterScala in practice - 3 years later
Scala in practice - 3 years later
 
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...
 
Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...
Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...
Migrating Apache Hive Workload to Apache Spark: Bridge the Gap with Zhan Zhan...
 
Table Retrieval and Generation
Table Retrieval and GenerationTable Retrieval and Generation
Table Retrieval and Generation
 
Paris jug ksql - 2018-06-28
Paris jug ksql - 2018-06-28Paris jug ksql - 2018-06-28
Paris jug ksql - 2018-06-28
 
SQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedSQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - Advanced
 
Apache Hive, data segmentation and bucketing
Apache Hive, data segmentation and bucketingApache Hive, data segmentation and bucketing
Apache Hive, data segmentation and bucketing
 
Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...
Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...
Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Database Programming
Database ProgrammingDatabase Programming
Database Programming
 
Mahout Introduction BarCampDC
Mahout Introduction BarCampDCMahout Introduction BarCampDC
Mahout Introduction BarCampDC
 
Data_base.pptx
Data_base.pptxData_base.pptx
Data_base.pptx
 
Scaling Machine Learning Feature Engineering in Apache Spark at Facebook
Scaling Machine Learning Feature Engineering in Apache Spark at FacebookScaling Machine Learning Feature Engineering in Apache Spark at Facebook
Scaling Machine Learning Feature Engineering in Apache Spark at Facebook
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD Viva
 
Hive_An Brief Introduction to HIVE_BIGDATAANALYTICS
Hive_An Brief Introduction to HIVE_BIGDATAANALYTICSHive_An Brief Introduction to HIVE_BIGDATAANALYTICS
Hive_An Brief Introduction to HIVE_BIGDATAANALYTICS
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Performing Data Science with HBase
Performing Data Science with HBasePerforming Data Science with HBase
Performing Data Science with HBase
 
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive TeamHive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive Team
 
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSAWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
 

Recently uploaded

Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 

Hive Bucketing in Apache Spark

  • 1. Hive Bucketing in Apache Spark Tejas Patil Facebook
  • 2. Agenda • Why bucketing ? • Why is shuffle bad ? • How to avoid shuffle ? • When to use bucketing ? • Spark’s bucketing support • Bucketing semantics of Spark vs Hive • Hive bucketing support in Spark • SQL Planner improvements
  • 4. Table A Table B . . . . . . . . . . . .JOIN
  • 5. Table A Table B . . . . . . . . . . . .JOIN Broadcast hash join - Ship smaller table to all nodes - Stream the other table
  • 6. Table A Table B . . . . . . . . . . . .JOIN Broadcast hash join Shuffle hash join - Ship smaller table to all nodes - Stream the other table - Shuffle both tables, - Hash smaller one, stream the bigger one
  • 7. Table A Table B . . . . . . . . . . . .JOIN Broadcast hash join Shuffle hash join - Ship smaller table to all nodes - Stream the other table - Shuffle both tables, - Hash smaller one, stream the bigger one Sort merge join - Shuffle both tables, - Sort both tables, buffer one, stream the bigger one
  • 8. Table A Table B . . . . . . . . . . . .JOIN Broadcast hash join Shuffle hash join Sort merge join - Ship smaller table to all nodes - Stream the other table - Shuffle both tables, - Hash smaller one, stream the bigger one - Shuffle both tables, - Sort both tables, buffer one, stream the bigger one
  • 9. Table A Table B . . . . . . . . . . . . Sort Merge Join
  • 10. Table A Table B . . . . . . . . . . . . Shuffle ShuffleShuffleShuffle Sort Merge Join
  • 11. Table A Table B . . . . . . . . . . . . Sort Sort Sort Sort Shuffle ShuffleShuffleShuffle Sort Merge Join
  • 12. Table A Table B . . . . . . . . . . . . Sort Join Sort Join Sort Join Sort Join Shuffle ShuffleShuffleShuffle Sort Merge Join
  • 14. Why is shuffle bad ?
  • 15. Table A Table B . . . . . . . . . . . . Disk Disk Disk IO for shuffle outputs Why is shuffle bad ?
  • 16. Table A Table B . . . . . . . . . . . . Shuffle Shuffle Shuffle Shuffle M * R network connections Why is shuffle bad ?
  • 17. How to avoid shuffle ?
  • 18.
  • 19. student s JOIN attendance a ON s.id = a.student_id student s JOIN results r ON s.id = r.student_id student s JOIN course_registration c ON s.id = c.student_id …….. ……..
  • 20. student s JOIN attendance a ON s.id = a.student_id student s JOIN results r ON s.id = r.student_id
  • 21. student s JOIN attendance a ON s.id = a.student_id student s JOIN results r ON s.id = r.student_id Pre-compute at table creation
  • 22. Bucketing = pre-(shuffle + sort) inputs on join keys
  • 25. Performance comparison 0 100 200 300 400 500 600 700 800 900 Before After Bucketing Reserved CPU time (days) 0 1 2 3 4 5 6 7 8 Before After Bucketing Latency (hours)
  • 26. When to use bucketing ? • Tables used frequently in JOINs with same key • Student -> student_id • Employee -> employee_id • Users -> user_id
  • 27. When to use bucketing ? • Tables used frequently in JOINs with same key • Student -> student_id • Employee -> employee_id • Users -> user_id …. => Dimension tables
  • 28. When to use bucketing ? • Tables used frequently in JOINs with same key • Student -> student_id • Employee -> employee_id • Users -> user_id • Loading daily cumulative tables • Both base and delta tables could be bucketed on a common column
  • 29. When to use bucketing ? • Tables used frequently in JOINs with same key • Student -> student_id • Employee -> employee_id • Users -> user_id • Loading daily cumulative tables • Both base and delta tables could be bucketed on a common column • Indexing capability
  • 31.
  • 32. Creation of bucketed tables df.write .bucketBy(numBuckets, "col1", ...) .sortBy("col1", ...) .saveAsTable("bucketed_table") via Dataframe API
  • 33. Creation of bucketed tables CREATE TABLE bucketed_table( column1 INT, ... ) USING parquet CLUSTERED BY (column1, …) SORTED BY (column1, …) INTO `n` BUCKETS via SQL statement
  • 34. Check bucketing spec scala> sparkContext.sql("DESC FORMATTED student").collect.foreach(println) [# col_name,data_type,comment] [student_id,int,null] [name,int,null] [# Detailed Table Information,,] [Database,default,] [Table,table1,] [Owner,tejas,] [Created,Fri May 12 08:06:33 PDT 2017,] [Type,MANAGED,] [Num Buckets,64,] [Bucket Columns,[`student_id`],] [Sort Columns,[`student_id`],] [Properties,[serialization.format=1],] [Serde Library,org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,] [InputFormat,org.apache.hadoop.mapred.SequenceFileInputFormat,]
  • 36. [SPARK-15453] Extract bucketing info in FileSourceScanExec SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j
  • 37. [SPARK-15453] Extract bucketing info in FileSourceScanExec SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j Sort Merge Join Sort Exchange TableScan Sort Exchange TableScan Already bucketed on columns (i, j) Shuffle on columns (i, j) Sort on columns (i, j) Join on [i,j], [i,j] respectively of relations
  • 38. [SPARK-15453] Extract bucketing info in FileSourceScanExec SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j Sort Merge Join Sort Exchange TableScan Sort Exchange TableScan Already bucketed on columns (i, j) Shuffle on columns (i, j) Sort on columns (i, j) Join on [i,j], [i,j] respectively of relations
  • 39. [SPARK-15453] Extract bucketing info in FileSourceScanExec SELECT * FROM tableA JOIN tableB ON tableA.i= tableB.i AND tableA.j= tableB.j Sort Merge Join TableScan TableScan Sort Merge Join TableScan TableScanSort Exchange Sort Exchange
  • 41. Bucketing semantics of Spark vs Hive bucket 0 bucket 1 bucket (n-1) ….. Hive my_bucketed_table
  • 42. Bucketing semantics of Spark vs Hive bucket 0 bucket 1 bucket (n-1) ….. Hive Spark bucket 0 bucket 1 bucket (n-1) ….. bucket 0 bucket 0 bucket 1 bucket (n-1)bucket (n-1)bucket (n-1) my_bucketed_tablemy_bucketed_table
  • 43. Bucketing semantics of Spark vs Hive bucket 0 bucket 1 bucket (n-1) ….. Hive M M M M M R R R …..
  • 44. Bucketing semantics of Spark vs Hive bucket 0 bucket 1 bucket (n-1) ….. Hive Spark M M M M M R R R ….. bucket 0 bucket 1 bucket (n-1) ….. M M M M M….. bucket 0 bucket 0 bucket 1 bucket (n-1)bucket (n-1)bucket (n-1)
  • 45. Bucketing semantics of Spark vs Hive Hive Spark Model Optimizes reads, writes are costly Writes are cheaper, reads are costlier
  • 46. Bucketing semantics of Spark vs Hive Hive Spark Model Optimizes reads, writes are costly Writes are cheaper, reads are costlier Unit of bucketing A single file represents a single bucket A collection of files together comprise a single bucket
  • 47. Bucketing semantics of Spark vs Hive Hive Spark Model Optimizes reads, writes are costly Writes are cheaper, reads are costlier Unit of bucketing A single file represents a single bucket A collection of files together comprise a single bucket Sort ordering Each bucket is sorted globally. This makes it easy to optimize queries reading this data Each file is individually sorted but the “bucket” as a whole is not globally sorted
  • 48. Bucketing semantics of Spark vs Hive Hive Spark Model Optimizes reads, writes are costly Writes are cheaper, reads are costlier Unit of bucketing A single file represents a single bucket A collection of files together comprise a single bucket Sort ordering Each bucket is sorted globally. This makes it easy to optimize queries reading this data Each file is individually sorted but the “bucket” as a whole is not globally sorted Hashing function Hive’s inbuilt hash Murmur3Hash
  • 49. [SPARK-19256] Hive bucketing support • Introduce Hive’s hashing function [SPARK-17495] • Enable creating hive bucketed tables [SPARK-17729] • Support Hive’s `Bucketed file format` • Propagate Hive bucketing information to planner [SPARK-17654] • Expressing outputPartitioning and requiredChildDistribution • Creating empty bucket files when no data for a bucket • Allow Sort Merge join over tables with number of buckets multiple of each other • Support N-way Sort Merge Join
  • 50. [SPARK-19256] Hive bucketing support • Introduce Hive’s hashing function [SPARK-17495] • Enable creating hive bucketed tables [SPARK-17729] • Support Hive’s `Bucketed file format` • Propagate Hive bucketing information to planner [SPARK-17654] • Expressing outputPartitioning and requiredChildDistribution • Creating empty bucket files when no data for a bucket • Allow Sort Merge join over tables with number of buckets multiple of each other • Support N-way Sort Merge Join Merged in upstream
  • 51. [SPARK-19256] Hive bucketing support • Introduce Hive’s hashing function [SPARK-17495] • Enable creating hive bucketed tables [SPARK-17729] • Support Hive’s `Bucketed file format` • Propagate Hive bucketing information to planner [SPARK-17654] • Expressing outputPartitioning and requiredChildDistribution • Creating empty bucket files when no data for a bucket • Allow Sort Merge join over tables with number of buckets multiple of each other • Support N-way Sort Merge Join FB-prod (6 months)
  • 53. [SPARK-19122] Unnecessary shuffle+sort added if join predicates ordering differ from bucketing and sorting order SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y
  • 54. [SPARK-19122] Unnecessary shuffle+sort added if join predicates ordering differ from bucketing and sorting order Sort Merge Join TableScan TableScan SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y
  • 55. [SPARK-19122] Unnecessary shuffle+sort added if join predicates ordering differ from bucketing and sorting order SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y Sort Merge Join TableScan TableScan SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
  • 56. [SPARK-19122] Unnecessary shuffle+sort added if join predicates ordering differ from bucketing and sorting order Sort Merge Join Exchange TableScan Sort Exchange TableScan Sort Sort Merge Join TableScan TableScan SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
  • 57. [SPARK-19122] Unnecessary shuffle+sort added if join predicates ordering differ from bucketing and sorting order Sort Merge Join Exchange TableScan Sort Exchange TableScan Sort Sort Merge Join TableScan TableScan SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x
  • 58. [SPARK-19122] Unnecessary shuffle+sort added if join predicates ordering differ from bucketing and sorting order Sort Merge Join TableScan TableScan SELECT * FROM a JOIN b ON a.x=b.x AND a.y=b.y SELECT * FROM a JOIN b ON a.y=b.y AND a.x=b.x Sort Merge Join TableScan TableScan
  • 59. [SPARK-17271] Planner adds un-necessary Sort even if child ordering is semantically same as required ordering SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1
  • 60. [SPARK-17271] Planner adds un-necessary Sort even if child ordering is semantically same as required ordering SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1 Sort Merge Join Sort TableScan Sort TableScan Already bucketed on column `col1` Sort on columns (a.col1 and b.col1) Join on [col1], [col1] respectively of relations
  • 61. [SPARK-17271] Planner adds un-necessary Sort even if child ordering is semantically same as required ordering SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1 Sort Merge Join TableScan TableScan Already bucketed on column `col1` Sort on columns (a.col1 and b.col1)Sort Sort Join on [col1], [col1] respectively of relations
  • 62. [SPARK-17271] Planner adds un-necessary Sort even if child ordering is semantically same as required ordering SELECT * FROM table1 a JOIN table2 b ON a.col1=b.col1 Sort Merge Join TableScan TableScan Sort Sort Sort Merge Join TableScan TableScan
  • 63. [SPARK-17698] Join predicates should not contain filter clauses SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1'
  • 64. [SPARK-17698] Join predicates should not contain filter clauses SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1' Sort Merge Join TableScan TableScan Sort Exchange Sort Exchange Already bucketed on column [id] Shuffle on columns [id, 1, id] and [id, id, 1] Sort on columns [id, id, 1] and [id, 1, id] Join on [id, id, 1], [id, 1, id] respectively of relations
  • 65. [SPARK-17698] Join predicates should not contain filter clauses SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1' Sort Merge Join TableScan TableScan Sort Exchange Sort Exchange Already bucketed on column [id] Shuffle on columns [id, 1, id] and [id, id, 1] Sort on columns [id, id, 1] and [id, 1, id] Join on [id, id, 1], [id, 1, id] respectively of relations
  • 66. [SPARK-17698] Join predicates should not contain filter clauses SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1' Sort Merge Join TableScan TableScan Sort Exchange Sort Exchange Already bucketed on column [id] Shuffle on columns [id, 1, id] and [id, id, 1] Sort on columns [id, id, 1] and [id, 1, id] Join on [id, id, 1], [id, 1, id] respectively of relations
  • 67. [SPARK-17698] Join predicates should not contain filter clauses SELECT a.id, b.id FROM table1 a FULL OUTER JOIN table2 b ON a.id = b.id AND a.id='1' AND b.id='1' Sort Merge Join TableScan TableScan Sort Merge Join TableScan TableScanSort Exchange Sort Exchange
  • 69. [SPARK-13450] Introduce ExternalAppendOnlyUnsafeRowArray Buffered relation Streamed relation If there are a lot of rows to be buffered (eg. Skew), the query would OOM row with key=`foo` rows with key=`foo`
  • 70. [SPARK-13450] Introduce ExternalAppendOnlyUnsafeRowArray Buffered relation Streamed relation row with key=`foo` rows with key=`foo` Disk buffer would spill to disk after reaching a threshold spill
  • 71. Summary • Shuffle and sort is costly due to disk and network IO • Bucketing will pre-(shuffle and sort) the inputs • Expect at least 2-5x gains after bucketing input tables for joins • Candidates for bucketing: • Tables used frequently in JOINs with same key • Loading of data in cumulative tables • Spark’s bucketing is not compatible with Hive’s bucketing