SlideShare a Scribd company logo
1 of 28
Download to read offline
T R E A S U R E D A T A
USER DEFINED PARTITIONING
A New Partitioning Strategy accelerating CDP Workload
Kai Sasaki
Software Engineer in Treasure Data
ABOUT ME
- Kai Sasaki (@Lewuathe)
- Software Engineer in Treasure Data since 2015
Working in Query Engine Team (Managing Hive, Presto in Treasure data)
- Contributor of Hadoop, Spark, Presto
TOPICS
PlazmaDB
PlazmaDB is a metadata storage for all log data in
Treasure Data. It supports import, export, INSERT
INTO, CREATE TABLE, DELETE etc on top of
PostgreSQL transaction mechanism.
Time Index Partitioning
Partitioning log data by the time log generated. The
time when the log is generated is specified as “time”
column in Treasure Data. It enables us to skip to
read unnecessary partitions.
User Defined Partitioning
(New!)
In addition to “time” column, we can use any
column as partitioning key. It provides us more
flexible partitioning strategy that fits CDP workload.
OVERVIEW OF QUERY ENGINE IN TD
PRESTO IN TREASURE DATA
• Multiple clusters with 50~60 worker cluster
• Presto 0.188
Stats
• 4.3+ million queries / month
• 400 trillion records / month
• 6+ PB / month
At the end of 2017
HIVE AND PRESTO ON PLAZMADB
Bulk Import
Fluentd
Mobile SDK
PlazmaDB
Presto
Hive
SQL, CDP
Amazon S3
PLAZMADB
PlazmaDB
Amazon S3
id data_set_id first_index_key last_index_key record_count path
P1 3065124 187250 1412323028 1412385139 109 abcdefg-1234567-abcdefg-1234567
P2 3065125 187250 1412323030 1412324030 209 abcdefg-1234567-abcdefg-9182841
P3 3065126 187250 1412327028 1412328028 31 abcdefg-1234567-abcdefg-5818231
P4 3065127 187250 1412325011 1412326001 102 abcdefg-1234567-abcdefg-7271828
P5 3065128 281254 1412324214 1412325210 987 abcdefg-1234567-abcdefg-6717284
P6 3065129 281254 1412325123 1412329800 541 abcdefg-1234567-abcdefg-5717274
Multi Column Indexes
s3://plazma-partitions/…
1-hour partitioning
PLAZMADB
PlazmaDB
Amazon S3
Realtime Storage
Amazon S3
Archive StorageMapReduce
Keeps 1-hour partitioning periodically.
Time-Indexed Partitioning
PROBLEM
• Time index partitioning is efficient only when “time” value is specified.

Specifying other columns cause full-scan which can make 

performance worse.
• The number of records in a partition highly depends on the table type, user usage.
SELECT
COUNT(1)
FROM table
WHERE
user_id = 1;
id data_set_id first_index_key last_index_key record_count path
P1 3065124 100 1412323028 1412385139 1 abcdefg-1234567-abcdefg-1234567
P2 3065125 100 1412323030 1412324030 1 abcdefg-1234567-abcdefg-9182841
P3 3065126 100 1412327028 1412328028 1 abcdefg-1234567-abcdefg-5818231
P4 3065127 200 1412325011 1412326001 101021 abcdefg-1234567-abcdefg-7271828
USER DEFINED PARTITIONING
USER DEFINED PARTITIONING
• User can specify the partitioning strategy based on their usage using partitioning key column 

max time range.
1h 1h 1h 1h1h
time
v1
v2
v3
c1
USER DEFINED PARTITIONING
1h 1h 1h 1h1h
time
c1
v1
v2
v3
… WHERE c1 = ‘v1’ AND time = …
• User can specify the partitioning strategy based on their usage using partitioning key column 

max time range.
USER DEFINED PARTITIONING
1h 1h 1h 1h1h
time
c1
v1
v2
v3
… WHERE c1 = ‘v1’ AND time = …
1h 1h 1h 1h1h
time
c1
v1
v2
v3
… WHERE c1 = ‘v1’ AND time = …
• User can specify the partitioning strategy based on their usage using partitioning key column 

max time range.
USER DEFINED PARTITIONING
CREATE TABLE via Presto or Hive
Insert data partitioned by set partitioning key
Set user defined configuration
The number of bucket, hash function, partitioning key
Read the data from UDP table
UDP table is now visible via Presto and HiveLOG
USER DEFINED CONFIGURATION
• We need to set columns to be used as partitioning key and the number of partitions. 

It should be custom configuration by each user.
user_table_id columns bucket_count partiton_function
T1 141849 [["o_orderkey","long"]] 32 hash
T2 141850 [[“user_id","long"]] 32 hash
T3 141910 [[“item_id”,”long"]] 16 hash
T4 151242
[[“region_id”,”long"],
[“device_id”,”long”]]
256 hash
CREATE UDP TABLE VIA PRESTO
• Presto and Hive support CREATE TABLE/INSERT INTO on UDP table
CREATE TABLE udp_customer
WITH (
bucketed_on = array[‘customer_id’],
bucket_count = 128
)
AS SELECT * from normal_customer;
CREATE UDP TABLE VIA PRESTO
• Override ConnectorPageSink to write MPC1 file based on user defined partitioning key.
PlazmaPageSink
PartitionedMPCWriter
TimeRangeMPCWriter
TimeRangeMPCWriter
TimeRangeMPCWriter
BufferedMPCWriter
BufferedMPCWriter
BufferedMPCWriter
.
.
.
b1
b2
b3
Page
1h
1h
1h
CREATE UDP TABLE VIA PRESTO
• Override ConnectorPageSink to write MPC1 file based on user defined partitioning key.
PlazmaPageSink
PartitionedMPCWriter
TimeRangeMPCWriter
TimeRangeMPCWriter
TimeRangeMPCWriter
BufferedMPCWriter
BufferedMPCWriter
BufferedMPCWriter
.
.
.
Page
CREATE UDP TABLE VIA PRESTO
id data_set_id first_index_key last_index_key record_count path
bucket_
number
P1 3065124 187250 1412323028 1412385139 109 abcdefg-1234567-abcdefg-1234567 1
P2 3065125 187250 1412323030 1412324030 209 abcdefg-1234567-abcdefg-9182841 2
P3 3065126 187250 1412327028 1412328028 31 abcdefg-1234567-abcdefg-5818231 3
P4 3065127 187250 1412325011 1412326001 102 abcdefg-1234567-abcdefg-7271828 2
P5 3065128 281254 1412324214 1412325210 987 abcdefg-1234567-abcdefg-6717284 16
P6 3065129 281254 1412325123 1412329800 541 abcdefg-1234567-abcdefg-5717274 14
• New bucket_number column is added to partition record in PlazmaDB.
READ DATA FROM UDP TABLE
ConnectorSplitManager#getSplits
returns data source splits to be read by Presto
cluster.
Decide target bucket from constraint
Constraint specifies the range should be read from
the table. ConnectorSplitManager asks PlazmaDB to
get the partitions on the target bucket.
Override Presto Connector to data source
Presto provides a plugin mechanism to connect any
data source flexibly. The connector provides the
information about metadata and location of real data
source, UDFs.
Receive constraint as TupleDomain
TupleDomain is created from query plan and
passed through TableLayout which is available
in ConnectorSplitManager
READ DATA FROM UDP TABLE
SplitManager
PlazmaDB
TableLayout
SQL
constraint
Map<ColumnHandle, Domain>
Distribute PageSource
… WHERE bucker_number in () …
PERFORMANCE
PERFORMANCE COMPARISON
SQLs on TPC-H (scaled factor=1000)
elapsedtime(sec)
0 sec
75 sec
150 sec
225 sec
300 sec
count1_filter groupby hashjoin
87.279
36.569
1.04
266.71
69.374
19.478
NORMAL UDP
COLOCATED JOIN
time
left right
l1 r1 l2 r2 l3 r3
left right left right
time
Distributed Join
l1 r1
l1 r1 l2 r2 l3 r3
l2 r2 l3 r3
Colocated Join
PERFORMANCE COMPARISON
SQLs on TPC-H (scaled factor=1000)
elapsedtime
0 sec
20 sec
40 sec
60 sec
80 sec
between mod_predicate count_distinct
NORMAL UDP
USER DEFINED PARTITIONING
1h 1h 1h 1h1h
time
c1
v1
v2
v3
… WHERE time = …
1h 1h 1h 1h1h
time
c1
v1
v2
v3
… WHERE time = …
FUTURE WORKS
• Maintaining efficient partitioning structure
• Developing Stella job to rearranging partitioning schema flexibly by using Presto resource.
• Various kinds of pipeline (streaming import etc) should support UDP table.
• Documentation
T R E A S U R E D A T A

More Related Content

What's hot

Presto in my_use_case
Presto in my_use_casePresto in my_use_case
Presto in my_use_casewyukawa
 
How to Make Norikra Perfect
How to Make Norikra PerfectHow to Make Norikra Perfect
How to Make Norikra PerfectSATOSHI TAGOMORI
 
Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Sadayuki Furuhashi
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageSATOSHI TAGOMORI
 
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...PROIDEA
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
 
Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraSATOSHI TAGOMORI
 
Druid at naver.com - part 1
Druid at naver.com - part 1Druid at naver.com - part 1
Druid at naver.com - part 1Jungsu Heo
 
To Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT ToTo Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT ToSATOSHI TAGOMORI
 
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...Databricks
 
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageSATOSHI TAGOMORI
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale Hakka Labs
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustAltinity Ltd
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...Altinity Ltd
 
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...Аліна Шепшелей
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comJungsu Heo
 
Building a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrBuilding a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrlucenerevolution
 
PGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from TrenchesPGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from TrenchesPGConf APAC
 

What's hot (20)

Presto in my_use_case
Presto in my_use_casePresto in my_use_case
Presto in my_use_case
 
Presto+MySQLで分散SQL
Presto+MySQLで分散SQLPresto+MySQLで分散SQL
Presto+MySQLで分散SQL
 
How to Make Norikra Perfect
How to Make Norikra PerfectHow to Make Norikra Perfect
How to Make Norikra Perfect
 
Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
 
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using El...
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
Distributed Logging Architecture in Container Era
Distributed Logging Architecture in Container EraDistributed Logging Architecture in Container Era
Distributed Logging Architecture in Container Era
 
Druid at naver.com - part 1
Druid at naver.com - part 1Druid at naver.com - part 1
Druid at naver.com - part 1
 
To Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT ToTo Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT To
 
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...
Building a Versatile Analytics Pipeline on Top of Apache Spark with Mikhail C...
 
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in Telco
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
 
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.com
 
Building a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrBuilding a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solr
 
PGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from TrenchesPGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from Trenches
 

Similar to User Defined Partitioning on PlazmaDB

Optimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud StorageOptimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud StorageKai Sasaki
 
Real World Storage in Treasure Data
Real World Storage in Treasure DataReal World Storage in Treasure Data
Real World Storage in Treasure DataKai Sasaki
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 
Sql 2016 - What's New
Sql 2016 - What's NewSql 2016 - What's New
Sql 2016 - What's Newdpcobb
 
Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"Lviv Startup Club
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.Keshav Murthy
 
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Serban Tanasa
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQLSatoshi Nagayasu
 
No more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in productionNo more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in productionChetan Khatri
 
Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAiougVizagChapter
 
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)Amazon Web Services
 
How to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdataHow to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdataN Masahiro
 
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10Altinity Ltd
 
It Depends - Database admin for developers - Rev 20151205
It Depends - Database admin for developers - Rev 20151205It Depends - Database admin for developers - Rev 20151205
It Depends - Database admin for developers - Rev 20151205Maggie Pint
 
London Redshift Meetup - July 2017
London Redshift Meetup - July 2017London Redshift Meetup - July 2017
London Redshift Meetup - July 2017Pratim Das
 
Save Coding Time with Proc SQL.ppt
Save Coding Time with Proc SQL.pptSave Coding Time with Proc SQL.ppt
Save Coding Time with Proc SQL.pptssuser660bb1
 

Similar to User Defined Partitioning on PlazmaDB (20)

Optimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud StorageOptimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud Storage
 
PHP tips by a MYSQL DBA
PHP tips by a MYSQL DBAPHP tips by a MYSQL DBA
PHP tips by a MYSQL DBA
 
Real World Storage in Treasure Data
Real World Storage in Treasure DataReal World Storage in Treasure Data
Real World Storage in Treasure Data
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Sql 2016 - What's New
Sql 2016 - What's NewSql 2016 - What's New
Sql 2016 - What's New
 
Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.
 
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
 
Oracle Tracing
Oracle TracingOracle Tracing
Oracle Tracing
 
No more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in productionNo more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in production
 
Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_features
 
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
 
How to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdataHow to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdata
 
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
 
It Depends - Database admin for developers - Rev 20151205
It Depends - Database admin for developers - Rev 20151205It Depends - Database admin for developers - Rev 20151205
It Depends - Database admin for developers - Rev 20151205
 
It Depends
It DependsIt Depends
It Depends
 
London Redshift Meetup - July 2017
London Redshift Meetup - July 2017London Redshift Meetup - July 2017
London Redshift Meetup - July 2017
 
Save Coding Time with Proc SQL.ppt
Save Coding Time with Proc SQL.pptSave Coding Time with Proc SQL.ppt
Save Coding Time with Proc SQL.ppt
 

More from Kai Sasaki

Graviton 2で実現する
コスト効率のよいCDP基盤
Graviton 2で実現する
コスト効率のよいCDP基盤Graviton 2で実現する
コスト効率のよいCDP基盤
Graviton 2で実現する
コスト効率のよいCDP基盤Kai Sasaki
 
Infrastructure for auto scaling distributed system
Infrastructure for auto scaling distributed systemInfrastructure for auto scaling distributed system
Infrastructure for auto scaling distributed systemKai Sasaki
 
Continuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisContinuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisKai Sasaki
 
Recent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future PrestoRecent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future PrestoKai Sasaki
 
20180522 infra autoscaling_system
20180522 infra autoscaling_system20180522 infra autoscaling_system
20180522 infra autoscaling_systemKai Sasaki
 
Deep dive into deeplearn.js
Deep dive into deeplearn.jsDeep dive into deeplearn.js
Deep dive into deeplearn.jsKai Sasaki
 
Presto updates to 0.178
Presto updates to 0.178Presto updates to 0.178
Presto updates to 0.178Kai Sasaki
 
How to ensure Presto scalability 
in multi use case
How to ensure Presto scalability 
in multi use case How to ensure Presto scalability 
in multi use case
How to ensure Presto scalability 
in multi use case Kai Sasaki
 
Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0Kai Sasaki
 
Embulk makes Japan visible
Embulk makes Japan visibleEmbulk makes Japan visible
Embulk makes Japan visibleKai Sasaki
 
Maintainable cloud architecture_of_hadoop
Maintainable cloud architecture_of_hadoopMaintainable cloud architecture_of_hadoop
Maintainable cloud architecture_of_hadoopKai Sasaki
 
図でわかるHDFS Erasure Coding
図でわかるHDFS Erasure Coding図でわかるHDFS Erasure Coding
図でわかるHDFS Erasure CodingKai Sasaki
 
Spark MLlib code reading ~optimization~
Spark MLlib code reading ~optimization~Spark MLlib code reading ~optimization~
Spark MLlib code reading ~optimization~Kai Sasaki
 
How I tried MADE
How I tried MADEHow I tried MADE
How I tried MADEKai Sasaki
 
Reading kernel org
Reading kernel orgReading kernel org
Reading kernel orgKai Sasaki
 
Kernel bootstrap
Kernel bootstrapKernel bootstrap
Kernel bootstrapKai Sasaki
 
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案Kai Sasaki
 
Kernel resource
Kernel resourceKernel resource
Kernel resourceKai Sasaki
 

More from Kai Sasaki (20)

Graviton 2で実現する
コスト効率のよいCDP基盤
Graviton 2で実現する
コスト効率のよいCDP基盤Graviton 2で実現する
コスト効率のよいCDP基盤
Graviton 2で実現する
コスト効率のよいCDP基盤
 
Infrastructure for auto scaling distributed system
Infrastructure for auto scaling distributed systemInfrastructure for auto scaling distributed system
Infrastructure for auto scaling distributed system
 
Continuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisContinuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData Analysis
 
Recent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future PrestoRecent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future Presto
 
20180522 infra autoscaling_system
20180522 infra autoscaling_system20180522 infra autoscaling_system
20180522 infra autoscaling_system
 
Deep dive into deeplearn.js
Deep dive into deeplearn.jsDeep dive into deeplearn.js
Deep dive into deeplearn.js
 
Presto updates to 0.178
Presto updates to 0.178Presto updates to 0.178
Presto updates to 0.178
 
How to ensure Presto scalability 
in multi use case
How to ensure Presto scalability 
in multi use case How to ensure Presto scalability 
in multi use case
How to ensure Presto scalability 
in multi use case
 
Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0
 
Embulk makes Japan visible
Embulk makes Japan visibleEmbulk makes Japan visible
Embulk makes Japan visible
 
Maintainable cloud architecture_of_hadoop
Maintainable cloud architecture_of_hadoopMaintainable cloud architecture_of_hadoop
Maintainable cloud architecture_of_hadoop
 
図でわかるHDFS Erasure Coding
図でわかるHDFS Erasure Coding図でわかるHDFS Erasure Coding
図でわかるHDFS Erasure Coding
 
Spark MLlib code reading ~optimization~
Spark MLlib code reading ~optimization~Spark MLlib code reading ~optimization~
Spark MLlib code reading ~optimization~
 
How I tried MADE
How I tried MADEHow I tried MADE
How I tried MADE
 
Reading kernel org
Reading kernel orgReading kernel org
Reading kernel org
 
Reading drill
Reading drillReading drill
Reading drill
 
Kernel ext4
Kernel ext4Kernel ext4
Kernel ext4
 
Kernel bootstrap
Kernel bootstrapKernel bootstrap
Kernel bootstrap
 
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
HyperLogLogを用いた、異なり数に基づく
 省リソースなk-meansの
k決定アルゴリズムの提案
 
Kernel resource
Kernel resourceKernel resource
Kernel resource
 

Recently uploaded

11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdfHafizMudaserAhmad
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Sumanth A
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...Erbil Polytechnic University
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
70 POWER PLANT IAE V2500 technical training
70 POWER PLANT IAE V2500 technical training70 POWER PLANT IAE V2500 technical training
70 POWER PLANT IAE V2500 technical trainingGladiatorsKasper
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsResearcher Researcher
 
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书rnrncn29
 
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdfAkritiPradhan2
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptJohnWilliam111370
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfBalamuruganV28
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosVictor Morales
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Communityprachaibot
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTSneha Padhiar
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSneha Padhiar
 
List of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfList of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfisabel213075
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptxmohitesoham12
 

Recently uploaded (20)

11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
70 POWER PLANT IAE V2500 technical training
70 POWER PLANT IAE V2500 technical training70 POWER PLANT IAE V2500 technical training
70 POWER PLANT IAE V2500 technical training
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending Actuators
 
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
 
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdf
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitos
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Community
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
 
List of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfList of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdf
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptx
 

User Defined Partitioning on PlazmaDB

  • 1. T R E A S U R E D A T A USER DEFINED PARTITIONING A New Partitioning Strategy accelerating CDP Workload Kai Sasaki Software Engineer in Treasure Data
  • 2. ABOUT ME - Kai Sasaki (@Lewuathe) - Software Engineer in Treasure Data since 2015 Working in Query Engine Team (Managing Hive, Presto in Treasure data) - Contributor of Hadoop, Spark, Presto
  • 3. TOPICS PlazmaDB PlazmaDB is a metadata storage for all log data in Treasure Data. It supports import, export, INSERT INTO, CREATE TABLE, DELETE etc on top of PostgreSQL transaction mechanism. Time Index Partitioning Partitioning log data by the time log generated. The time when the log is generated is specified as “time” column in Treasure Data. It enables us to skip to read unnecessary partitions. User Defined Partitioning (New!) In addition to “time” column, we can use any column as partitioning key. It provides us more flexible partitioning strategy that fits CDP workload.
  • 4. OVERVIEW OF QUERY ENGINE IN TD
  • 5. PRESTO IN TREASURE DATA • Multiple clusters with 50~60 worker cluster • Presto 0.188 Stats • 4.3+ million queries / month • 400 trillion records / month • 6+ PB / month At the end of 2017
  • 6. HIVE AND PRESTO ON PLAZMADB Bulk Import Fluentd Mobile SDK PlazmaDB Presto Hive SQL, CDP Amazon S3
  • 7. PLAZMADB PlazmaDB Amazon S3 id data_set_id first_index_key last_index_key record_count path P1 3065124 187250 1412323028 1412385139 109 abcdefg-1234567-abcdefg-1234567 P2 3065125 187250 1412323030 1412324030 209 abcdefg-1234567-abcdefg-9182841 P3 3065126 187250 1412327028 1412328028 31 abcdefg-1234567-abcdefg-5818231 P4 3065127 187250 1412325011 1412326001 102 abcdefg-1234567-abcdefg-7271828 P5 3065128 281254 1412324214 1412325210 987 abcdefg-1234567-abcdefg-6717284 P6 3065129 281254 1412325123 1412329800 541 abcdefg-1234567-abcdefg-5717274 Multi Column Indexes s3://plazma-partitions/… 1-hour partitioning
  • 8. PLAZMADB PlazmaDB Amazon S3 Realtime Storage Amazon S3 Archive StorageMapReduce Keeps 1-hour partitioning periodically. Time-Indexed Partitioning
  • 9. PROBLEM • Time index partitioning is efficient only when “time” value is specified.
 Specifying other columns cause full-scan which can make 
 performance worse. • The number of records in a partition highly depends on the table type, user usage. SELECT COUNT(1) FROM table WHERE user_id = 1; id data_set_id first_index_key last_index_key record_count path P1 3065124 100 1412323028 1412385139 1 abcdefg-1234567-abcdefg-1234567 P2 3065125 100 1412323030 1412324030 1 abcdefg-1234567-abcdefg-9182841 P3 3065126 100 1412327028 1412328028 1 abcdefg-1234567-abcdefg-5818231 P4 3065127 200 1412325011 1412326001 101021 abcdefg-1234567-abcdefg-7271828
  • 11. USER DEFINED PARTITIONING • User can specify the partitioning strategy based on their usage using partitioning key column 
 max time range. 1h 1h 1h 1h1h time v1 v2 v3 c1
  • 12. USER DEFINED PARTITIONING 1h 1h 1h 1h1h time c1 v1 v2 v3 … WHERE c1 = ‘v1’ AND time = … • User can specify the partitioning strategy based on their usage using partitioning key column 
 max time range.
  • 13. USER DEFINED PARTITIONING 1h 1h 1h 1h1h time c1 v1 v2 v3 … WHERE c1 = ‘v1’ AND time = … 1h 1h 1h 1h1h time c1 v1 v2 v3 … WHERE c1 = ‘v1’ AND time = … • User can specify the partitioning strategy based on their usage using partitioning key column 
 max time range.
  • 14. USER DEFINED PARTITIONING CREATE TABLE via Presto or Hive Insert data partitioned by set partitioning key Set user defined configuration The number of bucket, hash function, partitioning key Read the data from UDP table UDP table is now visible via Presto and HiveLOG
  • 15. USER DEFINED CONFIGURATION • We need to set columns to be used as partitioning key and the number of partitions. 
 It should be custom configuration by each user. user_table_id columns bucket_count partiton_function T1 141849 [["o_orderkey","long"]] 32 hash T2 141850 [[“user_id","long"]] 32 hash T3 141910 [[“item_id”,”long"]] 16 hash T4 151242 [[“region_id”,”long"], [“device_id”,”long”]] 256 hash
  • 16. CREATE UDP TABLE VIA PRESTO • Presto and Hive support CREATE TABLE/INSERT INTO on UDP table CREATE TABLE udp_customer WITH ( bucketed_on = array[‘customer_id’], bucket_count = 128 ) AS SELECT * from normal_customer;
  • 17. CREATE UDP TABLE VIA PRESTO • Override ConnectorPageSink to write MPC1 file based on user defined partitioning key. PlazmaPageSink PartitionedMPCWriter TimeRangeMPCWriter TimeRangeMPCWriter TimeRangeMPCWriter BufferedMPCWriter BufferedMPCWriter BufferedMPCWriter . . . b1 b2 b3 Page 1h 1h 1h
  • 18. CREATE UDP TABLE VIA PRESTO • Override ConnectorPageSink to write MPC1 file based on user defined partitioning key. PlazmaPageSink PartitionedMPCWriter TimeRangeMPCWriter TimeRangeMPCWriter TimeRangeMPCWriter BufferedMPCWriter BufferedMPCWriter BufferedMPCWriter . . . Page
  • 19. CREATE UDP TABLE VIA PRESTO id data_set_id first_index_key last_index_key record_count path bucket_ number P1 3065124 187250 1412323028 1412385139 109 abcdefg-1234567-abcdefg-1234567 1 P2 3065125 187250 1412323030 1412324030 209 abcdefg-1234567-abcdefg-9182841 2 P3 3065126 187250 1412327028 1412328028 31 abcdefg-1234567-abcdefg-5818231 3 P4 3065127 187250 1412325011 1412326001 102 abcdefg-1234567-abcdefg-7271828 2 P5 3065128 281254 1412324214 1412325210 987 abcdefg-1234567-abcdefg-6717284 16 P6 3065129 281254 1412325123 1412329800 541 abcdefg-1234567-abcdefg-5717274 14 • New bucket_number column is added to partition record in PlazmaDB.
  • 20. READ DATA FROM UDP TABLE ConnectorSplitManager#getSplits returns data source splits to be read by Presto cluster. Decide target bucket from constraint Constraint specifies the range should be read from the table. ConnectorSplitManager asks PlazmaDB to get the partitions on the target bucket. Override Presto Connector to data source Presto provides a plugin mechanism to connect any data source flexibly. The connector provides the information about metadata and location of real data source, UDFs. Receive constraint as TupleDomain TupleDomain is created from query plan and passed through TableLayout which is available in ConnectorSplitManager
  • 21. READ DATA FROM UDP TABLE SplitManager PlazmaDB TableLayout SQL constraint Map<ColumnHandle, Domain> Distribute PageSource … WHERE bucker_number in () …
  • 23. PERFORMANCE COMPARISON SQLs on TPC-H (scaled factor=1000) elapsedtime(sec) 0 sec 75 sec 150 sec 225 sec 300 sec count1_filter groupby hashjoin 87.279 36.569 1.04 266.71 69.374 19.478 NORMAL UDP
  • 24. COLOCATED JOIN time left right l1 r1 l2 r2 l3 r3 left right left right time Distributed Join l1 r1 l1 r1 l2 r2 l3 r3 l2 r2 l3 r3 Colocated Join
  • 25. PERFORMANCE COMPARISON SQLs on TPC-H (scaled factor=1000) elapsedtime 0 sec 20 sec 40 sec 60 sec 80 sec between mod_predicate count_distinct NORMAL UDP
  • 26. USER DEFINED PARTITIONING 1h 1h 1h 1h1h time c1 v1 v2 v3 … WHERE time = … 1h 1h 1h 1h1h time c1 v1 v2 v3 … WHERE time = …
  • 27. FUTURE WORKS • Maintaining efficient partitioning structure • Developing Stella job to rearranging partitioning schema flexibly by using Presto resource. • Various kinds of pipeline (streaming import etc) should support UDP table. • Documentation
  • 28. T R E A S U R E D A T A