SlideShare a Scribd company logo
1 of 54
Presto in Treasure Data
Mitsunori Komatsu, Treasure Data
Who am I?
• Mitsunori Komatsu, 

Software engineer @ Treasure Data.
• Presto, Hive, Plazma, td-android-sdk, 

td-ios-sdk, Mobile SDK backend,

embedded-sdk
• github:komamitsu,

msgpack-java committer, Presto contributor,
etc…
Today’s talk
• What's Presto?
• Pros & Cons
• Architecture
• Who uses Presto?
• How do we use Presto?
What’s Presto?
Fast
• Distributed SQL query engine (MPP)
• Low latency and good performance
• No disk IO
• Pipelined execution (not Map Reduce)
• Compile a query plan down to byte code
• Off heap memory
• Suitable for ad-hoc query
Pluggable
• Pluggable backends (“connectors”)
• Cassandra / Hive / JMX / Kafka / MySQL /
PostgreSQL / System / TPCH
• We can add a new connector by 

extending SPI
• Treasure Data has been developed a
connector to access our storage
What kind of SQL
• Supports ANSI SQL (Not HiveQL)
• Easy to use Presto compared to HiveQL
• Structural type: Map, Array, JSON, Row
• Window functions
• Approximate queries
• http://blinkdb.org/
Limitations
• Fails with huge JOIN
• In memory only (broadcast / distributed JOIN)
• No grace / hybrid hash join
• No fault tolerance
• Coordinator is SPOF
• No “cost based” optimization
• No authentication / authorization
• No native ODBC => Prestogres
Architectural overview
https://prestodb.io/overview.html
With Hive connector
Query plan
Output[nationkey, _col1] => [nationkey:bigint, count:bigint]

- _col1 := count
Exchange[GATHER] => nationkey:bigint, count:bigint
Aggregate(FINAL)[nationkey] => [nationkey:bigint, count:bigint]

- count := "count"("count_15")
Exchange[REPARTITION] => nationkey:bigint, count_15:bigint
Aggregate(PARTIAL)[nationkey] => [nationkey:bigint, count_15:bigint]
- count_15 := "count"("expr")
Project => [nationkey:bigint, expr:bigint]
- expr := 1
InnerJoin[("custkey" = "custkey_0")] =>
[custkey:bigint, custkey_0:bigint, nationkey:bigint]
Project => [custkey:bigint]
Filter[("orderpriority" = '1-URGENT')] =>
[custkey:bigint, orderpriority:varchar]
TableScan[tpch:tpch:orders:sf0.01, original constraint=

('1-URGENT' = "orderpriority")] => [custkey:bigint, orderpriority:varchar]

- custkey := tpch:custkey:1

- orderpriority := tpch:orderpriority:5
Exchange[REPLICATE] => custkey_0:bigint, nationkey:bigint
TableScan[tpch:tpch:customer:sf0.01, original constraint=true] =>
[custkey_0:bigint, nationkey:bigint]

- custkey_0 := tpch:custkey:0

- nationkey := tpch:nationkey:3
select

c.nationkey,

count(1)

from orders o
join customer c

on o.custkey = c.custkey
where
o.orderpriority = '1-URGENT'
group by c.nationkey
Stage 1
Stage 2
Stage 0
Query, stage, task and split
Query
Task 0.0
Split
Task 1.0
Split
Task 1.1 Task 1.2
Split Split Split
Task 2.0
Split
Task 2.1 Task 2.2
Split Split Split Split Split Split Split
Split
For example…
TableScan
(FROM)
Aggregation
(GROUP BY)
Output
@worker#2 @worker#3 @worker#0
Query plan
Output[nationkey, _col1] => [nationkey:bigint, count:bigint]

- _col1 := count
Exchange[GATHER] => nationkey:bigint, count:bigint
Aggregate(FINAL)[nationkey] => [nationkey:bigint, count:bigint]

- count := "count"("count_15")
Exchange[REPARTITION] => nationkey:bigint, count_15:bigint
Aggregate(PARTIAL)[nationkey] => [nationkey:bigint, count_15:bigint]
- count_15 := "count"("expr")
Project => [nationkey:bigint, expr:bigint]
- expr := 1
InnerJoin[("custkey" = "custkey_0")] =>
[custkey:bigint, custkey_0:bigint, nationkey:bigint]
Project => [custkey:bigint]
Filter[("orderpriority" = '1-URGENT')] =>
[custkey:bigint, orderpriority:varchar]
TableScan[tpch:tpch:orders:sf0.01, original constraint=

('1-URGENT' = "orderpriority")] => [custkey:bigint, orderpriority:varchar]

- custkey := tpch:custkey:1

- orderpriority := tpch:orderpriority:5
Exchange[REPLICATE] => custkey_0:bigint, nationkey:bigint
TableScan[tpch:tpch:customer:sf0.01, original constraint=true] =>
[custkey_0:bigint, nationkey:bigint]

- custkey_0 := tpch:custkey:0

- nationkey := tpch:nationkey:3
select

c.nationkey,

count(1)

from orders o
join customer c

on o.custkey = c.custkey
where
o.orderpriority = '1-URGENT'
group by c.nationkey
Query plan
Output[nationkey, _col1] => [nationkey:bigint, count:bigint]

- _col1 := count
Exchange[GATHER] => nationkey:bigint, count:bigint
Aggregate(FINAL)[nationkey] => [nationkey:bigint, count:bigint]

- count := "count"("count_15")
Exchange[REPARTITION] => nationkey:bigint, count_15:bigint
Aggregate(PARTIAL)[nationkey] => [nationkey:bigint, count_15:bigint]
- count_15 := "count"("expr")
Project => [nationkey:bigint, expr:bigint]
- expr := 1
InnerJoin[("custkey" = "custkey_0")] =>
[custkey:bigint, custkey_0:bigint, nationkey:bigint]
Project => [custkey:bigint]
Filter[("orderpriority" = '1-URGENT')] =>
[custkey:bigint, orderpriority:varchar]
TableScan[tpch:tpch:orders:sf0.01, original constraint=

('1-URGENT' = "orderpriority")] => [custkey:bigint, orderpriority:varchar]

- custkey := tpch:custkey:1

- orderpriority := tpch:orderpriority:5
Exchange[REPLICATE] => custkey_0:bigint, nationkey:bigint
TableScan[tpch:tpch:customer:sf0.01, original constraint=true] =>
[custkey_0:bigint, nationkey:bigint]

- custkey_0 := tpch:custkey:0

- nationkey := tpch:nationkey:3
select

c.nationkey,

count(1)

from orders o
join customer c

on o.custkey = c.custkey
where
o.orderpriority = '1-URGENT'
group by c.nationkey
Stage 3
Query plan
Output[nationkey, _col1] => [nationkey:bigint, count:bigint]

- _col1 := count
Exchange[GATHER] => nationkey:bigint, count:bigint
Aggregate(FINAL)[nationkey] => [nationkey:bigint, count:bigint]

- count := "count"("count_15")
Exchange[REPARTITION] => nationkey:bigint, count_15:bigint
Aggregate(PARTIAL)[nationkey] => [nationkey:bigint, count_15:bigint]
- count_15 := "count"("expr")
Project => [nationkey:bigint, expr:bigint]
- expr := 1
InnerJoin[("custkey" = "custkey_0")] =>
[custkey:bigint, custkey_0:bigint, nationkey:bigint]
Project => [custkey:bigint]
Filter[("orderpriority" = '1-URGENT')] =>
[custkey:bigint, orderpriority:varchar]
TableScan[tpch:tpch:orders:sf0.01, original constraint=

('1-URGENT' = "orderpriority")] => [custkey:bigint, orderpriority:varchar]

- custkey := tpch:custkey:1

- orderpriority := tpch:orderpriority:5
Exchange[REPLICATE] => custkey_0:bigint, nationkey:bigint
TableScan[tpch:tpch:customer:sf0.01, original constraint=true] =>
[custkey_0:bigint, nationkey:bigint]

- custkey_0 := tpch:custkey:0

- nationkey := tpch:nationkey:3
select

c.nationkey,

count(1)

from orders o
join customer c

on o.custkey = c.custkey
where
o.orderpriority = '1-URGENT'
group by c.nationkey
Stage 3
Stage 2
Query plan
Output[nationkey, _col1] => [nationkey:bigint, count:bigint]

- _col1 := count
Exchange[GATHER] => nationkey:bigint, count:bigint
Aggregate(FINAL)[nationkey] => [nationkey:bigint, count:bigint]

- count := "count"("count_15")
Exchange[REPARTITION] => nationkey:bigint, count_15:bigint
Aggregate(PARTIAL)[nationkey] => [nationkey:bigint, count_15:bigint]
- count_15 := "count"("expr")
Project => [nationkey:bigint, expr:bigint]
- expr := 1
InnerJoin[("custkey" = "custkey_0")] =>
[custkey:bigint, custkey_0:bigint, nationkey:bigint]
Project => [custkey:bigint]
Filter[("orderpriority" = '1-URGENT')] =>
[custkey:bigint, orderpriority:varchar]
TableScan[tpch:tpch:orders:sf0.01, original constraint=

('1-URGENT' = "orderpriority")] => [custkey:bigint, orderpriority:varchar]

- custkey := tpch:custkey:1

- orderpriority := tpch:orderpriority:5
Exchange[REPLICATE] => custkey_0:bigint, nationkey:bigint
TableScan[tpch:tpch:customer:sf0.01, original constraint=true] =>
[custkey_0:bigint, nationkey:bigint]

- custkey_0 := tpch:custkey:0

- nationkey := tpch:nationkey:3
select

c.nationkey,

count(1)

from orders o
join customer c

on o.custkey = c.custkey
where
o.orderpriority = '1-URGENT'
group by c.nationkey
Stage 3
Stage 2
Stage 1
Query plan
Output[nationkey, _col1] => [nationkey:bigint, count:bigint]

- _col1 := count
Exchange[GATHER] => nationkey:bigint, count:bigint
Aggregate(FINAL)[nationkey] => [nationkey:bigint, count:bigint]

- count := "count"("count_15")
Exchange[REPARTITION] => nationkey:bigint, count_15:bigint
Aggregate(PARTIAL)[nationkey] => [nationkey:bigint, count_15:bigint]
- count_15 := "count"("expr")
Project => [nationkey:bigint, expr:bigint]
- expr := 1
InnerJoin[("custkey" = "custkey_0")] =>
[custkey:bigint, custkey_0:bigint, nationkey:bigint]
Project => [custkey:bigint]
Filter[("orderpriority" = '1-URGENT')] =>
[custkey:bigint, orderpriority:varchar]
TableScan[tpch:tpch:orders:sf0.01, original constraint=

('1-URGENT' = "orderpriority")] => [custkey:bigint, orderpriority:varchar]

- custkey := tpch:custkey:1

- orderpriority := tpch:orderpriority:5
Exchange[REPLICATE] => custkey_0:bigint, nationkey:bigint
TableScan[tpch:tpch:customer:sf0.01, original constraint=true] =>
[custkey_0:bigint, nationkey:bigint]

- custkey_0 := tpch:custkey:0

- nationkey := tpch:nationkey:3
select

c.nationkey,

count(1)

from orders o
join customer c

on o.custkey = c.custkey
where
o.orderpriority = '1-URGENT'
group by c.nationkey
Stage 3
Stage 2
Stage 1
Stage 0
What each component do?
Presto Cli
Coordinator
- Parse Query
- Analyze Query
- Create Query Plan
- Execute Query
- Contains Stages
- Execute Stages
- Contains Tasks
- Issue Tasks
Discovery Service
Worker
Worker
- Execute Tasks
- Convert Query to Java
Bytecode (Operator)
- Execute Operator
Connector
- MetaData
- Table, Column…
- SplitManager
- Split, …
Connector
- RecordSetProvider
- RecordSet
- RecordCursor
- Read Storage
Connector
Storage
Worker
Connector
External

Metadata?
What each component do?
Presto Cli
Coordinator
- Parse Query
- Analyze Query
- Create Query Plan
- Execute Query
- Contains Stages
- Execute Stages
- Contains Tasks
- Issue Tasks
Discovery Service
Worker
Worker
- Execute Tasks
- Convert Query to Java
Bytecode (Operator)
- Execute Operator
Connector
- MetaData
- Table, Column…
- SplitManager
- Split, …
Connector
- RecordSetProvider
- RecordSet
- RecordCursor
- Read Storage
Connector
Storage
Worker
Connector
External

Metadata?
What each component do?
Presto Cli
Coordinator
- Parse Query
- Analyze Query
- Create Query Plan
- Execute Query
- Contains Stages
- Execute Stages
- Contains Tasks
- Issue Tasks
Discovery Service
Worker
Worker
- Execute Tasks
- Convert Query to Java
Bytecode (Operator)
- Execute Operator
Connector
- MetaData
- Table, Column…
- SplitManager
- Split, …
Connector
- RecordSetProvider
- RecordSet
- RecordCursor
- Read Storage
Connector
Storage
Worker
Connector
External

Metadata?
What each component do?
Presto Cli
Coordinator
- Parse Query
- Analyze Query
- Create Query Plan
- Execute Query
- Contains Stages
- Execute Stages
- Contains Tasks
- Issue Tasks
Discovery Service
Worker
Worker
- Execute Tasks
- Convert Query to Java
Bytecode (Operator)
- Execute Operator
Connector
- MetaData
- Table, Column…
- SplitManager
- Split, …
Connector
- RecordSetProvider
- RecordSet
- RecordCursor
- Read Storage
Connector
Storage
Worker
Connector
External

Metadata?
What each component do?
Presto Cli
Coordinator
- Parse Query
- Analyze Query
- Create Query Plan
- Execute Query
- Contains Stages
- Execute Stages
- Contains Tasks
- Issue Tasks
Discovery Service
Worker
Worker
- Execute Tasks
- Convert Query to Java
Bytecode (Operator)
- Execute Operator
Connector
- MetaData
- Table, Column…
- SplitManager
- Split, …
Connector
- RecordSetProvider
- RecordSet
- RecordCursor
- Read Storage
Connector
Storage
Worker
Connector
External

Metadata?
Who uses Presto?
• Facebook
http://www.slideshare.net/dain1/presto-meetup-2015
• Dropbox
Who uses Presto?
• Airbnb
Who uses Presto?
• Qubole
• SaaS
• Treasure Data
• SaaS
• Teradata (new!)
• commercial support
Who uses Presto?
As a service…
Today’s talk
• What's Presto?
• How do we use Presto?
• What’s Treasure Data
• Architecture
• How we manage Presto
How do we use Presto?
We…?
Treasure Data
Treasure Data
18 600,000
Treasure Data
Time to Value
Send query result 
Result Push
Acquire
 Analyze
Store
Plazma DB
Flexible, Scalable,
Columnar Storage
Web Log
App Log
Censor
CRM
ERP
RDBMS
Treasure Agent(Server)
SDK(JS, Android, iOS, Unity)
Streaming Collector
Batch /
Reliability
Ad-hoc /

Low latency
KPI$
KPI Dashboard
BI Tools
Other Products
RDBMS, Google Docs,
AWS S3, FTP Server, etc.
Metric Insights 
Tableau, 
Motion Board etc. 
POS
REST API
ODBC / JDBC
SQL, Pig 
Bulk Uploader
Embulk,

TD Toolbelt
SQL-based query
@AWS or @IDCF
Connectivity
Economy & Flexibility Simple & Supported
Architecture in Treasure Data
worker queue
(MySQL)
api server
td worker
process
plazmadb
(PostgreSQL +
S3/RiakCS)
select user_id,
count(1) from
…
Presto
coordinator
Presto
worker
Presto
worker
Presto
worker
Presto
worker
result bucket
(S3) Retry failed query
if needed
Authentication /
Authorization
Columnar file format.
Schema-less.
td-presto
connector
Architecture in Treasure Data
worker queue
(MySQL)
api server
td worker
process
plazmadb
(PostgreSQL +
S3/RiakCS)
select user_id,
count(1) from
…
Presto
coordinator
Presto
worker
Presto
worker
Presto
worker
Presto
worker
result bucket
(S3) Retry failed query
if needed
Authentication /
Authorization
Columnar file format.
Schema-less.
td-presto
connector
Architecture in Treasure Data
worker queue
(MySQL)
api server
td worker
process
plazmadb
(PostgreSQL +
S3/RiakCS)
select user_id,
count(1) from
…
Presto
coordinator
Presto
worker
Presto
worker
Presto
worker
Presto
worker
result bucket
(S3) Retry failed query
if needed
Authentication /
Authorization
Columnar file format.
Schema-less.
td-presto
connector
Architecture in Treasure Data
worker queue
(MySQL)
api server
td worker
process
plazmadb
(PostgreSQL +
S3/RiakCS)
select user_id,
count(1) from
…
Presto
coordinator
Presto
worker
Presto
worker
Presto
worker
Presto
worker
result bucket
(S3) Retry failed query
if needed
Authentication /
Authorization
Columnar file format.
Schema-less.
td-presto
connector
Architecture in Treasure Data
worker queue
(MySQL)
api server
td worker
process
plazmadb
(PostgreSQL +
S3/RiakCS)
select user_id,
count(1) from
…
Presto
coordinator
Presto
worker
Presto
worker
Presto
worker
Presto
worker
result bucket
(S3) Retry failed query
if needed
Authentication /
Authorization
Columnar file format.
Schema-less.
td-presto
connector
Architecture in Treasure Data
worker queue
(MySQL)
api server
td worker
process
plazmadb
(PostgreSQL +
S3/RiakCS)
select user_id,
count(1) from
…
Presto
coordinator
Presto
worker
Presto
worker
Presto
worker
Presto
worker
result bucket
(S3) Retry failed query
if needed
Authentication /
Authorization
Columnar file format.
Schema-less.
td-presto
connector
Architecture in Treasure Data
worker queue
(MySQL)
api server
td worker
process
plazmadb
(PostgreSQL +
S3/RiakCS)
select user_id,
count(1) from
…
Presto
coordinator
Presto
worker
Presto
worker
Presto
worker
Presto
worker
result bucket
(S3) Retry failed query
if needed
Authentication /
Authorization
Columnar file format.
Schema-less.
td-presto
connector
Architecture in Treasure Data
worker queue
(MySQL)
api server
td worker
process
plazmadb
(PostgreSQL +
S3/RiakCS)
select user_id,
count(1) from
…
Presto
coordinator
Presto
worker
Presto
worker
Presto
worker
Presto
worker
result bucket
(S3) Retry failed query
if needed
Authentication /
Authorization
Columnar file format.
Schema-less.
td-presto
connector
Schema on read
time code method user_id
2015-06-01 10:07:11 200 GET
2015-06-01 10:10:12 “200” GET
2015-06-01 10:10:20 200 GET
2015-06-01 10:11:30 200 POST
2015-06-01 10:20:45 200 GET
2015-06-01 10:33:50 400 GET 206
2015-06-01 10:40:11 200 GET 852
2015-06-01 10:51:32 200 PUT 1223
2015-06-01 10:58:02 200 GET 5118
2015-06-01 11:02:11 404 GET 12
2015-06-01 11:14:27 200 GET 3447
access_logs table
User added a new
column “user_id” in
imported data
User can select
this column with
only adding it to
the schema
(w/o reconstruct
the table)
Schema on read
Columnar file format
time code method user_id
2015-06-01 10:07:11 200 GET
2015-06-01 10:10:12 “200” GET
2015-06-01 10:10:20 200 GET
2015-06-01 10:11:30 200 POST
2015-06-01 10:20:45 200 GET
2015-06-01 10:33:50 400 GET 206
2015-06-01 10:40:11 200 GET 852
2015-06-01 10:51:32 200 PUT 1223
2015-06-01 10:58:02 200 GET 5118
2015-06-01 11:02:11 404 GET 12
2015-06-01 11:14:27 200 GET 3447
access_logs table
time
code
method
user_id
Columnar file
format
This query accesses
only code column


select code,
count(1) from tbl
group by code
td-presto connector
• MessagePack v07
• off heap
• Async IO with Jetty-client
• Scheduling & Resource
management
How we manage Presto
• Blue-Green Deployment
• Stress test tool
• Monitoring with DataDog
Blue-Green Deployment
worker queue
(MySQL)
api server
td worker
process
plazmadb
(PostgreSQL +
S3/RiakCS)
select user_id,
count(1) from
…
Presto
coordinator
result bucket
(S3)
Presto
coordinator
Presto
worker
Presto
worker
Presto
worker
Presto
worker
Presto
worker
Presto
worker
Presto
worker
Presto
worker
production
rc
Blue-Green Deployment
worker queue
(MySQL)
api server
td worker
process
plazmadb
(PostgreSQL +
S3/RiakCS)
select user_id,
count(1) from
…
Presto
coordinator
result bucket
(S3)
Presto
coordinator
Presto
worker
Presto
worker
Presto
worker
Presto
worker
Presto
worker
Presto
worker
Presto
worker
Presto
worker
production
rcTest Test Test!
Blue-Green Deployment
worker queue
(MySQL)
api server
td worker
process
plazmadb
(PostgreSQL +
S3/RiakCS)
select user_id,
count(1) from
…
Presto
coordinator
result bucket
(S3)
Presto
coordinator
Presto
worker
Presto
worker
Presto
worker
Presto
worker
Presto
worker
Presto
worker
Presto
worker
Presto
worker
production!
Stress test tool
• Collect queries that has ever caused issues.
• Add a new query with just adding this entry.
• Issue the query, gets the result and
implements a calculated digest
automatically.

• We can send all the queries including very
heavy ones (around 6000 stages) to Presto
- job_id: 28889999
- result: 227d16d801a9a43148c2b7149ce4657c
- job_id: 28889999
Stress test tool
• Collect queries that has ever caused issues.
• Add a new query with just adding this entry.
• Issue the query, gets the result and
implements a calculated digest
automatically.

• We can send all the queries including very
heavy ones (around 6000 stages) to Presto
- job_id: 28889999
- result: 227d16d801a9a43148c2b7149ce4657c
- job_id: 28889999
Stress test tool
• Collect queries that has ever caused issues.
• Add a new query with just adding this entry.
• Issue the query, gets the result and
implements a calculated digest
automatically.

• We can send all the queries including very
heavy ones (around 6000 stages) to Presto
- job_id: 28889999
- result: 227d16d801a9a43148c2b7149ce4657c
- job_id: 28889999
Stress test tool
• Collect queries that has ever caused issues.
• Add a new query with just adding this entry.
• Issue the query, gets the result and
implements a calculated digest
automatically.

• We can send all the queries including very
heavy ones (around 6000 stages) to Presto
- job_id: 28889999
- result: 227d16d801a9a43148c2b7149ce4657c
- job_id: 28889999
Monitoring with DataDog
Presto coordinator
Presto
worker
Presto
worker
Presto
worker
Presto
worker
Presto
process
td-agent
in_presto_metrics/v1/jmx/mbean
/v1/query
/v1/node
out_metricsense
DataDog
Monitoring with DataDog
Query stalled time
- Most important for us.
- It triggers alert calls to us…
- It can be mainly increased by td-presto
connector problems. Most of them are race
condition issue.
How many queries processed
More than 10000 queries / day
How many queries processed
Most queries finish within 1min
Check: treasuredata.com

  treasure-data.hateblo.jp/ (Japan blog)
Cloud service for the entire data pipeline

More Related Content

What's hot

Getting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJSGetting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJSMongoDB
 
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021Altinity Ltd
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovAltinity Ltd
 
Using Cerberus and PySpark to validate semi-structured datasets
Using Cerberus and PySpark to validate semi-structured datasetsUsing Cerberus and PySpark to validate semi-structured datasets
Using Cerberus and PySpark to validate semi-structured datasetsBartosz Konieczny
 
Creating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouseCreating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouseAltinity Ltd
 
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)Altinity Ltd
 
Time Series Meetup: Virtual Edition | July 2020
Time Series Meetup: Virtual Edition | July 2020Time Series Meetup: Virtual Edition | July 2020
Time Series Meetup: Virtual Edition | July 2020InfluxData
 
Apache Spark Structured Streaming + Apache Kafka = ♡
Apache Spark Structured Streaming + Apache Kafka = ♡Apache Spark Structured Streaming + Apache Kafka = ♡
Apache Spark Structured Streaming + Apache Kafka = ♡Bartosz Konieczny
 
Programming with Python and PostgreSQL
Programming with Python and PostgreSQLProgramming with Python and PostgreSQL
Programming with Python and PostgreSQLPeter Eisentraut
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOAltinity Ltd
 
Accelerating Local Search with PostgreSQL (KNN-Search)
Accelerating Local Search with PostgreSQL (KNN-Search)Accelerating Local Search with PostgreSQL (KNN-Search)
Accelerating Local Search with PostgreSQL (KNN-Search)Jonathan Katz
 
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...Altinity Ltd
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxData
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howAltinity Ltd
 
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and TelegrafObtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and TelegrafInfluxData
 
Apache Spark in your likeness - low and high level customization
Apache Spark in your likeness - low and high level customizationApache Spark in your likeness - low and high level customization
Apache Spark in your likeness - low and high level customizationBartosz Konieczny
 
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...InfluxData
 
Using Apache Spark to Solve Sessionization Problem in Batch and Streaming
Using Apache Spark to Solve Sessionization Problem in Batch and StreamingUsing Apache Spark to Solve Sessionization Problem in Batch and Streaming
Using Apache Spark to Solve Sessionization Problem in Batch and StreamingDatabricks
 
Webinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse
Webinar slides: Adding Fast Analytics to MySQL Applications with ClickhouseWebinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse
Webinar slides: Adding Fast Analytics to MySQL Applications with ClickhouseAltinity Ltd
 

What's hot (20)

Getting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJSGetting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJS
 
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei Milovidov
 
Using Cerberus and PySpark to validate semi-structured datasets
Using Cerberus and PySpark to validate semi-structured datasetsUsing Cerberus and PySpark to validate semi-structured datasets
Using Cerberus and PySpark to validate semi-structured datasets
 
Creating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouseCreating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouse
 
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
New features in ProxySQL 2.0 (updated to 2.0.9) by Rene Cannao (ProxySQL)
 
Time Series Meetup: Virtual Edition | July 2020
Time Series Meetup: Virtual Edition | July 2020Time Series Meetup: Virtual Edition | July 2020
Time Series Meetup: Virtual Edition | July 2020
 
Apache Spark Structured Streaming + Apache Kafka = ♡
Apache Spark Structured Streaming + Apache Kafka = ♡Apache Spark Structured Streaming + Apache Kafka = ♡
Apache Spark Structured Streaming + Apache Kafka = ♡
 
Programming with Python and PostgreSQL
Programming with Python and PostgreSQLProgramming with Python and PostgreSQL
Programming with Python and PostgreSQL
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
 
Accelerating Local Search with PostgreSQL (KNN-Search)
Accelerating Local Search with PostgreSQL (KNN-Search)Accelerating Local Search with PostgreSQL (KNN-Search)
Accelerating Local Search with PostgreSQL (KNN-Search)
 
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and how
 
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and TelegrafObtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
 
Hazelcast
HazelcastHazelcast
Hazelcast
 
Apache Spark in your likeness - low and high level customization
Apache Spark in your likeness - low and high level customizationApache Spark in your likeness - low and high level customization
Apache Spark in your likeness - low and high level customization
 
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
 
Using Apache Spark to Solve Sessionization Problem in Batch and Streaming
Using Apache Spark to Solve Sessionization Problem in Batch and StreamingUsing Apache Spark to Solve Sessionization Problem in Batch and Streaming
Using Apache Spark to Solve Sessionization Problem in Batch and Streaming
 
Webinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse
Webinar slides: Adding Fast Analytics to MySQL Applications with ClickhouseWebinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse
Webinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse
 

Viewers also liked

[db tech showcase Tokyo 2015] D22:インメモリープラットホームSAP HANAのご紹介と最新情報 by SAPジャパン株式...
[db tech showcase Tokyo 2015] D22:インメモリープラットホームSAP HANAのご紹介と最新情報 by SAPジャパン株式...[db tech showcase Tokyo 2015] D22:インメモリープラットホームSAP HANAのご紹介と最新情報 by SAPジャパン株式...
[db tech showcase Tokyo 2015] D22:インメモリープラットホームSAP HANAのご紹介と最新情報 by SAPジャパン株式...Insight Technology, Inc.
 
[db tech showcase Tokyo 2015] D23:MySQLはドキュメントデータベースになり、HTTPもしゃべる - MySQL Lab...
[db tech showcase Tokyo 2015] D23:MySQLはドキュメントデータベースになり、HTTPもしゃべる - MySQL Lab...[db tech showcase Tokyo 2015] D23:MySQLはドキュメントデータベースになり、HTTPもしゃべる - MySQL Lab...
[db tech showcase Tokyo 2015] D23:MySQLはドキュメントデータベースになり、HTTPもしゃべる - MySQL Lab...Insight Technology, Inc.
 
[db tech showcase Tokyo 2015] D32:HPの全方位インメモリDB化に向けた取り組みとSAP HANAインメモリDB の効果を...
[db tech showcase Tokyo 2015] D32:HPの全方位インメモリDB化に向けた取り組みとSAP HANAインメモリDB の効果を...[db tech showcase Tokyo 2015] D32:HPの全方位インメモリDB化に向けた取り組みとSAP HANAインメモリDB の効果を...
[db tech showcase Tokyo 2015] D32:HPの全方位インメモリDB化に向けた取り組みとSAP HANAインメモリDB の効果を...Insight Technology, Inc.
 
[db tech showcase Tokyo 2015] B34:データの仮想化を具体化するIBMのロジカルデータウェアハウス by 日本アイ・ビー・エ...
[db tech showcase Tokyo 2015] B34:データの仮想化を具体化するIBMのロジカルデータウェアハウス by 日本アイ・ビー・エ...[db tech showcase Tokyo 2015] B34:データの仮想化を具体化するIBMのロジカルデータウェアハウス by 日本アイ・ビー・エ...
[db tech showcase Tokyo 2015] B34:データの仮想化を具体化するIBMのロジカルデータウェアハウス by 日本アイ・ビー・エ...Insight Technology, Inc.
 
[db tech showcase Tokyo 2015] D13:PCIeフラッシュで、高可用性高性能データベースシステム?! by 株式会社HGSTジ...
[db tech showcase Tokyo 2015] D13:PCIeフラッシュで、高可用性高性能データベースシステム?! by 株式会社HGSTジ...[db tech showcase Tokyo 2015] D13:PCIeフラッシュで、高可用性高性能データベースシステム?! by 株式会社HGSTジ...
[db tech showcase Tokyo 2015] D13:PCIeフラッシュで、高可用性高性能データベースシステム?! by 株式会社HGSTジ...Insight Technology, Inc.
 
[db tech showcase Tokyo 2015] D16:マイケルストーンブレーカー発の超高速データベースで実現する分析基盤の簡単構築・運用ステ...
[db tech showcase Tokyo 2015] D16:マイケルストーンブレーカー発の超高速データベースで実現する分析基盤の簡単構築・運用ステ...[db tech showcase Tokyo 2015] D16:マイケルストーンブレーカー発の超高速データベースで実現する分析基盤の簡単構築・運用ステ...
[db tech showcase Tokyo 2015] D16:マイケルストーンブレーカー発の超高速データベースで実現する分析基盤の簡単構築・運用ステ...Insight Technology, Inc.
 
Mongodb x business
Mongodb x businessMongodb x business
Mongodb x businessemin_press
 
Dbts2015 tokyo vector_in_hadoop_vortex
Dbts2015 tokyo vector_in_hadoop_vortexDbts2015 tokyo vector_in_hadoop_vortex
Dbts2015 tokyo vector_in_hadoop_vortexKoji Shinkubo
 
[db tech showcase Tokyo 2015] C16:Oracle Disaster Recovery at New Zealand sto...
[db tech showcase Tokyo 2015] C16:Oracle Disaster Recovery at New Zealand sto...[db tech showcase Tokyo 2015] C16:Oracle Disaster Recovery at New Zealand sto...
[db tech showcase Tokyo 2015] C16:Oracle Disaster Recovery at New Zealand sto...Insight Technology, Inc.
 
[db tech showcase Tokyo 2015] C33:ビッグデータ・IoT時代のキーテクノロジー、CEPの「今」を掴む! by 株式会社日立...
[db tech showcase Tokyo 2015] C33:ビッグデータ・IoT時代のキーテクノロジー、CEPの「今」を掴む! by 株式会社日立...[db tech showcase Tokyo 2015] C33:ビッグデータ・IoT時代のキーテクノロジー、CEPの「今」を掴む! by 株式会社日立...
[db tech showcase Tokyo 2015] C33:ビッグデータ・IoT時代のキーテクノロジー、CEPの「今」を掴む! by 株式会社日立...Insight Technology, Inc.
 
DBTS2015 Tokyo DBAが知っておくべき最新テクノロジー
DBTS2015 Tokyo DBAが知っておくべき最新テクノロジーDBTS2015 Tokyo DBAが知っておくべき最新テクノロジー
DBTS2015 Tokyo DBAが知っておくべき最新テクノロジーMasaya Ishikawa
 
[db tech showcase Tokyo 2015] C32:「データ一貫性にこだわる日立のインメモリ分散KVS~こだわりの理由と実現方法とは~」 ...
[db tech showcase Tokyo 2015] C32:「データ一貫性にこだわる日立のインメモリ分散KVS~こだわりの理由と実現方法とは~」 ...[db tech showcase Tokyo 2015] C32:「データ一貫性にこだわる日立のインメモリ分散KVS~こだわりの理由と実現方法とは~」 ...
[db tech showcase Tokyo 2015] C32:「データ一貫性にこだわる日立のインメモリ分散KVS~こだわりの理由と実現方法とは~」 ...Insight Technology, Inc.
 
[DB tech showcase Tokyo 2015] B37 :オンプレミスからAWS上のSAP HANAまで高信頼DBシステム構築にHAクラスタリ...
[DB tech showcase Tokyo 2015] B37 :オンプレミスからAWS上のSAP HANAまで高信頼DBシステム構築にHAクラスタリ...[DB tech showcase Tokyo 2015] B37 :オンプレミスからAWS上のSAP HANAまで高信頼DBシステム構築にHAクラスタリ...
[DB tech showcase Tokyo 2015] B37 :オンプレミスからAWS上のSAP HANAまで高信頼DBシステム構築にHAクラスタリ...Funada Yasunobu
 
Apache Hiveの今とこれから
Apache Hiveの今とこれからApache Hiveの今とこれから
Apache Hiveの今とこれからYifeng Jiang
 
[db tech showcase Tokyo 2015] C25:HP NonStop SQLはなぜグローバルに分散DBを構築できるのか、 データの整合...
[db tech showcase Tokyo 2015] C25:HP NonStop SQLはなぜグローバルに分散DBを構築できるのか、 データの整合...[db tech showcase Tokyo 2015] C25:HP NonStop SQLはなぜグローバルに分散DBを構築できるのか、 データの整合...
[db tech showcase Tokyo 2015] C25:HP NonStop SQLはなぜグローバルに分散DBを構築できるのか、 データの整合...Insight Technology, Inc.
 
[db tech showcase Tokyo 2015] E35: Web, IoT, モバイル時代のデータベース、Apache Cassandraを学ぼう
[db tech showcase Tokyo 2015] E35: Web, IoT, モバイル時代のデータベース、Apache Cassandraを学ぼう[db tech showcase Tokyo 2015] E35: Web, IoT, モバイル時代のデータベース、Apache Cassandraを学ぼう
[db tech showcase Tokyo 2015] E35: Web, IoT, モバイル時代のデータベース、Apache Cassandraを学ぼうdatastaxjp
 
[db tech showcase Tokyo 2015] A33:Amazon DynamoDB Deep Dive by アマゾン データ サービス ...
[db tech showcase Tokyo 2015] A33:Amazon DynamoDB Deep Dive by アマゾン データ サービス ...[db tech showcase Tokyo 2015] A33:Amazon DynamoDB Deep Dive by アマゾン データ サービス ...
[db tech showcase Tokyo 2015] A33:Amazon DynamoDB Deep Dive by アマゾン データ サービス ...Insight Technology, Inc.
 
[db tech showcase Tokyo 2015] C14:30万のユーザ部門を抱える日立、情シスの「理想と現実」 by 株式会社日立製作所 情報...
[db tech showcase Tokyo 2015] C14:30万のユーザ部門を抱える日立、情シスの「理想と現実」 by 株式会社日立製作所 情報...[db tech showcase Tokyo 2015] C14:30万のユーザ部門を抱える日立、情シスの「理想と現実」 by 株式会社日立製作所 情報...
[db tech showcase Tokyo 2015] C14:30万のユーザ部門を抱える日立、情シスの「理想と現実」 by 株式会社日立製作所 情報...Insight Technology, Inc.
 
Couchbase introduction-20150611
Couchbase introduction-20150611Couchbase introduction-20150611
Couchbase introduction-20150611Couchbase Japan KK
 
[db tech showcase Tokyo 2015] B36:Hitachi Advanced Data Binder 実践SQLチューニング方法 ...
[db tech showcase Tokyo 2015] B36:Hitachi Advanced Data Binder 実践SQLチューニング方法 ...[db tech showcase Tokyo 2015] B36:Hitachi Advanced Data Binder 実践SQLチューニング方法 ...
[db tech showcase Tokyo 2015] B36:Hitachi Advanced Data Binder 実践SQLチューニング方法 ...Insight Technology, Inc.
 

Viewers also liked (20)

[db tech showcase Tokyo 2015] D22:インメモリープラットホームSAP HANAのご紹介と最新情報 by SAPジャパン株式...
[db tech showcase Tokyo 2015] D22:インメモリープラットホームSAP HANAのご紹介と最新情報 by SAPジャパン株式...[db tech showcase Tokyo 2015] D22:インメモリープラットホームSAP HANAのご紹介と最新情報 by SAPジャパン株式...
[db tech showcase Tokyo 2015] D22:インメモリープラットホームSAP HANAのご紹介と最新情報 by SAPジャパン株式...
 
[db tech showcase Tokyo 2015] D23:MySQLはドキュメントデータベースになり、HTTPもしゃべる - MySQL Lab...
[db tech showcase Tokyo 2015] D23:MySQLはドキュメントデータベースになり、HTTPもしゃべる - MySQL Lab...[db tech showcase Tokyo 2015] D23:MySQLはドキュメントデータベースになり、HTTPもしゃべる - MySQL Lab...
[db tech showcase Tokyo 2015] D23:MySQLはドキュメントデータベースになり、HTTPもしゃべる - MySQL Lab...
 
[db tech showcase Tokyo 2015] D32:HPの全方位インメモリDB化に向けた取り組みとSAP HANAインメモリDB の効果を...
[db tech showcase Tokyo 2015] D32:HPの全方位インメモリDB化に向けた取り組みとSAP HANAインメモリDB の効果を...[db tech showcase Tokyo 2015] D32:HPの全方位インメモリDB化に向けた取り組みとSAP HANAインメモリDB の効果を...
[db tech showcase Tokyo 2015] D32:HPの全方位インメモリDB化に向けた取り組みとSAP HANAインメモリDB の効果を...
 
[db tech showcase Tokyo 2015] B34:データの仮想化を具体化するIBMのロジカルデータウェアハウス by 日本アイ・ビー・エ...
[db tech showcase Tokyo 2015] B34:データの仮想化を具体化するIBMのロジカルデータウェアハウス by 日本アイ・ビー・エ...[db tech showcase Tokyo 2015] B34:データの仮想化を具体化するIBMのロジカルデータウェアハウス by 日本アイ・ビー・エ...
[db tech showcase Tokyo 2015] B34:データの仮想化を具体化するIBMのロジカルデータウェアハウス by 日本アイ・ビー・エ...
 
[db tech showcase Tokyo 2015] D13:PCIeフラッシュで、高可用性高性能データベースシステム?! by 株式会社HGSTジ...
[db tech showcase Tokyo 2015] D13:PCIeフラッシュで、高可用性高性能データベースシステム?! by 株式会社HGSTジ...[db tech showcase Tokyo 2015] D13:PCIeフラッシュで、高可用性高性能データベースシステム?! by 株式会社HGSTジ...
[db tech showcase Tokyo 2015] D13:PCIeフラッシュで、高可用性高性能データベースシステム?! by 株式会社HGSTジ...
 
[db tech showcase Tokyo 2015] D16:マイケルストーンブレーカー発の超高速データベースで実現する分析基盤の簡単構築・運用ステ...
[db tech showcase Tokyo 2015] D16:マイケルストーンブレーカー発の超高速データベースで実現する分析基盤の簡単構築・運用ステ...[db tech showcase Tokyo 2015] D16:マイケルストーンブレーカー発の超高速データベースで実現する分析基盤の簡単構築・運用ステ...
[db tech showcase Tokyo 2015] D16:マイケルストーンブレーカー発の超高速データベースで実現する分析基盤の簡単構築・運用ステ...
 
Mongodb x business
Mongodb x businessMongodb x business
Mongodb x business
 
Dbts2015 tokyo vector_in_hadoop_vortex
Dbts2015 tokyo vector_in_hadoop_vortexDbts2015 tokyo vector_in_hadoop_vortex
Dbts2015 tokyo vector_in_hadoop_vortex
 
[db tech showcase Tokyo 2015] C16:Oracle Disaster Recovery at New Zealand sto...
[db tech showcase Tokyo 2015] C16:Oracle Disaster Recovery at New Zealand sto...[db tech showcase Tokyo 2015] C16:Oracle Disaster Recovery at New Zealand sto...
[db tech showcase Tokyo 2015] C16:Oracle Disaster Recovery at New Zealand sto...
 
[db tech showcase Tokyo 2015] C33:ビッグデータ・IoT時代のキーテクノロジー、CEPの「今」を掴む! by 株式会社日立...
[db tech showcase Tokyo 2015] C33:ビッグデータ・IoT時代のキーテクノロジー、CEPの「今」を掴む! by 株式会社日立...[db tech showcase Tokyo 2015] C33:ビッグデータ・IoT時代のキーテクノロジー、CEPの「今」を掴む! by 株式会社日立...
[db tech showcase Tokyo 2015] C33:ビッグデータ・IoT時代のキーテクノロジー、CEPの「今」を掴む! by 株式会社日立...
 
DBTS2015 Tokyo DBAが知っておくべき最新テクノロジー
DBTS2015 Tokyo DBAが知っておくべき最新テクノロジーDBTS2015 Tokyo DBAが知っておくべき最新テクノロジー
DBTS2015 Tokyo DBAが知っておくべき最新テクノロジー
 
[db tech showcase Tokyo 2015] C32:「データ一貫性にこだわる日立のインメモリ分散KVS~こだわりの理由と実現方法とは~」 ...
[db tech showcase Tokyo 2015] C32:「データ一貫性にこだわる日立のインメモリ分散KVS~こだわりの理由と実現方法とは~」 ...[db tech showcase Tokyo 2015] C32:「データ一貫性にこだわる日立のインメモリ分散KVS~こだわりの理由と実現方法とは~」 ...
[db tech showcase Tokyo 2015] C32:「データ一貫性にこだわる日立のインメモリ分散KVS~こだわりの理由と実現方法とは~」 ...
 
[DB tech showcase Tokyo 2015] B37 :オンプレミスからAWS上のSAP HANAまで高信頼DBシステム構築にHAクラスタリ...
[DB tech showcase Tokyo 2015] B37 :オンプレミスからAWS上のSAP HANAまで高信頼DBシステム構築にHAクラスタリ...[DB tech showcase Tokyo 2015] B37 :オンプレミスからAWS上のSAP HANAまで高信頼DBシステム構築にHAクラスタリ...
[DB tech showcase Tokyo 2015] B37 :オンプレミスからAWS上のSAP HANAまで高信頼DBシステム構築にHAクラスタリ...
 
Apache Hiveの今とこれから
Apache Hiveの今とこれからApache Hiveの今とこれから
Apache Hiveの今とこれから
 
[db tech showcase Tokyo 2015] C25:HP NonStop SQLはなぜグローバルに分散DBを構築できるのか、 データの整合...
[db tech showcase Tokyo 2015] C25:HP NonStop SQLはなぜグローバルに分散DBを構築できるのか、 データの整合...[db tech showcase Tokyo 2015] C25:HP NonStop SQLはなぜグローバルに分散DBを構築できるのか、 データの整合...
[db tech showcase Tokyo 2015] C25:HP NonStop SQLはなぜグローバルに分散DBを構築できるのか、 データの整合...
 
[db tech showcase Tokyo 2015] E35: Web, IoT, モバイル時代のデータベース、Apache Cassandraを学ぼう
[db tech showcase Tokyo 2015] E35: Web, IoT, モバイル時代のデータベース、Apache Cassandraを学ぼう[db tech showcase Tokyo 2015] E35: Web, IoT, モバイル時代のデータベース、Apache Cassandraを学ぼう
[db tech showcase Tokyo 2015] E35: Web, IoT, モバイル時代のデータベース、Apache Cassandraを学ぼう
 
[db tech showcase Tokyo 2015] A33:Amazon DynamoDB Deep Dive by アマゾン データ サービス ...
[db tech showcase Tokyo 2015] A33:Amazon DynamoDB Deep Dive by アマゾン データ サービス ...[db tech showcase Tokyo 2015] A33:Amazon DynamoDB Deep Dive by アマゾン データ サービス ...
[db tech showcase Tokyo 2015] A33:Amazon DynamoDB Deep Dive by アマゾン データ サービス ...
 
[db tech showcase Tokyo 2015] C14:30万のユーザ部門を抱える日立、情シスの「理想と現実」 by 株式会社日立製作所 情報...
[db tech showcase Tokyo 2015] C14:30万のユーザ部門を抱える日立、情シスの「理想と現実」 by 株式会社日立製作所 情報...[db tech showcase Tokyo 2015] C14:30万のユーザ部門を抱える日立、情シスの「理想と現実」 by 株式会社日立製作所 情報...
[db tech showcase Tokyo 2015] C14:30万のユーザ部門を抱える日立、情シスの「理想と現実」 by 株式会社日立製作所 情報...
 
Couchbase introduction-20150611
Couchbase introduction-20150611Couchbase introduction-20150611
Couchbase introduction-20150611
 
[db tech showcase Tokyo 2015] B36:Hitachi Advanced Data Binder 実践SQLチューニング方法 ...
[db tech showcase Tokyo 2015] B36:Hitachi Advanced Data Binder 実践SQLチューニング方法 ...[db tech showcase Tokyo 2015] B36:Hitachi Advanced Data Binder 実践SQLチューニング方法 ...
[db tech showcase Tokyo 2015] B36:Hitachi Advanced Data Binder 実践SQLチューニング方法 ...
 

Similar to Presto in Treasure Data

Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure DataTaro L. Saito
 
MongoDB Analytics
MongoDB AnalyticsMongoDB Analytics
MongoDB Analyticsdatablend
 
Timeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaTimeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaOCoderFest
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & AggregationMongoDB
 
#include customer.h#include heap.h#include iostream.docx
#include customer.h#include heap.h#include iostream.docx#include customer.h#include heap.h#include iostream.docx
#include customer.h#include heap.h#include iostream.docxAASTHA76
 
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...Spark Summit
 
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...GeeksLab Odessa
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryDatabricks
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryDatabricks
 
Write Faster SQL with Trino.pdf
Write Faster SQL with Trino.pdfWrite Faster SQL with Trino.pdf
Write Faster SQL with Trino.pdfEric Xiao
 
The Ring programming language version 1.9 book - Part 78 of 210
The Ring programming language version 1.9 book - Part 78 of 210The Ring programming language version 1.9 book - Part 78 of 210
The Ring programming language version 1.9 book - Part 78 of 210Mahmoud Samir Fayed
 
The Ring programming language version 1.2 book - Part 48 of 84
The Ring programming language version 1.2 book - Part 48 of 84The Ring programming language version 1.2 book - Part 48 of 84
The Ring programming language version 1.2 book - Part 48 of 84Mahmoud Samir Fayed
 
Compose Async with RxJS
Compose Async with RxJSCompose Async with RxJS
Compose Async with RxJSKyung Yeol Kim
 
Query for json databases
Query for json databasesQuery for json databases
Query for json databasesBinh Le
 
リローダブルClojureアプリケーション
リローダブルClojureアプリケーションリローダブルClojureアプリケーション
リローダブルClojureアプリケーションKenji Nakamura
 
CANTEEN MANAGEMENT SYSTEM IN PYTHON
CANTEEN MANAGEMENT SYSTEM IN PYTHONCANTEEN MANAGEMENT SYSTEM IN PYTHON
CANTEEN MANAGEMENT SYSTEM IN PYTHONvikram mahendra
 
Mysql handle socket
Mysql handle socketMysql handle socket
Mysql handle socketPhilip Zhong
 
Chainer-Compiler 動かしてみた
Chainer-Compiler 動かしてみたChainer-Compiler 動かしてみた
Chainer-Compiler 動かしてみたAkira Maruoka
 

Similar to Presto in Treasure Data (20)

Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
 
MongoDB Analytics
MongoDB AnalyticsMongoDB Analytics
MongoDB Analytics
 
Timeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaTimeseries - data visualization in Grafana
Timeseries - data visualization in Grafana
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
 
#include customer.h#include heap.h#include iostream.docx
#include customer.h#include heap.h#include iostream.docx#include customer.h#include heap.h#include iostream.docx
#include customer.h#include heap.h#include iostream.docx
 
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
 
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...
Java/Scala Lab: Анатолий Кметюк - Scala SubScript: Алгебра для реактивного пр...
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love Story
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love Story
 
Write Faster SQL with Trino.pdf
Write Faster SQL with Trino.pdfWrite Faster SQL with Trino.pdf
Write Faster SQL with Trino.pdf
 
The Ring programming language version 1.9 book - Part 78 of 210
The Ring programming language version 1.9 book - Part 78 of 210The Ring programming language version 1.9 book - Part 78 of 210
The Ring programming language version 1.9 book - Part 78 of 210
 
Implementing stack
Implementing stackImplementing stack
Implementing stack
 
The Ring programming language version 1.2 book - Part 48 of 84
The Ring programming language version 1.2 book - Part 48 of 84The Ring programming language version 1.2 book - Part 48 of 84
The Ring programming language version 1.2 book - Part 48 of 84
 
Compose Async with RxJS
Compose Async with RxJSCompose Async with RxJS
Compose Async with RxJS
 
Query for json databases
Query for json databasesQuery for json databases
Query for json databases
 
リローダブルClojureアプリケーション
リローダブルClojureアプリケーションリローダブルClojureアプリケーション
リローダブルClojureアプリケーション
 
CANTEEN MANAGEMENT SYSTEM IN PYTHON
CANTEEN MANAGEMENT SYSTEM IN PYTHONCANTEEN MANAGEMENT SYSTEM IN PYTHON
CANTEEN MANAGEMENT SYSTEM IN PYTHON
 
Solving the n + 1 query problem
Solving the n + 1 query problemSolving the n + 1 query problem
Solving the n + 1 query problem
 
Mysql handle socket
Mysql handle socketMysql handle socket
Mysql handle socket
 
Chainer-Compiler 動かしてみた
Chainer-Compiler 動かしてみたChainer-Compiler 動かしてみた
Chainer-Compiler 動かしてみた
 

Recently uploaded

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 

Recently uploaded (20)

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 

Presto in Treasure Data

  • 1. Presto in Treasure Data Mitsunori Komatsu, Treasure Data
  • 2. Who am I? • Mitsunori Komatsu, 
 Software engineer @ Treasure Data. • Presto, Hive, Plazma, td-android-sdk, 
 td-ios-sdk, Mobile SDK backend,
 embedded-sdk • github:komamitsu,
 msgpack-java committer, Presto contributor, etc…
  • 3. Today’s talk • What's Presto? • Pros & Cons • Architecture • Who uses Presto? • How do we use Presto?
  • 5. Fast • Distributed SQL query engine (MPP) • Low latency and good performance • No disk IO • Pipelined execution (not Map Reduce) • Compile a query plan down to byte code • Off heap memory • Suitable for ad-hoc query
  • 6. Pluggable • Pluggable backends (“connectors”) • Cassandra / Hive / JMX / Kafka / MySQL / PostgreSQL / System / TPCH • We can add a new connector by 
 extending SPI • Treasure Data has been developed a connector to access our storage
  • 7. What kind of SQL • Supports ANSI SQL (Not HiveQL) • Easy to use Presto compared to HiveQL • Structural type: Map, Array, JSON, Row • Window functions • Approximate queries • http://blinkdb.org/
  • 8. Limitations • Fails with huge JOIN • In memory only (broadcast / distributed JOIN) • No grace / hybrid hash join • No fault tolerance • Coordinator is SPOF • No “cost based” optimization • No authentication / authorization • No native ODBC => Prestogres
  • 10. Query plan Output[nationkey, _col1] => [nationkey:bigint, count:bigint]
 - _col1 := count Exchange[GATHER] => nationkey:bigint, count:bigint Aggregate(FINAL)[nationkey] => [nationkey:bigint, count:bigint]
 - count := "count"("count_15") Exchange[REPARTITION] => nationkey:bigint, count_15:bigint Aggregate(PARTIAL)[nationkey] => [nationkey:bigint, count_15:bigint] - count_15 := "count"("expr") Project => [nationkey:bigint, expr:bigint] - expr := 1 InnerJoin[("custkey" = "custkey_0")] => [custkey:bigint, custkey_0:bigint, nationkey:bigint] Project => [custkey:bigint] Filter[("orderpriority" = '1-URGENT')] => [custkey:bigint, orderpriority:varchar] TableScan[tpch:tpch:orders:sf0.01, original constraint=
 ('1-URGENT' = "orderpriority")] => [custkey:bigint, orderpriority:varchar]
 - custkey := tpch:custkey:1
 - orderpriority := tpch:orderpriority:5 Exchange[REPLICATE] => custkey_0:bigint, nationkey:bigint TableScan[tpch:tpch:customer:sf0.01, original constraint=true] => [custkey_0:bigint, nationkey:bigint]
 - custkey_0 := tpch:custkey:0
 - nationkey := tpch:nationkey:3 select
 c.nationkey,
 count(1)
 from orders o join customer c
 on o.custkey = c.custkey where o.orderpriority = '1-URGENT' group by c.nationkey
  • 11. Stage 1 Stage 2 Stage 0 Query, stage, task and split Query Task 0.0 Split Task 1.0 Split Task 1.1 Task 1.2 Split Split Split Task 2.0 Split Task 2.1 Task 2.2 Split Split Split Split Split Split Split Split For example… TableScan (FROM) Aggregation (GROUP BY) Output @worker#2 @worker#3 @worker#0
  • 12. Query plan Output[nationkey, _col1] => [nationkey:bigint, count:bigint]
 - _col1 := count Exchange[GATHER] => nationkey:bigint, count:bigint Aggregate(FINAL)[nationkey] => [nationkey:bigint, count:bigint]
 - count := "count"("count_15") Exchange[REPARTITION] => nationkey:bigint, count_15:bigint Aggregate(PARTIAL)[nationkey] => [nationkey:bigint, count_15:bigint] - count_15 := "count"("expr") Project => [nationkey:bigint, expr:bigint] - expr := 1 InnerJoin[("custkey" = "custkey_0")] => [custkey:bigint, custkey_0:bigint, nationkey:bigint] Project => [custkey:bigint] Filter[("orderpriority" = '1-URGENT')] => [custkey:bigint, orderpriority:varchar] TableScan[tpch:tpch:orders:sf0.01, original constraint=
 ('1-URGENT' = "orderpriority")] => [custkey:bigint, orderpriority:varchar]
 - custkey := tpch:custkey:1
 - orderpriority := tpch:orderpriority:5 Exchange[REPLICATE] => custkey_0:bigint, nationkey:bigint TableScan[tpch:tpch:customer:sf0.01, original constraint=true] => [custkey_0:bigint, nationkey:bigint]
 - custkey_0 := tpch:custkey:0
 - nationkey := tpch:nationkey:3 select
 c.nationkey,
 count(1)
 from orders o join customer c
 on o.custkey = c.custkey where o.orderpriority = '1-URGENT' group by c.nationkey
  • 13. Query plan Output[nationkey, _col1] => [nationkey:bigint, count:bigint]
 - _col1 := count Exchange[GATHER] => nationkey:bigint, count:bigint Aggregate(FINAL)[nationkey] => [nationkey:bigint, count:bigint]
 - count := "count"("count_15") Exchange[REPARTITION] => nationkey:bigint, count_15:bigint Aggregate(PARTIAL)[nationkey] => [nationkey:bigint, count_15:bigint] - count_15 := "count"("expr") Project => [nationkey:bigint, expr:bigint] - expr := 1 InnerJoin[("custkey" = "custkey_0")] => [custkey:bigint, custkey_0:bigint, nationkey:bigint] Project => [custkey:bigint] Filter[("orderpriority" = '1-URGENT')] => [custkey:bigint, orderpriority:varchar] TableScan[tpch:tpch:orders:sf0.01, original constraint=
 ('1-URGENT' = "orderpriority")] => [custkey:bigint, orderpriority:varchar]
 - custkey := tpch:custkey:1
 - orderpriority := tpch:orderpriority:5 Exchange[REPLICATE] => custkey_0:bigint, nationkey:bigint TableScan[tpch:tpch:customer:sf0.01, original constraint=true] => [custkey_0:bigint, nationkey:bigint]
 - custkey_0 := tpch:custkey:0
 - nationkey := tpch:nationkey:3 select
 c.nationkey,
 count(1)
 from orders o join customer c
 on o.custkey = c.custkey where o.orderpriority = '1-URGENT' group by c.nationkey Stage 3
  • 14. Query plan Output[nationkey, _col1] => [nationkey:bigint, count:bigint]
 - _col1 := count Exchange[GATHER] => nationkey:bigint, count:bigint Aggregate(FINAL)[nationkey] => [nationkey:bigint, count:bigint]
 - count := "count"("count_15") Exchange[REPARTITION] => nationkey:bigint, count_15:bigint Aggregate(PARTIAL)[nationkey] => [nationkey:bigint, count_15:bigint] - count_15 := "count"("expr") Project => [nationkey:bigint, expr:bigint] - expr := 1 InnerJoin[("custkey" = "custkey_0")] => [custkey:bigint, custkey_0:bigint, nationkey:bigint] Project => [custkey:bigint] Filter[("orderpriority" = '1-URGENT')] => [custkey:bigint, orderpriority:varchar] TableScan[tpch:tpch:orders:sf0.01, original constraint=
 ('1-URGENT' = "orderpriority")] => [custkey:bigint, orderpriority:varchar]
 - custkey := tpch:custkey:1
 - orderpriority := tpch:orderpriority:5 Exchange[REPLICATE] => custkey_0:bigint, nationkey:bigint TableScan[tpch:tpch:customer:sf0.01, original constraint=true] => [custkey_0:bigint, nationkey:bigint]
 - custkey_0 := tpch:custkey:0
 - nationkey := tpch:nationkey:3 select
 c.nationkey,
 count(1)
 from orders o join customer c
 on o.custkey = c.custkey where o.orderpriority = '1-URGENT' group by c.nationkey Stage 3 Stage 2
  • 15. Query plan Output[nationkey, _col1] => [nationkey:bigint, count:bigint]
 - _col1 := count Exchange[GATHER] => nationkey:bigint, count:bigint Aggregate(FINAL)[nationkey] => [nationkey:bigint, count:bigint]
 - count := "count"("count_15") Exchange[REPARTITION] => nationkey:bigint, count_15:bigint Aggregate(PARTIAL)[nationkey] => [nationkey:bigint, count_15:bigint] - count_15 := "count"("expr") Project => [nationkey:bigint, expr:bigint] - expr := 1 InnerJoin[("custkey" = "custkey_0")] => [custkey:bigint, custkey_0:bigint, nationkey:bigint] Project => [custkey:bigint] Filter[("orderpriority" = '1-URGENT')] => [custkey:bigint, orderpriority:varchar] TableScan[tpch:tpch:orders:sf0.01, original constraint=
 ('1-URGENT' = "orderpriority")] => [custkey:bigint, orderpriority:varchar]
 - custkey := tpch:custkey:1
 - orderpriority := tpch:orderpriority:5 Exchange[REPLICATE] => custkey_0:bigint, nationkey:bigint TableScan[tpch:tpch:customer:sf0.01, original constraint=true] => [custkey_0:bigint, nationkey:bigint]
 - custkey_0 := tpch:custkey:0
 - nationkey := tpch:nationkey:3 select
 c.nationkey,
 count(1)
 from orders o join customer c
 on o.custkey = c.custkey where o.orderpriority = '1-URGENT' group by c.nationkey Stage 3 Stage 2 Stage 1
  • 16. Query plan Output[nationkey, _col1] => [nationkey:bigint, count:bigint]
 - _col1 := count Exchange[GATHER] => nationkey:bigint, count:bigint Aggregate(FINAL)[nationkey] => [nationkey:bigint, count:bigint]
 - count := "count"("count_15") Exchange[REPARTITION] => nationkey:bigint, count_15:bigint Aggregate(PARTIAL)[nationkey] => [nationkey:bigint, count_15:bigint] - count_15 := "count"("expr") Project => [nationkey:bigint, expr:bigint] - expr := 1 InnerJoin[("custkey" = "custkey_0")] => [custkey:bigint, custkey_0:bigint, nationkey:bigint] Project => [custkey:bigint] Filter[("orderpriority" = '1-URGENT')] => [custkey:bigint, orderpriority:varchar] TableScan[tpch:tpch:orders:sf0.01, original constraint=
 ('1-URGENT' = "orderpriority")] => [custkey:bigint, orderpriority:varchar]
 - custkey := tpch:custkey:1
 - orderpriority := tpch:orderpriority:5 Exchange[REPLICATE] => custkey_0:bigint, nationkey:bigint TableScan[tpch:tpch:customer:sf0.01, original constraint=true] => [custkey_0:bigint, nationkey:bigint]
 - custkey_0 := tpch:custkey:0
 - nationkey := tpch:nationkey:3 select
 c.nationkey,
 count(1)
 from orders o join customer c
 on o.custkey = c.custkey where o.orderpriority = '1-URGENT' group by c.nationkey Stage 3 Stage 2 Stage 1 Stage 0
  • 17. What each component do? Presto Cli Coordinator - Parse Query - Analyze Query - Create Query Plan - Execute Query - Contains Stages - Execute Stages - Contains Tasks - Issue Tasks Discovery Service Worker Worker - Execute Tasks - Convert Query to Java Bytecode (Operator) - Execute Operator Connector - MetaData - Table, Column… - SplitManager - Split, … Connector - RecordSetProvider - RecordSet - RecordCursor - Read Storage Connector Storage Worker Connector External
 Metadata?
  • 18. What each component do? Presto Cli Coordinator - Parse Query - Analyze Query - Create Query Plan - Execute Query - Contains Stages - Execute Stages - Contains Tasks - Issue Tasks Discovery Service Worker Worker - Execute Tasks - Convert Query to Java Bytecode (Operator) - Execute Operator Connector - MetaData - Table, Column… - SplitManager - Split, … Connector - RecordSetProvider - RecordSet - RecordCursor - Read Storage Connector Storage Worker Connector External
 Metadata?
  • 19. What each component do? Presto Cli Coordinator - Parse Query - Analyze Query - Create Query Plan - Execute Query - Contains Stages - Execute Stages - Contains Tasks - Issue Tasks Discovery Service Worker Worker - Execute Tasks - Convert Query to Java Bytecode (Operator) - Execute Operator Connector - MetaData - Table, Column… - SplitManager - Split, … Connector - RecordSetProvider - RecordSet - RecordCursor - Read Storage Connector Storage Worker Connector External
 Metadata?
  • 20. What each component do? Presto Cli Coordinator - Parse Query - Analyze Query - Create Query Plan - Execute Query - Contains Stages - Execute Stages - Contains Tasks - Issue Tasks Discovery Service Worker Worker - Execute Tasks - Convert Query to Java Bytecode (Operator) - Execute Operator Connector - MetaData - Table, Column… - SplitManager - Split, … Connector - RecordSetProvider - RecordSet - RecordCursor - Read Storage Connector Storage Worker Connector External
 Metadata?
  • 21. What each component do? Presto Cli Coordinator - Parse Query - Analyze Query - Create Query Plan - Execute Query - Contains Stages - Execute Stages - Contains Tasks - Issue Tasks Discovery Service Worker Worker - Execute Tasks - Convert Query to Java Bytecode (Operator) - Execute Operator Connector - MetaData - Table, Column… - SplitManager - Split, … Connector - RecordSetProvider - RecordSet - RecordCursor - Read Storage Connector Storage Worker Connector External
 Metadata?
  • 22. Who uses Presto? • Facebook http://www.slideshare.net/dain1/presto-meetup-2015
  • 25. • Qubole • SaaS • Treasure Data • SaaS • Teradata (new!) • commercial support Who uses Presto? As a service…
  • 26. Today’s talk • What's Presto? • How do we use Presto? • What’s Treasure Data • Architecture • How we manage Presto
  • 27. How do we use Presto? We…?
  • 30. Treasure Data Time to Value Send query result Result Push Acquire Analyze Store Plazma DB Flexible, Scalable, Columnar Storage Web Log App Log Censor CRM ERP RDBMS Treasure Agent(Server) SDK(JS, Android, iOS, Unity) Streaming Collector Batch / Reliability Ad-hoc /
 Low latency KPI$ KPI Dashboard BI Tools Other Products RDBMS, Google Docs, AWS S3, FTP Server, etc. Metric Insights Tableau, Motion Board etc. POS REST API ODBC / JDBC SQL, Pig Bulk Uploader Embulk,
 TD Toolbelt SQL-based query @AWS or @IDCF Connectivity Economy & Flexibility Simple & Supported
  • 31. Architecture in Treasure Data worker queue (MySQL) api server td worker process plazmadb (PostgreSQL + S3/RiakCS) select user_id, count(1) from … Presto coordinator Presto worker Presto worker Presto worker Presto worker result bucket (S3) Retry failed query if needed Authentication / Authorization Columnar file format. Schema-less. td-presto connector
  • 32. Architecture in Treasure Data worker queue (MySQL) api server td worker process plazmadb (PostgreSQL + S3/RiakCS) select user_id, count(1) from … Presto coordinator Presto worker Presto worker Presto worker Presto worker result bucket (S3) Retry failed query if needed Authentication / Authorization Columnar file format. Schema-less. td-presto connector
  • 33. Architecture in Treasure Data worker queue (MySQL) api server td worker process plazmadb (PostgreSQL + S3/RiakCS) select user_id, count(1) from … Presto coordinator Presto worker Presto worker Presto worker Presto worker result bucket (S3) Retry failed query if needed Authentication / Authorization Columnar file format. Schema-less. td-presto connector
  • 34. Architecture in Treasure Data worker queue (MySQL) api server td worker process plazmadb (PostgreSQL + S3/RiakCS) select user_id, count(1) from … Presto coordinator Presto worker Presto worker Presto worker Presto worker result bucket (S3) Retry failed query if needed Authentication / Authorization Columnar file format. Schema-less. td-presto connector
  • 35. Architecture in Treasure Data worker queue (MySQL) api server td worker process plazmadb (PostgreSQL + S3/RiakCS) select user_id, count(1) from … Presto coordinator Presto worker Presto worker Presto worker Presto worker result bucket (S3) Retry failed query if needed Authentication / Authorization Columnar file format. Schema-less. td-presto connector
  • 36. Architecture in Treasure Data worker queue (MySQL) api server td worker process plazmadb (PostgreSQL + S3/RiakCS) select user_id, count(1) from … Presto coordinator Presto worker Presto worker Presto worker Presto worker result bucket (S3) Retry failed query if needed Authentication / Authorization Columnar file format. Schema-less. td-presto connector
  • 37. Architecture in Treasure Data worker queue (MySQL) api server td worker process plazmadb (PostgreSQL + S3/RiakCS) select user_id, count(1) from … Presto coordinator Presto worker Presto worker Presto worker Presto worker result bucket (S3) Retry failed query if needed Authentication / Authorization Columnar file format. Schema-less. td-presto connector
  • 38. Architecture in Treasure Data worker queue (MySQL) api server td worker process plazmadb (PostgreSQL + S3/RiakCS) select user_id, count(1) from … Presto coordinator Presto worker Presto worker Presto worker Presto worker result bucket (S3) Retry failed query if needed Authentication / Authorization Columnar file format. Schema-less. td-presto connector
  • 39. Schema on read time code method user_id 2015-06-01 10:07:11 200 GET 2015-06-01 10:10:12 “200” GET 2015-06-01 10:10:20 200 GET 2015-06-01 10:11:30 200 POST 2015-06-01 10:20:45 200 GET 2015-06-01 10:33:50 400 GET 206 2015-06-01 10:40:11 200 GET 852 2015-06-01 10:51:32 200 PUT 1223 2015-06-01 10:58:02 200 GET 5118 2015-06-01 11:02:11 404 GET 12 2015-06-01 11:14:27 200 GET 3447 access_logs table User added a new column “user_id” in imported data User can select this column with only adding it to the schema (w/o reconstruct the table) Schema on read
  • 40. Columnar file format time code method user_id 2015-06-01 10:07:11 200 GET 2015-06-01 10:10:12 “200” GET 2015-06-01 10:10:20 200 GET 2015-06-01 10:11:30 200 POST 2015-06-01 10:20:45 200 GET 2015-06-01 10:33:50 400 GET 206 2015-06-01 10:40:11 200 GET 852 2015-06-01 10:51:32 200 PUT 1223 2015-06-01 10:58:02 200 GET 5118 2015-06-01 11:02:11 404 GET 12 2015-06-01 11:14:27 200 GET 3447 access_logs table time code method user_id Columnar file format This query accesses only code column 
 select code, count(1) from tbl group by code
  • 41. td-presto connector • MessagePack v07 • off heap • Async IO with Jetty-client • Scheduling & Resource management
  • 42. How we manage Presto • Blue-Green Deployment • Stress test tool • Monitoring with DataDog
  • 43. Blue-Green Deployment worker queue (MySQL) api server td worker process plazmadb (PostgreSQL + S3/RiakCS) select user_id, count(1) from … Presto coordinator result bucket (S3) Presto coordinator Presto worker Presto worker Presto worker Presto worker Presto worker Presto worker Presto worker Presto worker production rc
  • 44. Blue-Green Deployment worker queue (MySQL) api server td worker process plazmadb (PostgreSQL + S3/RiakCS) select user_id, count(1) from … Presto coordinator result bucket (S3) Presto coordinator Presto worker Presto worker Presto worker Presto worker Presto worker Presto worker Presto worker Presto worker production rcTest Test Test!
  • 45. Blue-Green Deployment worker queue (MySQL) api server td worker process plazmadb (PostgreSQL + S3/RiakCS) select user_id, count(1) from … Presto coordinator result bucket (S3) Presto coordinator Presto worker Presto worker Presto worker Presto worker Presto worker Presto worker Presto worker Presto worker production!
  • 46. Stress test tool • Collect queries that has ever caused issues. • Add a new query with just adding this entry. • Issue the query, gets the result and implements a calculated digest automatically.
 • We can send all the queries including very heavy ones (around 6000 stages) to Presto - job_id: 28889999 - result: 227d16d801a9a43148c2b7149ce4657c - job_id: 28889999
  • 47. Stress test tool • Collect queries that has ever caused issues. • Add a new query with just adding this entry. • Issue the query, gets the result and implements a calculated digest automatically.
 • We can send all the queries including very heavy ones (around 6000 stages) to Presto - job_id: 28889999 - result: 227d16d801a9a43148c2b7149ce4657c - job_id: 28889999
  • 48. Stress test tool • Collect queries that has ever caused issues. • Add a new query with just adding this entry. • Issue the query, gets the result and implements a calculated digest automatically.
 • We can send all the queries including very heavy ones (around 6000 stages) to Presto - job_id: 28889999 - result: 227d16d801a9a43148c2b7149ce4657c - job_id: 28889999
  • 49. Stress test tool • Collect queries that has ever caused issues. • Add a new query with just adding this entry. • Issue the query, gets the result and implements a calculated digest automatically.
 • We can send all the queries including very heavy ones (around 6000 stages) to Presto - job_id: 28889999 - result: 227d16d801a9a43148c2b7149ce4657c - job_id: 28889999
  • 50. Monitoring with DataDog Presto coordinator Presto worker Presto worker Presto worker Presto worker Presto process td-agent in_presto_metrics/v1/jmx/mbean /v1/query /v1/node out_metricsense DataDog
  • 51. Monitoring with DataDog Query stalled time - Most important for us. - It triggers alert calls to us… - It can be mainly increased by td-presto connector problems. Most of them are race condition issue.
  • 52. How many queries processed More than 10000 queries / day
  • 53. How many queries processed Most queries finish within 1min
  • 54. Check: treasuredata.com
   treasure-data.hateblo.jp/ (Japan blog) Cloud service for the entire data pipeline