SlideShare a Scribd company logo
1 of 22
Download to read offline
© Copyright 2019 Pivotal Software, Inc. All rights Reserved.
Ivan Novick
@NovickGreenplum
March 2019
Present & Future of Greenplum Database
A massively parallel Postgres Database
© Copyright 2019 Pivotal Software, Inc. All rights Reserved.© Copyright 2019 Pivotal Software, Inc. All rights Reserved.
Greenplum Database v5
Mission Critical Analytical Database Platform
GPDB v5: Mission Critical Analytical Database
Platform
Well rounded and proven feature set:
● Proven in Mission Critical Use Cases
● ORCA Optimizer
● Resource Groups & PGBouncer for Concurrency
● In-Database Analytics
● External Data Federation Ecosystem
● Pivotal Greenplum Command Center 4.x
● Updated Backup and Migration Tooling
“Pivotal Greenplum is
often used in mission-
critical use cases,
where downtime is
not well-tolerated.”
-- Gartner MQ 2019
© Copyright 2019 Pivotal Software, Inc. All rights Reserved.© Copyright 2019 Pivotal Software, Inc. All rights Reserved.
Greenplum Database V6
Massive Postgres Power
GPDB v6: Massive Postgres Power
What if Greenplum was a Superset and not a
subset of Postgres
● Postgres 9.4 merged
● WAL Replication
● Row Level Locking for Updates/Deletes
● Foreign Data Wrapper API
● PG Extensions: e.g. pgaudit
● Recursive CTE
● JSON, JSONB, FTS, GIN Index
“Customers
frequently called out
the open-source
alignment with
PostgreSQL as a
strong and cost-
effective positive”
-- Gartner MQ 2019
GPDB v6: OLTP Performance with Greenplum
Up to 50x Performance gain on pgbench in
early testing
● Greenplum has always been ACID with
Transaction semantics
● Many Analytical Systems Require a Mix of
Analytical and OLTP Queries
● Remove Table Lock on Updates & Deletes
● Distributed Deadlock Detector introduced
● Concurrent OLTP Operations allowed
“Customers
frequently called out
the open-source
alignment with
PostgreSQL as a
strong and cost-
effective positive”
-- Gartner MQ 2019
V6: Big Data Features
#ScaleMatters
● Online Expansion w/ Jump Consistent Hash
● Star-Schema DW with Replicated Tables
● Join Aggregrate Query Perf with Eager
Aggregation Optimizations
● zStandard compression
“Reference customers
for Pivotal praised the
overall performance
and scalability of
Pivotal Greenplum”
-- Gartner MQ 2019
GP v5 Expand Example
Distributed by Call ID
Detailed Call Records
Example
Call id 1
Call id 4
Call id 7
Call id 10
Call id 2
Call id 5
Call id 8
Call id 11
Call id 3
Call id 6
Call id 9
Call id 12
Call id 1
Call id 5
Call id 9
Call id 2
Call id 6
Call id 10
Call id 3
Call id 7
Call id 11
Call id 4
Call id 8
Call id 12
RESHUFFLE
ALL GPEXPAND
GP v6 Online Expand w/ Jump Consistent Hash
Distributed by Call ID
Detailed Call Records
Example
Call id 1
Call id 4
Call id 7
Call id 10
Call id 2
Call id 5
Call id 8
Call id 11
Call id 3
Call id 6
Call id 9
Call id 12
Call id 1
Call id 4
Call id 7
Call id 2
Call id 5
Call id 8
Call id 3
Call id 6
Call id 9
Call id 10
Call id 11
Call id 12
MINIMAL DATA
MOVEMENT
GPEXPAND
GP v6 Replicated Tables
Call 1, Caller 1
Call 5, Caller 2
Call 9, Caller 1
Call 13, Caller 3
Call 2, Caller 1
Call 6, Caller 3
Call 10, Caller 3
Call 14, Caller 3
Call 3, Caller 3
Call 7, Caller 3
Call 11, Caller 1
Call 15, Caller 1
CallerID 1
CallerID 2
CallerID 3
JOIN
Call 4, Caller 2
Call 8, Caller 3
Call 12, Caller 1
Call 16, Caller 1
CallerID 1
CallerID 2
CallerID 3
CallerID 1
CallerID 2
CallerID 3
CallerID 1
CallerID 2
CallerID 3
Distributed
Fact Table
Replicated
Dimension Table
SEGMENT 1 SEGMENT 2 SEGMENT 3 SEGMENT 4
CREATE TABLE CallerUser (x CallerId, y Attribute) DISTRIBUTED REPLICATED;
Eager-Agg Optimization in GPDB v6
create table foo (j1 int, g1 int, s1 int);
insert into foo select i%10000, i %1000, i from generate_series(1,100000000) i;
● 10,000 unique grouping columns
● 1000 unique join columns
create table bar (j2 int, g2 int, s2 int);
insert into bar select i%100, i %1000, i from generate_series(1,100000) i;
● 1000 unique grouping columns
● 100 unique join columns
Query:
select sum(s1)
from foo, bar
where j1 = j2 and s1%2 = 0
group by g1, g2;
Greenplum v5 63.8 seconds
Greenplum v6 7.4 seconds
~
9X
Im
provem
ent
Aggregate Queries over Join GPDB v5
Find the loss per line item for
all returned items
Join the line items to the
orders
Group them by store and
compute the aggregate loss
Straightforward translation of
the query into the query plan
If each order has a large
number of line items, the join
results can be quite large and
expensiveLINEITEM ORDERS
! L_LOSS:
L_EXTENDEDPRICE * (I-
L_DISCOUNT)
⨝ (L_ORDERKEY =
O_ORDERKEY)
#O_STORE (SUM(L_LOSS))
σ (L_RETURNFLAG = “R”)
Eager Agg Optimization GPDB v6
● Find the loss of revenue
for each order
● Join the aggregated
view with table ORDERS
● Compute the total loss
for each store
● Benefit: Inner group-by
reduces the number of
row to the join
[Yan95] W. P. Yan and P.
Larson, "Eager Aggregation
and Lazy Aggregation",
VLDB 1995
LINEITEM ORDERS
! L_LOSS:
L_EXTENDEDPRICE * (I-
L_DISCOUNT)
⨝ (L_ORDERKEY =
O_ORDERKEY)
# O_STORE
SUM(L_ORDERLOSS)
σ (L_RETURNFLAG = “R”)
# L_ORDERKEY
L_ORDERLOSS:
SUM(L_LOSS)
GPDB v6 zStd Compression
Same or more for less
● Open Source
● Lower CPU Cycles with same or better compression
● Originated at Facebook
CREATE TABLE call_data_records(callid int4, calldetails json)
WITH (appendonly=true, compresstype=zstd, orientation=column)
DISTRIBUTED BY (callid);
Pivotal Greenplum 6 Roadmap
Containerized Greenplum w/ GPDB v6● GP embedded in
containers for
portability and
dependency
management
● Containers
managed by
Kubernetes for
higher availability
and elasticity
● Kubernetes
operator used for
automation
Container
Operator AUTOMATION
AUTOMATION
AUTOMATION
pod pod
© Copyright 2019 Pivotal Software, Inc. All rights Reserved.© Copyright 2019 Pivotal Software, Inc. All rights Reserved.
Greenplum Database V7
BEYOND THE CLUSTER
GPDB v7: Beyond the Cluster
We have all this Postgres infrastructure in
GPDB v6 now lets use it
● Postgres 9.6 target
● DB Snapshots / Backup
● Streaming Replication
● Log Shipping and Reconciliation
● Greenplum as a source for Kafka
● Greenplum as a source for CDC Tools
● Greenplum to Greenplum Inter Cluster Queries
“You do this and you
can beat Oracle”
-- US Federal
Customer, 2018
GPDB v7: Thought Leadership in Database AI
Define Artificial Intelligence. Does it make
sense to integrate intelligence into an
analytical platform?
● 2019 Apache Madlib is focused on Deep Learning
and GPU processing
● 2019 Pivotal’s GPText Solution will add more
cognitive intelligence of human language
● Combine with existing functions: PostGIS
Geospatial; Apache Madlib Machine Learning &
Graph; Python, R libraries, SQL at scale
● This is a platform for modern AI!
“With the Apache MADlib
analytics libraries, Pivotal
Greenplum has capable
in-database analytics that
allow for predictive
modeling and ML to be
applied to relational
data.” -- Gartner MQ 2019
“Greenplum Database, soar
with us new to new heights”
#ScaleMatters
© Copyright 2019 Pivotal Software, Inc. All rights Reserved.

More Related Content

What's hot

Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...VMware Tanzu
 
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019VMware Tanzu
 
Greenplum-Spark November 2018
Greenplum-Spark November 2018Greenplum-Spark November 2018
Greenplum-Spark November 2018KongYew Chan, MBA
 
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...VMware Tanzu
 
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018VMware Tanzu
 
Maximize Greenplum For Any Use Cases Decoupling Compute and Storage - Greenpl...
Maximize Greenplum For Any Use Cases Decoupling Compute and Storage - Greenpl...Maximize Greenplum For Any Use Cases Decoupling Compute and Storage - Greenpl...
Maximize Greenplum For Any Use Cases Decoupling Compute and Storage - Greenpl...VMware Tanzu
 
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018VMware Tanzu
 
Greenplum Experts Panel, Greenplum Operations at Scale - Greenplum Summit 2019
Greenplum Experts Panel, Greenplum Operations at Scale - Greenplum Summit 2019Greenplum Experts Panel, Greenplum Operations at Scale - Greenplum Summit 2019
Greenplum Experts Panel, Greenplum Operations at Scale - Greenplum Summit 2019VMware Tanzu
 
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...VMware Tanzu
 
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォームPivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォームMasayuki Matsushita
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkDataWorks Summit
 
Graphene – Microsoft SCOPE on Tez
Graphene – Microsoft SCOPE on Tez Graphene – Microsoft SCOPE on Tez
Graphene – Microsoft SCOPE on Tez DataWorks Summit
 
Change Data Streaming Patterns for Microservices With Debezium
Change Data Streaming Patterns for Microservices With Debezium Change Data Streaming Patterns for Microservices With Debezium
Change Data Streaming Patterns for Microservices With Debezium confluent
 
Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid DataWorks Summit
 
PostgreSQL to Accelerate Innovation
PostgreSQL to Accelerate InnovationPostgreSQL to Accelerate Innovation
PostgreSQL to Accelerate InnovationEDB
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowDataWorks Summit
 
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity ManagementEDB
 
Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...DataWorks Summit
 
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 Apache AGE and the synergy effect in the combination of Postgres and NoSQL Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Apache AGE and the synergy effect in the combination of Postgres and NoSQLEDB
 

What's hot (20)

Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
 
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
 
Greenplum-Spark November 2018
Greenplum-Spark November 2018Greenplum-Spark November 2018
Greenplum-Spark November 2018
 
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
 
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
 
Maximize Greenplum For Any Use Cases Decoupling Compute and Storage - Greenpl...
Maximize Greenplum For Any Use Cases Decoupling Compute and Storage - Greenpl...Maximize Greenplum For Any Use Cases Decoupling Compute and Storage - Greenpl...
Maximize Greenplum For Any Use Cases Decoupling Compute and Storage - Greenpl...
 
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
 
Greenplum Roadmap
Greenplum RoadmapGreenplum Roadmap
Greenplum Roadmap
 
Greenplum Experts Panel, Greenplum Operations at Scale - Greenplum Summit 2019
Greenplum Experts Panel, Greenplum Operations at Scale - Greenplum Summit 2019Greenplum Experts Panel, Greenplum Operations at Scale - Greenplum Summit 2019
Greenplum Experts Panel, Greenplum Operations at Scale - Greenplum Summit 2019
 
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
 
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォームPivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
Graphene – Microsoft SCOPE on Tez
Graphene – Microsoft SCOPE on Tez Graphene – Microsoft SCOPE on Tez
Graphene – Microsoft SCOPE on Tez
 
Change Data Streaming Patterns for Microservices With Debezium
Change Data Streaming Patterns for Microservices With Debezium Change Data Streaming Patterns for Microservices With Debezium
Change Data Streaming Patterns for Microservices With Debezium
 
Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid
 
PostgreSQL to Accelerate Innovation
PostgreSQL to Accelerate InnovationPostgreSQL to Accelerate Innovation
PostgreSQL to Accelerate Innovation
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity Management
 
Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...
 
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 Apache AGE and the synergy effect in the combination of Postgres and NoSQL Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 

Similar to GPDB v6: Massive Postgres Power for Analytics

Greenplum: A Pivotal Moment on Wall Street - Greenplum Summit 2018
Greenplum: A Pivotal Moment on Wall Street - Greenplum Summit 2018Greenplum: A Pivotal Moment on Wall Street - Greenplum Summit 2018
Greenplum: A Pivotal Moment on Wall Street - Greenplum Summit 2018VMware Tanzu
 
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Dataconomy Media
 
Apache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouseApache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouseYang Li
 
Delivering Quality at Speed with GitOps
Delivering Quality at Speed with GitOpsDelivering Quality at Speed with GitOps
Delivering Quality at Speed with GitOpsWeaveworks
 
Google F1
Google F1Google F1
Google F1ikewu83
 
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...PivotalOpenSourceHub
 
Big Data London v 11.0 I 'Distributed Machine & Deep Learning at Scale with A...
Big Data London v 11.0 I 'Distributed Machine & Deep Learning at Scale with A...Big Data London v 11.0 I 'Distributed Machine & Deep Learning at Scale with A...
Big Data London v 11.0 I 'Distributed Machine & Deep Learning at Scale with A...Dataconomy Media
 
Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...
Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...
Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...Mydbops
 
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFGPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFKeith Kraus
 
MeasureCamp_Custom GA4 Channel Groups with dbt
MeasureCamp_Custom GA4 Channel Groups with dbtMeasureCamp_Custom GA4 Channel Groups with dbt
MeasureCamp_Custom GA4 Channel Groups with dbtChristopher Gutknecht
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotXiang Fu
 
Deploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud PlatformDeploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud PlatformMariaDB plc
 
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...ScyllaDB
 
PyCon Korea - Real World Graphene
PyCon Korea - Real World GraphenePyCon Korea - Real World Graphene
PyCon Korea - Real World GrapheneMarcin Gębala
 
Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams confluent
 
My past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsaiMy past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsaiKim Kao
 
GraphQL Advanced
GraphQL AdvancedGraphQL Advanced
GraphQL AdvancedLeanIX GmbH
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsConnected Data World
 
Gerrit Analytics applied to Android source code
Gerrit Analytics applied to Android source codeGerrit Analytics applied to Android source code
Gerrit Analytics applied to Android source codeLuca Milanesio
 
The Answer to the NGS Data Analysis Challenges in Agricultural Biotechnology ...
The Answer to the NGS Data Analysis Challenges in Agricultural Biotechnology ...The Answer to the NGS Data Analysis Challenges in Agricultural Biotechnology ...
The Answer to the NGS Data Analysis Challenges in Agricultural Biotechnology ...GENALICE
 

Similar to GPDB v6: Massive Postgres Power for Analytics (20)

Greenplum: A Pivotal Moment on Wall Street - Greenplum Summit 2018
Greenplum: A Pivotal Moment on Wall Street - Greenplum Summit 2018Greenplum: A Pivotal Moment on Wall Street - Greenplum Summit 2018
Greenplum: A Pivotal Moment on Wall Street - Greenplum Summit 2018
 
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
 
Apache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouseApache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouse
 
Delivering Quality at Speed with GitOps
Delivering Quality at Speed with GitOpsDelivering Quality at Speed with GitOps
Delivering Quality at Speed with GitOps
 
Google F1
Google F1Google F1
Google F1
 
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...
 
Big Data London v 11.0 I 'Distributed Machine & Deep Learning at Scale with A...
Big Data London v 11.0 I 'Distributed Machine & Deep Learning at Scale with A...Big Data London v 11.0 I 'Distributed Machine & Deep Learning at Scale with A...
Big Data London v 11.0 I 'Distributed Machine & Deep Learning at Scale with A...
 
Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...
Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...
Migrate your EOL MySQL servers to HA Complaint GR Cluster / InnoDB Cluster Wi...
 
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFGPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
 
MeasureCamp_Custom GA4 Channel Groups with dbt
MeasureCamp_Custom GA4 Channel Groups with dbtMeasureCamp_Custom GA4 Channel Groups with dbt
MeasureCamp_Custom GA4 Channel Groups with dbt
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 
Deploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud PlatformDeploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud Platform
 
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
 
PyCon Korea - Real World Graphene
PyCon Korea - Real World GraphenePyCon Korea - Real World Graphene
PyCon Korea - Real World Graphene
 
Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams
 
My past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsaiMy past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsai
 
GraphQL Advanced
GraphQL AdvancedGraphQL Advanced
GraphQL Advanced
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
Gerrit Analytics applied to Android source code
Gerrit Analytics applied to Android source codeGerrit Analytics applied to Android source code
Gerrit Analytics applied to Android source code
 
The Answer to the NGS Data Analysis Challenges in Agricultural Biotechnology ...
The Answer to the NGS Data Analysis Challenges in Agricultural Biotechnology ...The Answer to the NGS Data Analysis Challenges in Agricultural Biotechnology ...
The Answer to the NGS Data Analysis Challenges in Agricultural Biotechnology ...
 

More from VMware Tanzu

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItVMware Tanzu
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023VMware Tanzu
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleVMware Tanzu
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023VMware Tanzu
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductVMware Tanzu
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready AppsVMware Tanzu
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And BeyondVMware Tanzu
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfVMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023VMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023VMware Tanzu
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptxVMware Tanzu
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchVMware Tanzu
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishVMware Tanzu
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVMware Tanzu
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - FrenchVMware Tanzu
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023VMware Tanzu
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootVMware Tanzu
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerVMware Tanzu
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeVMware Tanzu
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsVMware Tanzu
 

More from VMware Tanzu (20)

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About It
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at Scale
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a Product
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And Beyond
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - French
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - English
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - English
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - French
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software Engineer
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs Practice
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
 

Recently uploaded

Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 

GPDB v6: Massive Postgres Power for Analytics

  • 1.
  • 2. © Copyright 2019 Pivotal Software, Inc. All rights Reserved. Ivan Novick @NovickGreenplum March 2019 Present & Future of Greenplum Database A massively parallel Postgres Database
  • 3. © Copyright 2019 Pivotal Software, Inc. All rights Reserved.© Copyright 2019 Pivotal Software, Inc. All rights Reserved. Greenplum Database v5 Mission Critical Analytical Database Platform
  • 4. GPDB v5: Mission Critical Analytical Database Platform Well rounded and proven feature set: ● Proven in Mission Critical Use Cases ● ORCA Optimizer ● Resource Groups & PGBouncer for Concurrency ● In-Database Analytics ● External Data Federation Ecosystem ● Pivotal Greenplum Command Center 4.x ● Updated Backup and Migration Tooling “Pivotal Greenplum is often used in mission- critical use cases, where downtime is not well-tolerated.” -- Gartner MQ 2019
  • 5. © Copyright 2019 Pivotal Software, Inc. All rights Reserved.© Copyright 2019 Pivotal Software, Inc. All rights Reserved. Greenplum Database V6 Massive Postgres Power
  • 6. GPDB v6: Massive Postgres Power What if Greenplum was a Superset and not a subset of Postgres ● Postgres 9.4 merged ● WAL Replication ● Row Level Locking for Updates/Deletes ● Foreign Data Wrapper API ● PG Extensions: e.g. pgaudit ● Recursive CTE ● JSON, JSONB, FTS, GIN Index “Customers frequently called out the open-source alignment with PostgreSQL as a strong and cost- effective positive” -- Gartner MQ 2019
  • 7. GPDB v6: OLTP Performance with Greenplum Up to 50x Performance gain on pgbench in early testing ● Greenplum has always been ACID with Transaction semantics ● Many Analytical Systems Require a Mix of Analytical and OLTP Queries ● Remove Table Lock on Updates & Deletes ● Distributed Deadlock Detector introduced ● Concurrent OLTP Operations allowed “Customers frequently called out the open-source alignment with PostgreSQL as a strong and cost- effective positive” -- Gartner MQ 2019
  • 8. V6: Big Data Features #ScaleMatters ● Online Expansion w/ Jump Consistent Hash ● Star-Schema DW with Replicated Tables ● Join Aggregrate Query Perf with Eager Aggregation Optimizations ● zStandard compression “Reference customers for Pivotal praised the overall performance and scalability of Pivotal Greenplum” -- Gartner MQ 2019
  • 9. GP v5 Expand Example Distributed by Call ID Detailed Call Records Example Call id 1 Call id 4 Call id 7 Call id 10 Call id 2 Call id 5 Call id 8 Call id 11 Call id 3 Call id 6 Call id 9 Call id 12 Call id 1 Call id 5 Call id 9 Call id 2 Call id 6 Call id 10 Call id 3 Call id 7 Call id 11 Call id 4 Call id 8 Call id 12 RESHUFFLE ALL GPEXPAND
  • 10. GP v6 Online Expand w/ Jump Consistent Hash Distributed by Call ID Detailed Call Records Example Call id 1 Call id 4 Call id 7 Call id 10 Call id 2 Call id 5 Call id 8 Call id 11 Call id 3 Call id 6 Call id 9 Call id 12 Call id 1 Call id 4 Call id 7 Call id 2 Call id 5 Call id 8 Call id 3 Call id 6 Call id 9 Call id 10 Call id 11 Call id 12 MINIMAL DATA MOVEMENT GPEXPAND
  • 11. GP v6 Replicated Tables Call 1, Caller 1 Call 5, Caller 2 Call 9, Caller 1 Call 13, Caller 3 Call 2, Caller 1 Call 6, Caller 3 Call 10, Caller 3 Call 14, Caller 3 Call 3, Caller 3 Call 7, Caller 3 Call 11, Caller 1 Call 15, Caller 1 CallerID 1 CallerID 2 CallerID 3 JOIN Call 4, Caller 2 Call 8, Caller 3 Call 12, Caller 1 Call 16, Caller 1 CallerID 1 CallerID 2 CallerID 3 CallerID 1 CallerID 2 CallerID 3 CallerID 1 CallerID 2 CallerID 3 Distributed Fact Table Replicated Dimension Table SEGMENT 1 SEGMENT 2 SEGMENT 3 SEGMENT 4 CREATE TABLE CallerUser (x CallerId, y Attribute) DISTRIBUTED REPLICATED;
  • 12. Eager-Agg Optimization in GPDB v6 create table foo (j1 int, g1 int, s1 int); insert into foo select i%10000, i %1000, i from generate_series(1,100000000) i; ● 10,000 unique grouping columns ● 1000 unique join columns create table bar (j2 int, g2 int, s2 int); insert into bar select i%100, i %1000, i from generate_series(1,100000) i; ● 1000 unique grouping columns ● 100 unique join columns Query: select sum(s1) from foo, bar where j1 = j2 and s1%2 = 0 group by g1, g2; Greenplum v5 63.8 seconds Greenplum v6 7.4 seconds ~ 9X Im provem ent
  • 13. Aggregate Queries over Join GPDB v5 Find the loss per line item for all returned items Join the line items to the orders Group them by store and compute the aggregate loss Straightforward translation of the query into the query plan If each order has a large number of line items, the join results can be quite large and expensiveLINEITEM ORDERS ! L_LOSS: L_EXTENDEDPRICE * (I- L_DISCOUNT) ⨝ (L_ORDERKEY = O_ORDERKEY) #O_STORE (SUM(L_LOSS)) σ (L_RETURNFLAG = “R”)
  • 14. Eager Agg Optimization GPDB v6 ● Find the loss of revenue for each order ● Join the aggregated view with table ORDERS ● Compute the total loss for each store ● Benefit: Inner group-by reduces the number of row to the join [Yan95] W. P. Yan and P. Larson, "Eager Aggregation and Lazy Aggregation", VLDB 1995 LINEITEM ORDERS ! L_LOSS: L_EXTENDEDPRICE * (I- L_DISCOUNT) ⨝ (L_ORDERKEY = O_ORDERKEY) # O_STORE SUM(L_ORDERLOSS) σ (L_RETURNFLAG = “R”) # L_ORDERKEY L_ORDERLOSS: SUM(L_LOSS)
  • 15. GPDB v6 zStd Compression Same or more for less ● Open Source ● Lower CPU Cycles with same or better compression ● Originated at Facebook CREATE TABLE call_data_records(callid int4, calldetails json) WITH (appendonly=true, compresstype=zstd, orientation=column) DISTRIBUTED BY (callid);
  • 17. Containerized Greenplum w/ GPDB v6● GP embedded in containers for portability and dependency management ● Containers managed by Kubernetes for higher availability and elasticity ● Kubernetes operator used for automation Container Operator AUTOMATION AUTOMATION AUTOMATION pod pod
  • 18. © Copyright 2019 Pivotal Software, Inc. All rights Reserved.© Copyright 2019 Pivotal Software, Inc. All rights Reserved. Greenplum Database V7 BEYOND THE CLUSTER
  • 19. GPDB v7: Beyond the Cluster We have all this Postgres infrastructure in GPDB v6 now lets use it ● Postgres 9.6 target ● DB Snapshots / Backup ● Streaming Replication ● Log Shipping and Reconciliation ● Greenplum as a source for Kafka ● Greenplum as a source for CDC Tools ● Greenplum to Greenplum Inter Cluster Queries “You do this and you can beat Oracle” -- US Federal Customer, 2018
  • 20. GPDB v7: Thought Leadership in Database AI Define Artificial Intelligence. Does it make sense to integrate intelligence into an analytical platform? ● 2019 Apache Madlib is focused on Deep Learning and GPU processing ● 2019 Pivotal’s GPText Solution will add more cognitive intelligence of human language ● Combine with existing functions: PostGIS Geospatial; Apache Madlib Machine Learning & Graph; Python, R libraries, SQL at scale ● This is a platform for modern AI! “With the Apache MADlib analytics libraries, Pivotal Greenplum has capable in-database analytics that allow for predictive modeling and ML to be applied to relational data.” -- Gartner MQ 2019
  • 21. “Greenplum Database, soar with us new to new heights”
  • 22. #ScaleMatters © Copyright 2019 Pivotal Software, Inc. All rights Reserved.