3. Cover w/ Image
■ FAST
■ Leverage the Cloud
■ Same Experience Across Clouds
■ Secure
Goals for Cloud
Deployments
4. Goal 1 - Fast
● Companies use Greenplum for SPEED
● Cloud Deployments Must be Fast too
5. Performance Tuning
What is Tuned?
● Virtual Machine
● Operating System
● Disk
● Memory
● Network
● Marketplace Template
How is it Measured?
● "gpcheckperf" (Greenplum Utility)
for Network and Disk
● TPC-DS Benchmark
● Cloud Vendor Specs
6. TPC-DS Performance Test
Score
● Transaction Processing Performance Council (TPC)
● Members include:
○ Pivotal, Cloudera, HP, IBM, Microsoft, MapR, Oracle, RedHat,
Teradata, Intel, VMWare, Dell, and many others
● Decision Support (DS): Standard for Big Data / Data Warehousing
● Star Schema with 24 Tables and 99 Queries
● 3TB of data
● 1 and 5 Users
https://github.com/pivotalguru/tpc-ds
7. Score is a Function of
Duration and Hardware
Larger Score = Faster
TPC-DS Performance Test
9. Goal 2 - Leverage The Cloud
● Take Advantage of Cloud-Only Features
○ On-Demand Provisioning
○ Node Replacement
○ Disk Snapshots
○ Upgrades
○ Optional Installations
○ Web Based
10. On-Demand Provisioning
● Deployments Take less than 1 Hour to Complete
● Removes Barriers to Evaluate and Buy
● Empowers Business Units
Azure Resource Group
Deployment
AWS CloudFormation
GCP Deployment
Manager
11. Node Replacement
Pivotal Greenplum Self-Healing
● ANY Node Failure gets Automatically Replaced and Recovered
● Full Recovery in as little as 5 Minutes
○ On-Premises Recovery can last for Days!
● Online Recovery for Standby and Segment Hosts
● pgBouncer pause before Rebalance
VM VM
VMVM
VM
X
Demos in Pivotal
Booth!
12. Node Replacement
Pivotal Greenplum Self-Healing
Single Master
● Maintains High Availability
● No Performance Loss
● Fast Recovery with Self-Healing
● Save $$ on Infrastructure and
Licensing Costs
Interconnect
sdw1
Standby
Seg1
Seg2
Seg3
Seg4
sdw2
Seg5
Seg6
Seg7
Seg8
sdw3
Seg9
Seg10
Seg11
Seg12
...
mdw
Master
13. Disk Snapshots
gpsnap
● Schedule, Create, List, Delete, and Restore Snapshots with "gpsnap" and
"gpcronsnap"
● IaaS Snapshots Provide Fast Backup of a Volume
● Full Cluster Backup Measured in Minutes
● Automatically Configured to take a Weekly Snapshot Backup
● Snapshots are executed in Parallel so they are very FAST!
Data Volume Snapshot Restore
Demos in Pivotal
Booth!
14. Upgrades
gprelease
● Notification of New Version Availability with gpcronrelease (Executes Weekly)
● Installation of New Version with gprelease
● Existing Optional Packages (MADlib, PostGIS, Command Center, etc) Re-Installed and
Upgraded if Needed
Demos in Pivotal
Booth!
15. Optional Installations
gpoptional
● Deployment Parameters to Install
Components
● Or Post Deployment Tool
gpoptional
● Included Packages
○ Command Center
○ Data Science R and Python
○ MADlib
○ PostGIS
○ PL/R
Demos in Pivotal
Booth!
17. Goal 3 - Same Experience Across Clouds
● Similar Deployment
● Same Tools
● Same Software Versions
18. Parameters - Basics
Parameter AWS Azure GCP
Name? Stack Name Deployment Name Deployment Name
Where
Deployed?
Availability
Zone
Resource Group +
Location
Zone
SSH Key? Key Name SSH Public Key N/A
Who Can
Access?
SSH Location SSH Location SSH Location
Subnet CIDR ClusterSubnet Subnet Subnet
Instance
Type?
Instance
Type+Storage
Instance
Type+Storage
Instance Type
Instance
Storage?
N/A N/A Node Storage
How Many? Instance Count Instance Count Node Count
● GCP SSH Key is Managed
Automatically
● Azure Deployments are in a
Resource Group as well as
in a Location
● AWS and Azure Storage is
set by Instance Type for
Optimal Performance
● GCP Disk Size does not
impact performance
Demos in Pivotal
Booth!
19. Parameters - AWS
Parameter AWS
Name? Stack Name
SSH Key? Key Name
Who Can
Access?
SSH Location
Where
Deployed?
Availability
Zone
Subnet CIDR Subnet
Instance
Type?
Instance
Type+Storage
How Many? Instance Count
20. Parameters - Azure
Parameter Azure
Name? Deployment Name
SSH Key? SSH Public Key
Who Can
Access?
SSH Location
Subnet Subnet
Where
Deployed?
Resource Group + Location
22. Parameters - GCP
Parameter GCP
Name? Deployment Name
Where
Deployed?
Zone
Subnet Subnet
Instance
Type?
Instance Type
How Many? Node Count
Instance
Storage?
Node Storage
Who Can
Access?
SSH Location
Dynamic SSH Keys
23. Parameters - Optional Installs
Parameter AWS Azure GCP
Install? Command Center Command Center Command Center
Install? MADlib MADlib MADlib
Install? Data Science Python Data Science Python Data Science Python
Install? Data Science R Data Science R Data Science R
Install? PL/R PL/R PL/R
Install? PostGIS PostGIS PostGIS
● Optional Installs performed by "gpoptional"
30. Documentation
● Release Notes
○ Detailed Information
○ Located On Each Marketplace Listing
● Overview
○ One Pager
○ Located on Each Marketplace Listing