The document discusses Greenplum for Kubernetes, which allows Greenplum databases to be deployed on Kubernetes. It can be deployed on public clouds, private clouds, or bare metal. Greenplum is packaged as containers for portability and managed by Kubernetes for high availability and elasticity. Benefits include speed of deployment, savings from using existing Kubernetes skills and hardware, security, stability, and scalability. Use cases include agile analytics, workbenches with curated tool stacks, and automatic data platforms with day-2 operations automation.
3. Greenplum for
Kubernetes
Public CloudPrivate CloudBare-Metal
Deploy workloads on any infrastructure
Other Kubernetes
(on VMs or not)
Google
Container Engine
Greenplum Building
Blocks
• Pivotal blueprint + Dell
reference hardware configs
• Superior price/performance; no
expensive proprietary
hardware
• The most performant way to
run Greenplum on premises
• Certified and supported by
Pivotal
New! New!
It’s the same Greenplum in all environments, including hybrid
deployments
via Kubernetes
Enterprise & Essentials(OSS K8s)
4. Kubernetes Intro
namespaces
chroot
Linux Kernel
chroot
namespaces
Linux Kernel
chroot
cgroups
1979 2002 2007 2014
time (not to scale)
namespaces
Linux Kernel
chroot
cgroups
Kubernetes
2016
namespaces
Linux Kernel
chroot
cgroups
Kubernetes
Container building blocks
for
● resource management
● process isolation
● storage separation
Pod:
Atomic Unit that Kubernetes manages.
A pod is a group of containers
K8s
Cluster
K8s
ClusterPods
Operator
Pod
K8s
Cluster
K8s
ClusterPods
5. GP embedded in containers for portability
and dependency management
Containers managed by Kubernetes for
higher availability and elasticity
Kubernetes operator used for automation
Container
Operato
r
Massively Parallel Postgres Databases & Kubernetes
6. Why Greenplum for Kubernetes?
Speed
Security
Stability
Savings
Scalability
8. Operational efficiency
Run anywhere, on any K8s
Run on any HW, storage
Leverage org’s K8s skills
Quick new user ramp up
Spin-up cluster for RCA
Why Greenplum
for Kubernetes
Savings
17. Standardized end-to-end Data Science with the
Greenplum/Postgres stack
Experimentation
Initial code development and testing,
model experimentation on samples.
Modeling at Scale
Heavy compute tasks such as model
training across big data
Deployment
Production deployment of models
to feed downstream applications
and reports
Artificial
Intelligenc
e: Closed
Loop
Machine
Learning
18. Deployment Topology
Options
K8s worker 1
K8s worker n
K8s cluster
If paying for physical server
=>
many many pods per server
pod pod
pod pod
If single worker => all
pods
K8s worker 1
K8s cluster
pod pod
K8s worker 1 K8s worker n
K8s cluster
If paying per VM => 1 pod
per VM
podpod
19. Kubernetes provides that
flexibility
There are a growing
number of storage classes:
○ local for performance
○ remote for flexibility
○ others with features
such as dynamic
growth
Users can choose the best
storage class for their
needs
GCEPersistentDisk
AWSElasticBlockStore
PortworxVolume VsphereVolume
(formerly
ScaleIO)
Azure Disk
Allow users to specify a storage class because...