SlideShare a Scribd company logo
1 of 43
Download to read offline
PGConf.ASIA 2019
Indonesia, Bali
PostgreSQL on K8S at
Zalando: two+ years in
production
ALEXANDER KUKUSHKIN
10-09-2019
2
ABOUT ME
Alexander Kukushkin
Database Engineer @ZalandoTech
The Patroni guy
alexander.kukushkin@zalando.de
Twitter: @cyberdemn
3
Typical problems and horror stories
Brief introduction to Kubernetes
Spilo & Patroni
Postgres-Operator
AGENDA
4
WE BRING FASHION TO PEOPLE IN 17 COUNTRIES
17 markets
7 fulfillment centers
26.4 million active customers
5.4 billion € net sales 2018
250 million visits per month
15,000 employees in Europe
5
PostgreSQL at Zalando
> 300
In the data centers
> 160
Databases on AWS
Managed by DB team
> 1000
Databases in other
Kubernetes clusters
> 190
Run in the ACID’s
Kubernetes cluster
6
What is Kubernetes?
● A set of open-source components running on one or more servers
● A container orchestration system
● An abstraction layer over your real or virtualized hardware
● An “infrastructure as a code” system
● Automatic resource allocation
● Next step after Ansible/Chef/Puppet
7
What Kubernetes is not?
● An operating system
● A magical way to make your infrastructure scalable
● An excuse to fire your devops (someone has to
configure and manage it)
● A good solution for running 2-3 servers
8
Terminology
Traditional infrastructure
● Physical server
● Virtual machine
● Individual application
● NAS/SAN
● Load balancer
● Application registry/hardware information
● Password files, certificates
Kubernetes
● Node
● Pod
● Container (typically Docker)
● Persistent Volumes
● Service/Endpoint
● Labels
● Secrets
9
Kubernetes overview
10
Stateful applications on Kubernetes
● PersistentVolumes
○ Abstracts details how storage is provisioned
○ Supports many different storage types via plugins:
■ EBS, AzureDisk, iSCSI, NFS, CEPH, Glusterfs and so on
● StatefulSets
○ Guarantied number of Pods with stable (and unique) identifiers
○ Ordered deployment and scaling
○ Connecting Pods with corresponding persistent storage
(PersistentVolume+PersistentVolumeClaim)
11
● All supported versions of PostgreSQL inside the single image
● Plenty of extensions (pg_partman, pg_cron, postgis, timescaledb, etc)
● Additional tools (pgq, pgbouncer, wal-e/wal-g)
● PGDATA on an external volume
● Patroni for HA
● Environment-variables based configuration
● Lightweight!
Spilo Docker image
12
● Automatic failover solution for PostgreSQL streaming replication
● A python daemon that manages one PostgreSQL instance
● Keeps the cluster state in a DCS (Etcd, Zookeeper, Consul, Kubernetes)
○ Uses DCS for leader elections
■ Makes PostgreSQL 1st class citizen on Kubernetes!
● Helps to automate a lot of things like:
○ A new cluster deployment
○ Scaling out and in
○ PostgreSQL configuration management
What is Patroni
13
Spilo & Patroni on K8S
Node2
Pod: demo-0
role: master
PersistentVolume
PersistentVolume
Node1
StatefulSet: demo
Pod: demo-1
role: replica
Secret: demo
WATCH()
UPDATE()
Service: demo-replica
labelSelector: role=replica
Service: demo
Endpoint: demoS3
14
● A few long YAML manifests to write
● Different parts of PostgreSQL configuration spread over multiple
manifests
● No easy way to work with a cluster as a whole (update, delete)
● Manual generation of DB objects, i.e. users, and their passwords.
Manual deployment to Kubernetes
15
● Encapsulates knowledge of a human operating the service
● Implement a controller application to act on custom resources
● CRD (custom resource definitions) to describe a domain-specific object
(i.e. a Postgres cluster)
https://coreos.com/blog/introducing-operators.html
Kubernetes operator pattern
16
● Defines a custom Postgresql resource
● Watches instances of Postgresql, creates/updates/deletes corresponding
Kubernetes objects
● Allows updating running-cluster resources (memory, cpu, volumes), postgres
configuration
● Creates databases, users and automatically generates passwords
● Auto-repairs, smart rolling updates (switchover to replicas before updating
the master)
Zalando Postgres-Operator
17
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: acid-minimal-cluster
spec:
teamId: "ACID" # is used to provision human users
volume:
size: 1Gi
numberOfInstances: 2
users:
zalando: # database owner
- createrole
- createdb
foo_app_user: # role for application foo
databases: # name->owner
foo: zalando
postgresql:
version: "11"
Postgresql manifest
18
deploycluster
manifest
Stateful set
Spilo pod
Kubernetes cluster
PATRONI
operator
pod
Endpoint
Service
Client
application
operator
config map
Cluster
secrets
DB
deployer
create
create
create
watch
Infrastructure
roles
19
https://www.reddit.com/r/ProgrammerHumor/comments/9fhvyl/writing_yaml/
20
21
Zalando Postgres-Operator features
● Rolling updates on Postgres cluster changes
● Volume resize without Pod restarts
● Cloning Postgres clusters
● Logical Backups to S3 Bucket
● Standby cluster from S3 WAL archive
● Configurable for non-cloud environments
● UI to create and edit Postgres cluster manifests
22
Kubernetes rolling upgrade
● Rotates all worker nodes in the K8s cluster
○ Does it in a rolling matter, one-by-one
○ If you are unlucky, it will cause the number of failover equal number
of pods in your postgres cluster
● How Postgres-Operator deals with it?
○ Detect the to-be-decommissioned node by lack of the ready label
and SchedulingDisabled status
○ Move all master pods:
■ Move replicas to the already updated node
■ Trigger switchover to those replicas
23
Availability Zone 1
Node
cluster: A
primary
cluster: B
primary
cluster: C
replica
Availability Zone 2 Availability Zone 3
Kubernetes rolling upgrade (start)
Node Node
cluster: A
replica
cluster: B
replica
cluster: C
primary
Node Node
cluster: A
replica
cluster: B
replica
cluster: C
replica
Node
Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
24
Availability Zone 1
Node
cluster: A
primary
cluster: B
primary
cluster: C
replica
Availability Zone 2 Availability Zone 3
Kubernetes rolling upgrade (step 1)
Node Node
cluster: A
replica
cluster: B
replica
cluster: C
primary
Node Node
cluster: A
replica
cluster: B
replica
cluster: C
replica
Node
cluster: A
replica
cluster: B
replica
cluster: A
replica
cluster: B
replica
cluster: C
replica
cluster: C
replica
Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
25
Availability Zone 1
Node
cluster: A
primary
cluster: B
primary
Availability Zone 2 Availability Zone 3
Kubernetes rolling upgrade (step 1)
Node Node
cluster: C
primary
Node Node
cluster: A
replica
cluster: B
replica
cluster: A
replica
cluster: B
replica
cluster: C
replica
cluster: C
replica
Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
26
Availability Zone 1
Node
cluster: A
primary
cluster: B
primary
Availability Zone 2 Availability Zone 3
Kubernetes rolling upgrade (switchover)
Node Node
cluster: C
primary
Node Node
cluster: A
replica
cluster: B
replica
cluster: A
replica
cluster: B
replica
cluster: C
replica
cluster: C
replica
Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
27
Availability Zone 1
Node
cluster: A
replica
cluster: B
replica
Availability Zone 2 Availability Zone 3
Kubernetes rolling upgrade (switchover)
Node Node
cluster: C
replica
Node Node
cluster: A
primary
cluster: B
replica
cluster: A
replica
cluster: B
primary
cluster: C
replica
cluster: C
primary
Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
28
Availability Zone 1
Node
Availability Zone 2 Availability Zone 3
Kubernetes rolling upgrade (finish)
Node
cluster: A
replica
cluster: B
replica
Node Node
cluster: C
replica
Node
cluster: A
primary
cluster: B
replica
cluster: A
replica
cluster: B
primary
cluster: C
replica
cluster: C
primary
Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
cluster: A
replica
cluster: B
replica
cluster: C
replica
Most common issues
on K8s
30
Problems with AWS infrastructure
● AWS API Rate Limit Exceeded
○ Prevents or delays attaching/detaching persistent volumes (EBS)
to/from Pods
■ Delays recovery of failed Pods
○ Might delay a deployment of a new cluster
● Sometimes EC2 instances fail and being shutdown by AWS
○ Shutdown might take ages
○ All EBS volumes remain attached until instance is shutted down
■ Pods can’t be rescheduled
31
KUBE2IAM
● AWS IAM + Instance Profile is very flexible, convenient, and powerful
technology
● AWS tools and libraries can use temporary credentials obtained from the
Instance Profile to access other AWS services, for example S3
○ We are using S3 for backups and continuous archiving (wal-e/wal-g)
● Instance Profile is configured per EC2 Instance (K8s Node)!
● kube2iam - provides different IAM roles for Pods running on a single Node
● kube2iam might stop serving the Pod which is being shutdown
○ wal-e is unable to get credentials to access S3
■ Last WAL segments remain unarchived! :(
32
Lack of Disk space
● Single volume for PGDATA, pg_wal and logs
● FATAL,53100,could not write to file
"pg_wal/xlogtemp.22993": No space left on device
○ Usually ends up with postgres being self shutdown
● Patroni tries to recover the primary which isn’t running
○ “start->promote->No space left->shutdown” loop
Disk space MUST be monitored!
33
Disk space problems root cause or why
we don’t implement volume autoextend
● Excessive logging
○ slow queries, human access, application errors, connections/disconnections
● pg_wal growth
○ archive_command is slow/failing
○ Unconsumed changes on the replication slot
■ Replica is not streaming? Replica is slow?
■ Logical replication slot?
○ checkpoints taking too long due to throttled IOPS
● PGDATA growth
○ Table and index bloat!
■ Useless updates of unchanged data?
■ Autovacuum tuning? Zheap?
○ Natural growth of data
■ Lack of retention policies?
■ Broken cleanup jobs?
34
ORM can cause wal-e to fail!
wal_e.main ERROR MSG: Attempted to archive a file that is too
large. HINT: There is a file in the postgres database
directory that is larger than 1610612736 bytes. If no such
file exists, please report this as a bug. In particular,
check pg_stat/pg_stat_statements.stat.tmp, which appears to
be 2010822591 bytes
Meanwhile in pg_stat_statements:
UPDATE foo SET bar = $1 WHERE id IN ($2, $3, $4, …, $10500);
UPDATE foo SET bar = $1 WHERE id IN ($2, $3, $4, …, $100500);
…. and so on
35
Exclusive backup issues
PANIC,XX000,"online backup was canceled, recovery cannot
continue",,,,,"xlog redo at D45/EB000028 for
XLOG/CHECKPOINT_SHUTDOWN: redo D45/EB000028; tli 237; prev
tli 237; fpw true; xid 0:105446371; oid 187558; multi 1;
offset 0; oldest xid 544 in DB 1; oldest multi 1 in DB 1;
oldest/newest commit timestamp xid: 0/0; oldest running xid
0; shutdown",,,,""
● There is no way to join back such failed primary as a replica without
rebuilding (reinitializing) it!
○ wal-g supports non-exclusive backups, but not yet stable enough
36
● Postgres log:
server process (PID 10810) was terminated by signal 9: Killed
● dmesg -T:
[Wed Jul 31 01:35:35 2019] Memory cgroup out of memory: Kill process 14208
(postgres) score 606 or sacrifice child
[Wed Jul 31 01:35:35 2019] Killed process 14208 (postgres) total-vm:2972124kB,
anon-rss:68724kB, file-rss:1304kB, shmem-rss:2691844kB
[Wed Jul 31 01:35:35 2019] oom_reaper: reaped process 14208 (postgres), now
anon-rss:0kB, file-rss:0kB, shmem-rss:2691844kB
Out-Of-Memory Killer
37
Out-Of-Memory Killer
● Pids in the container (10810) and on the host are different (14208)
○ Sometimes it is hard to investigate what happened
● oom_score_adj trick doesn’t really make sense in the container
○ There are only Patroni+PostgreSQL running
● It is not really clear how memory accounting in the container works:
○ memory: usage 8388392kB, limit 8388608kB, failcnt 1
○ cache:2173896KB rss:6019692KB rss_huge:0KB shmem:2173428KB
mapped_file:2173512KB dirty:132KB writeback:0KB swap:0KB
inactive_anon:15732KB active_anon:8177696KB inactive_file:320KB active_file:184KB
unevictable:0KB
38
Kubernetes+Docker
● ERROR: could not resize shared memory
segment "/PostgreSQL.1384046013" to 8388608
bytes: No space left on device
● PostgreSQL 11 (due to the “parallel hash join”)
● Docker limits /dev/shm to 64MB by default
● How to fix?
○ Mount custom dshm tmpfs volume to /dev/shm
■ Or set enableShmVolume: true in the cluster
manifest
39
Problems with PostgreSQL
● Logical decoding on the replica? Failover slots?
○ Patroni does sort of a hack by not allowing connections until
logical slot is created.
■ Consumer might still lose some events.
● “FATAL too many connections”
○ Prevents replica from starting streaming
■ Solved in PostgreSQL 12 (wal_senders not count as
part of max_connection)
○ Built-in connection pooler?
40
Human errors
● YAML formatting :)
● Inadequate resource and limit requests
○ Pod can’t be scheduled due to the node weakness
○ Processes are terminated by oom-killer
● Deleted Postgres-Operator/Spilo ServiceAccount by
employees
41
Conclusion
● Postgres-Operator helps us to manage more than 1000
PostgreSQL clusters distributed in 80+ K8s accounts with
minimal effort.
○ It wouldn’t be possible without high level of
automation
● In the cloud and on K8s you have to be ready to deal with
absolutely new problems and failure scenarios
○ Find the solution and implement it in the operator!
42
● Postgres-operator: https://github.com/zalando/postgres-operator
● Patroni: https://github.com/zalando/patroni
● Spilo: https://github.com/zalando/spilo
Open-source
Thank you!
Questions?

More Related Content

What's hot

A guide of PostgreSQL on Kubernetes
A guide of PostgreSQL on KubernetesA guide of PostgreSQL on Kubernetes
A guide of PostgreSQL on Kubernetest8kobayashi
 
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky HaryadiPGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky HaryadiEqunix Business Solutions
 
Deploying PostgreSQL on Kubernetes
Deploying PostgreSQL on KubernetesDeploying PostgreSQL on Kubernetes
Deploying PostgreSQL on KubernetesJimmy Angelakos
 
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Spark Summit
 
Operating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with KubernetesOperating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with KubernetesJonathan Katz
 
Online Upgrade Using Logical Replication.
Online Upgrade Using Logical Replication.Online Upgrade Using Logical Replication.
Online Upgrade Using Logical Replication.EDB
 
Multi Master PostgreSQL Cluster on Kubernetes
Multi Master PostgreSQL Cluster on KubernetesMulti Master PostgreSQL Cluster on Kubernetes
Multi Master PostgreSQL Cluster on KubernetesOhyama Masanori
 
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsBeyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsDataWorks Summit
 
PGConf.ASIA 2019 Bali - Modern PostgreSQL Monitoring & Diagnostics - Mahadeva...
PGConf.ASIA 2019 Bali - Modern PostgreSQL Monitoring & Diagnostics - Mahadeva...PGConf.ASIA 2019 Bali - Modern PostgreSQL Monitoring & Diagnostics - Mahadeva...
PGConf.ASIA 2019 Bali - Modern PostgreSQL Monitoring & Diagnostics - Mahadeva...Equnix Business Solutions
 
Patroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyPatroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyAlexander Kukushkin
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)Kohei KaiGai
 
PGConf.ASIA 2019 Bali - Performance Analysis at Full Power - Julien Rouhaud
PGConf.ASIA 2019 Bali - Performance Analysis at Full Power - Julien RouhaudPGConf.ASIA 2019 Bali - Performance Analysis at Full Power - Julien Rouhaud
PGConf.ASIA 2019 Bali - Performance Analysis at Full Power - Julien RouhaudEqunix Business Solutions
 
PGConf.ASIA 2019 Bali - Patroni on GitLab.com - Jose Cores Finnoto
PGConf.ASIA 2019 Bali - Patroni on GitLab.com - Jose Cores FinnotoPGConf.ASIA 2019 Bali - Patroni on GitLab.com - Jose Cores Finnoto
PGConf.ASIA 2019 Bali - Patroni on GitLab.com - Jose Cores FinnotoEqunix Business Solutions
 
patroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymentpatroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymenthyeongchae lee
 
PGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC 2018 - PostgreSQL performance comparison in various cloudsPGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC 2018 - PostgreSQL performance comparison in various cloudsPGConf APAC
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergendistributed matters
 
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...VMware Tanzu
 
PGConf.ASIA 2019 Bali - Your Business Continuity Matrix and PostgreSQL's Disa...
PGConf.ASIA 2019 Bali - Your Business Continuity Matrix and PostgreSQL's Disa...PGConf.ASIA 2019 Bali - Your Business Continuity Matrix and PostgreSQL's Disa...
PGConf.ASIA 2019 Bali - Your Business Continuity Matrix and PostgreSQL's Disa...Equnix Business Solutions
 
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018VMware Tanzu
 

What's hot (20)

A guide of PostgreSQL on Kubernetes
A guide of PostgreSQL on KubernetesA guide of PostgreSQL on Kubernetes
A guide of PostgreSQL on Kubernetes
 
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky HaryadiPGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
PGConf.ASIA 2019 - High Availability, 10 Seconds Failover - Lucky Haryadi
 
Deploying PostgreSQL on Kubernetes
Deploying PostgreSQL on KubernetesDeploying PostgreSQL on Kubernetes
Deploying PostgreSQL on Kubernetes
 
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
 
Operating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with KubernetesOperating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with Kubernetes
 
Online Upgrade Using Logical Replication.
Online Upgrade Using Logical Replication.Online Upgrade Using Logical Replication.
Online Upgrade Using Logical Replication.
 
Multi Master PostgreSQL Cluster on Kubernetes
Multi Master PostgreSQL Cluster on KubernetesMulti Master PostgreSQL Cluster on Kubernetes
Multi Master PostgreSQL Cluster on Kubernetes
 
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsBeyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
 
PGConf.ASIA 2019 Bali - Modern PostgreSQL Monitoring & Diagnostics - Mahadeva...
PGConf.ASIA 2019 Bali - Modern PostgreSQL Monitoring & Diagnostics - Mahadeva...PGConf.ASIA 2019 Bali - Modern PostgreSQL Monitoring & Diagnostics - Mahadeva...
PGConf.ASIA 2019 Bali - Modern PostgreSQL Monitoring & Diagnostics - Mahadeva...
 
Patroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyPatroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easy
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
PGConf.ASIA 2019 Bali - Performance Analysis at Full Power - Julien Rouhaud
PGConf.ASIA 2019 Bali - Performance Analysis at Full Power - Julien RouhaudPGConf.ASIA 2019 Bali - Performance Analysis at Full Power - Julien Rouhaud
PGConf.ASIA 2019 Bali - Performance Analysis at Full Power - Julien Rouhaud
 
PGConf.ASIA 2019 Bali - Patroni on GitLab.com - Jose Cores Finnoto
PGConf.ASIA 2019 Bali - Patroni on GitLab.com - Jose Cores FinnotoPGConf.ASIA 2019 Bali - Patroni on GitLab.com - Jose Cores Finnoto
PGConf.ASIA 2019 Bali - Patroni on GitLab.com - Jose Cores Finnoto
 
PostgreSQL with OpenCL
PostgreSQL with OpenCLPostgreSQL with OpenCL
PostgreSQL with OpenCL
 
patroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymentpatroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deployment
 
PGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC 2018 - PostgreSQL performance comparison in various cloudsPGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC 2018 - PostgreSQL performance comparison in various clouds
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
 
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
Greenplum Kontained: Coordinating Many PostgreSQL Instances on Kubernetes: Cl...
 
PGConf.ASIA 2019 Bali - Your Business Continuity Matrix and PostgreSQL's Disa...
PGConf.ASIA 2019 Bali - Your Business Continuity Matrix and PostgreSQL's Disa...PGConf.ASIA 2019 Bali - Your Business Continuity Matrix and PostgreSQL's Disa...
PGConf.ASIA 2019 Bali - Your Business Continuity Matrix and PostgreSQL's Disa...
 
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
Greenplum Overview for Postgres Hackers - Greenplum Summit 2018
 

Similar to PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin

PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC
 
Patroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companionPatroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companionAlexander Kukushkin
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...javier ramirez
 
Keynote #Tech - Google : aperçu de la gestion des services distribués chez Go...
Keynote #Tech - Google : aperçu de la gestion des services distribués chez Go...Keynote #Tech - Google : aperçu de la gestion des services distribués chez Go...
Keynote #Tech - Google : aperçu de la gestion des services distribués chez Go...Paris Open Source Summit
 
Lessons learned with kubernetes in production at PlayPass
Lessons learned with kubernetes in productionat PlayPassLessons learned with kubernetes in productionat PlayPass
Lessons learned with kubernetes in production at PlayPassPeter Vandenabeele
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kevin Lynch
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetesRishabh Indoria
 
[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep Dive[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep DiveAkihiro Suda
 
Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)DoiT International
 
Customize and Secure the Runtime and Dependencies of Your Procedural Language...
Customize and Secure the Runtime and Dependencies of Your Procedural Language...Customize and Secure the Runtime and Dependencies of Your Procedural Language...
Customize and Secure the Runtime and Dependencies of Your Procedural Language...VMware Tanzu
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetesTed Jung
 
Kubernetes: The Next Research Platform
Kubernetes: The Next Research PlatformKubernetes: The Next Research Platform
Kubernetes: The Next Research PlatformBob Killen
 
PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...
PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...
PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...Puppet
 
Kubernetes Basis: Pods, Deployments, and Services
Kubernetes Basis: Pods, Deployments, and ServicesKubernetes Basis: Pods, Deployments, and Services
Kubernetes Basis: Pods, Deployments, and ServicesJian-Kai Wang
 
Scylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScyllaDB
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKevin Lynch
 
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and DockerWSO2
 

Similar to PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin (20)

PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
 
Patroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companionPatroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companion
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
Keynote #Tech - Google : aperçu de la gestion des services distribués chez Go...
Keynote #Tech - Google : aperçu de la gestion des services distribués chez Go...Keynote #Tech - Google : aperçu de la gestion des services distribués chez Go...
Keynote #Tech - Google : aperçu de la gestion des services distribués chez Go...
 
Lessons learned with kubernetes in production at PlayPass
Lessons learned with kubernetes in productionat PlayPassLessons learned with kubernetes in productionat PlayPass
Lessons learned with kubernetes in production at PlayPass
 
Containers > VMs
Containers > VMsContainers > VMs
Containers > VMs
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep Dive[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep Dive
 
Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)
 
App container rkt
App container rktApp container rkt
App container rkt
 
Customize and Secure the Runtime and Dependencies of Your Procedural Language...
Customize and Secure the Runtime and Dependencies of Your Procedural Language...Customize and Secure the Runtime and Dependencies of Your Procedural Language...
Customize and Secure the Runtime and Dependencies of Your Procedural Language...
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetes
 
Kubernetes: The Next Research Platform
Kubernetes: The Next Research PlatformKubernetes: The Next Research Platform
Kubernetes: The Next Research Platform
 
PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...
PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...
PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...
 
Kubernetes Basis: Pods, Deployments, and Services
Kubernetes Basis: Pods, Deployments, and ServicesKubernetes Basis: Pods, Deployments, and Services
Kubernetes Basis: Pods, Deployments, and Services
 
Scylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla Operator
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the Datacenter
 
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
 

More from Equnix Business Solutions

Yang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdf
Yang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdfYang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdf
Yang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdfEqunix Business Solutions
 
Kebocoran Data_ Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...
Kebocoran Data_  Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...Kebocoran Data_  Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...
Kebocoran Data_ Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...Equnix Business Solutions
 
Kuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdf
Kuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdfKuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdf
Kuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdfEqunix Business Solutions
 
EWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdf
EWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdfEWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdf
EWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdfEqunix Business Solutions
 
Oracle to PostgreSQL, Challenges to Opportunity.pdf
Oracle to PostgreSQL, Challenges to Opportunity.pdfOracle to PostgreSQL, Challenges to Opportunity.pdf
Oracle to PostgreSQL, Challenges to Opportunity.pdfEqunix Business Solutions
 
[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf
[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf
[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdfEqunix Business Solutions
 
Webinar2021 - Does HA Can Help You Balance Your Load-.pdf
Webinar2021 - Does HA Can Help You Balance Your Load-.pdfWebinar2021 - Does HA Can Help You Balance Your Load-.pdf
Webinar2021 - Does HA Can Help You Balance Your Load-.pdfEqunix Business Solutions
 
Webinar2021 - In-Memory Database, is it really faster-.pdf
Webinar2021 - In-Memory Database, is it really faster-.pdfWebinar2021 - In-Memory Database, is it really faster-.pdf
Webinar2021 - In-Memory Database, is it really faster-.pdfEqunix Business Solutions
 
equpos - General Presentation v20230420.pptx
equpos - General Presentation v20230420.pptxequpos - General Presentation v20230420.pptx
equpos - General Presentation v20230420.pptxEqunix Business Solutions
 
Equnix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdf
Equnix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdfEqunix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdf
Equnix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdfEqunix Business Solutions
 
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdf
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdfOSPX - Professional PostgreSQL Certification Scheme v20201111.pdf
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdfEqunix Business Solutions
 
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki KondoPGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki KondoEqunix Business Solutions
 
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo HirosePGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo HiroseEqunix Business Solutions
 
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...Equnix Business Solutions
 
PGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGai
PGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGaiPGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGai
PGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGaiEqunix Business Solutions
 
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan PachenkoPGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan PachenkoEqunix Business Solutions
 
PGConf.ASIA 2019 Bali - Keynote Speech 1 - Bruce Momjian
PGConf.ASIA 2019 Bali - Keynote Speech 1 - Bruce MomjianPGConf.ASIA 2019 Bali - Keynote Speech 1 - Bruce Momjian
PGConf.ASIA 2019 Bali - Keynote Speech 1 - Bruce MomjianEqunix Business Solutions
 

More from Equnix Business Solutions (20)

Yang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdf
Yang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdfYang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdf
Yang perlu kita ketahui Untuk memahami aspek utama IT dalam bisnis_.pdf
 
Kebocoran Data_ Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...
Kebocoran Data_  Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...Kebocoran Data_  Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...
Kebocoran Data_ Tindakan Hacker atau Kriminal_ Bagaimana kita mengantisipasi...
 
Kuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdf
Kuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdfKuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdf
Kuliah Tamu - Dari Proses Bisnis Menuju Struktur Data.pdf
 
EWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdf
EWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdfEWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdf
EWTT22_ Apakah Open Source Cocok digunakan dalam Korporasi_.pdf
 
Oracle to PostgreSQL, Challenges to Opportunity.pdf
Oracle to PostgreSQL, Challenges to Opportunity.pdfOracle to PostgreSQL, Challenges to Opportunity.pdf
Oracle to PostgreSQL, Challenges to Opportunity.pdf
 
[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf
[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf
[EWTT2022] Strategi Implementasi Database dalam Microservice Architecture.pdf
 
PostgreSQL as Enterprise Solution v1.1.pdf
PostgreSQL as Enterprise Solution v1.1.pdfPostgreSQL as Enterprise Solution v1.1.pdf
PostgreSQL as Enterprise Solution v1.1.pdf
 
Webinar2021 - Does HA Can Help You Balance Your Load-.pdf
Webinar2021 - Does HA Can Help You Balance Your Load-.pdfWebinar2021 - Does HA Can Help You Balance Your Load-.pdf
Webinar2021 - Does HA Can Help You Balance Your Load-.pdf
 
Webinar2021 - In-Memory Database, is it really faster-.pdf
Webinar2021 - In-Memory Database, is it really faster-.pdfWebinar2021 - In-Memory Database, is it really faster-.pdf
Webinar2021 - In-Memory Database, is it really faster-.pdf
 
EQUNIX - PPT 11DB-Postgres™.pdf
EQUNIX - PPT 11DB-Postgres™.pdfEQUNIX - PPT 11DB-Postgres™.pdf
EQUNIX - PPT 11DB-Postgres™.pdf
 
equpos - General Presentation v20230420.pptx
equpos - General Presentation v20230420.pptxequpos - General Presentation v20230420.pptx
equpos - General Presentation v20230420.pptx
 
Equnix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdf
Equnix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdfEqunix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdf
Equnix Appliance- Jawaban terbaik untuk kebutuhan komputasi yang mumpuni.pdf
 
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdf
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdfOSPX - Professional PostgreSQL Certification Scheme v20201111.pdf
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdf
 
Equnix Company Profile v20230329.pdf
Equnix Company Profile v20230329.pdfEqunix Company Profile v20230329.pdf
Equnix Company Profile v20230329.pdf
 
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki KondoPGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
 
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo HirosePGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
PGConf.ASIA 2019 - PGSpider High Performance Cluster Engine - Shigeo Hirose
 
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
PGConf.ASIA 2019 Bali - Mission Critical Production High Availability Postgre...
 
PGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGai
PGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGaiPGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGai
PGConf.ASIA 2019 Bali - Keynote Speech 3 - Kohei KaiGai
 
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan PachenkoPGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
 
PGConf.ASIA 2019 Bali - Keynote Speech 1 - Bruce Momjian
PGConf.ASIA 2019 Bali - Keynote Speech 1 - Bruce MomjianPGConf.ASIA 2019 Bali - Keynote Speech 1 - Bruce Momjian
PGConf.ASIA 2019 Bali - Keynote Speech 1 - Bruce Momjian
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 

PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin

  • 1. PGConf.ASIA 2019 Indonesia, Bali PostgreSQL on K8S at Zalando: two+ years in production ALEXANDER KUKUSHKIN 10-09-2019
  • 2. 2 ABOUT ME Alexander Kukushkin Database Engineer @ZalandoTech The Patroni guy alexander.kukushkin@zalando.de Twitter: @cyberdemn
  • 3. 3 Typical problems and horror stories Brief introduction to Kubernetes Spilo & Patroni Postgres-Operator AGENDA
  • 4. 4 WE BRING FASHION TO PEOPLE IN 17 COUNTRIES 17 markets 7 fulfillment centers 26.4 million active customers 5.4 billion € net sales 2018 250 million visits per month 15,000 employees in Europe
  • 5. 5 PostgreSQL at Zalando > 300 In the data centers > 160 Databases on AWS Managed by DB team > 1000 Databases in other Kubernetes clusters > 190 Run in the ACID’s Kubernetes cluster
  • 6. 6 What is Kubernetes? ● A set of open-source components running on one or more servers ● A container orchestration system ● An abstraction layer over your real or virtualized hardware ● An “infrastructure as a code” system ● Automatic resource allocation ● Next step after Ansible/Chef/Puppet
  • 7. 7 What Kubernetes is not? ● An operating system ● A magical way to make your infrastructure scalable ● An excuse to fire your devops (someone has to configure and manage it) ● A good solution for running 2-3 servers
  • 8. 8 Terminology Traditional infrastructure ● Physical server ● Virtual machine ● Individual application ● NAS/SAN ● Load balancer ● Application registry/hardware information ● Password files, certificates Kubernetes ● Node ● Pod ● Container (typically Docker) ● Persistent Volumes ● Service/Endpoint ● Labels ● Secrets
  • 10. 10 Stateful applications on Kubernetes ● PersistentVolumes ○ Abstracts details how storage is provisioned ○ Supports many different storage types via plugins: ■ EBS, AzureDisk, iSCSI, NFS, CEPH, Glusterfs and so on ● StatefulSets ○ Guarantied number of Pods with stable (and unique) identifiers ○ Ordered deployment and scaling ○ Connecting Pods with corresponding persistent storage (PersistentVolume+PersistentVolumeClaim)
  • 11. 11 ● All supported versions of PostgreSQL inside the single image ● Plenty of extensions (pg_partman, pg_cron, postgis, timescaledb, etc) ● Additional tools (pgq, pgbouncer, wal-e/wal-g) ● PGDATA on an external volume ● Patroni for HA ● Environment-variables based configuration ● Lightweight! Spilo Docker image
  • 12. 12 ● Automatic failover solution for PostgreSQL streaming replication ● A python daemon that manages one PostgreSQL instance ● Keeps the cluster state in a DCS (Etcd, Zookeeper, Consul, Kubernetes) ○ Uses DCS for leader elections ■ Makes PostgreSQL 1st class citizen on Kubernetes! ● Helps to automate a lot of things like: ○ A new cluster deployment ○ Scaling out and in ○ PostgreSQL configuration management What is Patroni
  • 13. 13 Spilo & Patroni on K8S Node2 Pod: demo-0 role: master PersistentVolume PersistentVolume Node1 StatefulSet: demo Pod: demo-1 role: replica Secret: demo WATCH() UPDATE() Service: demo-replica labelSelector: role=replica Service: demo Endpoint: demoS3
  • 14. 14 ● A few long YAML manifests to write ● Different parts of PostgreSQL configuration spread over multiple manifests ● No easy way to work with a cluster as a whole (update, delete) ● Manual generation of DB objects, i.e. users, and their passwords. Manual deployment to Kubernetes
  • 15. 15 ● Encapsulates knowledge of a human operating the service ● Implement a controller application to act on custom resources ● CRD (custom resource definitions) to describe a domain-specific object (i.e. a Postgres cluster) https://coreos.com/blog/introducing-operators.html Kubernetes operator pattern
  • 16. 16 ● Defines a custom Postgresql resource ● Watches instances of Postgresql, creates/updates/deletes corresponding Kubernetes objects ● Allows updating running-cluster resources (memory, cpu, volumes), postgres configuration ● Creates databases, users and automatically generates passwords ● Auto-repairs, smart rolling updates (switchover to replicas before updating the master) Zalando Postgres-Operator
  • 17. 17 apiVersion: "acid.zalan.do/v1" kind: postgresql metadata: name: acid-minimal-cluster spec: teamId: "ACID" # is used to provision human users volume: size: 1Gi numberOfInstances: 2 users: zalando: # database owner - createrole - createdb foo_app_user: # role for application foo databases: # name->owner foo: zalando postgresql: version: "11" Postgresql manifest
  • 18. 18 deploycluster manifest Stateful set Spilo pod Kubernetes cluster PATRONI operator pod Endpoint Service Client application operator config map Cluster secrets DB deployer create create create watch Infrastructure roles
  • 20. 20
  • 21. 21 Zalando Postgres-Operator features ● Rolling updates on Postgres cluster changes ● Volume resize without Pod restarts ● Cloning Postgres clusters ● Logical Backups to S3 Bucket ● Standby cluster from S3 WAL archive ● Configurable for non-cloud environments ● UI to create and edit Postgres cluster manifests
  • 22. 22 Kubernetes rolling upgrade ● Rotates all worker nodes in the K8s cluster ○ Does it in a rolling matter, one-by-one ○ If you are unlucky, it will cause the number of failover equal number of pods in your postgres cluster ● How Postgres-Operator deals with it? ○ Detect the to-be-decommissioned node by lack of the ready label and SchedulingDisabled status ○ Move all master pods: ■ Move replicas to the already updated node ■ Trigger switchover to those replicas
  • 23. 23 Availability Zone 1 Node cluster: A primary cluster: B primary cluster: C replica Availability Zone 2 Availability Zone 3 Kubernetes rolling upgrade (start) Node Node cluster: A replica cluster: B replica cluster: C primary Node Node cluster: A replica cluster: B replica cluster: C replica Node Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
  • 24. 24 Availability Zone 1 Node cluster: A primary cluster: B primary cluster: C replica Availability Zone 2 Availability Zone 3 Kubernetes rolling upgrade (step 1) Node Node cluster: A replica cluster: B replica cluster: C primary Node Node cluster: A replica cluster: B replica cluster: C replica Node cluster: A replica cluster: B replica cluster: A replica cluster: B replica cluster: C replica cluster: C replica Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
  • 25. 25 Availability Zone 1 Node cluster: A primary cluster: B primary Availability Zone 2 Availability Zone 3 Kubernetes rolling upgrade (step 1) Node Node cluster: C primary Node Node cluster: A replica cluster: B replica cluster: A replica cluster: B replica cluster: C replica cluster: C replica Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
  • 26. 26 Availability Zone 1 Node cluster: A primary cluster: B primary Availability Zone 2 Availability Zone 3 Kubernetes rolling upgrade (switchover) Node Node cluster: C primary Node Node cluster: A replica cluster: B replica cluster: A replica cluster: B replica cluster: C replica cluster: C replica Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
  • 27. 27 Availability Zone 1 Node cluster: A replica cluster: B replica Availability Zone 2 Availability Zone 3 Kubernetes rolling upgrade (switchover) Node Node cluster: C replica Node Node cluster: A primary cluster: B replica cluster: A replica cluster: B primary cluster: C replica cluster: C primary Node (to-be-decommissioned) Node (new) Terminated PodActive Pod
  • 28. 28 Availability Zone 1 Node Availability Zone 2 Availability Zone 3 Kubernetes rolling upgrade (finish) Node cluster: A replica cluster: B replica Node Node cluster: C replica Node cluster: A primary cluster: B replica cluster: A replica cluster: B primary cluster: C replica cluster: C primary Node (to-be-decommissioned) Node (new) Terminated PodActive Pod cluster: A replica cluster: B replica cluster: C replica
  • 30. 30 Problems with AWS infrastructure ● AWS API Rate Limit Exceeded ○ Prevents or delays attaching/detaching persistent volumes (EBS) to/from Pods ■ Delays recovery of failed Pods ○ Might delay a deployment of a new cluster ● Sometimes EC2 instances fail and being shutdown by AWS ○ Shutdown might take ages ○ All EBS volumes remain attached until instance is shutted down ■ Pods can’t be rescheduled
  • 31. 31 KUBE2IAM ● AWS IAM + Instance Profile is very flexible, convenient, and powerful technology ● AWS tools and libraries can use temporary credentials obtained from the Instance Profile to access other AWS services, for example S3 ○ We are using S3 for backups and continuous archiving (wal-e/wal-g) ● Instance Profile is configured per EC2 Instance (K8s Node)! ● kube2iam - provides different IAM roles for Pods running on a single Node ● kube2iam might stop serving the Pod which is being shutdown ○ wal-e is unable to get credentials to access S3 ■ Last WAL segments remain unarchived! :(
  • 32. 32 Lack of Disk space ● Single volume for PGDATA, pg_wal and logs ● FATAL,53100,could not write to file "pg_wal/xlogtemp.22993": No space left on device ○ Usually ends up with postgres being self shutdown ● Patroni tries to recover the primary which isn’t running ○ “start->promote->No space left->shutdown” loop Disk space MUST be monitored!
  • 33. 33 Disk space problems root cause or why we don’t implement volume autoextend ● Excessive logging ○ slow queries, human access, application errors, connections/disconnections ● pg_wal growth ○ archive_command is slow/failing ○ Unconsumed changes on the replication slot ■ Replica is not streaming? Replica is slow? ■ Logical replication slot? ○ checkpoints taking too long due to throttled IOPS ● PGDATA growth ○ Table and index bloat! ■ Useless updates of unchanged data? ■ Autovacuum tuning? Zheap? ○ Natural growth of data ■ Lack of retention policies? ■ Broken cleanup jobs?
  • 34. 34 ORM can cause wal-e to fail! wal_e.main ERROR MSG: Attempted to archive a file that is too large. HINT: There is a file in the postgres database directory that is larger than 1610612736 bytes. If no such file exists, please report this as a bug. In particular, check pg_stat/pg_stat_statements.stat.tmp, which appears to be 2010822591 bytes Meanwhile in pg_stat_statements: UPDATE foo SET bar = $1 WHERE id IN ($2, $3, $4, …, $10500); UPDATE foo SET bar = $1 WHERE id IN ($2, $3, $4, …, $100500); …. and so on
  • 35. 35 Exclusive backup issues PANIC,XX000,"online backup was canceled, recovery cannot continue",,,,,"xlog redo at D45/EB000028 for XLOG/CHECKPOINT_SHUTDOWN: redo D45/EB000028; tli 237; prev tli 237; fpw true; xid 0:105446371; oid 187558; multi 1; offset 0; oldest xid 544 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown",,,,"" ● There is no way to join back such failed primary as a replica without rebuilding (reinitializing) it! ○ wal-g supports non-exclusive backups, but not yet stable enough
  • 36. 36 ● Postgres log: server process (PID 10810) was terminated by signal 9: Killed ● dmesg -T: [Wed Jul 31 01:35:35 2019] Memory cgroup out of memory: Kill process 14208 (postgres) score 606 or sacrifice child [Wed Jul 31 01:35:35 2019] Killed process 14208 (postgres) total-vm:2972124kB, anon-rss:68724kB, file-rss:1304kB, shmem-rss:2691844kB [Wed Jul 31 01:35:35 2019] oom_reaper: reaped process 14208 (postgres), now anon-rss:0kB, file-rss:0kB, shmem-rss:2691844kB Out-Of-Memory Killer
  • 37. 37 Out-Of-Memory Killer ● Pids in the container (10810) and on the host are different (14208) ○ Sometimes it is hard to investigate what happened ● oom_score_adj trick doesn’t really make sense in the container ○ There are only Patroni+PostgreSQL running ● It is not really clear how memory accounting in the container works: ○ memory: usage 8388392kB, limit 8388608kB, failcnt 1 ○ cache:2173896KB rss:6019692KB rss_huge:0KB shmem:2173428KB mapped_file:2173512KB dirty:132KB writeback:0KB swap:0KB inactive_anon:15732KB active_anon:8177696KB inactive_file:320KB active_file:184KB unevictable:0KB
  • 38. 38 Kubernetes+Docker ● ERROR: could not resize shared memory segment "/PostgreSQL.1384046013" to 8388608 bytes: No space left on device ● PostgreSQL 11 (due to the “parallel hash join”) ● Docker limits /dev/shm to 64MB by default ● How to fix? ○ Mount custom dshm tmpfs volume to /dev/shm ■ Or set enableShmVolume: true in the cluster manifest
  • 39. 39 Problems with PostgreSQL ● Logical decoding on the replica? Failover slots? ○ Patroni does sort of a hack by not allowing connections until logical slot is created. ■ Consumer might still lose some events. ● “FATAL too many connections” ○ Prevents replica from starting streaming ■ Solved in PostgreSQL 12 (wal_senders not count as part of max_connection) ○ Built-in connection pooler?
  • 40. 40 Human errors ● YAML formatting :) ● Inadequate resource and limit requests ○ Pod can’t be scheduled due to the node weakness ○ Processes are terminated by oom-killer ● Deleted Postgres-Operator/Spilo ServiceAccount by employees
  • 41. 41 Conclusion ● Postgres-Operator helps us to manage more than 1000 PostgreSQL clusters distributed in 80+ K8s accounts with minimal effort. ○ It wouldn’t be possible without high level of automation ● In the cloud and on K8s you have to be ready to deal with absolutely new problems and failure scenarios ○ Find the solution and implement it in the operator!
  • 42. 42 ● Postgres-operator: https://github.com/zalando/postgres-operator ● Patroni: https://github.com/zalando/patroni ● Spilo: https://github.com/zalando/spilo Open-source