SlideShare a Scribd company logo
1 of 35
Download to read offline
Moving Our Entire Stack to K8S Within a Year - 7 Lessons Learned
October 12, 2018
Chris Homer
Co-Founder & CTO at thredUPddd
● Largest Consignment Store
● $130M+ invested
● 1000+ employees
● 4 distribution centers
● Kiev & SF Engineering Offices
● We’re Hiring!
Co-Founder & CTO at thredUP
Solution Specialist at Microsoft
Princeton University & Harvard Business School
Chris Homer - @chrishomer
Confidential 4
The thredUP Marketplace
● Convenient Pre-Paid Bag
● Earn Cash or Donate
● Do Good
● Amazing prices
● Wide assortment
● Fresh selection everyday
Confidential 5
Visualizing The thredUP Marketplace
Confidential 6
Operating the Marketplace
Confidential 7
Augmenting the Marketplace
Supplier
Scoring
Partners
Supply
Lifecycle
Quality +
Expected
Value
Proprietary
Pricing
Algorithm
Personalization
Search
Notifications
Discounting
Algorithm
Marketing
Confidential 8
● K8S Migration Begins
Infrastructure Timeline
A little history of our journey towards the promised-land
2010 201820152014 2016 2017 2019 ...2009
● Slicehost
● Manual Config
● Capistrano Deploy
● Manual Tests
● AWS Hosted
● Manual Saved AMI’s
● Staging & Dev - cleansed prod copy
● “Outsourcing DevOps”
● Back to Chef
● “Microservices”
● Hand-crafted Staging
● Chef
● Ansible all the things
● “Insourcing DevOps”
● Back to Ansible - One Source of Truth
● Infrastructure Team
● DevOps is about Culture
● Security Assessment
● Terraform
● Ansible Hardening
● Dynamic Staging
● Service Mesh
● DevSecOps● Docker & ECS “Attempt”
Confidential 9
The Current Infrastructure Stack
After the migration, the picture is getting clearer and increasingly rational
prod staging dev
Confidential 10
Why Docker & Kubernetes?
● Obviously because it’s cool & hype :)
● Popularity - widely supported
● Scalable & fault-tolerant out of the box
● Flexibility & deep control
● Standardization & ownership
● Speed up development lifecycle
● Encourage more & smaller services
● Linux Foundation & CNCF
Confidential 11
Learning #1 - Fear, Uncertainty & Doubt => Excitement & Ownership
● Not everyone will be on board
● Share the vision, explain the advantages, pains and short-comings
● A simple demo application helps “make it real”
● Emphasize that success requires app team and infra team ownership
● Cultivate champions and use their help
● Momentum is your friend
● Milestones are important for larger services
● Technical debt opportunities
● Knowledge sharing & workshops along the way and after
Confidential 12
Learning #2 - Pay close attention to performance
➢ Setup k8s VPS that is peered with prod
VPC
○ Redis
○ Memcached
○ Aurora
➢ scale haproxy instances
➢ update kubernetes nodes to c5.2xlarge
➢ disable ingress controller
➢ disable kubeDNS
Confidential 13
Learning #2 - Pay close attention to performance
ec2 response time p90
k8s response time p90
Confidential 14
Learning #2 cont’d - Internal communication is way faster
access by
cluster IP
access by
public DNS name
Confidential 15
Learning #3 - Liveness probe is not always your friend
Response time
time
k8s healthcheck timeout
External Request
Our Code
Confidential 16
Learning #3 - Liveness probe is not always your friend
Response time
time
Confidential 17
Many DNS errors and ~5 seconds delays
Learning #4 – DNS
Confidential 18
Many DNS errors and ~5 seconds delays
Learning #4 – DNS
● It’s a well-known issue with UDP & Dynamic NAT
● It has a bug report - https://github.com/kubernetes/kubernetes/issues/56903
● And good problem explanation https://www.weave.works/blog/racy-conntrack-and-dns-
lookup-timeouts
Solution – use TCP as a protocol
dnsConfig:
options:
- name: use-vc
dnsPolicy: ClusterFirst
Another Solution
dnsConfig:
options:
- name: single-request-reopen
dnsPolicy: ClusterFirst
Confidential 19
Learning #5 - Too many open files
Ok, Google =)
max_user_watches=8192 → this looks too low, let's bump it a little!
That did seem to help … For some time ....
Confidential 20
Learning #5 - Too many open files
Spikes seem to correlate with POD Crash Loops? Why?
Confidential 21
Learning #5 - Too many open files
Logaggregatorclient
Docker
container
container
container
log file
log file
log file
fd
fd
fd
Confidential 22
Learning #5 - Too many open files
Docker
container
container
container
log file
log file
log file
fd
fd
container log file
fd
fd
fd
fdfd
Logaggregatorclient
These are still opened
Confidential 23
Learning #6 - Pod Distribution after Cluster Maintenance
Worker node#1 Worker node#2 Worker node#3
Service A pod Service A pod Service A pod
Confidential 24
Learning #6 - Pod Distribution after Cluster Maintenance
Under
Maintenance
Worker node#1 Worker node#2 Worker node#3
Service A pod Service A pod
Service A pod
Service A pod
Confidential 25
Learning #6 - Pod Distribution after Cluster Maintenance
Alive and
functioning
Worker node#1 Worker node#2 Worker node#3
Service A pod Service A pod
Service A pod
Confidential 26
Learning #6 - Pod Distribution after Cluster Maintenance
Worker node#1 Worker node#2 Worker node#3
Service A pod Service A pod
Service A pod
Confidential 27
Learning #6 - Pod Distribution after Cluster Maintenance
Worker node#1 Worker node#2 Worker node#3
Service A pod Service A pod
Service A pod
Service A pod
Service A pod
All traffic goes here
Confidential 28
Learning #6 - Pod Distribution after Cluster Maintenance
Worker node#1 Worker node#2 Worker node#3
Service A pod Service A pod
Service A pod
Service A pod
Service A pod
All traffic goes here
Solution: Redeploy to redistribute pods
Confidential 29
Learning #6 - Pod Distribution after Cluster Maintenance
Worker node#1 Worker node#2 Worker node#3
Service A pod Service A pod Service A pod
Confidential 30
Learning #7 - Building Docker Images within the K8s Cluster
Kubernetes worker node
Docker
daemon
docker.sock
Jenkins
slave
Container BContainer A
Docker cli
Containers
Jenkinsfile
docker build ...
docker build ...
Confidential 31
Learning #7 - Building Docker Images within the K8s Cluster
Kubernetes worker node
Docker
daemon
docker.sock
Jenkins
slave
Container BContainer A
Docker cli
Containers
Jenkinsfile
docker rm ...
docker rm ...
Confidential 32
Learning #7 - Building Docker Images within the K8s Cluster
Kubernetes worker node
Docker
daemon
Jenkins
slave
Container BContainer A
Docker cli
Containers
Separate ec2 instance
Confidential 33
Was it worth it? YES!
● Deployment time halved ~ (main service – from 12 min to 5 min)
● Rollback is very easy and fast (nearly instant)
● Hardware provisioned decreased by a factor of 3
● Pods autoscaling eliminated manual work to support traffic spikes
● System level upgrades are now non-blocking and easy to execute
● Time to provision and deploy a new service in production changed from
days/weeks to minutes/hours
● Each project has its own simple helm chart in a project repo ~ 3200
ansible config files deprecated.
Confidential 34
What’s next?
● Dynamic Staging Environments
○ Encourage better development workflow
○ Easily enable cross-team review with design, marketing and others
● Telepresence for Complex Local Development
○ Easier onboarding & dev env refresh
○ More consistent behavior with production
● End-to-end integration suite
● Iterate for Improvements
○ Faster builds
○ Cluster Performance
○ Observability
○ Cost Improvements
● Service mesh with Istio
Confidential 35
Thank You!
chris@thredup.com
@chrishomer
PS. We’re Hiring :)

More Related Content

What's hot

從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)
從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)
從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)William Yeh
 
Docker Dhahran Nov 2016 meetup
Docker Dhahran Nov 2016 meetupDocker Dhahran Nov 2016 meetup
Docker Dhahran Nov 2016 meetupWalid Shaari
 
OpenShift As A DevOps Platform
OpenShift As A DevOps PlatformOpenShift As A DevOps Platform
OpenShift As A DevOps PlatformLalatendu Mohanty
 
Кирилл Толкачев. Микросервисы: огонь, вода и девопс
Кирилл Толкачев. Микросервисы: огонь, вода и девопсКирилл Толкачев. Микросервисы: огонь, вода и девопс
Кирилл Толкачев. Микросервисы: огонь, вода и девопсScrumTrek
 
Reproducible development to live applications with Red Hat CDK and Red Hat Op...
Reproducible development to live applications with Red Hat CDK and Red Hat Op...Reproducible development to live applications with Red Hat CDK and Red Hat Op...
Reproducible development to live applications with Red Hat CDK and Red Hat Op...Lalatendu Mohanty
 
Jelastic Docker Orchestrator
Jelastic Docker OrchestratorJelastic Docker Orchestrator
Jelastic Docker OrchestratorHidora
 
Hands-on GitOps Patterns for Helm Users
Hands-on GitOps Patterns for Helm UsersHands-on GitOps Patterns for Helm Users
Hands-on GitOps Patterns for Helm UsersWeaveworks
 
Perforce Innovations Showcase 
Perforce Innovations Showcase Perforce Innovations Showcase 
Perforce Innovations Showcase Perforce
 
Modern Post-Exploitation Strategies - 44CON 2012
Modern Post-Exploitation Strategies - 44CON 2012Modern Post-Exploitation Strategies - 44CON 2012
Modern Post-Exploitation Strategies - 44CON 201244CON
 
Multi-container Applications on OpenShift with the Ansible Service Broker Mul...
Multi-container Applications on OpenShift with the Ansible Service Broker Mul...Multi-container Applications on OpenShift with the Ansible Service Broker Mul...
Multi-container Applications on OpenShift with the Ansible Service Broker Mul...Amazon Web Services
 
QCon SF 2017 - Microservices: Service-Oriented Development
QCon SF 2017 - Microservices: Service-Oriented DevelopmentQCon SF 2017 - Microservices: Service-Oriented Development
QCon SF 2017 - Microservices: Service-Oriented DevelopmentAmbassador Labs
 
Enterprise-Grade DevOps Solutions for a Start Up Budget
Enterprise-Grade DevOps Solutions for a Start Up BudgetEnterprise-Grade DevOps Solutions for a Start Up Budget
Enterprise-Grade DevOps Solutions for a Start Up BudgetDevOps.com
 
CNCF Webinar Series: "Creating an Effective Developer Experience on Kubernetes"
CNCF Webinar Series: "Creating an Effective Developer Experience on Kubernetes"CNCF Webinar Series: "Creating an Effective Developer Experience on Kubernetes"
CNCF Webinar Series: "Creating an Effective Developer Experience on Kubernetes"Daniel Bryant
 
[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise Applications[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise ApplicationsDaniel Oh
 
Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2
Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2
Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2Amrita Prasad
 
Aws ug dxb 2021 container series iv
Aws ug dxb 2021 container series  ivAws ug dxb 2021 container series  iv
Aws ug dxb 2021 container series ivWalid Shaari
 
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison DowdneySetting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison DowdneyWeaveworks
 
Jfrog artifactory artifact management c tamilmaran presentation - copy
Jfrog artifactory artifact management c tamilmaran presentation - copyJfrog artifactory artifact management c tamilmaran presentation - copy
Jfrog artifactory artifact management c tamilmaran presentation - copyTAMILMARAN C
 
Enabling Cloud Native Buildpacks for Windows Containers
Enabling Cloud Native Buildpacks for Windows ContainersEnabling Cloud Native Buildpacks for Windows Containers
Enabling Cloud Native Buildpacks for Windows ContainersVMware Tanzu
 

What's hot (20)

從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)
從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)
從系統思考看 DevOps:以 microservices 為例 (DevOps: a system dynamics perspective)
 
Docker Dhahran Nov 2016 meetup
Docker Dhahran Nov 2016 meetupDocker Dhahran Nov 2016 meetup
Docker Dhahran Nov 2016 meetup
 
OpenShift As A DevOps Platform
OpenShift As A DevOps PlatformOpenShift As A DevOps Platform
OpenShift As A DevOps Platform
 
Кирилл Толкачев. Микросервисы: огонь, вода и девопс
Кирилл Толкачев. Микросервисы: огонь, вода и девопсКирилл Толкачев. Микросервисы: огонь, вода и девопс
Кирилл Толкачев. Микросервисы: огонь, вода и девопс
 
Reproducible development to live applications with Red Hat CDK and Red Hat Op...
Reproducible development to live applications with Red Hat CDK and Red Hat Op...Reproducible development to live applications with Red Hat CDK and Red Hat Op...
Reproducible development to live applications with Red Hat CDK and Red Hat Op...
 
Jelastic Docker Orchestrator
Jelastic Docker OrchestratorJelastic Docker Orchestrator
Jelastic Docker Orchestrator
 
Hands-on GitOps Patterns for Helm Users
Hands-on GitOps Patterns for Helm UsersHands-on GitOps Patterns for Helm Users
Hands-on GitOps Patterns for Helm Users
 
Perforce Innovations Showcase 
Perforce Innovations Showcase Perforce Innovations Showcase 
Perforce Innovations Showcase 
 
Modern Post-Exploitation Strategies - 44CON 2012
Modern Post-Exploitation Strategies - 44CON 2012Modern Post-Exploitation Strategies - 44CON 2012
Modern Post-Exploitation Strategies - 44CON 2012
 
Multi-container Applications on OpenShift with the Ansible Service Broker Mul...
Multi-container Applications on OpenShift with the Ansible Service Broker Mul...Multi-container Applications on OpenShift with the Ansible Service Broker Mul...
Multi-container Applications on OpenShift with the Ansible Service Broker Mul...
 
QCon SF 2017 - Microservices: Service-Oriented Development
QCon SF 2017 - Microservices: Service-Oriented DevelopmentQCon SF 2017 - Microservices: Service-Oriented Development
QCon SF 2017 - Microservices: Service-Oriented Development
 
Enterprise-Grade DevOps Solutions for a Start Up Budget
Enterprise-Grade DevOps Solutions for a Start Up BudgetEnterprise-Grade DevOps Solutions for a Start Up Budget
Enterprise-Grade DevOps Solutions for a Start Up Budget
 
CNCF Webinar Series: "Creating an Effective Developer Experience on Kubernetes"
CNCF Webinar Series: "Creating an Effective Developer Experience on Kubernetes"CNCF Webinar Series: "Creating an Effective Developer Experience on Kubernetes"
CNCF Webinar Series: "Creating an Effective Developer Experience on Kubernetes"
 
[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise Applications[RHFSeoul2017]6 Steps to Transform Enterprise Applications
[RHFSeoul2017]6 Steps to Transform Enterprise Applications
 
Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2
Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2
Puzzle ITC Talk @Docker CH meetup CI CD_with_Openshift_0.2
 
Aws ug dxb 2021 container series iv
Aws ug dxb 2021 container series  ivAws ug dxb 2021 container series  iv
Aws ug dxb 2021 container series iv
 
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison DowdneySetting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
 
TDD anche su iOS
TDD anche su iOSTDD anche su iOS
TDD anche su iOS
 
Jfrog artifactory artifact management c tamilmaran presentation - copy
Jfrog artifactory artifact management c tamilmaran presentation - copyJfrog artifactory artifact management c tamilmaran presentation - copy
Jfrog artifactory artifact management c tamilmaran presentation - copy
 
Enabling Cloud Native Buildpacks for Windows Containers
Enabling Cloud Native Buildpacks for Windows ContainersEnabling Cloud Native Buildpacks for Windows Containers
Enabling Cloud Native Buildpacks for Windows Containers
 

Similar to Chris Homer - Moving the entire stack to k8s within a year – lessons learned

Truemotion Adventures in Containerization
Truemotion Adventures in ContainerizationTruemotion Adventures in Containerization
Truemotion Adventures in ContainerizationRyan Hunter
 
Container orchestration and microservices world
Container orchestration and microservices worldContainer orchestration and microservices world
Container orchestration and microservices worldKarol Chrapek
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016aspyker
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Sharma Podila
 
ContainerDays NYC 2015: "Easing Your Way Into Docker: Lessons From a Journey ...
ContainerDays NYC 2015: "Easing Your Way Into Docker: Lessons From a Journey ...ContainerDays NYC 2015: "Easing Your Way Into Docker: Lessons From a Journey ...
ContainerDays NYC 2015: "Easing Your Way Into Docker: Lessons From a Journey ...DynamicInfraDays
 
Migrating .NET Apps to CF, A Strategy for Enterprises
Migrating .NET Apps to CF, A Strategy for EnterprisesMigrating .NET Apps to CF, A Strategy for Enterprises
Migrating .NET Apps to CF, A Strategy for EnterprisesVMware Tanzu
 
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...Haggai Philip Zagury
 
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsDevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsAmbassador Labs
 
Dockerizing Aurea - Docker Con EU 2017
Dockerizing Aurea - Docker Con EU 2017Dockerizing Aurea - Docker Con EU 2017
Dockerizing Aurea - Docker Con EU 2017Matias Lespiau
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On DemandBogdan Kyryliuk
 
Kubernetes Clusters At Scale: Managing Hundreds Apache Pinot Kubernetes Clust...
Kubernetes Clusters At Scale: Managing Hundreds Apache Pinot Kubernetes Clust...Kubernetes Clusters At Scale: Managing Hundreds Apache Pinot Kubernetes Clust...
Kubernetes Clusters At Scale: Managing Hundreds Apache Pinot Kubernetes Clust...Xiaoman DONG
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetesHelder Klemp
 
Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...
Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...
Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...Ridwan Fadjar
 
Docker Bday #5, SF Edition: Introduction to Docker
Docker Bday #5, SF Edition: Introduction to DockerDocker Bday #5, SF Edition: Introduction to Docker
Docker Bday #5, SF Edition: Introduction to DockerDocker, Inc.
 
Kubernetes Navigation Stories – DevOpsStage 2019, Kyiv
Kubernetes Navigation Stories – DevOpsStage 2019, KyivKubernetes Navigation Stories – DevOpsStage 2019, Kyiv
Kubernetes Navigation Stories – DevOpsStage 2019, KyivAleksey Asiutin
 
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)Cédrick Lunven
 
AWS-CICD_MULESOFT
AWS-CICD_MULESOFTAWS-CICD_MULESOFT
AWS-CICD_MULESOFTshiva310211
 
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...NETWAYS
 

Similar to Chris Homer - Moving the entire stack to k8s within a year – lessons learned (20)

Truemotion Adventures in Containerization
Truemotion Adventures in ContainerizationTruemotion Adventures in Containerization
Truemotion Adventures in Containerization
 
Container orchestration and microservices world
Container orchestration and microservices worldContainer orchestration and microservices world
Container orchestration and microservices world
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016
 
ContainerDays NYC 2015: "Easing Your Way Into Docker: Lessons From a Journey ...
ContainerDays NYC 2015: "Easing Your Way Into Docker: Lessons From a Journey ...ContainerDays NYC 2015: "Easing Your Way Into Docker: Lessons From a Journey ...
ContainerDays NYC 2015: "Easing Your Way Into Docker: Lessons From a Journey ...
 
Container Days
Container DaysContainer Days
Container Days
 
Migrating .NET Apps to CF, A Strategy for Enterprises
Migrating .NET Apps to CF, A Strategy for EnterprisesMigrating .NET Apps to CF, A Strategy for Enterprises
Migrating .NET Apps to CF, A Strategy for Enterprises
 
Webinar : Docker in Production
Webinar : Docker in ProductionWebinar : Docker in Production
Webinar : Docker in Production
 
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
 
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsDevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
 
Dockerizing Aurea - Docker Con EU 2017
Dockerizing Aurea - Docker Con EU 2017Dockerizing Aurea - Docker Con EU 2017
Dockerizing Aurea - Docker Con EU 2017
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On Demand
 
Kubernetes Clusters At Scale: Managing Hundreds Apache Pinot Kubernetes Clust...
Kubernetes Clusters At Scale: Managing Hundreds Apache Pinot Kubernetes Clust...Kubernetes Clusters At Scale: Managing Hundreds Apache Pinot Kubernetes Clust...
Kubernetes Clusters At Scale: Managing Hundreds Apache Pinot Kubernetes Clust...
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...
Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...
Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...
 
Docker Bday #5, SF Edition: Introduction to Docker
Docker Bday #5, SF Edition: Introduction to DockerDocker Bday #5, SF Edition: Introduction to Docker
Docker Bday #5, SF Edition: Introduction to Docker
 
Kubernetes Navigation Stories – DevOpsStage 2019, Kyiv
Kubernetes Navigation Stories – DevOpsStage 2019, KyivKubernetes Navigation Stories – DevOpsStage 2019, Kyiv
Kubernetes Navigation Stories – DevOpsStage 2019, Kyiv
 
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
 
AWS-CICD_MULESOFT
AWS-CICD_MULESOFTAWS-CICD_MULESOFT
AWS-CICD_MULESOFT
 
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
 

More from Dariia Seimova

juliya tkachova - dev ops on scale from philosophy to toolset
juliya tkachova - dev ops on scale from philosophy to toolsetjuliya tkachova - dev ops on scale from philosophy to toolset
juliya tkachova - dev ops on scale from philosophy to toolsetDariia Seimova
 
rohit sharma - dev ops virtual assistant - automate devops stuffs using nlp a...
rohit sharma - dev ops virtual assistant - automate devops stuffs using nlp a...rohit sharma - dev ops virtual assistant - automate devops stuffs using nlp a...
rohit sharma - dev ops virtual assistant - automate devops stuffs using nlp a...Dariia Seimova
 
ostap soroka - aws architecture and a human body
ostap soroka - aws architecture and a human bodyostap soroka - aws architecture and a human body
ostap soroka - aws architecture and a human bodyDariia Seimova
 
sveta smirnova - my sql performance schema in action
sveta smirnova - my sql performance schema in actionsveta smirnova - my sql performance schema in action
sveta smirnova - my sql performance schema in actionDariia Seimova
 
faisal mushtaq - an enterprise cloud cost management framework
faisal mushtaq - an enterprise cloud cost management frameworkfaisal mushtaq - an enterprise cloud cost management framework
faisal mushtaq - an enterprise cloud cost management frameworkDariia Seimova
 
mykola marzhan - jenkins on aws spot instance
mykola marzhan - jenkins on aws spot instancemykola marzhan - jenkins on aws spot instance
mykola marzhan - jenkins on aws spot instanceDariia Seimova
 
maksym vlasov - culture of git as roots of your ci
maksym vlasov - culture of git as roots of your cimaksym vlasov - culture of git as roots of your ci
maksym vlasov - culture of git as roots of your ciDariia Seimova
 
vitaly davidoff - end 2 end containers secure sdlc process
vitaly davidoff - end 2 end containers secure sdlc processvitaly davidoff - end 2 end containers secure sdlc process
vitaly davidoff - end 2 end containers secure sdlc processDariia Seimova
 
yegor maksymchuk - open shift as a cloud for data science
yegor maksymchuk - open shift as a cloud for data scienceyegor maksymchuk - open shift as a cloud for data science
yegor maksymchuk - open shift as a cloud for data scienceDariia Seimova
 

More from Dariia Seimova (9)

juliya tkachova - dev ops on scale from philosophy to toolset
juliya tkachova - dev ops on scale from philosophy to toolsetjuliya tkachova - dev ops on scale from philosophy to toolset
juliya tkachova - dev ops on scale from philosophy to toolset
 
rohit sharma - dev ops virtual assistant - automate devops stuffs using nlp a...
rohit sharma - dev ops virtual assistant - automate devops stuffs using nlp a...rohit sharma - dev ops virtual assistant - automate devops stuffs using nlp a...
rohit sharma - dev ops virtual assistant - automate devops stuffs using nlp a...
 
ostap soroka - aws architecture and a human body
ostap soroka - aws architecture and a human bodyostap soroka - aws architecture and a human body
ostap soroka - aws architecture and a human body
 
sveta smirnova - my sql performance schema in action
sveta smirnova - my sql performance schema in actionsveta smirnova - my sql performance schema in action
sveta smirnova - my sql performance schema in action
 
faisal mushtaq - an enterprise cloud cost management framework
faisal mushtaq - an enterprise cloud cost management frameworkfaisal mushtaq - an enterprise cloud cost management framework
faisal mushtaq - an enterprise cloud cost management framework
 
mykola marzhan - jenkins on aws spot instance
mykola marzhan - jenkins on aws spot instancemykola marzhan - jenkins on aws spot instance
mykola marzhan - jenkins on aws spot instance
 
maksym vlasov - culture of git as roots of your ci
maksym vlasov - culture of git as roots of your cimaksym vlasov - culture of git as roots of your ci
maksym vlasov - culture of git as roots of your ci
 
vitaly davidoff - end 2 end containers secure sdlc process
vitaly davidoff - end 2 end containers secure sdlc processvitaly davidoff - end 2 end containers secure sdlc process
vitaly davidoff - end 2 end containers secure sdlc process
 
yegor maksymchuk - open shift as a cloud for data science
yegor maksymchuk - open shift as a cloud for data scienceyegor maksymchuk - open shift as a cloud for data science
yegor maksymchuk - open shift as a cloud for data science
 

Recently uploaded

Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptxmary850239
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesVijayaLaxmi84
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptxAneriPatwari
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxAnupam32727
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 

Recently uploaded (20)

Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their uses
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptx
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 

Chris Homer - Moving the entire stack to k8s within a year – lessons learned

  • 1. Moving Our Entire Stack to K8S Within a Year - 7 Lessons Learned October 12, 2018
  • 2. Chris Homer Co-Founder & CTO at thredUPddd ● Largest Consignment Store ● $130M+ invested ● 1000+ employees ● 4 distribution centers ● Kiev & SF Engineering Offices ● We’re Hiring! Co-Founder & CTO at thredUP Solution Specialist at Microsoft Princeton University & Harvard Business School Chris Homer - @chrishomer
  • 3.
  • 4. Confidential 4 The thredUP Marketplace ● Convenient Pre-Paid Bag ● Earn Cash or Donate ● Do Good ● Amazing prices ● Wide assortment ● Fresh selection everyday
  • 5. Confidential 5 Visualizing The thredUP Marketplace
  • 7. Confidential 7 Augmenting the Marketplace Supplier Scoring Partners Supply Lifecycle Quality + Expected Value Proprietary Pricing Algorithm Personalization Search Notifications Discounting Algorithm Marketing
  • 8. Confidential 8 ● K8S Migration Begins Infrastructure Timeline A little history of our journey towards the promised-land 2010 201820152014 2016 2017 2019 ...2009 ● Slicehost ● Manual Config ● Capistrano Deploy ● Manual Tests ● AWS Hosted ● Manual Saved AMI’s ● Staging & Dev - cleansed prod copy ● “Outsourcing DevOps” ● Back to Chef ● “Microservices” ● Hand-crafted Staging ● Chef ● Ansible all the things ● “Insourcing DevOps” ● Back to Ansible - One Source of Truth ● Infrastructure Team ● DevOps is about Culture ● Security Assessment ● Terraform ● Ansible Hardening ● Dynamic Staging ● Service Mesh ● DevSecOps● Docker & ECS “Attempt”
  • 9. Confidential 9 The Current Infrastructure Stack After the migration, the picture is getting clearer and increasingly rational prod staging dev
  • 10. Confidential 10 Why Docker & Kubernetes? ● Obviously because it’s cool & hype :) ● Popularity - widely supported ● Scalable & fault-tolerant out of the box ● Flexibility & deep control ● Standardization & ownership ● Speed up development lifecycle ● Encourage more & smaller services ● Linux Foundation & CNCF
  • 11. Confidential 11 Learning #1 - Fear, Uncertainty & Doubt => Excitement & Ownership ● Not everyone will be on board ● Share the vision, explain the advantages, pains and short-comings ● A simple demo application helps “make it real” ● Emphasize that success requires app team and infra team ownership ● Cultivate champions and use their help ● Momentum is your friend ● Milestones are important for larger services ● Technical debt opportunities ● Knowledge sharing & workshops along the way and after
  • 12. Confidential 12 Learning #2 - Pay close attention to performance ➢ Setup k8s VPS that is peered with prod VPC ○ Redis ○ Memcached ○ Aurora ➢ scale haproxy instances ➢ update kubernetes nodes to c5.2xlarge ➢ disable ingress controller ➢ disable kubeDNS
  • 13. Confidential 13 Learning #2 - Pay close attention to performance ec2 response time p90 k8s response time p90
  • 14. Confidential 14 Learning #2 cont’d - Internal communication is way faster access by cluster IP access by public DNS name
  • 15. Confidential 15 Learning #3 - Liveness probe is not always your friend Response time time k8s healthcheck timeout External Request Our Code
  • 16. Confidential 16 Learning #3 - Liveness probe is not always your friend Response time time
  • 17. Confidential 17 Many DNS errors and ~5 seconds delays Learning #4 – DNS
  • 18. Confidential 18 Many DNS errors and ~5 seconds delays Learning #4 – DNS ● It’s a well-known issue with UDP & Dynamic NAT ● It has a bug report - https://github.com/kubernetes/kubernetes/issues/56903 ● And good problem explanation https://www.weave.works/blog/racy-conntrack-and-dns- lookup-timeouts Solution – use TCP as a protocol dnsConfig: options: - name: use-vc dnsPolicy: ClusterFirst Another Solution dnsConfig: options: - name: single-request-reopen dnsPolicy: ClusterFirst
  • 19. Confidential 19 Learning #5 - Too many open files Ok, Google =) max_user_watches=8192 → this looks too low, let's bump it a little! That did seem to help … For some time ....
  • 20. Confidential 20 Learning #5 - Too many open files Spikes seem to correlate with POD Crash Loops? Why?
  • 21. Confidential 21 Learning #5 - Too many open files Logaggregatorclient Docker container container container log file log file log file fd fd fd
  • 22. Confidential 22 Learning #5 - Too many open files Docker container container container log file log file log file fd fd container log file fd fd fd fdfd Logaggregatorclient These are still opened
  • 23. Confidential 23 Learning #6 - Pod Distribution after Cluster Maintenance Worker node#1 Worker node#2 Worker node#3 Service A pod Service A pod Service A pod
  • 24. Confidential 24 Learning #6 - Pod Distribution after Cluster Maintenance Under Maintenance Worker node#1 Worker node#2 Worker node#3 Service A pod Service A pod Service A pod Service A pod
  • 25. Confidential 25 Learning #6 - Pod Distribution after Cluster Maintenance Alive and functioning Worker node#1 Worker node#2 Worker node#3 Service A pod Service A pod Service A pod
  • 26. Confidential 26 Learning #6 - Pod Distribution after Cluster Maintenance Worker node#1 Worker node#2 Worker node#3 Service A pod Service A pod Service A pod
  • 27. Confidential 27 Learning #6 - Pod Distribution after Cluster Maintenance Worker node#1 Worker node#2 Worker node#3 Service A pod Service A pod Service A pod Service A pod Service A pod All traffic goes here
  • 28. Confidential 28 Learning #6 - Pod Distribution after Cluster Maintenance Worker node#1 Worker node#2 Worker node#3 Service A pod Service A pod Service A pod Service A pod Service A pod All traffic goes here Solution: Redeploy to redistribute pods
  • 29. Confidential 29 Learning #6 - Pod Distribution after Cluster Maintenance Worker node#1 Worker node#2 Worker node#3 Service A pod Service A pod Service A pod
  • 30. Confidential 30 Learning #7 - Building Docker Images within the K8s Cluster Kubernetes worker node Docker daemon docker.sock Jenkins slave Container BContainer A Docker cli Containers Jenkinsfile docker build ... docker build ...
  • 31. Confidential 31 Learning #7 - Building Docker Images within the K8s Cluster Kubernetes worker node Docker daemon docker.sock Jenkins slave Container BContainer A Docker cli Containers Jenkinsfile docker rm ... docker rm ...
  • 32. Confidential 32 Learning #7 - Building Docker Images within the K8s Cluster Kubernetes worker node Docker daemon Jenkins slave Container BContainer A Docker cli Containers Separate ec2 instance
  • 33. Confidential 33 Was it worth it? YES! ● Deployment time halved ~ (main service – from 12 min to 5 min) ● Rollback is very easy and fast (nearly instant) ● Hardware provisioned decreased by a factor of 3 ● Pods autoscaling eliminated manual work to support traffic spikes ● System level upgrades are now non-blocking and easy to execute ● Time to provision and deploy a new service in production changed from days/weeks to minutes/hours ● Each project has its own simple helm chart in a project repo ~ 3200 ansible config files deprecated.
  • 34. Confidential 34 What’s next? ● Dynamic Staging Environments ○ Encourage better development workflow ○ Easily enable cross-team review with design, marketing and others ● Telepresence for Complex Local Development ○ Easier onboarding & dev env refresh ○ More consistent behavior with production ● End-to-end integration suite ● Iterate for Improvements ○ Faster builds ○ Cluster Performance ○ Observability ○ Cost Improvements ● Service mesh with Istio