SlideShare a Scribd company logo
1 of 54
Welcome to
&
Dock8s Meetup
Robert Werlich
Site Reliability Engineer
robert.werlich@verivox.com
Marlen Blaube
Senior HR Business Partner
marlen.blaube@verivox.com
Cloud Journey: Lifting a Major
Product to Kubernetes
Dock8s Meetup Heidelberg
Feb 27th 2019
Martin Danielsson, Haufe Group, Freiburg
@donmartin76 (Twitter, Github)
Dock8s Meetup Heidelberg, February 27th 2019
whoami
C:> WINDOWS.EXE
C/C++/C# Background
10+ years
$ docker ps
Containers & Kubernetes
Since ~4 years
wicked.haufe.io maintainer
OSS API Management
Solution
Architect
Developer
since 2006
Dock8s Meetup Heidelberg, February 27th 2019
Dock8s Meetup Heidelberg, February 27th 2019
Agenda
Operations
Planning & Politics
We‘ll set the scene a little.
Dock8s Meetup Heidelberg, February 27th 2019
Some numbers
100+ active
git repos
874k LOC
10-15 Developers
200-500
concur-
rent users
Typically
100 req/s
448 GB RAM
56 Cores
Major revenue
Strategic move
to containers Modular
Architecture
Without Container
Experience
Hosted with Hoster
(€€€)
Long Release
cycles
(LOTS of) Manual
Work for Releases
Little Operations
Insight
Error tracking
very difficult
Non-Parity
Dev/Test/Prod
(Cost!)
Legacy Web App
(Java based)
Dock8s Meetup Heidelberg, February 27th 2019
Vision – Goals
Enabling
CI/CD
Automatic
Provisioning
Full Insight
Minimize
Ops
Dock8s Meetup Heidelberg, February 27th 2019
Let‘s go DevOps
in the Cloud!
Dock8s Meetup Heidelberg, February 27th 2019
Project
Interfaces
Technology Processes
HR Topics
(Operations)
Stakeholder
Management
Dock8s Meetup Heidelberg, February 27th 2019
Stakeholder Management
CONVINCE THEM,
DON‘T PERSUADE
THEM
COMMUNICATE
OFTEN AND
CLEARLY
DON‘T
UNDERESTIMATE
TASKS AT HAND
BE TRANSPARENT SHARE
SUCCESSES
BUT ALSO
FAILURES!
Dock8s Meetup Heidelberg, February 27th 2019
Team Setup – Vision
100% DevOps Engineers
T-Shaped Engineers
No dedicated manual testers
Automate! YBI, YRI. Ops experience?
Dock8s Meetup Heidelberg, February 27th 2019
Some HR topics
Release
Managers?
Operations
Responsibility?
Quality
Engineers
(testers)?
On Call Duty?
Technology
This is what you came for…?
Dock8s Meetup Heidelberg, February 27th 2019
Technology Stack
Kubernetes
Azure (Public Cloud)
Dock8s Meetup Heidelberg, February 27th 2019
Steps to DevOps Happiness
Provision Deploy CI/CD
Weekly for Production, Daily for Dev/Test
Ship when ready!
Dock8s Meetup Heidelberg, February 27th 2019
Wait, uh, what…?
Target
“No-Ops”
No long-running
systems
Enable validation of
3rd Party component
upgrades
Incremental
changes
Practice Disaster
Recovery Daily
100% Reproducible
Deployments
On-demand Production
Identical Environments
Dock8s Meetup Heidelberg, February 27th 2019
Code
&
Pipelines
So, it‘s all…
… and pipelines are also code
Dock8s Meetup Heidelberg, February 27th 2019
Incremental Backend Development
Merge feature to
master
•After code
review
•Including test
suite changes
Build master
branch
•Includes unit
testing
•First integration
tests
Deploy to
integration system
•Blue/Green with
integration tests
Deploy to
Production
•Blue/Green with
integration tests
Dock8s Meetup Heidelberg, February 27th 2019
Incremental Frontend Development
Merge feature to
master
•After code
review
•Including test
suite changes
Build master
branch
•Includes unit
testing
•First integration
tests
Deploy to
integration system
•Run e2e
integration tests
•Rollback if
failing
Deploy to
Production
•Run e2e
integration tests
•Rollback if
failing
Dock8s Meetup Heidelberg, February 27th 2019
Stateless Components
Stateful Components
Dock8s Meetup Heidelberg, February 27th 2019
Full Provisioning
Create backup
Provision new
infrastructure
•From backups
•Same as
disaster
recovery!
Deploy
components
•Using
deployment
pipelines
•Partly
parallelized
Top level DNS
switch
•Using DNS
traffic
manager
Destroy old
infrastructure
•If tests
succeed
Dock8s Meetup Heidelberg, February 27th 2019
Persistence Options
Roll your own persistence Persistence “as a service”
Self managed VMs (incl. NFS) Managed Disks
(AWS EBS, Azure Managed Disks)
DBaaS (many options)
Files as a service
(AWS EFS, Azure Files)
Gluster/Ceph FS (cluster)
Dock8s Meetup Heidelberg, February 27th 2019
iDesk2 Deployment Architecture
Resource Group
Kubernetes
Cluster
ks8
Master
ks8
Agent
ks8
Agent n
…
NFS
VM(s)
Postgres
VM(s)
Disks
Disks
• Azure Files not fast enough
• Legacy components depend on
UNIX rights (Azure Files is SMB)
• Azure Disks only ReadWriteOnce
• Azure PGaaS was not yet available
• More „bang for your buck“
• PG Admin knowledge in Team
Dock8s Meetup Heidelberg, February 27th 2019
Endless Variants
Dock8s Meetup Heidelberg, February 27th 2019
Some hints…
Assess your Persistence
Needs early on
If possible, use DBaaS
(avoid NIH syndrome)
Externalize Configuration
Shared File Storage is not
“Cloud Native”
Operations
No. You don‘t get around it. Sorry.
Dock8s Meetup Heidelberg, February 27th 2019
Now that we have Kubernetes…?
Self healing
Robust
Production Ready
Battle proven
“Vertrauen ist gut...
… Kontrolle ist besser!”
Complex
Additional Abstraction
Layer
Dock8s Meetup Heidelberg, February 27th 2019
“Kontrolle” - What do you mean?
Detecting these things is a start...
Dock8s Meetup Heidelberg, February 27th 2019
Fail: Lyin’ Monitors
End-to-End Monitoring
ALL GOOD
People logging in
500
… an entire weekend.
Dock8s Meetup Heidelberg, February 27th 2019
Instrument
Monitor and Alert
Enable Insight
Dock8s Meetup Heidelberg, February 27th 2019
Prometheu
s
A
Metrics
Endpoint
http://A:8080/metrics
JVM Metrics
Node.js Metrics
VM Exporters
(node_exporter)
DB Exporters
(pg_exporter)
Kubernetes Statistics
Prometheus Client based
Custom Exporters
...
BTime Series
DB
Dock8s Meetup Heidelberg, February 27th 2019
Alertmanage
r
Dock8s Meetup Heidelberg, February 27th 2019
Metrics
White Box Black Box
Counters
GaugesHistograms
Summaries Application
Network
Latencies
Errors
Timeouts
Infrastructure
Disk Space
CPU
Memory
Pod Status
Dock8s Meetup Heidelberg, February 27th 2019
Friday 9 o’clock
Newsletter
Dock8s Meetup Heidelberg, February 27th 2019
205’886
Dock8s Meetup Heidelberg, February 27th 2019
Alerting?
On what?
Dock8s Meetup Heidelberg, February 27th 2019
Availability
Infrastructure
https://www.zazzle.com/nines_dont_matter_t_shirt-235118578582589495
Charity Majors says…
(@mipsytipsy)
Dock8s Meetup Heidelberg, February 27th 2019
Percentage of document
retrieval requests served within
0.25 and 1s
Percentage of search requests
answered within 1, 3 and 7.5s
Percentage of Error Pages
Indicators
95% and 98.5%
50%, 95% and 98.5%
<1%
Agreements
Service Level
Dock8s Meetup Heidelberg, February 27th 2019
Dock8s Meetup Heidelberg, February 27th 2019
Holistic View
Instrument early (and lots)
Deployments easier
Less fear of change
We are in control!
hope and think we
Dock8s Meetup Heidelberg, February 27th 2019
Fails: Resiliency Issues
VMs are sometimes
patched and restarted.
Or they just die.
So will any
service on them.
Networks are
unreliable.
Connections will fail.
Use (libraries for)
circuit breakers
and retries.
Re-establishing TLS on
each call to external
services is expensive.
… and the service
will hate you. Use
Keep-Alive.
SPOFs will
eventually fail.
Assess and act.
Learn how to
detect problems.
Conclusion
Was it worth it?
Dock8s Meetup Heidelberg, February 27th 2019
Would we do it again?
Dock8s Meetup Heidelberg, February 27th 2019
Key Performance
Indicators
• >70% Cost Saving
• Release Effort down >98% via
automation
• Higher Release Pace (3-5/y to 15-20/mo)
• Performance measurable
• Faster Reaction to Issues
• Unlocks Cloud Technology
Dock8s Meetup Heidelberg, February 27th 2019
Dock8s Meetup Heidelberg, February 27th 2019
k8s Ops possible
as a Team
Requires full automation
(also test)
Team dedication Rethinking ops is
challenging
No Silver Bullet
Assess your requirements
Dock8s Meetup Heidelberg, February 27th 2019
Some Links…
kubernetes.io
prometheus.io
grafana.com
azure.com
aws.amazon.com
Twitter @donmartin76
GitHub donmartin76
We’re hiring!
www.haufegroup.com/en/career

More Related Content

What's hot

RedisConf18 - Using Redis as a Backend in a Serverless Application With Kubeless
RedisConf18 - Using Redis as a Backend in a Serverless Application With KubelessRedisConf18 - Using Redis as a Backend in a Serverless Application With Kubeless
RedisConf18 - Using Redis as a Backend in a Serverless Application With Kubeless
Redis Labs
 

What's hot (20)

Architecting for Continuous Delivery
Architecting for Continuous DeliveryArchitecting for Continuous Delivery
Architecting for Continuous Delivery
 
Ambassador Developer Office Hours: Summer of Kubernetes Ship Week 1: Intro to...
Ambassador Developer Office Hours: Summer of Kubernetes Ship Week 1: Intro to...Ambassador Developer Office Hours: Summer of Kubernetes Ship Week 1: Intro to...
Ambassador Developer Office Hours: Summer of Kubernetes Ship Week 1: Intro to...
 
HP Helion European Webinar Series ,Webinar #3
HP Helion European Webinar Series ,Webinar #3 HP Helion European Webinar Series ,Webinar #3
HP Helion European Webinar Series ,Webinar #3
 
RedisConf18 - Using Redis as a Backend in a Serverless Application With Kubeless
RedisConf18 - Using Redis as a Backend in a Serverless Application With KubelessRedisConf18 - Using Redis as a Backend in a Serverless Application With Kubeless
RedisConf18 - Using Redis as a Backend in a Serverless Application With Kubeless
 
Tectonic Summit 2016: Ticketmaster's Public Cloud & Kubernetes Strategy
Tectonic Summit 2016: Ticketmaster's Public Cloud & Kubernetes StrategyTectonic Summit 2016: Ticketmaster's Public Cloud & Kubernetes Strategy
Tectonic Summit 2016: Ticketmaster's Public Cloud & Kubernetes Strategy
 
Reactive Microservices with Quarkus
Reactive Microservices with QuarkusReactive Microservices with Quarkus
Reactive Microservices with Quarkus
 
Hardening Your CI/CD Pipelines with GitOps and Continuous Security
Hardening Your CI/CD Pipelines with GitOps and Continuous SecurityHardening Your CI/CD Pipelines with GitOps and Continuous Security
Hardening Your CI/CD Pipelines with GitOps and Continuous Security
 
Building Cloud Native Applications Using Azure Kubernetes Service
Building Cloud Native Applications Using Azure Kubernetes ServiceBuilding Cloud Native Applications Using Azure Kubernetes Service
Building Cloud Native Applications Using Azure Kubernetes Service
 
High-Precision GPS Positioning for Spring Developers
High-Precision GPS Positioning for Spring DevelopersHigh-Precision GPS Positioning for Spring Developers
High-Precision GPS Positioning for Spring Developers
 
Journey Through Four Stages of Kubernetes Deployment Maturity
Journey Through Four Stages of Kubernetes Deployment MaturityJourney Through Four Stages of Kubernetes Deployment Maturity
Journey Through Four Stages of Kubernetes Deployment Maturity
 
Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.Crap. Your Big Data Kitchen Is Broken.
Crap. Your Big Data Kitchen Is Broken.
 
Yannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflowYannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflow
 
Observe and command your fleets across any kubernetes with weave git ops
Observe and command your fleets across any kubernetes with weave git opsObserve and command your fleets across any kubernetes with weave git ops
Observe and command your fleets across any kubernetes with weave git ops
 
Kubernetes 1.21 release
Kubernetes 1.21 releaseKubernetes 1.21 release
Kubernetes 1.21 release
 
Event specifications, state of the serverless landscape, and other news from ...
Event specifications, state of the serverless landscape, and other news from ...Event specifications, state of the serverless landscape, and other news from ...
Event specifications, state of the serverless landscape, and other news from ...
 
The art of decomposing monoliths
The art of decomposing monolithsThe art of decomposing monoliths
The art of decomposing monoliths
 
Serverless APIs with Apache OpenWhisk
Serverless APIs with Apache OpenWhiskServerless APIs with Apache OpenWhisk
Serverless APIs with Apache OpenWhisk
 
Cloud Foundry for Data Science
Cloud Foundry for Data ScienceCloud Foundry for Data Science
Cloud Foundry for Data Science
 
From Developer to Data Scientist - Gaines Kergosien
From Developer to Data Scientist - Gaines KergosienFrom Developer to Data Scientist - Gaines Kergosien
From Developer to Data Scientist - Gaines Kergosien
 
Implementing DevOps – How it came to the fore, its key elements and example d...
Implementing DevOps – How it came to the fore, its key elements and example d...Implementing DevOps – How it came to the fore, its key elements and example d...
Implementing DevOps – How it came to the fore, its key elements and example d...
 

Similar to Cloud Journey: Lifting a Major Product to Kubernetes

RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service Catalog
RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service CatalogRedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service Catalog
RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service Catalog
Redis Labs
 
Securing Red Hat OpenShift Containerized Applications At Enterprise Scale
Securing Red Hat OpenShift Containerized Applications At Enterprise ScaleSecuring Red Hat OpenShift Containerized Applications At Enterprise Scale
Securing Red Hat OpenShift Containerized Applications At Enterprise Scale
DevOps.com
 

Similar to Cloud Journey: Lifting a Major Product to Kubernetes (20)

Kubernetes - 7 lessons learned from 7 data centers in 7 months
Kubernetes - 7 lessons learned from 7 data centers in 7 monthsKubernetes - 7 lessons learned from 7 data centers in 7 months
Kubernetes - 7 lessons learned from 7 data centers in 7 months
 
Pat Gelsinger, James Watters, Cornelia Davis at SpringOne Platform 2019
Pat Gelsinger, James Watters, Cornelia Davis at SpringOne Platform 2019Pat Gelsinger, James Watters, Cornelia Davis at SpringOne Platform 2019
Pat Gelsinger, James Watters, Cornelia Davis at SpringOne Platform 2019
 
The Big Cloud Native FaaS Lebowski
The Big Cloud Native FaaS LebowskiThe Big Cloud Native FaaS Lebowski
The Big Cloud Native FaaS Lebowski
 
Open up your platform with Open Source and GitHub
Open up your platform with Open Source and GitHubOpen up your platform with Open Source and GitHub
Open up your platform with Open Source and GitHub
 
RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service Catalog
RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service CatalogRedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service Catalog
RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service Catalog
 
A Data-Driven Approach for Mobile Testing and Automation
A Data-Driven Approach for Mobile Testing and AutomationA Data-Driven Approach for Mobile Testing and Automation
A Data-Driven Approach for Mobile Testing and Automation
 
How to Scale With Helix Core and Microsoft Azure
How to Scale With Helix Core and Microsoft Azure How to Scale With Helix Core and Microsoft Azure
How to Scale With Helix Core and Microsoft Azure
 
Building CI from scratch
Building CI from scratchBuilding CI from scratch
Building CI from scratch
 
La importancia de versionar el código: GitHub, portafolio y recursos para est...
La importancia de versionar el código: GitHub, portafolio y recursos para est...La importancia de versionar el código: GitHub, portafolio y recursos para est...
La importancia de versionar el código: GitHub, portafolio y recursos para est...
 
Windows containers on Kubernetes
Windows containers on KubernetesWindows containers on Kubernetes
Windows containers on Kubernetes
 
Kubernetes for Developers - 7 lessons learned from 7 data centers in 7 months...
Kubernetes for Developers - 7 lessons learned from 7 data centers in 7 months...Kubernetes for Developers - 7 lessons learned from 7 data centers in 7 months...
Kubernetes for Developers - 7 lessons learned from 7 data centers in 7 months...
 
Migrating from IBM API Connect v5 to v2018
Migrating from IBM API Connect v5 to v2018Migrating from IBM API Connect v5 to v2018
Migrating from IBM API Connect v5 to v2018
 
Talking Architecture Shop - Exploring Open Source Success at Scale
Talking Architecture Shop - Exploring Open Source Success at ScaleTalking Architecture Shop - Exploring Open Source Success at Scale
Talking Architecture Shop - Exploring Open Source Success at Scale
 
Securing Red Hat OpenShift Containerized Applications At Enterprise Scale
Securing Red Hat OpenShift Containerized Applications At Enterprise ScaleSecuring Red Hat OpenShift Containerized Applications At Enterprise Scale
Securing Red Hat OpenShift Containerized Applications At Enterprise Scale
 
Http Services in Rust on Containers
Http Services in Rust on ContainersHttp Services in Rust on Containers
Http Services in Rust on Containers
 
Combinação de logs, métricas e rastreamentos para observabilidade unificada
Combinação de logs, métricas e rastreamentos para observabilidade unificadaCombinação de logs, métricas e rastreamentos para observabilidade unificada
Combinação de logs, métricas e rastreamentos para observabilidade unificada
 
The Big Cloud native FaaS Lebowski
The Big Cloud native FaaS LebowskiThe Big Cloud native FaaS Lebowski
The Big Cloud native FaaS Lebowski
 
DevOps KPIs as a Service: Daimler’s Solution
DevOps KPIs as a Service: Daimler’s SolutionDevOps KPIs as a Service: Daimler’s Solution
DevOps KPIs as a Service: Daimler’s Solution
 
TEC118 – How Do You Manage the Configuration of Your Environments from Metal ...
TEC118 –How Do You Manage the Configuration of Your Environments from Metal ...TEC118 –How Do You Manage the Configuration of Your Environments from Metal ...
TEC118 – How Do You Manage the Configuration of Your Environments from Metal ...
 
Introducing GitLab (September 2018)
Introducing GitLab (September 2018)Introducing GitLab (September 2018)
Introducing GitLab (September 2018)
 

More from Haufe-Lexware GmbH & Co KG

More from Haufe-Lexware GmbH & Co KG (20)

Tech stackhaufegroup
Tech stackhaufegroupTech stackhaufegroup
Tech stackhaufegroup
 
X-celerate 2019: Iterating fast with the MERN Stack
X-celerate 2019: Iterating fast with the MERN StackX-celerate 2019: Iterating fast with the MERN Stack
X-celerate 2019: Iterating fast with the MERN Stack
 
Nils Rhode - Does it always have to be k8s - TeC Day 2019
Nils Rhode - Does it always have to be k8s - TeC Day 2019Nils Rhode - Does it always have to be k8s - TeC Day 2019
Nils Rhode - Does it always have to be k8s - TeC Day 2019
 
Haufe Onboarding - Fast Iterating With the MERN Stack - TEC Day 2019
Haufe Onboarding - Fast Iterating With the MERN Stack - TEC Day 2019Haufe Onboarding - Fast Iterating With the MERN Stack - TEC Day 2019
Haufe Onboarding - Fast Iterating With the MERN Stack - TEC Day 2019
 
ONA ( organizational network analysis ) to enable individuals to impact their...
ONA ( organizational network analysis ) to enable individuals to impact their...ONA ( organizational network analysis ) to enable individuals to impact their...
ONA ( organizational network analysis ) to enable individuals to impact their...
 
ONA ( organizational network analysis ) enabling individuals to impact their ...
ONA ( organizational network analysis ) enabling individuals to impact their ...ONA ( organizational network analysis ) enabling individuals to impact their ...
ONA ( organizational network analysis ) enabling individuals to impact their ...
 
Using word vectors to enable better search in our legal products
Using word vectors to enable better search in our legal productsUsing word vectors to enable better search in our legal products
Using word vectors to enable better search in our legal products
 
Identifying customer potentials through unsupervised learning
Identifying customer potentials through unsupervised learningIdentifying customer potentials through unsupervised learning
Identifying customer potentials through unsupervised learning
 
Field report: Rapid application development
Field report: Rapid application developmentField report: Rapid application development
Field report: Rapid application development
 
Behavior-Driven Development with JGiven
Behavior-Driven Development with JGivenBehavior-Driven Development with JGiven
Behavior-Driven Development with JGiven
 
Externalized Spring Boot App Configuration
Externalized  Spring Boot App ConfigurationExternalized  Spring Boot App Configuration
Externalized Spring Boot App Configuration
 
Managing short lived Kubernetes (Production) deployments
Managing short lived Kubernetes (Production) deploymentsManaging short lived Kubernetes (Production) deployments
Managing short lived Kubernetes (Production) deployments
 
Docker in Production at the Aurora Team
Docker in Production at the Aurora TeamDocker in Production at the Aurora Team
Docker in Production at the Aurora Team
 
DevOps Journey of Foundational Services at Haufe
DevOps Journey of Foundational Services at HaufeDevOps Journey of Foundational Services at Haufe
DevOps Journey of Foundational Services at Haufe
 
New Serverless World - Cloud Native Apps
New Serverless World - Cloud Native AppsNew Serverless World - Cloud Native Apps
New Serverless World - Cloud Native Apps
 
Microservice Transformation of the Haufe Publishing System
Microservice Transformation of the Haufe Publishing SystemMicroservice Transformation of the Haufe Publishing System
Microservice Transformation of the Haufe Publishing System
 
Haufe API Strategy
Haufe API StrategyHaufe API Strategy
Haufe API Strategy
 
Haufe's Tech Strategy In Practice
Haufe's Tech Strategy In PracticeHaufe's Tech Strategy In Practice
Haufe's Tech Strategy In Practice
 
Kubernetes Intro @HaufeDev
Kubernetes Intro @HaufeDev Kubernetes Intro @HaufeDev
Kubernetes Intro @HaufeDev
 
API Management with wicked.haufe.io
API Management with wicked.haufe.ioAPI Management with wicked.haufe.io
API Management with wicked.haufe.io
 

Recently uploaded

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 

Recently uploaded (20)

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 

Cloud Journey: Lifting a Major Product to Kubernetes

  • 1. Welcome to & Dock8s Meetup Robert Werlich Site Reliability Engineer robert.werlich@verivox.com Marlen Blaube Senior HR Business Partner marlen.blaube@verivox.com
  • 2. Cloud Journey: Lifting a Major Product to Kubernetes Dock8s Meetup Heidelberg Feb 27th 2019 Martin Danielsson, Haufe Group, Freiburg @donmartin76 (Twitter, Github)
  • 3. Dock8s Meetup Heidelberg, February 27th 2019 whoami C:> WINDOWS.EXE C/C++/C# Background 10+ years $ docker ps Containers & Kubernetes Since ~4 years wicked.haufe.io maintainer OSS API Management Solution Architect Developer since 2006
  • 4. Dock8s Meetup Heidelberg, February 27th 2019
  • 5. Dock8s Meetup Heidelberg, February 27th 2019 Agenda Operations
  • 6. Planning & Politics We‘ll set the scene a little.
  • 7. Dock8s Meetup Heidelberg, February 27th 2019 Some numbers 100+ active git repos 874k LOC 10-15 Developers 200-500 concur- rent users Typically 100 req/s 448 GB RAM 56 Cores
  • 8.
  • 9. Major revenue Strategic move to containers Modular Architecture Without Container Experience Hosted with Hoster (€€€) Long Release cycles (LOTS of) Manual Work for Releases Little Operations Insight Error tracking very difficult Non-Parity Dev/Test/Prod (Cost!) Legacy Web App (Java based)
  • 10. Dock8s Meetup Heidelberg, February 27th 2019 Vision – Goals Enabling CI/CD Automatic Provisioning Full Insight Minimize Ops
  • 11. Dock8s Meetup Heidelberg, February 27th 2019 Let‘s go DevOps in the Cloud!
  • 12. Dock8s Meetup Heidelberg, February 27th 2019 Project Interfaces Technology Processes HR Topics (Operations) Stakeholder Management
  • 13. Dock8s Meetup Heidelberg, February 27th 2019 Stakeholder Management CONVINCE THEM, DON‘T PERSUADE THEM COMMUNICATE OFTEN AND CLEARLY DON‘T UNDERESTIMATE TASKS AT HAND BE TRANSPARENT SHARE SUCCESSES BUT ALSO FAILURES!
  • 14. Dock8s Meetup Heidelberg, February 27th 2019 Team Setup – Vision 100% DevOps Engineers T-Shaped Engineers No dedicated manual testers Automate! YBI, YRI. Ops experience?
  • 15. Dock8s Meetup Heidelberg, February 27th 2019 Some HR topics Release Managers? Operations Responsibility? Quality Engineers (testers)? On Call Duty?
  • 16. Technology This is what you came for…?
  • 17. Dock8s Meetup Heidelberg, February 27th 2019 Technology Stack Kubernetes Azure (Public Cloud)
  • 18. Dock8s Meetup Heidelberg, February 27th 2019 Steps to DevOps Happiness Provision Deploy CI/CD Weekly for Production, Daily for Dev/Test Ship when ready!
  • 19. Dock8s Meetup Heidelberg, February 27th 2019 Wait, uh, what…? Target “No-Ops” No long-running systems Enable validation of 3rd Party component upgrades Incremental changes Practice Disaster Recovery Daily 100% Reproducible Deployments On-demand Production Identical Environments
  • 20. Dock8s Meetup Heidelberg, February 27th 2019 Code & Pipelines So, it‘s all… … and pipelines are also code
  • 21. Dock8s Meetup Heidelberg, February 27th 2019 Incremental Backend Development Merge feature to master •After code review •Including test suite changes Build master branch •Includes unit testing •First integration tests Deploy to integration system •Blue/Green with integration tests Deploy to Production •Blue/Green with integration tests
  • 22. Dock8s Meetup Heidelberg, February 27th 2019 Incremental Frontend Development Merge feature to master •After code review •Including test suite changes Build master branch •Includes unit testing •First integration tests Deploy to integration system •Run e2e integration tests •Rollback if failing Deploy to Production •Run e2e integration tests •Rollback if failing
  • 23. Dock8s Meetup Heidelberg, February 27th 2019 Stateless Components Stateful Components
  • 24. Dock8s Meetup Heidelberg, February 27th 2019 Full Provisioning Create backup Provision new infrastructure •From backups •Same as disaster recovery! Deploy components •Using deployment pipelines •Partly parallelized Top level DNS switch •Using DNS traffic manager Destroy old infrastructure •If tests succeed
  • 25. Dock8s Meetup Heidelberg, February 27th 2019 Persistence Options Roll your own persistence Persistence “as a service” Self managed VMs (incl. NFS) Managed Disks (AWS EBS, Azure Managed Disks) DBaaS (many options) Files as a service (AWS EFS, Azure Files) Gluster/Ceph FS (cluster)
  • 26. Dock8s Meetup Heidelberg, February 27th 2019 iDesk2 Deployment Architecture Resource Group Kubernetes Cluster ks8 Master ks8 Agent ks8 Agent n … NFS VM(s) Postgres VM(s) Disks Disks • Azure Files not fast enough • Legacy components depend on UNIX rights (Azure Files is SMB) • Azure Disks only ReadWriteOnce • Azure PGaaS was not yet available • More „bang for your buck“ • PG Admin knowledge in Team
  • 27. Dock8s Meetup Heidelberg, February 27th 2019 Endless Variants
  • 28. Dock8s Meetup Heidelberg, February 27th 2019 Some hints… Assess your Persistence Needs early on If possible, use DBaaS (avoid NIH syndrome) Externalize Configuration Shared File Storage is not “Cloud Native”
  • 29. Operations No. You don‘t get around it. Sorry.
  • 30. Dock8s Meetup Heidelberg, February 27th 2019 Now that we have Kubernetes…? Self healing Robust Production Ready Battle proven “Vertrauen ist gut... … Kontrolle ist besser!” Complex Additional Abstraction Layer
  • 31. Dock8s Meetup Heidelberg, February 27th 2019 “Kontrolle” - What do you mean? Detecting these things is a start...
  • 32. Dock8s Meetup Heidelberg, February 27th 2019 Fail: Lyin’ Monitors End-to-End Monitoring ALL GOOD People logging in 500 … an entire weekend.
  • 33. Dock8s Meetup Heidelberg, February 27th 2019 Instrument Monitor and Alert Enable Insight
  • 34. Dock8s Meetup Heidelberg, February 27th 2019 Prometheu s A Metrics Endpoint http://A:8080/metrics JVM Metrics Node.js Metrics VM Exporters (node_exporter) DB Exporters (pg_exporter) Kubernetes Statistics Prometheus Client based Custom Exporters ... BTime Series DB
  • 35. Dock8s Meetup Heidelberg, February 27th 2019 Alertmanage r
  • 36. Dock8s Meetup Heidelberg, February 27th 2019 Metrics White Box Black Box Counters GaugesHistograms Summaries Application Network Latencies Errors Timeouts Infrastructure Disk Space CPU Memory Pod Status
  • 37.
  • 38.
  • 39. Dock8s Meetup Heidelberg, February 27th 2019 Friday 9 o’clock Newsletter
  • 40.
  • 41. Dock8s Meetup Heidelberg, February 27th 2019 205’886
  • 42. Dock8s Meetup Heidelberg, February 27th 2019 Alerting? On what?
  • 43. Dock8s Meetup Heidelberg, February 27th 2019 Availability Infrastructure
  • 45.
  • 46. Dock8s Meetup Heidelberg, February 27th 2019 Percentage of document retrieval requests served within 0.25 and 1s Percentage of search requests answered within 1, 3 and 7.5s Percentage of Error Pages Indicators 95% and 98.5% 50%, 95% and 98.5% <1% Agreements Service Level
  • 47. Dock8s Meetup Heidelberg, February 27th 2019
  • 48. Dock8s Meetup Heidelberg, February 27th 2019 Holistic View Instrument early (and lots) Deployments easier Less fear of change We are in control! hope and think we
  • 49. Dock8s Meetup Heidelberg, February 27th 2019 Fails: Resiliency Issues VMs are sometimes patched and restarted. Or they just die. So will any service on them. Networks are unreliable. Connections will fail. Use (libraries for) circuit breakers and retries. Re-establishing TLS on each call to external services is expensive. … and the service will hate you. Use Keep-Alive. SPOFs will eventually fail. Assess and act. Learn how to detect problems.
  • 51. Dock8s Meetup Heidelberg, February 27th 2019 Would we do it again?
  • 52. Dock8s Meetup Heidelberg, February 27th 2019 Key Performance Indicators • >70% Cost Saving • Release Effort down >98% via automation • Higher Release Pace (3-5/y to 15-20/mo) • Performance measurable • Faster Reaction to Issues • Unlocks Cloud Technology Dock8s Meetup Heidelberg, February 27th 2019
  • 53. Dock8s Meetup Heidelberg, February 27th 2019 k8s Ops possible as a Team Requires full automation (also test) Team dedication Rethinking ops is challenging No Silver Bullet Assess your requirements
  • 54. Dock8s Meetup Heidelberg, February 27th 2019 Some Links… kubernetes.io prometheus.io grafana.com azure.com aws.amazon.com Twitter @donmartin76 GitHub donmartin76 We’re hiring! www.haufegroup.com/en/career

Editor's Notes

  1. YBIYRI = You build it, you run it.
  2. Could just as well have been AWS; Azure was investigated first as we didn‘t know whether we would have the need to go to Azure Germany (this was 2017).
  3. This has a couple of implications: You need backups for persistent data inside the cluster You must be able to automatically restore them You will also get a certain amount of „non-persisted“ time (time where you cannot persist user changed) – for Aurora, this is around 90 minutes each Tuesday early morning  Acceptable for us, may not be acceptable for other teams
  4. Instrument your components to get out (possibly) interesting metrics. Rather instrument more, if you do it from the start, it doesn’t hurt much. And adding later is also rather easy. Monitor and alert on anticipated failures or known previous issues If for some reason you cannot find or fix the root cause* With Monitor and Alert, I subsume Logging and Tracing here. Enable insight and visualization - or “debugging” if you will - to see inside your system what might have gone wrong.
  5. “This is what you would call ‘instrumenting’ your code” - exporting metrics from it You would use a client library (there are client libraries for most programming languages). This takes your application current state of all tracked metrics and transforms it into a format that Prometheus understands and exposes it via the http endpoints, which Prometheus scrapes at regular time intervals. There are a number of libraries and servers which help in exporting existing metrics from third-party systems as Prometheus metrics. This is useful for cases where it is not feasible to instrument a given system with Prometheus metrics directly
  6. What can we do with that data - Two examples: Dashboarding and Alerting E.g. Grafana can use Prometheus as a data source via the Prometheus Query Language to display time series as a graph, e.g. for dashboarding. Simultaneously, Prometheus can evaluate certain expressions to see whether alerts have to be triggered. These are then passed on to another component of Prometheus, the Alertmanager, which in turn makes sure the alerts are delivered to wherever they should be delivered to. For us, that’s (both) Rocket Chat and E-Mail.
  7. One step back, what kind of metrics exist? Let’s look at a couple of categories - first, white box and black box. That’s where the metrics come from - do you measure them inside your stack (white box), or do you probe from the outside - black box. Hint: You should do both. Bottom left you see different types of metrics here specifically Prometheus supports - Counters (things which only increase), Gauges (things which go up and down), Histograms (to see a discrete distribution) and Summaries (for seeing quantiles). Bottom right you see the sources of metrics - infrastructure (things like disk space, CPU and memory utilization), network (latencies, errors, timeouts and such) and perhaps the most interesting bit - your own application metrics. Recall - there is no automatic way of retrieving all of your application specific metrics - this is the instrumentation bit. It was in parts an eye opener to us when we started looking at metrics...
  8. By simply inspecting response times on various end points, we could pinpoint issues we weren’t really aware of, but which helped getting an even better experience on our web site. Mind you - all of these things were already in the logs - but who reads logs in case you don’t REALLY have a problem. Takers?
  9. Typical “Newletter Friday” - The editors of on of the largest products send out newsletters each friday, which we immediately see on login numbers.
  10. So, what’s this number? It’s number of individual time series we collect from our production system. Prometheus can do lots more, up to millions, but it’s still quite a number of things to look at and evaluate.
  11. OK, so, great. We have a bunch of metrics. What do we do with those?
  12. Of course you should alert on infrastructure failure - if the failure entrails any need of intervention. If you can automatically recover - no need to alert. Rule of thumb: Alerts should be ACTIONABLE. If there’s an alert - you should have to do something (even if it’s just investigating). If an alert doesn’t require any actions - chances are good you should not alert on it (and just collect statistics). We have found out that this is dang hard though. The other thing that is just plain clear is that you must make sure that your application is available - probably by using some black box type of end to end test. If your application isn’t available - that must be your top priority to get it back up and running (but that’s obvious). Is that enough though?
  13. Let’s say we have 99,99% availability, does that mean everything is fine? No. We must find additional metrics to measure how well we are doing.
  14. Actually, we would like to measure user happiness. We are doing that with NPS and “Kundenbarometer”, but we’d like to have at least an approximation in real time. Well, you can’t do that, but you can approximate via the definition of functional and non-functional requirements you know (or at least assume) are important for customer happiness. Typical things are: Latencies or expected runtimes, and of course that your application does what it’s intended to do. This takes us back to metrics and calculated metrics, in other words KPIs, or SLIs, Service Level Indicators. Disclaimer: This is not an exact science, but always a guesstimate. Rule of thumb should at least be: If these indicators are off, the customer will definitely be UNHAPPY.
  15. And in addition to these, we of course also track the availability, where we also have an SLA. So, as these are the values to which we will be held accountable, we better also alert on these.
  16. We have gathered a more holistic view on our application - we no longer just look at what has to be developed, we also, from the start, look at how the components will behave at runtime, and how we can observe them. We don’t have to think very hard about how and where to run things - we have solved most tricky problems using Kubernetes and the toolset around; we just have to re-apply patterns, while relying on that most things aren’t that complicated that nobody has solved them yet. We have a lot less fear of changing things. Since everything is built up as code, everything is easily and fairly quickly reproducible, and we can efficiently test changes up front. We have gathered a feeling that we are in control. At least, we hope and think we are in control. And that’s a nice feeling.
  17. Restarted VMs: Redis cluster failed after restart AppServer could not reconnect to Redis Pods running only once?  SPOF Expect failures Circuit breakers: Currently Hystrix, investigating Istio/linkerd TLS: External semantic search – clogged up their load balancer after a couple of hours of traffic.