SlideShare a Scribd company logo
1 of 30
Download to read offline
Chaos Engineering
Hamburg
Marvin Hoffmann | Computer Scientist
15.12.2015
1. AWS Basics and Intro
2. Evolution of Chaos Testing
3. Tooling
4. Chaos Engineering
Agenda
Europe West (Ireland)US East (N. Virginia)
Regions
AZs Instances
AWS Basics
Chaos? -
What do we mean?
“A way to improve availability is
to install proven hardware and
software, and then leave it alone”
Jim Gray
Why Do Computers Stop and What Can Be Done About It?
• Systems need to be reliable
• Nuklear weapon arsenal, heart rate monitoring,
World of Warcraft servers, Streaming business
• Third party dependencies (software and
hardware)
Be reliable!
DynamoDB Outage US-East
• “… there was a brief network disruption that impacted a
portion of DynamoDB’s storage servers.”
• 2:19am until 7:10am PDT
• “There are several other AWS services that use
DynamoDB that experienced problems during the event.”
• SQS, EC2 auto scaling, CloudWatch
Source: https://aws.amazon.com/message/5467D2/
• Deployments themselves may cause issues
• Unpredicted behaviour after a change has been
rolled out
• Issues during rollback
• Change in client / user behaviour
It’s not always the infrastructure
Evolution of
Chaos Testing
Do the simplest thing first
• Prepare for your machines to die
• “Cattle, not pets” (Adrian Cockcroft)
• Resilience through redundancy
• Stateless machines
Deal with infrastructure issues
• Latency between instances
• Package loss
• Ports blocked
• or even outages of an entire AZ
Think big!
• Remember that DynamoDB failure?
• Outage of an entire AWS region!
• You’ll need more than one region in the first place
• Re-routing of entire traffic from one region to another
• Any region needs to be able to scale to take the load of
two regions
Tooling
(meet the Monkeys)
Chaos Monkey
Kills random instances in your account
Chaos Gorilla
Kills a random AZ in your account
Chaos Kong
Kills an entire AWS region in your account
What’s in it?
• A compilation of scripts
• Scripts mess with your AWS account
• Thus, they are very AWS specific
• If not on AWS, get inspired and build your toolset around
these ideas
• Not a comprehensive toolset
• Latency Monkey
• Conformity Monkey
• Security Monkey
• Doctor Monkey
• 10-18 Monkey
Simian Army
Chaos
Engineering
• Systematic approach to Chaos Testing
• Started by Netflix
• Talk about it a lot to attract talent
• Many other companies doing similar things in that field
• Want to grow a community around it
Chaos Engineering
“Experiment on a distributed system
in order to build confidence in the
system’s capability to withstand
turbulent conditions in production.”
Netflix
Four Principles of
Chaos Engineering
Know your system
• Operational insight
• What is “normal”? What does a failure look like?
Four Principles of
Chaos Engineering
1.Build a hypothesis around steady-state behaviour
The “Happy Path”
• Trace through code
where nothing bad
happens
• usually testing happens
first on the happy path
• Bad things usually
happen off the happy
path
Source: https://bethtrissel.files.wordpress.com/2014/06/176869567.jpg
Four Principles of
Chaos Engineering
1.Build a hypothesis around steady-state behaviour
2.Vary real-world events
Laboratory
• “Works on my machine” (or “works in stage env.”)
Source: http://www.memegasms.com/media/created/vhyfxm.jpg
Four Principles of
Chaos Engineering
1.Build a hypothesis around steady-state behaviour
2.Vary real-world events
3.Run experiments in production
Four Principles of
Chaos Engineering
1.Build a hypothesis around steady-state behaviour
2.Vary real-world events
3.Run experiments in production
4.Automate experiments to run continuously
Chaos Engineering Culture
• http://principlesofchaos.com
• More resources:
• https://github.com/Netflix/SimianArmy
• https://github.com/Netflix/atlas
• https://www.youtube.com/watch?v=vq4QZ4_YDok

More Related Content

What's hot

DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012Nick Galbreath
 
Overcoming Security Challenges in DevOps
Overcoming Security Challenges in DevOpsOvercoming Security Challenges in DevOps
Overcoming Security Challenges in DevOpsAlert Logic
 
DevSecCon KeyNote London 2015
DevSecCon KeyNote London 2015DevSecCon KeyNote London 2015
DevSecCon KeyNote London 2015Shannon Lietz
 
How to adapt the SDLC to the era of DevSecOps
How to adapt the SDLC to the era of DevSecOpsHow to adapt the SDLC to the era of DevSecOps
How to adapt the SDLC to the era of DevSecOpsZane Lackey
 
Finding Security a Home in a DevOps World
Finding Security a Home in a DevOps WorldFinding Security a Home in a DevOps World
Finding Security a Home in a DevOps WorldShannon Lietz
 
Effective approaches to web application security
Effective approaches to web application security Effective approaches to web application security
Effective approaches to web application security Zane Lackey
 
The Joy of Proactive Security
The Joy of Proactive SecurityThe Joy of Proactive Security
The Joy of Proactive SecurityAndy Hoernecke
 
Security as Code owasp
Security as  Code owaspSecurity as  Code owasp
Security as Code owaspShannon Lietz
 
DevSecOps - The big picture
DevSecOps - The big pictureDevSecOps - The big picture
DevSecOps - The big pictureDevSecOpsSg
 
The Rise of DevSecOps - Fabian Lim - DevSecOpsSg
The Rise of DevSecOps - Fabian Lim - DevSecOpsSgThe Rise of DevSecOps - Fabian Lim - DevSecOpsSg
The Rise of DevSecOps - Fabian Lim - DevSecOpsSgDevSecOpsSg
 
From Gates to Guardrails: Alternate Approaches to Product Security
From Gates to Guardrails: Alternate Approaches to Product SecurityFrom Gates to Guardrails: Alternate Approaches to Product Security
From Gates to Guardrails: Alternate Approaches to Product SecurityJason Chan
 
Cloud Application Security: Lessons Learned
Cloud Application Security: Lessons LearnedCloud Application Security: Lessons Learned
Cloud Application Security: Lessons LearnedJason Chan
 
Chaos Engineering and Systems Reliability
Chaos Engineering and Systems ReliabilityChaos Engineering and Systems Reliability
Chaos Engineering and Systems ReliabilitySylvain Hellegouarch
 
DevSecCon London 2017: when good containers go bad by Tim Mackey
DevSecCon London 2017: when good containers go bad by Tim MackeyDevSecCon London 2017: when good containers go bad by Tim Mackey
DevSecCon London 2017: when good containers go bad by Tim MackeyDevSecCon
 
Accelerating Innovation and Time-to-Market @ Camp Devops Houston 2015
Accelerating Innovation and Time-to-Market @ Camp Devops Houston 2015 Accelerating Innovation and Time-to-Market @ Camp Devops Houston 2015
Accelerating Innovation and Time-to-Market @ Camp Devops Houston 2015 Ariel Tseitlin
 
2019 DevSecOps Reference Architectures
2019 DevSecOps Reference Architectures2019 DevSecOps Reference Architectures
2019 DevSecOps Reference ArchitecturesSonatype
 
DevOps, Common use cases, Architectures, Best Practices
DevOps, Common use cases, Architectures, Best PracticesDevOps, Common use cases, Architectures, Best Practices
DevOps, Common use cases, Architectures, Best PracticesShiva Narayanaswamy
 
Security & DevOps- Ways To Make Sure Your Apps & Infrastructure Are Secure
Security & DevOps- Ways To Make Sure Your Apps & Infrastructure Are SecureSecurity & DevOps- Ways To Make Sure Your Apps & Infrastructure Are Secure
Security & DevOps- Ways To Make Sure Your Apps & Infrastructure Are SecurePuppet
 

What's hot (20)

DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
DevOpsSec: Appling DevOps Principles to Security, DevOpsDays Austin 2012
 
Overcoming Security Challenges in DevOps
Overcoming Security Challenges in DevOpsOvercoming Security Challenges in DevOps
Overcoming Security Challenges in DevOps
 
DevSecCon KeyNote London 2015
DevSecCon KeyNote London 2015DevSecCon KeyNote London 2015
DevSecCon KeyNote London 2015
 
The Journey to DevSecOps
The Journey to DevSecOpsThe Journey to DevSecOps
The Journey to DevSecOps
 
How to adapt the SDLC to the era of DevSecOps
How to adapt the SDLC to the era of DevSecOpsHow to adapt the SDLC to the era of DevSecOps
How to adapt the SDLC to the era of DevSecOps
 
Finding Security a Home in a DevOps World
Finding Security a Home in a DevOps WorldFinding Security a Home in a DevOps World
Finding Security a Home in a DevOps World
 
Effective approaches to web application security
Effective approaches to web application security Effective approaches to web application security
Effective approaches to web application security
 
Introduction to DevSecOps
Introduction to DevSecOpsIntroduction to DevSecOps
Introduction to DevSecOps
 
The Joy of Proactive Security
The Joy of Proactive SecurityThe Joy of Proactive Security
The Joy of Proactive Security
 
Security as Code owasp
Security as  Code owaspSecurity as  Code owasp
Security as Code owasp
 
DevSecOps - The big picture
DevSecOps - The big pictureDevSecOps - The big picture
DevSecOps - The big picture
 
The Rise of DevSecOps - Fabian Lim - DevSecOpsSg
The Rise of DevSecOps - Fabian Lim - DevSecOpsSgThe Rise of DevSecOps - Fabian Lim - DevSecOpsSg
The Rise of DevSecOps - Fabian Lim - DevSecOpsSg
 
From Gates to Guardrails: Alternate Approaches to Product Security
From Gates to Guardrails: Alternate Approaches to Product SecurityFrom Gates to Guardrails: Alternate Approaches to Product Security
From Gates to Guardrails: Alternate Approaches to Product Security
 
Cloud Application Security: Lessons Learned
Cloud Application Security: Lessons LearnedCloud Application Security: Lessons Learned
Cloud Application Security: Lessons Learned
 
Chaos Engineering and Systems Reliability
Chaos Engineering and Systems ReliabilityChaos Engineering and Systems Reliability
Chaos Engineering and Systems Reliability
 
DevSecCon London 2017: when good containers go bad by Tim Mackey
DevSecCon London 2017: when good containers go bad by Tim MackeyDevSecCon London 2017: when good containers go bad by Tim Mackey
DevSecCon London 2017: when good containers go bad by Tim Mackey
 
Accelerating Innovation and Time-to-Market @ Camp Devops Houston 2015
Accelerating Innovation and Time-to-Market @ Camp Devops Houston 2015 Accelerating Innovation and Time-to-Market @ Camp Devops Houston 2015
Accelerating Innovation and Time-to-Market @ Camp Devops Houston 2015
 
2019 DevSecOps Reference Architectures
2019 DevSecOps Reference Architectures2019 DevSecOps Reference Architectures
2019 DevSecOps Reference Architectures
 
DevOps, Common use cases, Architectures, Best Practices
DevOps, Common use cases, Architectures, Best PracticesDevOps, Common use cases, Architectures, Best Practices
DevOps, Common use cases, Architectures, Best Practices
 
Security & DevOps- Ways To Make Sure Your Apps & Infrastructure Are Secure
Security & DevOps- Ways To Make Sure Your Apps & Infrastructure Are SecureSecurity & DevOps- Ways To Make Sure Your Apps & Infrastructure Are Secure
Security & DevOps- Ways To Make Sure Your Apps & Infrastructure Are Secure
 

Similar to Principles of Chaos Engineering

20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers Software20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers SoftwareDevOps Chicago
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterJohn Adams
 
Chaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWSChaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWSBilal Aybar
 
Hack-Proof Your Cloud: Responding to 2016 Threats | AWS Public Sector Summit ...
Hack-Proof Your Cloud: Responding to 2016 Threats | AWS Public Sector Summit ...Hack-Proof Your Cloud: Responding to 2016 Threats | AWS Public Sector Summit ...
Hack-Proof Your Cloud: Responding to 2016 Threats | AWS Public Sector Summit ...Amazon Web Services
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remanijaxconf
 
Do you lose sleep at night?
Do you lose sleep at night?Do you lose sleep at night?
Do you lose sleep at night?Nathan Van Gheem
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling SoftwareAbdelmonaim Remani
 
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Amazon Web Services
 
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...Codemotion
 
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National PoliceCodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National PoliceBert Jan Schrijver
 
AWS Meetup - Nordstrom Data Lab and the AWS Cloud
AWS Meetup - Nordstrom Data Lab and the AWS CloudAWS Meetup - Nordstrom Data Lab and the AWS Cloud
AWS Meetup - Nordstrom Data Lab and the AWS CloudNordstromDataLab
 
Hacklu2011 tricaud
Hacklu2011 tricaudHacklu2011 tricaud
Hacklu2011 tricaudstricaud
 
Meetup #3: Migrate a fast scale system to AWS
Meetup #3: Migrate a fast scale system to AWSMeetup #3: Migrate a fast scale system to AWS
Meetup #3: Migrate a fast scale system to AWSAWS Vietnam Community
 
Using AWS WAF and Lambda for Automatic Protection
Using AWS WAF and Lambda for Automatic ProtectionUsing AWS WAF and Lambda for Automatic Protection
Using AWS WAF and Lambda for Automatic ProtectionAmazon Web Services
 

Similar to Principles of Chaos Engineering (20)

20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers Software20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers Software
 
Mini-Training: Netflix Simian Army
Mini-Training: Netflix Simian ArmyMini-Training: Netflix Simian Army
Mini-Training: Netflix Simian Army
 
Inrastructure as Code
Inrastructure as CodeInrastructure as Code
Inrastructure as Code
 
Elatt Presentation
Elatt PresentationElatt Presentation
Elatt Presentation
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
 
Chaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWSChaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWS
 
Hack-Proof Your Cloud: Responding to 2016 Threats | AWS Public Sector Summit ...
Hack-Proof Your Cloud: Responding to 2016 Threats | AWS Public Sector Summit ...Hack-Proof Your Cloud: Responding to 2016 Threats | AWS Public Sector Summit ...
Hack-Proof Your Cloud: Responding to 2016 Threats | AWS Public Sector Summit ...
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remani
 
Do you lose sleep at night?
Do you lose sleep at night?Do you lose sleep at night?
Do you lose sleep at night?
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling Software
 
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
 
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
 
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National PoliceCodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
 
AWS Meetup - Nordstrom Data Lab and the AWS Cloud
AWS Meetup - Nordstrom Data Lab and the AWS CloudAWS Meetup - Nordstrom Data Lab and the AWS Cloud
AWS Meetup - Nordstrom Data Lab and the AWS Cloud
 
Azure Service Fabric Mesh
Azure Service Fabric MeshAzure Service Fabric Mesh
Azure Service Fabric Mesh
 
Dev Ops without the Ops
Dev Ops without the OpsDev Ops without the Ops
Dev Ops without the Ops
 
Hacklu2011 tricaud
Hacklu2011 tricaudHacklu2011 tricaud
Hacklu2011 tricaud
 
Meetup #3: Migrate a fast scale system to AWS
Meetup #3: Migrate a fast scale system to AWSMeetup #3: Migrate a fast scale system to AWS
Meetup #3: Migrate a fast scale system to AWS
 
Migrating to aws
Migrating to awsMigrating to aws
Migrating to aws
 
Using AWS WAF and Lambda for Automatic Protection
Using AWS WAF and Lambda for Automatic ProtectionUsing AWS WAF and Lambda for Automatic Protection
Using AWS WAF and Lambda for Automatic Protection
 

Recently uploaded

chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 

Recently uploaded (20)

chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 

Principles of Chaos Engineering

  • 1. Chaos Engineering Hamburg Marvin Hoffmann | Computer Scientist 15.12.2015
  • 2. 1. AWS Basics and Intro 2. Evolution of Chaos Testing 3. Tooling 4. Chaos Engineering Agenda
  • 3. Europe West (Ireland)US East (N. Virginia) Regions AZs Instances AWS Basics
  • 4. Chaos? - What do we mean?
  • 5. “A way to improve availability is to install proven hardware and software, and then leave it alone” Jim Gray Why Do Computers Stop and What Can Be Done About It?
  • 6. • Systems need to be reliable • Nuklear weapon arsenal, heart rate monitoring, World of Warcraft servers, Streaming business • Third party dependencies (software and hardware) Be reliable!
  • 7. DynamoDB Outage US-East • “… there was a brief network disruption that impacted a portion of DynamoDB’s storage servers.” • 2:19am until 7:10am PDT • “There are several other AWS services that use DynamoDB that experienced problems during the event.” • SQS, EC2 auto scaling, CloudWatch Source: https://aws.amazon.com/message/5467D2/
  • 8. • Deployments themselves may cause issues • Unpredicted behaviour after a change has been rolled out • Issues during rollback • Change in client / user behaviour It’s not always the infrastructure
  • 10. Do the simplest thing first • Prepare for your machines to die • “Cattle, not pets” (Adrian Cockcroft) • Resilience through redundancy • Stateless machines
  • 11. Deal with infrastructure issues • Latency between instances • Package loss • Ports blocked • or even outages of an entire AZ
  • 12. Think big! • Remember that DynamoDB failure? • Outage of an entire AWS region! • You’ll need more than one region in the first place • Re-routing of entire traffic from one region to another • Any region needs to be able to scale to take the load of two regions
  • 14. Chaos Monkey Kills random instances in your account
  • 15. Chaos Gorilla Kills a random AZ in your account
  • 16. Chaos Kong Kills an entire AWS region in your account
  • 17. What’s in it? • A compilation of scripts • Scripts mess with your AWS account • Thus, they are very AWS specific • If not on AWS, get inspired and build your toolset around these ideas • Not a comprehensive toolset
  • 18. • Latency Monkey • Conformity Monkey • Security Monkey • Doctor Monkey • 10-18 Monkey Simian Army
  • 20. • Systematic approach to Chaos Testing • Started by Netflix • Talk about it a lot to attract talent • Many other companies doing similar things in that field • Want to grow a community around it Chaos Engineering
  • 21. “Experiment on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production.” Netflix
  • 23. Know your system • Operational insight • What is “normal”? What does a failure look like?
  • 24. Four Principles of Chaos Engineering 1.Build a hypothesis around steady-state behaviour
  • 25. The “Happy Path” • Trace through code where nothing bad happens • usually testing happens first on the happy path • Bad things usually happen off the happy path Source: https://bethtrissel.files.wordpress.com/2014/06/176869567.jpg
  • 26. Four Principles of Chaos Engineering 1.Build a hypothesis around steady-state behaviour 2.Vary real-world events
  • 27. Laboratory • “Works on my machine” (or “works in stage env.”) Source: http://www.memegasms.com/media/created/vhyfxm.jpg
  • 28. Four Principles of Chaos Engineering 1.Build a hypothesis around steady-state behaviour 2.Vary real-world events 3.Run experiments in production
  • 29. Four Principles of Chaos Engineering 1.Build a hypothesis around steady-state behaviour 2.Vary real-world events 3.Run experiments in production 4.Automate experiments to run continuously
  • 30. Chaos Engineering Culture • http://principlesofchaos.com • More resources: • https://github.com/Netflix/SimianArmy • https://github.com/Netflix/atlas • https://www.youtube.com/watch?v=vq4QZ4_YDok