SlideShare a Scribd company logo
1 of 66
Download to read offline
WHISPERS IN CHAOS
J. PAUL REED
RELEASE ENGINEERING APPROACHES
MONITORAMA, 2017
Greetings, Pacific Power!
@jpaulreed #monitorama
“CHAOS?!”
@jpaulreed #monitorama
“CHAOS?!”
(INCIDENTS)
@jpaulreed #monitorama
HOW DO YOU KNOW
AN INCIDENT
IS OCCURRING?
@jpaulreed #monitorama
MONITORING!
@jpaulreed #monitorama
MONITORING!
(Not a trick question.)
@jpaulreed #monitorama
HOW DO YOU KNOW
WHAT TO DO
WHEN AN INCIDENT
IS OCCURRING?
@jpaulreed #monitorama
J. PAUL REED
• @JPAULREED ON
• @SHIPSHOWPODCAST ALUM
• 15+ YEARS IN BUILD/RELEASE
ENGINEERING
• NOW, A DEVOPS CONSULTANT™
• MASTERS OF SCIENCE
CANDIDATE IN HUMAN FACTORS
AND SYSTEMS SAFETY
@jpaulreed #monitorama
HOW DO YOU KNOW
WHAT TO DO
WHEN AN INCIDENT
IS OCCURRING?
@jpaulreed #monitorama
Two Brain Systems
“Automatic” / Quick

Little to no effort

No sense of voluntary
control
“System One”
@jpaulreed #monitorama
Two Brain Systems
“Automatic” / Quick

Little to no effort

No sense of voluntary
control
“Effortful”

Complex computations

“Associated with the
subjective experience
of agency, choice, and
concentration”
“System One” “System Two”
@jpaulreed #monitorama
@jpaulreed #monitorama
Two Problem Types
Orient to the source of a
sudden sound

Complete: “bread and…”

2 + 2 = ?

Find a strong move in chess

(but only if you’re a chess
master!)
Focus on a particular voice
in a crowded room

Count the occurrence of
the letter ‘a’ on this slide

Fill out a tax form

Check the validity of a
complex logical argument
“System One” “System Two”
@jpaulreed #monitorama
@jpaulreed #monitorama
TRADE-OFFS UNDER PRESSURE:
HEURISTICS AND
OBSERVATIONS OF TEAMS
RESOLVING INTERNET SERVICE
OUTAGES
John Allspaw
LUND UNIVERSITY
SWEDEN
Date of submission: 2015-09-07
@jpaulreed #monitorama
“THE INCIDENT”
On December 4th, 2014,
during the busy holiday
shopping season, it was
reported at 1:06 PM EST
that the personalized
homepage for logged-in
users was experiencing
loading issues.
@jpaulreed #monitorama
“Timelines, yadda, yadda”
The Field Guide to Understanding Human Error
Dekker@jpaulreed #monitorama
Figure 19 - Infrastructure Engineer 1 timeline
Diagnostic
Activity
Taking
Action/Response
HOLD is placed on
the push queue
ProdEng1 re-enables
the sidebar,
with blog turned off
13:06:44 13:15:00 13:30:00 13:45:00 14:00:00 14:15:00 14:30:00
ProdEng2 turns off
homepage
sidebar module
HOLD is removed on
the push queue
Dashboard
Access
Staff Directory
Access
Princess
Requests
Production
Site Requests
@jpaulreed #monitorama
Software: A Team Sport
38
Figure 8 - Timeline view of utterances in IRC, by participant
Combined IRC utterances
@jpaulreed #monitorama
ALLSPAW IDENTIFIED THREE
“MONITORS” (HEURISTICS)
ENGINEERS USE TO WORK
INCIDENTS
@jpaulreed #monitorama
Heuristic #1: Change
“What has changed since the system was in a known-good state?”
@jpaulreed #monitorama
Heuristic #2: “Go Wide”
Widen the search to any potential contributors imagined
@jpaulreed #monitorama
Heuristic #3: Convergent Searching
Confirm/disqualify diagnoses by matching signals/symptoms
@jpaulreed #monitorama
Heuristic #3: Convergent Searching
Confirm / Disqualify…
…that comes to mind by matching signals or
symptoms that appear similar
A specific and past diagnosis

A general and recent diagnosis
@jpaulreed #monitorama
Heuristic #3: Convergent Searching
Confirm / Disqualify…
…that comes to mind by matching signals or
symptoms that appear similar
A really painful incident-memory

An incident still in your L1 cache
@jpaulreed #monitorama
“THE INCIDENT”
The page load time
increase was caused by:
Figure 5 - Signed-in homepage with sidebar components
CDN cache misses…
Due to an HTTP 400

status in an API…
From a “closed store”…
Referenced by a blog post

in the sidebar
@jpaulreed #monitorama
IE2
PE2
IE5
IE1
IE1
PE3
IE3
PE3
PE3
ProdEng1 re-enables
the sidebar,
with blog turned off
ProdEng2 turns off
homepage
sidebar module
disable a
CDN?
Load
balancer
changes?
Network
changes?
Wordpress
issue?
Frozen shop?
Featured
shop?
PE1PE1
Varnish
queuing?
Featured
staff shop?
Sidebar loading
staff shop?
IE1IE1IE1IE1IE1IE1IE1
Varnish
not caching?
IE3
Database
schema change?
IE2 IE2
IE1Errors from
Homepage
sidebar
IE2400 response
code
IE2
PublicShops_GetShopCards
API method
PE3
Featured
shop loading
OK
IE2
“Shop 1234567
does not exist”
Varnish queuing,
not caching
400 responses?
Stated hypothesis
Critical relayed
observation
@jpaulreed #monitorama
Bonus Heuristic: Testing the Fix
@jpaulreed #monitorama
b. 5 = I ALWAYS wait for tests to finish, I don't care how much time pressure
there is.
The results of question one were: 29 Yes, 3 No. (n=32)
The results of question two can be seen in Figure 18.
Figure 18 - Survey results: waiting for automated tests to finish
Some follow-up discussion with one of the respondents about the questions helped to provide
Bonus Heuristic: Testing the Fix
@jpaulreed #monitorama
b. 5 = I ALWAYS wait for tests to finish, I don't care how much time pressure
there is.
The results of question one were: 29 Yes, 3 No. (n=32)
The results of question two can be seen in Figure 18.
Figure 18 - Survey results: waiting for automated tests to finish
Some follow-up discussion with one of the respondents about the questions helped to provide
Bonus Heuristic: Testing the Fix
YOLO,
Every Day,
Twice on
Sundays?
@jpaulreed #monitorama
HOW DO YOU GET BETTER
AT DETECTING
AN INCIDENT
IS OCCURRING?
@jpaulreed #monitorama
MONITOR
THINGS
BETTER!
@jpaulreed #monitorama
MONITOR
THINGS
BETTER!
(Still not a trick question.)
@jpaulreed #monitorama
HOW DO YOU
GET BETTER AT
KNOWING WHAT TO DO
WHEN AN INCIDENT
IS OCCURRING?@jpaulreed #monitorama
Elements of “Expertise”
Experts use their knowledge base to

Recognize typicality
Make fine discriminations

Use mental simulation
Knowledge base also used to apply higher level rules
@jpaulreed #monitorama
“Seeing the Invisible”
With experience, a person gains
the ability to visualize how a
situation developed and how to
imagine how it’s going to turn out.
Experts can see what is not there.
Seeing the Invisible: Perceptual-Cognitive Aspects of Expertise
Klein & Hoffman@jpaulreed #monitorama
10,000 HOUR RULE
@jpaulreed #monitorama
@jpaulreed #monitorama
“Yeah, but Malcolm Gladwell…”
Psychological Review
1993, Vol.100. No. 3, 363-406
Copyright 1993 by the American Psychological Association, Inc.
0033-295X/93/S3.00
The Role of Deliberate Practice in the Acquisition of Expert Performance
K. Anders Ericsson, Ralf Th. Krampe, and Clemens Tesch-Romer
The theoretical framework presented in thisarticle explainsexpert performanceasthe end resultof
individuals' prolonged efforts to improve performance while negotiatingmotivational and external
constraints. In most domains of expertise, individuals begin in their childhood a regimen of
effortful activities (deliberate practice) designed to optimize improvement. Individual differences,
even among elite performers, are closely related to assessed amounts of deliberate practice. Many
characteristics once believed to reflect innate talent are actually the result of intense practice
extended for a minimumof 10years. Analysisof expert performanceprovides uniqueevidence on
the potential and limitsof extreme environmental adaptation and learning.
Our civilization has always recognized exceptional individ-
uals, whose performance in sports, the arts, and science is
vastly superior to that of the rest of the population. Specula-
tions on the causes of these individuals' extraordinary abilities
and performanceare as old as the first records of their achieve-
ments. Early accounts commonly attribute these individuals'
outstanding performance to divine intervention, such as the
influence of the stars or organs in their bodies, or to special
gifts (Murray, 1989). As science progressed, these explanations
became less acceptable. Contemporary accounts assert that the
characteristics responsible for exceptional performance are in-
nate and are genetically transmitted.
The simplicity of these accounts is attractive, but more is
because observed behavior is the result of interactions between
environmental factors and genes during the extended period of
development. Therefore, to better understand expert and ex-
ceptional performance, we must require that the account spec-
ify the different environmental factors that could selectively
promote and facilitate the achievement ofsuch performance. In
addition, recent research on expert performance and expertise
(Chi, Glaser, & Farr, 1988; Ericsson &Smith, 199la) has shown
that important characteristics ofexperts' superior performance
are acquired through experience and that the effect of practice
on performance is larger than earlier believed possible. For this
reason, an account of exceptional performance must specify
the environmental circumstances, such as the duration and
@jpaulreed #monitorama
“Yeah, but Malcolm Gladwell…”
Why expert performance is special and cannot be extrapolated
from studies of performance in the general population:
A response to criticisms☆
K. Anders Ericsson
Department of Psychology, Florida State University, Tallahassee, FL 32306-1270, USA
a r t i c l e i n f o a b s t r a c t
Article history:
Received 1 December 2013
Accepted 1 December 2013
Available online 23 December 2013
Many misunderstandings about the expert-performance approach can be attributed to its
unique methodology and theoretical concepts. This approach was established with case
studies of the acquisition of expert memory with detailed experimental analysis of the
mediating mechanisms. In contrast the traditional individual difference approach starts with
the assumption of underlying general latent factors of cognitive ability and personality that
correlate with performance across levels of acquired skill. My review rejects the assumption
that data on large samples of beginners can be extrapolated to samples of elite and expert
performers. Once we can agree on the criteria for reproducible objective expert performance
and acceptable methodologies for collecting valid data. I believe that scientists will recognize
the need for expert-performance approach to the study of expert performance, especially at
the very highest levels of achievement.
© 2013 Elsevier Inc. All rights reserved.
Keywords:
Expert performance
Deliberate practice
Long-term working memory
Innate talent
IQ
Intelligence 45 (2014) 81–103
Contents lists available at ScienceDirect
Intelligence
@jpaulreed #monitorama
Expert Performance
Psychological Review
1993, Vol.100. No. 3, 363-406
Copyright 1993 by the American Psychological Association, Inc.
0033-295X/93/S3.00
The Role of Deliberate Practice in the Acquisition of Expert Performance
K. Anders Ericsson, Ralf Th. Krampe, and Clemens Tesch-Romer
The theoretical framework presented in thisarticle explainsexpert performanceasthe end resultof
individuals' prolonged efforts to improve performance while negotiatingmotivational and external
constraints. In most domains of expertise, individuals begin in their childhood a regimen of
effortful activities (deliberate practice) designed to optimize improvement. Individual differences,
even among elite performers, are closely related to assessed amounts of deliberate practice. Many
characteristics once believed to reflect innate talent are actually the result of intense practice
extended for a minimumof 10years. Analysisof expert performanceprovides uniqueevidence on
the potential and limitsof extreme environmental adaptation and learning.
Our civilization has always recognized exceptional individ-
uals, whose performance in sports, the arts, and science is
vastly superior to that of the rest of the population. Specula-
tions on the causes of these individuals' extraordinary abilities
and performanceare as old as the first records of their achieve-
ments. Early accounts commonly attribute these individuals'
outstanding performance to divine intervention, such as the
influence of the stars or organs in their bodies, or to special
gifts (Murray, 1989). As science progressed, these explanations
became less acceptable. Contemporary accounts assert that the
characteristics responsible for exceptional performance are in-
nate and are genetically transmitted.
The simplicity of these accounts is attractive, but more is
because observed behavior is the result of interactions between
environmental factors and genes during the extended period of
development. Therefore, to better understand expert and ex-
ceptional performance, we must require that the account spec-
ify the different environmental factors that could selectively
promote and facilitate the achievement ofsuch performance. In
addition, recent research on expert performance and expertise
(Chi, Glaser, & Farr, 1988; Ericsson &Smith, 199la) has shown
that important characteristics ofexperts' superior performance
are acquired through experience and that the effect of practice
on performance is larger than earlier believed possible. For this
reason, an account of exceptional performance must specify
the environmental circumstances, such as the duration and
@jpaulreed #monitorama
Expertise in Other Crafts
Immediately
starting the APU

Taking control of
the airplane

Not attempting to
land at La Guardia
Airport
@jpaulreed #monitorama
Expertise
in Ops:
A Haiku
@jpaulreed #monitorama
Transforming Experience
into Expertise
Personal Experiences: “the opportunity to be
continually challenged”

Directed Experiences: Receiving tutoring so as to
be able to tutor
Manufactured Experiences: training / simulation

Vicarious Experiences: painful / memorable events
we craft into stories we tell others
@jpaulreed #monitorama
Transforming Experience
into Expertise
Personal Experiences: “On-call”

Directed Experiences: Training / Code Review / Pair
Programming / Wikis+Runbooks

Manufactured Experiences: Chaos Engineering /
Game Days

Vicarious Experiences: “I remember this one
incident… where it was DNS.”
@jpaulreed #monitorama
EXPLORE
DISCRETIONARY
SPACES
@jpaulreed #monitorama
The Rasmussen Triangle
@jpaulreed #monitorama
Boundaryof
Econom
icFailure
Boundary of
Unacceptable Workload
BoundaryofFunctionally
AcceptablePerformance/
AcceptableRisk
The Rasmussen Triangle
@jpaulreed #monitorama
The Rasmussen Triangle
@jpaulreed #monitorama
“Cheaper,
Better,
Faster”
The Rasmussen Triangle
@jpaulreed #monitorama
“Cheaper,
Better,
Faster”
Maximum Work for
the Least Effort
The Rasmussen Triangle
@jpaulreed #monitorama
“Cheaper,
Better,
Faster”
Maximum Work for
the Least Effort
The Rasmussen Triangle
@jpaulreed #monitorama
“Cheaper,
Better,
Faster”
Maximum Work for
the Least Effort
The Rasmussen Triangle
@jpaulreed #monitorama
“Cheaper,
Better,
Faster”
Maximum Work for
the Least Effort
The Rasmussen Triangle
@jpaulreed #monitorama
Maslow’s SRE Hierarchy
Figure III-1. Service Reliability Hierarchy
Monitoring
Site Reliability Engineering: How Google Runs Production Systems
@jpaulreed #monitorama
Just Two Questions
Did at least one person learn
one thing that will affect how
they work in the future?

Did at least half of the
attendees say they would
attend another debrief in the
future?
Debriefing
Facilitation Guide
Leading Groups at Etsy to Learn From Accidents
Authors: John Allspaw, Morgan Evans, Daniel Schauenberg
@jpaulreed #monitorama
HOW DO YOU
GET BETTER AT
KNOWING WHAT TO DO
WHEN AN INCIDENT
IS OCCURRING?@jpaulreed #monitorama
CREATE SPACE & EXPERIENCES TO
FACILITATE THE CULTIVATION OF
OURSELVES AND OUR TEAMS SO AS
TO IMPROVE OUR HEURISTICS AT DETECTING
WEAK SIGNALS AND AMBIGUITY IN THE
COMPLEX SOCIO-TECHNICAL SYSTEMS WE
OPERATE AND IN WHICH WE EXIST
@jpaulreed #monitorama
PRACTICE MAKES… BETTER
@jpaulreed #monitorama
EXPERTISE TAKES TIME.
AND SPACE.
MAKE TIME AND SPACE.
@jpaulreed #monitorama
IT’S JUST US OUT HERE
@jpaulreed #monitorama
BE GOOD TO EACH OTHER
ON OUR JOURNEY TO EXPERTISE@jpaulreed #monitorama
J. Paul Reed
preed@release-approaches.com
@jpaulreed
http://jpaulreed.com
Anonymous Feedback
http://sayat.me/jpaulreed
Bibliography
Allspaw, J. (2015). Trade-offs under pressure: heuristics and observations of teams resolving
Internet service outages (Unpublished master’s thesis). Lund University, Lund, Sweden.

Allspaw, J., Evans, M., & Shauenberg, D. (2016). Debriefing facilitation guide: leading groups at
Etsy to learn from accidents. Retrieved January 23, 2017, from https://extfiles.etsy.com/
DebriefingFacilitationGuide.pdf

Beyer, B., Jones, C., Petoff, J., & Murphy, N. R. (Eds). (2016). Site reliability engineering: how
Google runs production systems. Sebastopol, California: O’Reilly Media.

Ericsson, K. A., Trampe, R. T., & Tesch-Römer, C. (1993). The role of deliberate practice in the
acquisition of expert performance. Psychological Review, 100(3), pp. 363-406.

Ericsson, K. A. (2013). Why expert performance is special and cannot be extrapolated from studies
of performance in the general population: a response to criticisms. Intelligence, 45, pp. 81-103.
@jpaulreed #monitorama
Bibliography
Gladwell, M. (2008). Outliers: the story of success. New York, New York: Little, Brown and
Company.

Kahneman, D. (2011). Thinking, fast and slow. New York, New York: Farrar, Straus and Giroux.

Klein, G. A., & Hoffman, R. R. (1992). Seeing the invisible: perceptual-cognitive aspects of
expertise. In M. Rabinowitz (Ed.), Cognitive science foundations of instruction (pp. 203-226).
Mahwah, New Jersey: Erlbaum.

Rasmussen, J. (1997). Risk management in a dynamic society: a modelling problem. Safety
Science, 27(2-3), pp. 183-213.

Sullenberger, C. & Zaslow, J. (2009). Highest duty: my search for what really matters. New
York, New York: Harper Collins.
@jpaulreed #monitorama

More Related Content

What's hot

Business Development for Startup Founders (DevCon Cebu 2018 keynote)
Business Development for Startup Founders (DevCon Cebu 2018 keynote)Business Development for Startup Founders (DevCon Cebu 2018 keynote)
Business Development for Startup Founders (DevCon Cebu 2018 keynote)Alistair Israel
 
Technology Trends, Consumer Experience @MICA 2016
Technology Trends, Consumer Experience @MICA 2016Technology Trends, Consumer Experience @MICA 2016
Technology Trends, Consumer Experience @MICA 2016Ravi Pal
 
Sedgwick e0498336-d0105-sp7-module 01-30537a-assessment 01
Sedgwick e0498336-d0105-sp7-module 01-30537a-assessment 01Sedgwick e0498336-d0105-sp7-module 01-30537a-assessment 01
Sedgwick e0498336-d0105-sp7-module 01-30537a-assessment 01Colleen Sedgwick
 
Rp2-2015 - Experience - Tools n Methods
Rp2-2015 - Experience - Tools n MethodsRp2-2015 - Experience - Tools n Methods
Rp2-2015 - Experience - Tools n MethodsRavi Pal
 
Rp2-2015-technology trends enriching consumer experience
Rp2-2015-technology trends enriching consumer experienceRp2-2015-technology trends enriching consumer experience
Rp2-2015-technology trends enriching consumer experienceRavi Pal
 
Data Science Popup Austin: Conflict in Growing Data Science Organizations
Data Science Popup Austin: Conflict in Growing Data Science Organizations Data Science Popup Austin: Conflict in Growing Data Science Organizations
Data Science Popup Austin: Conflict in Growing Data Science Organizations Domino Data Lab
 
Rp2-2015 - technology driven macro trends in marketing space
Rp2-2015 -  technology driven macro trends in marketing space Rp2-2015 -  technology driven macro trends in marketing space
Rp2-2015 - technology driven macro trends in marketing space Ravi Pal
 
5 Deadly Workplace Legal Risks
5 Deadly Workplace Legal Risks5 Deadly Workplace Legal Risks
5 Deadly Workplace Legal Risksperformanceweb
 
ICTON 2020 KeyNote: Evolving Network Security & Resilience
ICTON 2020 KeyNote:  Evolving Network Security & ResilienceICTON 2020 KeyNote:  Evolving Network Security & Resilience
ICTON 2020 KeyNote: Evolving Network Security & ResilienceUniversity of Hertfordshire
 
What does mapping controversies mean and why controversies are charted and vi...
What does mapping controversies mean and why controversies are charted and vi...What does mapping controversies mean and why controversies are charted and vi...
What does mapping controversies mean and why controversies are charted and vi...Khalid Md Saifuddin
 
Its the Product. Not the Project May 17 2017
Its the Product.  Not the Project   May 17 2017Its the Product.  Not the Project   May 17 2017
Its the Product. Not the Project May 17 2017Kathleen Leach, PMP
 
Data Science Popup Austin: Privilege and Supervised Machine Learning
Data Science Popup Austin: Privilege and Supervised Machine LearningData Science Popup Austin: Privilege and Supervised Machine Learning
Data Science Popup Austin: Privilege and Supervised Machine LearningDomino Data Lab
 
Critical Thinking for Software Testers
Critical Thinking for Software TestersCritical Thinking for Software Testers
Critical Thinking for Software TestersTechWell
 

What's hot (20)

Business Development for Startup Founders (DevCon Cebu 2018 keynote)
Business Development for Startup Founders (DevCon Cebu 2018 keynote)Business Development for Startup Founders (DevCon Cebu 2018 keynote)
Business Development for Startup Founders (DevCon Cebu 2018 keynote)
 
Technologies That Will Change Everything
Technologies That Will Change EverythingTechnologies That Will Change Everything
Technologies That Will Change Everything
 
Quantifying Machine Intelligence Mathematically
Quantifying Machine Intelligence MathematicallyQuantifying Machine Intelligence Mathematically
Quantifying Machine Intelligence Mathematically
 
Cyber Security in a Fully Mobile World
Cyber Security in a Fully Mobile WorldCyber Security in a Fully Mobile World
Cyber Security in a Fully Mobile World
 
Technology Trends, Consumer Experience @MICA 2016
Technology Trends, Consumer Experience @MICA 2016Technology Trends, Consumer Experience @MICA 2016
Technology Trends, Consumer Experience @MICA 2016
 
Sedgwick e0498336-d0105-sp7-module 01-30537a-assessment 01
Sedgwick e0498336-d0105-sp7-module 01-30537a-assessment 01Sedgwick e0498336-d0105-sp7-module 01-30537a-assessment 01
Sedgwick e0498336-d0105-sp7-module 01-30537a-assessment 01
 
Rp2-2015 - Experience - Tools n Methods
Rp2-2015 - Experience - Tools n MethodsRp2-2015 - Experience - Tools n Methods
Rp2-2015 - Experience - Tools n Methods
 
Rp2-2015-technology trends enriching consumer experience
Rp2-2015-technology trends enriching consumer experienceRp2-2015-technology trends enriching consumer experience
Rp2-2015-technology trends enriching consumer experience
 
Data Science Popup Austin: Conflict in Growing Data Science Organizations
Data Science Popup Austin: Conflict in Growing Data Science Organizations Data Science Popup Austin: Conflict in Growing Data Science Organizations
Data Science Popup Austin: Conflict in Growing Data Science Organizations
 
Rp2-2015 - technology driven macro trends in marketing space
Rp2-2015 -  technology driven macro trends in marketing space Rp2-2015 -  technology driven macro trends in marketing space
Rp2-2015 - technology driven macro trends in marketing space
 
5 Deadly Workplace Legal Risks
5 Deadly Workplace Legal Risks5 Deadly Workplace Legal Risks
5 Deadly Workplace Legal Risks
 
ICTON 2020 KeyNote: Evolving Network Security & Resilience
ICTON 2020 KeyNote:  Evolving Network Security & ResilienceICTON 2020 KeyNote:  Evolving Network Security & Resilience
ICTON 2020 KeyNote: Evolving Network Security & Resilience
 
What does mapping controversies mean and why controversies are charted and vi...
What does mapping controversies mean and why controversies are charted and vi...What does mapping controversies mean and why controversies are charted and vi...
What does mapping controversies mean and why controversies are charted and vi...
 
Cyber Security - Thinking Like The Enemy
Cyber Security - Thinking Like The EnemyCyber Security - Thinking Like The Enemy
Cyber Security - Thinking Like The Enemy
 
Its the Product. Not the Project May 17 2017
Its the Product.  Not the Project   May 17 2017Its the Product.  Not the Project   May 17 2017
Its the Product. Not the Project May 17 2017
 
Data Science Popup Austin: Privilege and Supervised Machine Learning
Data Science Popup Austin: Privilege and Supervised Machine LearningData Science Popup Austin: Privilege and Supervised Machine Learning
Data Science Popup Austin: Privilege and Supervised Machine Learning
 
Critical Thinking for Software Testers
Critical Thinking for Software TestersCritical Thinking for Software Testers
Critical Thinking for Software Testers
 
Wireless Past Present Future
Wireless Past Present FutureWireless Past Present Future
Wireless Past Present Future
 
MSP Automation - Application and Execution
MSP Automation - Application and ExecutionMSP Automation - Application and Execution
MSP Automation - Application and Execution
 
Smart Materials and Structures
Smart Materials and StructuresSmart Materials and Structures
Smart Materials and Structures
 

Similar to Release Engineering Approaches to Resolving Internet Outages

2-3-Research-Experiences-and-Importance-of-Research_STUDENT-COPY.pptx
2-3-Research-Experiences-and-Importance-of-Research_STUDENT-COPY.pptx2-3-Research-Experiences-and-Importance-of-Research_STUDENT-COPY.pptx
2-3-Research-Experiences-and-Importance-of-Research_STUDENT-COPY.pptxTenkiTanjo
 
The Relationship Between Body Image And The Media
The Relationship Between Body Image And The MediaThe Relationship Between Body Image And The Media
The Relationship Between Body Image And The MediaJessica Myers
 
Revenue Optimization: The Science of Sales and Customer Success - Julie Weill...
Revenue Optimization: The Science of Sales and Customer Success - Julie Weill...Revenue Optimization: The Science of Sales and Customer Success - Julie Weill...
Revenue Optimization: The Science of Sales and Customer Success - Julie Weill...Traction Conf
 
People Analytics_Introduction
People Analytics_IntroductionPeople Analytics_Introduction
People Analytics_IntroductionEdith Soghomonyan
 
IRJET- Technology Related Anxiety- The Deepest Contributor to Stress
IRJET- Technology Related Anxiety- The Deepest Contributor to StressIRJET- Technology Related Anxiety- The Deepest Contributor to Stress
IRJET- Technology Related Anxiety- The Deepest Contributor to StressIRJET Journal
 
PRACTICAL RESEARCH 1 LESSONs 11.-10pptx
PRACTICAL RESEARCH  1 LESSONs 11.-10pptxPRACTICAL RESEARCH  1 LESSONs 11.-10pptx
PRACTICAL RESEARCH 1 LESSONs 11.-10pptxsherylduenas
 
Kostogryzov-for china-2013
 Kostogryzov-for china-2013 Kostogryzov-for china-2013
Kostogryzov-for china-2013Mathmodels Net
 
Demystifying Gamification in Learning
Demystifying Gamification in LearningDemystifying Gamification in Learning
Demystifying Gamification in LearningAxonify
 
A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science processMathieu d'Aquin
 
CORE: Quantitative Research Methodology: An Overview
CORE: Quantitative Research Methodology: An OverviewCORE: Quantitative Research Methodology: An Overview
CORE: Quantitative Research Methodology: An OverviewTrident University
 
Research method
Research methodResearch method
Research methodCh Irfan
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldPyData
 

Similar to Release Engineering Approaches to Resolving Internet Outages (20)

2-3-Research-Experiences-and-Importance-of-Research_STUDENT-COPY.pptx
2-3-Research-Experiences-and-Importance-of-Research_STUDENT-COPY.pptx2-3-Research-Experiences-and-Importance-of-Research_STUDENT-COPY.pptx
2-3-Research-Experiences-and-Importance-of-Research_STUDENT-COPY.pptx
 
The Relationship Between Body Image And The Media
The Relationship Between Body Image And The MediaThe Relationship Between Body Image And The Media
The Relationship Between Body Image And The Media
 
Revenue Optimization: The Science of Sales and Customer Success - Julie Weill...
Revenue Optimization: The Science of Sales and Customer Success - Julie Weill...Revenue Optimization: The Science of Sales and Customer Success - Julie Weill...
Revenue Optimization: The Science of Sales and Customer Success - Julie Weill...
 
Third year ebm 2 2013
Third year ebm 2 2013Third year ebm 2 2013
Third year ebm 2 2013
 
People Analytics_Introduction
People Analytics_IntroductionPeople Analytics_Introduction
People Analytics_Introduction
 
Nov1 webinar intro_slides v
Nov1 webinar intro_slides vNov1 webinar intro_slides v
Nov1 webinar intro_slides v
 
IRJET- Technology Related Anxiety- The Deepest Contributor to Stress
IRJET- Technology Related Anxiety- The Deepest Contributor to StressIRJET- Technology Related Anxiety- The Deepest Contributor to Stress
IRJET- Technology Related Anxiety- The Deepest Contributor to Stress
 
Introduction to PPDAC cycle
Introduction to PPDAC cycleIntroduction to PPDAC cycle
Introduction to PPDAC cycle
 
Detection Of Drowning Victims
Detection Of Drowning VictimsDetection Of Drowning Victims
Detection Of Drowning Victims
 
PRACTICAL RESEARCH 1 LESSONs 11.-10pptx
PRACTICAL RESEARCH  1 LESSONs 11.-10pptxPRACTICAL RESEARCH  1 LESSONs 11.-10pptx
PRACTICAL RESEARCH 1 LESSONs 11.-10pptx
 
Desarrollo del cliente MBA
Desarrollo del cliente MBA Desarrollo del cliente MBA
Desarrollo del cliente MBA
 
Observational studies in social media
Observational studies in social mediaObservational studies in social media
Observational studies in social media
 
Kostogryzov-for china-2013
 Kostogryzov-for china-2013 Kostogryzov-for china-2013
Kostogryzov-for china-2013
 
Demystifying Gamification in Learning
Demystifying Gamification in LearningDemystifying Gamification in Learning
Demystifying Gamification in Learning
 
A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science process
 
Issues in Experimental Design
Issues in Experimental DesignIssues in Experimental Design
Issues in Experimental Design
 
E. Jenkins' Thesis
E. Jenkins' Thesis E. Jenkins' Thesis
E. Jenkins' Thesis
 
CORE: Quantitative Research Methodology: An Overview
CORE: Quantitative Research Methodology: An OverviewCORE: Quantitative Research Methodology: An Overview
CORE: Quantitative Research Methodology: An Overview
 
Research method
Research methodResearch method
Research method
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
 

More from J. Paul Reed

AllDayDevOps: Crossing the CD Chasm
AllDayDevOps: Crossing the CD ChasmAllDayDevOps: Crossing the CD Chasm
AllDayDevOps: Crossing the CD ChasmJ. Paul Reed
 
Tyranny of the SLA
Tyranny of the SLATyranny of the SLA
Tyranny of the SLAJ. Paul Reed
 
The Blameless Cloud: Bringing Actionable Retrospectives to Salesforce
The Blameless Cloud: Bringing Actionable Retrospectives to SalesforceThe Blameless Cloud: Bringing Actionable Retrospectives to Salesforce
The Blameless Cloud: Bringing Actionable Retrospectives to SalesforceJ. Paul Reed
 
Tools, Culture, and Aesthetics: The Art of DevOps
Tools, Culture, and Aesthetics: The Art of DevOpsTools, Culture, and Aesthetics: The Art of DevOps
Tools, Culture, and Aesthetics: The Art of DevOpsJ. Paul Reed
 
Fixing Your Org By Continually Breaking It
Fixing Your Org By Continually Breaking ItFixing Your Org By Continually Breaking It
Fixing Your Org By Continually Breaking ItJ. Paul Reed
 
Devops at 5,016 Feet
Devops at 5,016 FeetDevops at 5,016 Feet
Devops at 5,016 FeetJ. Paul Reed
 
"DevOps" in a Post-DevOps World
"DevOps" in a Post-DevOps World"DevOps" in a Post-DevOps World
"DevOps" in a Post-DevOps WorldJ. Paul Reed
 
Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)
Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)
Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)J. Paul Reed
 

More from J. Paul Reed (8)

AllDayDevOps: Crossing the CD Chasm
AllDayDevOps: Crossing the CD ChasmAllDayDevOps: Crossing the CD Chasm
AllDayDevOps: Crossing the CD Chasm
 
Tyranny of the SLA
Tyranny of the SLATyranny of the SLA
Tyranny of the SLA
 
The Blameless Cloud: Bringing Actionable Retrospectives to Salesforce
The Blameless Cloud: Bringing Actionable Retrospectives to SalesforceThe Blameless Cloud: Bringing Actionable Retrospectives to Salesforce
The Blameless Cloud: Bringing Actionable Retrospectives to Salesforce
 
Tools, Culture, and Aesthetics: The Art of DevOps
Tools, Culture, and Aesthetics: The Art of DevOpsTools, Culture, and Aesthetics: The Art of DevOps
Tools, Culture, and Aesthetics: The Art of DevOps
 
Fixing Your Org By Continually Breaking It
Fixing Your Org By Continually Breaking ItFixing Your Org By Continually Breaking It
Fixing Your Org By Continually Breaking It
 
Devops at 5,016 Feet
Devops at 5,016 FeetDevops at 5,016 Feet
Devops at 5,016 Feet
 
"DevOps" in a Post-DevOps World
"DevOps" in a Post-DevOps World"DevOps" in a Post-DevOps World
"DevOps" in a Post-DevOps World
 
Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)
Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)
Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)
 

Recently uploaded

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

Release Engineering Approaches to Resolving Internet Outages

  • 1. WHISPERS IN CHAOS J. PAUL REED RELEASE ENGINEERING APPROACHES MONITORAMA, 2017
  • 5. HOW DO YOU KNOW AN INCIDENT IS OCCURRING? @jpaulreed #monitorama
  • 7. MONITORING! (Not a trick question.) @jpaulreed #monitorama
  • 8. HOW DO YOU KNOW WHAT TO DO WHEN AN INCIDENT IS OCCURRING? @jpaulreed #monitorama
  • 9. J. PAUL REED • @JPAULREED ON • @SHIPSHOWPODCAST ALUM • 15+ YEARS IN BUILD/RELEASE ENGINEERING • NOW, A DEVOPS CONSULTANT™ • MASTERS OF SCIENCE CANDIDATE IN HUMAN FACTORS AND SYSTEMS SAFETY @jpaulreed #monitorama
  • 10. HOW DO YOU KNOW WHAT TO DO WHEN AN INCIDENT IS OCCURRING? @jpaulreed #monitorama
  • 11. Two Brain Systems “Automatic” / Quick Little to no effort No sense of voluntary control “System One” @jpaulreed #monitorama
  • 12. Two Brain Systems “Automatic” / Quick Little to no effort No sense of voluntary control “Effortful” Complex computations “Associated with the subjective experience of agency, choice, and concentration” “System One” “System Two” @jpaulreed #monitorama
  • 14. Two Problem Types Orient to the source of a sudden sound Complete: “bread and…” 2 + 2 = ? Find a strong move in chess
 (but only if you’re a chess master!) Focus on a particular voice in a crowded room Count the occurrence of the letter ‘a’ on this slide Fill out a tax form Check the validity of a complex logical argument “System One” “System Two” @jpaulreed #monitorama
  • 16. TRADE-OFFS UNDER PRESSURE: HEURISTICS AND OBSERVATIONS OF TEAMS RESOLVING INTERNET SERVICE OUTAGES John Allspaw LUND UNIVERSITY SWEDEN Date of submission: 2015-09-07 @jpaulreed #monitorama
  • 17. “THE INCIDENT” On December 4th, 2014, during the busy holiday shopping season, it was reported at 1:06 PM EST that the personalized homepage for logged-in users was experiencing loading issues. @jpaulreed #monitorama
  • 18. “Timelines, yadda, yadda” The Field Guide to Understanding Human Error Dekker@jpaulreed #monitorama
  • 19. Figure 19 - Infrastructure Engineer 1 timeline Diagnostic Activity Taking Action/Response HOLD is placed on the push queue ProdEng1 re-enables the sidebar, with blog turned off 13:06:44 13:15:00 13:30:00 13:45:00 14:00:00 14:15:00 14:30:00 ProdEng2 turns off homepage sidebar module HOLD is removed on the push queue Dashboard Access Staff Directory Access Princess Requests Production Site Requests @jpaulreed #monitorama
  • 20. Software: A Team Sport 38 Figure 8 - Timeline view of utterances in IRC, by participant Combined IRC utterances @jpaulreed #monitorama
  • 21. ALLSPAW IDENTIFIED THREE “MONITORS” (HEURISTICS) ENGINEERS USE TO WORK INCIDENTS @jpaulreed #monitorama
  • 22. Heuristic #1: Change “What has changed since the system was in a known-good state?” @jpaulreed #monitorama
  • 23. Heuristic #2: “Go Wide” Widen the search to any potential contributors imagined @jpaulreed #monitorama
  • 24. Heuristic #3: Convergent Searching Confirm/disqualify diagnoses by matching signals/symptoms @jpaulreed #monitorama
  • 25. Heuristic #3: Convergent Searching Confirm / Disqualify… …that comes to mind by matching signals or symptoms that appear similar A specific and past diagnosis A general and recent diagnosis @jpaulreed #monitorama
  • 26. Heuristic #3: Convergent Searching Confirm / Disqualify… …that comes to mind by matching signals or symptoms that appear similar A really painful incident-memory An incident still in your L1 cache @jpaulreed #monitorama
  • 27. “THE INCIDENT” The page load time increase was caused by: Figure 5 - Signed-in homepage with sidebar components CDN cache misses… Due to an HTTP 400
 status in an API… From a “closed store”… Referenced by a blog post
 in the sidebar @jpaulreed #monitorama
  • 28. IE2 PE2 IE5 IE1 IE1 PE3 IE3 PE3 PE3 ProdEng1 re-enables the sidebar, with blog turned off ProdEng2 turns off homepage sidebar module disable a CDN? Load balancer changes? Network changes? Wordpress issue? Frozen shop? Featured shop? PE1PE1 Varnish queuing? Featured staff shop? Sidebar loading staff shop? IE1IE1IE1IE1IE1IE1IE1 Varnish not caching? IE3 Database schema change? IE2 IE2 IE1Errors from Homepage sidebar IE2400 response code IE2 PublicShops_GetShopCards API method PE3 Featured shop loading OK IE2 “Shop 1234567 does not exist” Varnish queuing, not caching 400 responses? Stated hypothesis Critical relayed observation @jpaulreed #monitorama
  • 29. Bonus Heuristic: Testing the Fix @jpaulreed #monitorama
  • 30. b. 5 = I ALWAYS wait for tests to finish, I don't care how much time pressure there is. The results of question one were: 29 Yes, 3 No. (n=32) The results of question two can be seen in Figure 18. Figure 18 - Survey results: waiting for automated tests to finish Some follow-up discussion with one of the respondents about the questions helped to provide Bonus Heuristic: Testing the Fix @jpaulreed #monitorama
  • 31. b. 5 = I ALWAYS wait for tests to finish, I don't care how much time pressure there is. The results of question one were: 29 Yes, 3 No. (n=32) The results of question two can be seen in Figure 18. Figure 18 - Survey results: waiting for automated tests to finish Some follow-up discussion with one of the respondents about the questions helped to provide Bonus Heuristic: Testing the Fix YOLO, Every Day, Twice on Sundays? @jpaulreed #monitorama
  • 32. HOW DO YOU GET BETTER AT DETECTING AN INCIDENT IS OCCURRING? @jpaulreed #monitorama
  • 34. MONITOR THINGS BETTER! (Still not a trick question.) @jpaulreed #monitorama
  • 35. HOW DO YOU GET BETTER AT KNOWING WHAT TO DO WHEN AN INCIDENT IS OCCURRING?@jpaulreed #monitorama
  • 36. Elements of “Expertise” Experts use their knowledge base to Recognize typicality Make fine discriminations Use mental simulation Knowledge base also used to apply higher level rules @jpaulreed #monitorama
  • 37. “Seeing the Invisible” With experience, a person gains the ability to visualize how a situation developed and how to imagine how it’s going to turn out. Experts can see what is not there. Seeing the Invisible: Perceptual-Cognitive Aspects of Expertise Klein & Hoffman@jpaulreed #monitorama
  • 40. “Yeah, but Malcolm Gladwell…” Psychological Review 1993, Vol.100. No. 3, 363-406 Copyright 1993 by the American Psychological Association, Inc. 0033-295X/93/S3.00 The Role of Deliberate Practice in the Acquisition of Expert Performance K. Anders Ericsson, Ralf Th. Krampe, and Clemens Tesch-Romer The theoretical framework presented in thisarticle explainsexpert performanceasthe end resultof individuals' prolonged efforts to improve performance while negotiatingmotivational and external constraints. In most domains of expertise, individuals begin in their childhood a regimen of effortful activities (deliberate practice) designed to optimize improvement. Individual differences, even among elite performers, are closely related to assessed amounts of deliberate practice. Many characteristics once believed to reflect innate talent are actually the result of intense practice extended for a minimumof 10years. Analysisof expert performanceprovides uniqueevidence on the potential and limitsof extreme environmental adaptation and learning. Our civilization has always recognized exceptional individ- uals, whose performance in sports, the arts, and science is vastly superior to that of the rest of the population. Specula- tions on the causes of these individuals' extraordinary abilities and performanceare as old as the first records of their achieve- ments. Early accounts commonly attribute these individuals' outstanding performance to divine intervention, such as the influence of the stars or organs in their bodies, or to special gifts (Murray, 1989). As science progressed, these explanations became less acceptable. Contemporary accounts assert that the characteristics responsible for exceptional performance are in- nate and are genetically transmitted. The simplicity of these accounts is attractive, but more is because observed behavior is the result of interactions between environmental factors and genes during the extended period of development. Therefore, to better understand expert and ex- ceptional performance, we must require that the account spec- ify the different environmental factors that could selectively promote and facilitate the achievement ofsuch performance. In addition, recent research on expert performance and expertise (Chi, Glaser, & Farr, 1988; Ericsson &Smith, 199la) has shown that important characteristics ofexperts' superior performance are acquired through experience and that the effect of practice on performance is larger than earlier believed possible. For this reason, an account of exceptional performance must specify the environmental circumstances, such as the duration and @jpaulreed #monitorama
  • 41. “Yeah, but Malcolm Gladwell…” Why expert performance is special and cannot be extrapolated from studies of performance in the general population: A response to criticisms☆ K. Anders Ericsson Department of Psychology, Florida State University, Tallahassee, FL 32306-1270, USA a r t i c l e i n f o a b s t r a c t Article history: Received 1 December 2013 Accepted 1 December 2013 Available online 23 December 2013 Many misunderstandings about the expert-performance approach can be attributed to its unique methodology and theoretical concepts. This approach was established with case studies of the acquisition of expert memory with detailed experimental analysis of the mediating mechanisms. In contrast the traditional individual difference approach starts with the assumption of underlying general latent factors of cognitive ability and personality that correlate with performance across levels of acquired skill. My review rejects the assumption that data on large samples of beginners can be extrapolated to samples of elite and expert performers. Once we can agree on the criteria for reproducible objective expert performance and acceptable methodologies for collecting valid data. I believe that scientists will recognize the need for expert-performance approach to the study of expert performance, especially at the very highest levels of achievement. © 2013 Elsevier Inc. All rights reserved. Keywords: Expert performance Deliberate practice Long-term working memory Innate talent IQ Intelligence 45 (2014) 81–103 Contents lists available at ScienceDirect Intelligence @jpaulreed #monitorama
  • 42. Expert Performance Psychological Review 1993, Vol.100. No. 3, 363-406 Copyright 1993 by the American Psychological Association, Inc. 0033-295X/93/S3.00 The Role of Deliberate Practice in the Acquisition of Expert Performance K. Anders Ericsson, Ralf Th. Krampe, and Clemens Tesch-Romer The theoretical framework presented in thisarticle explainsexpert performanceasthe end resultof individuals' prolonged efforts to improve performance while negotiatingmotivational and external constraints. In most domains of expertise, individuals begin in their childhood a regimen of effortful activities (deliberate practice) designed to optimize improvement. Individual differences, even among elite performers, are closely related to assessed amounts of deliberate practice. Many characteristics once believed to reflect innate talent are actually the result of intense practice extended for a minimumof 10years. Analysisof expert performanceprovides uniqueevidence on the potential and limitsof extreme environmental adaptation and learning. Our civilization has always recognized exceptional individ- uals, whose performance in sports, the arts, and science is vastly superior to that of the rest of the population. Specula- tions on the causes of these individuals' extraordinary abilities and performanceare as old as the first records of their achieve- ments. Early accounts commonly attribute these individuals' outstanding performance to divine intervention, such as the influence of the stars or organs in their bodies, or to special gifts (Murray, 1989). As science progressed, these explanations became less acceptable. Contemporary accounts assert that the characteristics responsible for exceptional performance are in- nate and are genetically transmitted. The simplicity of these accounts is attractive, but more is because observed behavior is the result of interactions between environmental factors and genes during the extended period of development. Therefore, to better understand expert and ex- ceptional performance, we must require that the account spec- ify the different environmental factors that could selectively promote and facilitate the achievement ofsuch performance. In addition, recent research on expert performance and expertise (Chi, Glaser, & Farr, 1988; Ericsson &Smith, 199la) has shown that important characteristics ofexperts' superior performance are acquired through experience and that the effect of practice on performance is larger than earlier believed possible. For this reason, an account of exceptional performance must specify the environmental circumstances, such as the duration and @jpaulreed #monitorama
  • 43. Expertise in Other Crafts Immediately starting the APU Taking control of the airplane Not attempting to land at La Guardia Airport @jpaulreed #monitorama
  • 45. Transforming Experience into Expertise Personal Experiences: “the opportunity to be continually challenged” Directed Experiences: Receiving tutoring so as to be able to tutor Manufactured Experiences: training / simulation Vicarious Experiences: painful / memorable events we craft into stories we tell others @jpaulreed #monitorama
  • 46. Transforming Experience into Expertise Personal Experiences: “On-call” Directed Experiences: Training / Code Review / Pair Programming / Wikis+Runbooks Manufactured Experiences: Chaos Engineering / Game Days Vicarious Experiences: “I remember this one incident… where it was DNS.” @jpaulreed #monitorama
  • 52. “Cheaper, Better, Faster” Maximum Work for the Least Effort The Rasmussen Triangle @jpaulreed #monitorama
  • 53. “Cheaper, Better, Faster” Maximum Work for the Least Effort The Rasmussen Triangle @jpaulreed #monitorama
  • 54. “Cheaper, Better, Faster” Maximum Work for the Least Effort The Rasmussen Triangle @jpaulreed #monitorama
  • 55. “Cheaper, Better, Faster” Maximum Work for the Least Effort The Rasmussen Triangle @jpaulreed #monitorama
  • 56. Maslow’s SRE Hierarchy Figure III-1. Service Reliability Hierarchy Monitoring Site Reliability Engineering: How Google Runs Production Systems @jpaulreed #monitorama
  • 57. Just Two Questions Did at least one person learn one thing that will affect how they work in the future? Did at least half of the attendees say they would attend another debrief in the future? Debriefing Facilitation Guide Leading Groups at Etsy to Learn From Accidents Authors: John Allspaw, Morgan Evans, Daniel Schauenberg @jpaulreed #monitorama
  • 58. HOW DO YOU GET BETTER AT KNOWING WHAT TO DO WHEN AN INCIDENT IS OCCURRING?@jpaulreed #monitorama
  • 59. CREATE SPACE & EXPERIENCES TO FACILITATE THE CULTIVATION OF OURSELVES AND OUR TEAMS SO AS TO IMPROVE OUR HEURISTICS AT DETECTING WEAK SIGNALS AND AMBIGUITY IN THE COMPLEX SOCIO-TECHNICAL SYSTEMS WE OPERATE AND IN WHICH WE EXIST @jpaulreed #monitorama
  • 61. EXPERTISE TAKES TIME. AND SPACE. MAKE TIME AND SPACE. @jpaulreed #monitorama
  • 62. IT’S JUST US OUT HERE @jpaulreed #monitorama
  • 63. BE GOOD TO EACH OTHER ON OUR JOURNEY TO EXPERTISE@jpaulreed #monitorama
  • 65. Bibliography Allspaw, J. (2015). Trade-offs under pressure: heuristics and observations of teams resolving Internet service outages (Unpublished master’s thesis). Lund University, Lund, Sweden. Allspaw, J., Evans, M., & Shauenberg, D. (2016). Debriefing facilitation guide: leading groups at Etsy to learn from accidents. Retrieved January 23, 2017, from https://extfiles.etsy.com/ DebriefingFacilitationGuide.pdf Beyer, B., Jones, C., Petoff, J., & Murphy, N. R. (Eds). (2016). Site reliability engineering: how Google runs production systems. Sebastopol, California: O’Reilly Media. Ericsson, K. A., Trampe, R. T., & Tesch-Römer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100(3), pp. 363-406. Ericsson, K. A. (2013). Why expert performance is special and cannot be extrapolated from studies of performance in the general population: a response to criticisms. Intelligence, 45, pp. 81-103. @jpaulreed #monitorama
  • 66. Bibliography Gladwell, M. (2008). Outliers: the story of success. New York, New York: Little, Brown and Company. Kahneman, D. (2011). Thinking, fast and slow. New York, New York: Farrar, Straus and Giroux. Klein, G. A., & Hoffman, R. R. (1992). Seeing the invisible: perceptual-cognitive aspects of expertise. In M. Rabinowitz (Ed.), Cognitive science foundations of instruction (pp. 203-226). Mahwah, New Jersey: Erlbaum. Rasmussen, J. (1997). Risk management in a dynamic society: a modelling problem. Safety Science, 27(2-3), pp. 183-213. Sullenberger, C. & Zaslow, J. (2009). Highest duty: my search for what really matters. New York, New York: Harper Collins. @jpaulreed #monitorama