Metrics-driven Continuous DELIVERY

Metrics-driven Continuous DELIVERY
Andrew Phillips

Me me me!
• Been on both sides of the Dev…Ops fence
• Done the high-level strategy stuff and also
regularly get my hands dirty
• Open-source contributor
• Regular meetup, conference etc. speaker
and contributor to IT publications
• Co-organizer of DynamicInfraDays
• A bunch of other crazy stuff

The Plan
• Lightning intro to CD
• Re-framing Continuous Delivery
• Metrics, metrics, metrics…and some microservices
• Beyond go-live

Lightning Intro
• Ongoing flow of ideas/features/whatever to your
users
• I.e. regular delivery of code to production systems
• Move away from “big bang” releases

Lightning Intro
Why?
• Faster feedback to developers
• More responsive engagement with customers
• Reduced risk of failures in production

700 deployments / year
10 + deployments / day
50 – 60 deployments /
day
Every 11.6 seconds

• Waterfall  agile: 3 years
• 220 Apps - 1 deployment per month
• “Every manual tester does automation”
• “We don’t log bugs. We fix them.”
• Measures are built in & visible to everyone
• Promote your wins! Educate your peers.
• Everyone can do continuous delivery.

Lightning Intro
Why?
• Faster feedback to developers
• More responsive engagement with customers
• Reduced risk of failures in production
• More engaged/user-focused teams
• “Experimental organization”

RE-FRAMING CONTINUOUS DELIVERY

Re-framing CD
• The “classic” CD pipeline
Build/CI/unit testing Functional testing Deployment

Re-framing CD
• The “classic” CD pipeline
Source code commit

Re-framing CD
Build/CI/unit testing
Functional testing
Canary
Regression testing
A Deployment
B Deployment
Static code analysis

Re-framing CD
• Common conception of a CD pipeline: system for
enabling “commit-to-prod” flow

Re-framing CD
“Consignment
47b0e288bfb26772d4a274
cb8d64b66c012e10f2 has
arrived”

Re-framing CD
• Need to relate pipeline to the flow of ideas through
the system, not just the flow of commits
• Ideas can be commit-scoped (“change the sort
order of the displayed list”) or much bigger
(“support Japanese as a language in the
application”)

Re-framing CD

Re-framing CD
Build/CI/unit
testing
Functional
testing
Deployment
Idea
feature
Team
Team
Service/
repo
Service/
repo
Service/
repo
Build/CI/unit
testing
Functional
testing
Deployment
Build/CI/unit
testing
Functional
testing
Deployment

Re-framing CD
• This is tricky, because teams will work on different
parts of the feature at different speeds
• Can’t always “just put it live and let it wait there”

Re-framing CD
Build/CI/unit
testing
Component
func. tests
Idea
feature
Team
Team
Service/
repo
Service/
repo
Service/
repo
Build/CI/unit
testing
Component
func. tests
Deploy-
ment
Build/CI/unit
testing
Component
func. tests
Integrated
func. tests

Re-framing CD
Build/CI/unit
testing
Component
func. tests
Idea
feature
Team
Team
Service/
repo
Service/
repo
Service/
repo
Build/CI/unit
testing
Component
func. tests
Deploy-
ment
Build/CI/unit
testing
Component
func. tests
Integrated
func. tests
?

Re-framing CD
• No easy answers
• Dark launches
• "So maybe the idea of software projects is wrong.“
– Dave Farley, http://blog.xebialabs.com/2016/04/07/theres-no-c-devops/

Re-framing CD
• Even if the idea of software projects is wrong,
ongoing tweaking only goes so far

Re-framing CD
• You’ll likely have multiple pipeline patterns in your
organization
• Both across teams and within a team, over time
• Think about your approach to tweaking vs.
innovation

METRICS, METRICS, METRICS
…AND SOME MICROSERVICES

Metrics, metrics, metrics
Build/CI/unit
testing
Functional
testing
Deployment
Idea
feature
Team
Team
Service/
repo
Service/
repo
Service/
repo
Build/CI/unit
testing
Functional
testing
Deployment
Build/CI/unit
testing
Functional
testing
Deployment

• Goal of the CD pipeline: get enough data for you to
be confident that you can go live
– …and then actually deploy the candidate if you decide to
go ahead

• Goal of the CD pipeline: get enough data for you to
be confident that you can go live
• What can we be confident of?

• Confidence that the new version will be “better” ≠
confidence that we will avoid catastrophic failure

• Determining accurately whether the new version
will be better is hard

• Determining accurately whether the new version
will be better is hard
• Functional and code-level metrics are poor
indicators of catastrophic failure avoidance!

• Idea: use architectural metrics to look for
indications of catastrophic failure risk

• Idea: use architectural metrics to look for
indications of catastrophic failure risk
• Trick: don’t need production-equivalent values –
can use deviation from the baseline!

Build 17 testPurchase OK
testSearch OK
Build # Test Case Status
Test & Monitoring Framework Results

testSearch OK
Build 18 testPurchase FAILED
testSearch OK
We identified a regression

testSearch OK
testSearch OK
testSearch OK
Problem solved

testSearch OK
testSearch OK
testSearch OK
Build # Test Case Status # SQL # Exceptions CPU
Test & Monitoring Framework Results Architectural Data
Let’s look behind the scenes

testSearch OK
testSearch OK
testSearch OK
12 0 120ms
3 1 68ms

testSearch OK
testSearch OK
testSearch OK
12 0 120ms
3 1 68ms
12 5 60ms
3 1 68ms
Exceptions probably reason for
failed tests

testSearch OK
testSearch OK
testSearch OK
12 0 120ms
3 1 68ms
12 5 60ms
3 1 68ms
75 0 230ms
3 1 68ms
Problem fixed but now we have an
architectural regression
Problem fixed but now we have an
architectural regression

12 0 120ms
3 1 68ms
testSearch OK
testSearch OK
testSearch OK
testSearch OK
12 0 120ms
3 1 68ms
12 5 60ms
3 1 68ms
75 0 230ms
3 1 68ms
Now we have the functional and architectural
confidence

62
What’s commonly measured
# Test Failures
Overall Duration

63
What’s commonly measured
# Test Failures
Overall Duration
What’s out there
# Log Messages
# HTTP 4xx/5xx
Request/Response Size
Page Load/Rendering Time
…
Execution Time per test
# calls to API
# executed SQL statements
# Web Service Calls
# JMS Messages
# Objects Allocated
# Exceptions

879 SQL Queries
8Missing CSS & JS Files
340Calls to GetItemById

26.7s
execution time
33Calls to the
same web service
171SQL Queries through
LINQ request similar data for
each call
Direct access to DB from frontend logic

Baselining and anomaly
detection

Architectural metrics
Performance metrics

• Pipelines are about gathering data to allow you to
make a confident go/no-go decision
• Confidence that the new version will improve is
harder to come back than confidence that it won’t go
bang
• Architectural metrics are very useful to determine if
the new version likely to go bang
• You probably already have the tools you need to
integrate this kind of data into your pipeline

Beyond go-live
1. “CD: ongoing flow of ideas to your users”
2. “Gaining confidence that the new version is
better than the current one before production is
hard”

Beyond go-live
• Ergo: extend the concept of pipeline beyond
production!

Beyond go-live
User experience
monitoring
Deployment marker

Beyond go-live
• Ergo: extend the concept of pipeline beyond
production!
• OK, but…what to measure then?

Beyond go-live
• Plenty of technical metrics available, but how do
these relate to user experience?
• Client-side instrumentation?
• Feedback popups?
• Tracking social mentions?
• “Traditional” UX testing?

Beyond go-live
1. “More user-focused teams”
2. “Experimental organization”

Beyond go-live
• This isn’t easy!
• “Our experience at Microsoft is no different: only
about 1/3 of ideas improve the metrics they were
designed to improve [...] at Amazon, for example, it
is a common practice to evaluate every new
feature, yet the success rate is below 50%.”
– http://robotics.stanford.edu/~ronnyk/ExPThinkWeek2009Public.pdf

Beyond go-live
• Requires organizational and cultural change,
more than just technical capabilities

Takeaways
• CD is about “idea-to-users”, not just “commit-to-
prod”
• Depending on the relationship between ideas and
commits, you’re likely to have different pipeline
patterns across teams, and within the same team
over time
• “Tweaking phase” vs. “Innovation phase”

Takeaways (2)
• The automation phase of your CD pipeline is about
gathering metrics to allow you to make an
accurate go/no-go decision
• What kind of confidence are you looking for?
Confidence that things won’t “go bang” is easier to
obtain than confidence that things will improve
• Architectural metrics are a big help in determining
whether things are likely to go bang

Takeaways (3)
• The capability to release quickly and with
confidence that CD provides is the foundation for a
more user-focused approach to building software
• Think about how you would measure user
experience in your production systems
• Moving from a Just Build What We Tell You culture
to an experimental organization requires
significant culture change

Just for fun ;-)
• The “zero-width joiner”
– https://en.wikipedia.org/wiki/Zero-width_joiner
• Unicode character U+200D
• Causes characters to be connected

Just for fun ;-)
…add ZWJ…

Metrics-driven Continuous DELIVERY

Metrics-driven Continuous DELIVERY

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (12)

Similar to Metrics-driven Continuous DELIVERY

Similar to Metrics-driven Continuous DELIVERY (20)

More from Andrew Phillips

More from Andrew Phillips (13)

Recently uploaded

Recently uploaded (20)

Metrics-driven Continuous DELIVERY