1. The document discusses metrics-driven continuous delivery and focuses on using metrics throughout the development and delivery process.
2. It emphasizes using architectural metrics in addition to functional metrics to help determine if a new version is likely to cause catastrophic failures before deploying to production.
3. It also argues that the concept of continuous delivery pipelines should extend beyond production deployments to help evaluate user experience and gain feedback on new features beyond just technical metrics.
2. Me me me!
• Been on both sides of the Dev…Ops fence
• Done the high-level strategy stuff and also
regularly get my hands dirty
• Open-source contributor
• Regular meetup, conference etc. speaker
and contributor to IT publications
• Co-organizer of DynamicInfraDays
• A bunch of other crazy stuff
3. The Plan
• Lightning intro to CD
• Re-framing Continuous Delivery
• Metrics, metrics, metrics…and some microservices
• Beyond go-live
6. Lightning Intro
• Ongoing flow of ideas/features/whatever to your
users
• I.e. regular delivery of code to production systems
• Move away from “big bang” releases
9. Lightning Intro
Why?
• Faster feedback to developers
• More responsive engagement with customers
• Reduced risk of failures in production
10. 700 deployments / year
10 + deployments / day
50 – 60 deployments /
day
Every 11.6 seconds
11. • Waterfall agile: 3 years
• 220 Apps - 1 deployment per month
• “Every manual tester does automation”
• “We don’t log bugs. We fix them.”
• Measures are built in & visible to everyone
• Promote your wins! Educate your peers.
• Everyone can do continuous delivery.
12. Lightning Intro
Why?
• Faster feedback to developers
• More responsive engagement with customers
• Reduced risk of failures in production
13. Lightning Intro
Why?
• Faster feedback to developers
• More responsive engagement with customers
• Reduced risk of failures in production
• More engaged/user-focused teams
• “Experimental organization”
14. Lightning Intro
Why?
• Faster feedback to developers
• More responsive engagement with customers
• Reduced risk of failures in production
• More engaged/user-focused teams
• “Experimental organization”
29. Re-framing CD
• Need to relate pipeline to the flow of ideas through
the system, not just the flow of commits
• Ideas can be commit-scoped (“change the sort
order of the displayed list”) or much bigger
(“support Japanese as a language in the
application”)
32. Re-framing CD
• This is tricky, because teams will work on different
parts of the feature at different speeds
• Can’t always “just put it live and let it wait there”
35. Re-framing CD
• No easy answers
• Dark launches
• "So maybe the idea of software projects is wrong.“
– Dave Farley, http://blog.xebialabs.com/2016/04/07/theres-no-c-devops/
36. Re-framing CD
• Even if the idea of software projects is wrong,
ongoing tweaking only goes so far
37. Re-framing CD
• Even if the idea of software projects is wrong,
ongoing tweaking only goes so far
38. Re-framing CD
• Even if the idea of software projects is wrong,
ongoing tweaking only goes so far
39. Re-framing CD
• You’ll likely have multiple pipeline patterns in your
organization
• Both across teams and within a team, over time
• Think about your approach to tweaking vs.
innovation
43. Metrics, metrics, metrics
• Goal of the CD pipeline: get enough data for you to
be confident that you can go live
– …and then actually deploy the candidate if you decide to
go ahead
44. Metrics, metrics, metrics
• Goal of the CD pipeline: get enough data for you to
be confident that you can go live
• What can we be confident of?
48. Metrics, metrics, metrics
• Confidence that the new version will be “better” ≠
confidence that we will avoid catastrophic failure
49. Metrics, metrics, metrics
• Confidence that the new version will be “better” ≠
confidence that we will avoid catastrophic failure
• Determining accurately whether the new version
will be better is hard
50. Metrics, metrics, metrics
• Confidence that the new version will be “better” ≠
confidence that we will avoid catastrophic failure
• Determining accurately whether the new version
will be better is hard
• Functional and code-level metrics are poor
indicators of catastrophic failure avoidance!
51. Metrics, metrics, metrics
• Idea: use architectural metrics to look for
indications of catastrophic failure risk
52. Metrics, metrics, metrics
• Idea: use architectural metrics to look for
indications of catastrophic failure risk
• Trick: don’t need production-equivalent values –
can use deviation from the baseline!
53. Metrics, metrics, metrics
• Idea: use architectural metrics to look for
indications of catastrophic failure risk
• Trick: don’t need production-equivalent values –
can use deviation from the baseline!
54. Metrics, metrics, metrics
Build 17 testPurchase OK
testSearch OK
Build # Test Case Status
Test & Monitoring Framework Results
55. Metrics, metrics, metrics
Build 17 testPurchase OK
testSearch OK
Build 18 testPurchase FAILED
testSearch OK
Build # Test Case Status
Test & Monitoring Framework Results
We identified a regression
56. Metrics, metrics, metrics
Build 17 testPurchase OK
testSearch OK
Build 18 testPurchase FAILED
testSearch OK
Build 19 testPurchase OK
testSearch OK
Build # Test Case Status
Test & Monitoring Framework Results
Problem solved
57. Metrics, metrics, metrics
Build 17 testPurchase OK
testSearch OK
Build 18 testPurchase FAILED
testSearch OK
Build 19 testPurchase OK
testSearch OK
Build # Test Case Status # SQL # Exceptions CPU
Test & Monitoring Framework Results Architectural Data
Let’s look behind the scenes
58. Metrics, metrics, metrics
Build 17 testPurchase OK
testSearch OK
Build 18 testPurchase FAILED
testSearch OK
Build 19 testPurchase OK
testSearch OK
Build # Test Case Status # SQL # Exceptions CPU
12 0 120ms
3 1 68ms
Test & Monitoring Framework Results Architectural Data
59. Metrics, metrics, metrics
Build 17 testPurchase OK
testSearch OK
Build 18 testPurchase FAILED
testSearch OK
Build 19 testPurchase OK
testSearch OK
Build # Test Case Status # SQL # Exceptions CPU
12 0 120ms
3 1 68ms
12 5 60ms
3 1 68ms
Test & Monitoring Framework Results Architectural Data
Exceptions probably reason for
failed tests
60. Metrics, metrics, metrics
Build 17 testPurchase OK
testSearch OK
Build 18 testPurchase FAILED
testSearch OK
Build 19 testPurchase OK
testSearch OK
Build # Test Case Status # SQL # Exceptions CPU
12 0 120ms
3 1 68ms
12 5 60ms
3 1 68ms
75 0 230ms
3 1 68ms
Test & Monitoring Framework Results Architectural Data
Problem fixed but now we have an
architectural regression
Problem fixed but now we have an
architectural regression
61. Metrics, metrics, metrics
12 0 120ms
3 1 68ms
Build 20 testPurchase OK
testSearch OK
Build 17 testPurchase OK
testSearch OK
Build 18 testPurchase FAILED
testSearch OK
Build 19 testPurchase OK
testSearch OK
Build # Test Case Status # SQL # Exceptions CPU
12 0 120ms
3 1 68ms
12 5 60ms
3 1 68ms
75 0 230ms
3 1 68ms
Test & Monitoring Framework Results Architectural Data
Now we have the functional and architectural
confidence
63. 63
What’s commonly measured
# Test Failures
Overall Duration
What’s out there
# Log Messages
# HTTP 4xx/5xx
Request/Response Size
Page Load/Rendering Time
…
Execution Time per test
# calls to API
# executed SQL statements
# Web Service Calls
# JMS Messages
# Objects Allocated
# Exceptions
65. 26.7s
execution time
33Calls to the
same web service
171SQL Queries through
LINQ request similar data for
each call
Direct access to DB from frontend logic
Metrics, metrics, metrics
69. Metrics, metrics, metrics
• Pipelines are about gathering data to allow you to
make a confident go/no-go decision
• Confidence that the new version will improve is
harder to come back than confidence that it won’t go
bang
• Architectural metrics are very useful to determine if
the new version likely to go bang
• You probably already have the tools you need to
integrate this kind of data into your pipeline
71. Beyond go-live
1. “CD: ongoing flow of ideas to your users”
2. “Gaining confidence that the new version is
better than the current one before production is
hard”
77. Beyond go-live
• Plenty of technical metrics available, but how do
these relate to user experience?
• Client-side instrumentation?
• Feedback popups?
• Tracking social mentions?
• “Traditional” UX testing?
79. Beyond go-live
• This isn’t easy!
• “Our experience at Microsoft is no different: only
about 1/3 of ideas improve the metrics they were
designed to improve [...] at Amazon, for example, it
is a common practice to evaluate every new
feature, yet the success rate is below 50%.”
– http://robotics.stanford.edu/~ronnyk/ExPThinkWeek2009Public.pdf
81. Takeaways
• CD is about “idea-to-users”, not just “commit-to-
prod”
• Depending on the relationship between ideas and
commits, you’re likely to have different pipeline
patterns across teams, and within the same team
over time
• “Tweaking phase” vs. “Innovation phase”
82. Takeaways (2)
• The automation phase of your CD pipeline is about
gathering metrics to allow you to make an
accurate go/no-go decision
• What kind of confidence are you looking for?
Confidence that things won’t “go bang” is easier to
obtain than confidence that things will improve
• Architectural metrics are a big help in determining
whether things are likely to go bang
83. Takeaways (3)
• The capability to release quickly and with
confidence that CD provides is the foundation for a
more user-focused approach to building software
• Think about how you would measure user
experience in your production systems
• Moving from a Just Build What We Tell You culture
to an experimental organization requires
significant culture change
84. Just for fun ;-)
• The “zero-width joiner”
– https://en.wikipedia.org/wiki/Zero-width_joiner
• Unicode character U+200D
• Causes characters to be connected