Summary
Advanced planning techniques that deliver on promise of empirical evidence based predictability and improve organizational Agility.
Outline
Two things are certain about estimates:
Estimates are always wrong
You will spend more time estimating that you should have otherwise used to do the work instead.
Agile Manifesto Values and Principles do not, not even once, mention “estimates” any where. Yet rapid adoption of estimation techniques labeled as “Agile Estimation” techniques puzzle me. In my experience as practitioner, advisor and coach : I have experienced very limited benefits from estimating and often find that estimates create more harm than good. There are however legitimate business concerns that need active management. Estimates hinder real business agility by servicing temporary comfort through plausible but highly improbable plans.
Following is outline of my talk:
Opening and Introduction
So you think you can estimate: Overview of estimating biases with references to current research in software context.
Anchoring
Impact of irrelevant and misleading information
Temporal distance : The further out in future you estimate the more optimistic your estimate
Relative Size estimation is prone to Directional bias and Assimilation Effect
Sequence reference bias: Biases introduced depending on number sequence used for story pointing
Recollection bias (flawed memory)
Motivational bias
Exposure to biases is unavoidably high and there is no escaping it.
Estimates anchor benefits - Why estimates make me frown?
Applicability of Story point estimates.
Story points are applicable only in fully cross-functional teams that can move a request from Business to Production all by itself. Or in Scaled contexts where teams are fully cross-functional feature teams. In all other cases story points are inapplicable.
In applicability in scaled context with many dependent teams
Introduction to cycle time
How to gather empirical evidence in non-ideal contexts? - Single team
What happens in multi-team environment where teams are cannot be fully cross-functional and have shared dependencies?
I will share principles via case-study where I used cycle time measurements and dependency management board to actively develop empirical cycle time evidence to track a major Game release.
Conclusion
Q&A
Note: This 45 minute talk is fast paced and assumes that participants are sound on their fundamentals.
3. 2things are certain about
estimating
You will always be WRONG.
You will spend more time estimating, that you could have used to do
the work instead.
7. EXPERT JUDGEMENT BA
STATISTICAL ANALYSIS BASED
EXPRESSED AS A “RANG
EXPRESSED IN PROBABILITIES OR PERCENTILE CONFIDENCE
Guesses on Known-Knowns and Tolerance for Known-Unknown
8. WHAT YOU KNOW YOU KNOW
What you know you don’t know
What you don’t
know, you don’t
know
10. Diminishing Returns
When people are asked to think
in detail about how they plan to
finish a task, there is an ever
greater tendency to
underestimate its duration
-Beuhler & Griffin, 2003
Accuracy
Effort Spent Estimating
100%
11. THE CONFIDENCE ESTIMATORS
HAVE IN THEIR OWN ESTIMATES IS
UNJUSTIFIABLY HIGH
- JØRGENSEN, M., TEIGEN, K. J., AND MØLOKKEN-ØSTVOLD, K. BETTER SURE THAN SAFE?
OVERCONFIDENCE IN JUDGEMENT BASED SOFTWARE DEVELOPMENT EFFORT PREDICTION INTERVALS.
THE JOURNAL OF SYSTEMS AND SOFTWARE, 70, (2004)
12. AnchorsWHEN ASKED TO MAKE DECISION UNDER
UNCERTAINTY WE (HUMANS) GRASP AT ANY
INFORMATION OFFERED (THE ANCHOR).
FURTHER ADJUSTMENTS TO ARRIVE AT THE
DECISION (EG: ESTIMATE) ARE MADE WITH
REFERENCE TO THIS ANCHOR AND THESE
ADJUSTMENTS ARE UNABLE TO COMPENSATE
FOR NEGATIVE EFFECTS OF THE ANCHOR.
ANCHOR’S OPERATE UNINTENTIONALLY AND
WORK EVEN WHEN PEOPLE ARE FOREWARNED.
Its just …. a button
14. I’m a Lead, I’m better than
them
Therefore estimate for lower
size
The lead will obviously work
on this story
my estimate does not
matter
I will go with the “team” ..
(lead)
COMPETENCE
15. YOUR ESTIMATE FOR THE
WORK CLEARLY IMPLICATES
YOU
Estimate is often treated as a reflection of person’s ability. People
frequently under-estimate to “impress”.
SEE: THE SOCIAL IMPLICATIONS OF PLANNING: HOW PUBLIC PREDICTIONS BIAS FUTURE PLANS :
STEPHANIE P. PEZZOA. MARK V. PEZZOB, AND ERIC R. STONE : JOURNAL OF EXPERIMENTAL SOCIAL PSYCHOLOGY, 20
16. Avoid Irrelevant and Misleading Information
Goldilocks User Story
As a Goldilocks User Story Card
I want to have written on me - Not too little,
and not too much, but just right
So that my author speaks - for not too
little, and for not too long, but just enough
HOW TO AVOID IMPACT FROM IRRELEVANT AND MISLEADING INFORMATION WHEN ESTIMATING
SOFTWARE DEVELOPMENT EFFORT”: JØRGENSEN AND GRIMSTAD 2008
17. Temporal Distance
Case 1:
Three months from today, you have to write
your name on a white index card with blue
fountain pen.
Estimate: ______ ?
Abstract plan that was decontextualized.
(find pen, find index card, write name, done!)
Biases estimates to be optimistic
Increased specificity of task, that is
contextualized. Thinking in terms of
impediments.
(Do I have a blue ink pen?, Where is my “white”
index card?)
Case 2:
Right now!! you have to write your name on a
white index card with blue fountain pen.
Estimate: ______ ?
18. Bye, Bye, Big Upfront Estimating!
It is un-safe (intended) to make plans based on
estimates of backlog items that are not going to be
worked on until further into future (> 2 months)
shorter releases or as Scrum says always be
“potentially shippable” at end of every sprint. Easier
said than done!, I know
Don’t have large number of backlog items in your
product backlog.
Don’t blindly believe in the “magic” of Fibonacci
sequence.
“But my management wants long term predictions” -
they have no business being in management
19.
20. AN ELLIPSE IS MORE SIMILAR
TO A CIRCLE THAN A CIRCLE IS
TO ELLIPSE
21.
22. MYTH: “RESEARCH SHOWS PEOPLE
ARE BETTER AT RELATIVE THAN
ABSOLUTE ESTIMATION”
http://guide.agilealliance.org/guide/relative.html
23. Relative Size Estimation
Directional Bias
- YOU ARE LIKELY TO GET DIFFERENT RESULT IF YOU SWITCH YOUR ‘TARGET’ STORY FOR ESTIMATION WITH REFERENCE
STORY FOR COMPARISON.
Assimilation Effect: Is a 8 point User Story, 8 times more .. <waving
hands here!> than a 1 point User story ?
- REMEDY: CREATE A REFERENCE STORY SET OF 2 OR 3 USER STORIES FOR EACH SIZE IN YOUR SET AND REFRESH YOUR
REFERENCES EVERY FEW SPRINTS.
Sequence Reference Bias : Your previous estimate of task/story acts
as implicit ‘reference’ for the next story/task you estimate.
- “RELATIVE ESTIMATION OF SOFTWARE DEVELOPMENT EFFORT:IT MATTERS WITH WHAT AND HOW YOU COMPARE” :
JØRGENSEN (2013)
24. “EVEN IF THEY ARE AWARE THAT THEY DID NOT FINISH THE
TASK WHEN PLANNED, THEIR MEMORY OF HOW LONG IT
TOOK TO COMPLETE THE TASK ITSELF MIGHT STILL BE AN
UNDERESTIMATION OF THE ACTUAL DURATION”
ROY AND CHESTERFIELD AND MCKENZIE, 2005 : UNDERESTIMATING THE DURATION OF FUTURE EVENTS - MEMORY
INCORRECTLY USED OR MEMORY BIAS?
25. benefit
time
estimate
“expected” or “accepted”
tolerance zone
Time spent to estimate
Cost to estimate
Start “Work”
ANCHORING EFFECT
Awareness of Estimate has
“pinching” effect - Eroding
benefits from outcomes when
work is completed outside of
tolerance zone.
26. WITH WHAT YOU KNOW NOW,
WOULD YOU STILL GO BACK AND
ASK FOR ESTIMATES?
27. "Teams first struggle with estimation, then get quite good at it, then
reach a point where they don't need it” - @martinfowler
29. Story Points : Applicability
Fully Cross-Functional Teams
Idea to Delivery can be achieved by a
single team
In Scaled environments: Feature Teams
All of the above implies that teams have
DevOps capability or healthy processes
and practices with Operations teams. With
of course no manual testing hand-offs.
Story Points are not
comparative between teams.
(Apple points and Orange
points)
REMEMBER
Story points between
teams cannot be added to
project on overall release.
30. ANALYSIS
DESIGN & DEV
TESTING
OPS AND RELEASE READY
FRONT-END
MIDDLE TIER
INTEGRATION TIER
BACK-END
End-to-EndFunctionality
Potentially
Shippable
Product
Increment
5 3 8 5
21 Story Points
31. 5 3 8 5
21 Story Points/Sprint
110 Story Points
~ 5-6 Sprints
33. Shift from Story Points to Story Counting
Agreement on priority of item in
product backlog
Agreement on acceptance criteria
“Conversation” about User Stories
Strong Definition of Done.
Use “yesterday’s weather” technique
Agreement on priority of item in
product backlog
Agreement on acceptance criteria
“Conversation” about User Stories
Strong Definition of Done.
Use “yesterday’s weather” technique
Planning poker to arrive at consensus
on estimate and ONLY splitting stories
when greater than ___ (13) points
Split all stories. A sprint has atleast 4-5
User Stories that have discreet business
value.
35. Weak Definition of Done
Definition of Done < Potentially Shippable Product Increment
Examples
Development team defers many aspects of quality to later sprints
Development Team relies on an Manual QA team for full regression test pass.
Business Analysts teams works ahead of Development and Test Teams
Business receives functionality from scrum team that is UAT’ed before release
to end-user.
36. Serialization of backlogs
New
Items
from PO
Product Backlog
Delivery
Team
Un-tested Product
Testing
Team
Defects/Bugs detected
Potentially Shippable
Product Increment
37. Backlog Items Remaining Development Team
Backlog Items Remaining Testing Team
Serialization of backlogs
38. Every Source of Delay between
Development and Test whiplashes
the final “GO-LIVE” date.
Development Test
Delay
39. DELIVER WORKING SOFTWARE
FREQUENTLY, FROM A COUPLE OF
WEEKS TO A COUPLE OF MONTHS,
WITH A PREFERENCE TO THE SHORTER
TIMESCALE.
- Principles Behind Agile Manifesto
40. THE MOST POWERFUL WAY TO
REDUCE VARIABILITY IN FORECASTS IS
TO SHORTEN OUR PLANNING
HORIZONS
- Donald G. Reinertsen: The Principles of Product Development FLOW
41. Gathering Empirical Data
How many Stories can we develop if we must be shippable at least every two
months?
Destabilizing Stories (N) Stabilizing Stories (?)
example: 20 stories require 15 stories worth stabilization
Improve planning For Future Releases
43. 2Laws of Dependency Management
Do not create dependencies.
If you must, then schedule dependent work ONLY with knowledge of
dependent service team’s PULL schedule and expected cycle time.
45. Go Look at <alm tool> does not work!
How is the project doing?
Do you have capacity to take this on?
What is the status of Feature/Initiative?
Will I get Feature X?
When will I get Feature X?
Go look at online
tool
Easier to answer for
single X-functional
teams
Blob of many
interdependent teams
46. Team Level Metrics
Impediments
from daily
scrum
Velocity (#of
Story Points
or # of
Stories)
Cost $$ of funding a
sprint
etc. etc.
ALM
Tool
Program Level
Team Level Metrics
Team Level Metrics
Team Level Metrics
Team Level Metrics
47. Team Level Metrics
Impediments
from daily
scrum
Velocity (#of
Story Points
or # of
Stories)
Cost $$ of funding a
sprint
etc. etc.
ALM
Tool
Team Level Metrics
Program Level
Inter-Connectionofdependent
teams
Systemthroughputcapacity
Cycletime@Feature/Epiclevel
Team Level Metrics
Team Level Metrics
Team Level Metrics
Howmanyactivedependencies
cansystemhandle?
How many dependencies can
a team handle?
48. Team A Team B Team C
Epic 1
Epic 2
Epic 3
Most at Risk
Epic
Most involved team
Dependency BoardI Can see big
picture
FirstOrderInformation
49. Dependency Board Purpose
To determine if features are being developed too late or too early.
To see which features are related to each other so as to make trade-off
decisions
To identify unknown and duplicate work.
To identify at-risk features.
To allow teams to pick up right work.
50. Epic 3
Team A Team BCycle Time
Delay
Delays in resolving dependencies contribute the most to project delivery timeline.
Start
Shippable
51. Program Cumulative Flow Diagram (CFD): Queue Sizes
time
#ofEpics
EPICS SHIPPABLE
CUMULATIVE # OF EPICS
WIP
SecondOrderInformation
52. Program Cumulative Flow Diagram (CFD) : Time
time
#ofEpics
CYCLE TIME
DELAY
TEAM B VELOCITY
TEAM A VELOCITY
SecondOrderInformation
53. Epic 3
Team A Team BCycle Time
DelayStart
Shippable
Optimize
interconnected
team system to
reduce this.
54. Actively Manage “delay” caused by
Dependencies
Prioritize Epics at program level. Work in prioritized order.
Stop starting, start finishing
Actively monitor dependency queues
Create Kanban slots to constrain # of Dependent items for a service team.
55. Where there is delay there are backlog
items
DEPENDENCIES WAITING TO BE RESOLVED
“BACKLOG”
WAIT TIME OR “DELAY”
QUEUESIZEISLEADINGINDICATO
THROUGHPUT IS LAGGING INDICATOR
56. Controlling Queue Size directly
controls cycle time
Littles Law: Average Cycle Time = (Average number of Items in Queue) / (Average
Processing Rate)
Do not let Dependencies to accumulate.
Set Explicit Kanban limits on Dependencies between teams.
57. C
Team B has Two (fixed)
Kanban Slots for
“Servicing” external
dependencies
Team A Team B
Kanban Dependency Slots
Lets work on
these two
stories
Look. team B
has only one of
the two
dependency
slots open.
So we should not
create more
dependencies
And PULL only our
highest priority
story that requires
dependent work
We can now
focus on finishing
dependent work
We can predict
that one more
dependent item
will be added
soon
And we are
always working
on high priority
for teams.
58. Summary
At best, Estimates are educated guesses
Story Points are no better than simply counting stories per sprint
For a cross-functional team that produces potentially shippable code every sprint, Velocity is a
good forecasting measure.
In all other (real-world) cases, focus on getting to “shippable” state. You will discover that planning
is emergent and estimating not necessary any more.
Increase action on dependencies by creating team level dependency Kanban limit.
To forecast:
your Queue (backlog) Size is leading indicator of challenges ahead.
End-to-End measurements of Cycle Time are better for forecasting on programs with dependent
teams. Team level velocities do not account for “wait” period in between backlogs