The case for lower noise software estimates
2023-Mar-12Software estimates are noisy and expensive, but we can help.
What is noise in software estimates?
Noise is the variability in human judgements, either between people or when people disagree with themselves. With software estimates, this is the difference between people's estimates, or unjustified changes to an estimate.
For example, I led a small project with 2 other devs to add a feature to a system. At the outset we wrote a short design doc, and I asked the other devs to estimate how long it would take until the customer got the feature. With the same doc, our estimates were 3 months, 6 months, and 10 months. Each estimate had some "error bars" on them too.
While we would need to complete the project to know the error (what was the truth, and how close were we?), we could immediately see the noise in our estimates.
In our case, the noise came from a two sources:
- Different estimating methods. The lowest estimate came from the developer who broke the work down into tasks, estimated each, and summed it.
- Different choice of comparison classes. The other two devs looked at different sets of data. One looked at all projects for the team, and one looked at "similar" projects. There were fewer "similar" projects, but their average time-to-delivery was higher.
Noise is different to uncertainty. In the above example, each developer gave their estimates with a range. If the process were noise-free, then we'd expect the same range from each developer, even though we were uncertain.
What's the cost of noise?
Noise is expensive, but it's difficult to see it because the expense is diffuse. Noise makes all decisions riskier, and risk eventually materialises on the bottom line.
You might decline a project that would have been profitable, you might take on unprofitable projects. You might allocate too many developers to a project, taking resources away from a more important project. You might allocate too few developers to a project, meaning you need to crunch at the end or disappoint customers.
Another cost that's harder to see: noise makes process improvement slower. You must run more trials to see if a change helped in a noisy system, so you can't improve as quickly.
These costs are hard to see, but real. Project overruns are easy to measure. However, not taking on a profitable project is an opportunity cost. Opportunity costs are hard to measure because you need the counter-factual. Slow process improvement shows up as higher overheads. These costs lower return on invested capital, reduce free cash flow, and eventually lower growth.
How do I lower noise in my estimates?
Is noise a problem in your organisation?
If you're not sure how expensive noise is for your organisation you can measure it. You don't even need to wait for work to finish to measure it.
First, decide how much noise is acceptable. This is the hardest step. It's easy to dismiss the results if you don't know the threshold ahead of time. You'll probably want to do some napkin maths about how much under and over estimation costs you. You can turn that into a concrete measure.
Give your design docs (or stories, epics, or whatever you use) to a few engineers and ask them to privately estimate them.
When you get the answers back, double check against your threshold. You should then know if you need to do anything else.
This is called a "noise audit." It's surprisingly cheap because you can do it in your normal process, you just need to set the threshold, then ask for estimates as usual (with the no-conference caveat.)
Use several independant assessments
One common source of noise is who speaks first. If that person is high-status or confident, then others may keep their opinions hidden. Even if others give their opinions, the next people to speak might anchor their estimates on the first person's. You need the unvarnished opinions to see how much noise there is.
The estimator's mood and recent experiences also affect their estimates. Folks who have had a bad time in a project recently may be more pessimistic, even if the experience is not transferable.
Using several independent estimators controls for this, to some degree. Not everyone will have had a bad day, and there's no "who speaks first" anchor. After that, you can average the estimates [1]. You can add more estimators for a more stable estimate; small projects might use 3 - 5 estimators, very high value (e.g. "bet the company") estimates could include everyone. The more independent estimators, the lower the noise.
Avoid vague verbiage
Different interpretations of words add noise. Similarly, so do estimation units that do not have a consistent definition, e.g. "T-shirt sizes". For example, one person may think a 30 dev-hr task is "large" while another person with the same underlying estimate might call it "extra large."
Even asking for "an estimate" is vague. There are several types of estimates: dev-hours, cycle time, time-to-delivery, etc. Each type of estimate needs different methods, and has different outcomes. Be specific about what type of estimate you want.
Structure the process
Differing methods, while useful for getting different perspectives, are also a source of noise. Staff might use an inappropriate method for the problem at hand. You can pre-ordain the methods and let estimators find and interpret the data.
You can ask staff to use multiple methods and base their judgement on that. This gives you the benefit of multiple types of analysis without unnecessary between-estimator noise.
Finally, structuring the process lets you improve the process. If estimators consistently forget some necessary component, you can write a checklist. This reduces noise and bias.
Remove irrelevant information
Folks incorporate irrelevant information into their estimates, even subconsciously. Since this information is irrelevant and can vary, the variance in the irrelevant information causes noise.
For example, you might find the implementation language has no effect on delivery time. Some developers might believe that Python projects are easier to implement than Java projects and use this belief in their estimates. Withholding the implementation language stops one source of noise in this case.
When does noise not matter?
Routine work generally does not need human estimating. You can get the data directly from your issue tracker. That will give you a good idea of the estimate that has no estimator bias or noise. You'll also be able to see the uncertainty in the work -- maybe the routine work takes between 2 and 3 days according to the tracking system.
You might not care about the noise if it's not going to affect a decision. If you have a task which you will do regardless [2] of the cost (e.g. compliance work,) then the noise in the estimate is irrelevant [3].
Small work might not need this level of attention. An estimate can't exceed the value of the task you're estimating [4]. Therefore, the value of clamping down on the noise is limited for low-value work. In these cases, you might try to go the "whole hog" and estimate without involving people by applying a model; e.g. just counting the tasks and assuming they're all approximately average [5]. That saves on estimating time and avoids the noise.
Conclusion
Noise in software estimates can be expensive. Underestimating leads to crunch, project overruns, disappointed customers, and unprofitable projects. Overestimating makes us misallocate resources, missing other opportunities.
We get noise from a few different places: differences in methods, recent events skewing folks' mood, irrelevant information. Fortunately, it's easy to see the noise, even without waiting for the work to finish. You just need to get independent estimates from a variety of developers and the noise is self-evident.
Fixing noisy estimates does raise costs, but it can be done relatively cheaply. Get rid of vague verbiage and irrelevant information, then have your estimators independently assess the work using a structured or standard process.
You can avoid these costs for small work, places where the estimates won't change a decision, or mechanically creating the estimates.
Further reading
- Kahneman, Sibony, and Sunstein, Noise: A Flaw in Human Judgement, 2021
- Satopää, Salikhov, Tetlock and Mellers, Bias, Information, Noise: The BIN Model of Forecasting, 2021
- Grimstad and Jørgensen, Inconsistency of expert judgment-based estimates of software development effort, 2007
[1] | You can use almost any "average" process -- mean, median, and mode. they all have different properties that are useful at different times. The mean minimises "mean squared error", but is suceptible to outliers. Choose wisely. |
[2] | To a point. If the work will bankrupt the company then it might be a sign you need to just close down or stop the activity. |
[3] | This is a reasonable rule of thumb: if decision is made and you cannot change it, then the whole exercise of estimating is waste. |
[4] | Opportunity cost comes in here though. Although it might not exceed the value of this task, if you're deciding between two tasks it might because the other task might be so much more valuable. Doing the wrong task is then an opportunity cost. |
[5] | This works surprisingly well, although the error bars do get quite big because you're assuming the variance is uncorrelated. It does still suffer from the planning fallacy as you might miss tasks. It's fine for most gross calculations. |