After several years of experience working with, training and coaching teams on Scrum, I have no doubt that agile (empirical) estimation techniques are better than classic, predictive estimation techniques. One question that kept nagging me over time is whether or not using story points, T-shirt sizes, or ideal hours truly explains the increased success of Scrum based techniques?
It took me a while, but I finally figured it out. And the answer is a definite ‘NO’!
All Estimation Techniques are Merely Guesses
At the end of the day (or at least, at the end of the planning session), any estimation is just a guess: sometimes its more, and sometimes its less. This means that the distribution of the actual time versus the estimation should be pretty much normal: In large numbers, about half of the actual results will be less than the estimation, and half will be more. On average, according to the Law of Large Numbers, it evens up.
The better the estimation technique is, the steeper the curve will be, i.e. the closer actual results will be to the estimation.
My experience is that agile techniques are in fact better at predicting the actual results, but the fact remains that if you make enough of these predictions, then regardless of the technique, the variance should cancel itself out and the sum should settle on the average!
This means that the quality of the technique used to estimate each task has little or no impact of the success of estimating the time to complete the entire project!
Then Why Do Estimations Fail?
If estimations behave “normally” whole project estimations should not fail. I don’t buy into all that “programmers are notoriously optimistic” sentiment; I’m a developer myself and there is not a single soul alive that will call me optimistic, and yet, non-agile projects I worked on overran their deadlines, like everybody else’s. Something must be out there, that either changes the normal distribution, or invalidates the law of large numbers or both. And it has to be something that is present in classic waterfall-ish projects but not in agile ones.
It Is All About the Independence!
The main difference, in my opinion between agile and non-agile projects is how they are planned and constructed. An agile project plan is basically a stack of user stories, with no* dependencies between them, whereas a waterfall project plan is a Gantt chart describing tasks and their interdependencies.
Why Is This Important?
Let’s assume a task that has two dependencies. Each were estimated, and each have a normal distribution of the actual amount of time to complete vs. the estimated time. The following figure describes such a “plan”:
Each task has a 50% chance to be completed before time and 50% to be completed after (There is of course no statistical chance of being completed “on time”). Now, let’s say we’re interested in calculating the probability of Task C to be completed before time, or not to finish late. For the sake of ease, we’ll assume all three tasks to have the same estimated duration. For Task C to begin early, it needs both of its dependencies to complete early. Since the chance of each to complete early is 50%, the chance for both is (0.5 * 0.5) = 0.25, or 25%. The chance for this not to happen, i.e. for at least one of them to be completed late is 1-0.25 or 75%. this means that there’s a 75% for Task C to complete late, and of course it has its own probability of being late, which means that the chance of Task C completing late (and making the whole project run late) is more than 75%!
It is easy to see that the more complex a project is, the more dependencies each task has the greater the chance of being late is; The distribution of the actual duration of tasks with dependencies is a negatively skewed distribution.
Scrum User Stories Have No Dependencies
With this in mind, Scrum user stories have no* interdependencies. If two stories do have a dependency, they are should be joined to be treated as one (and if they are too big, they can and should be split again, though in a way that doesn’t create dependencies). With no dependency (or, if you wish, a single dependency that is the completion of the prior story), we get a plan that looks like this:
The chance of B starting early, which is the same chance of A ending early, is 50%. The chance of B completing early is still 50%! Even better the variance is reduced, which means that there is an increased probability of the total project duration being close to the the estimate (i.e. a steep probability distribution).
What Does This Mean?
This means, that while a project with multi-dependent tasks has an increased skew towards lateness with each added task, a project with no dependency (beyond sequence) has an increased tendency towards the mean, or the estimate.
This means that by the very nature of what we are estimating, agile projects tend to succeed (i.e. be on budget and on time), than non-agile.
Moreover, the actual estimation technique doesn’t matter – it is the estimation of independent units of work that makes the difference.
Therefore, my conclusion, and advice is that the very first change a team needs to make is to start working on independent features – call them user stories, features, or MMFs – it doesn’t matter; just keep them independent of each other!
Hope this helps,
Assaf.
This view aligns with the object-oriented development view that the components of a system should be highly cohesive and loosely coupled. The loose coupling enables developers to minimize dependencies and, thus, build the components with significant independence.
ReplyDeleteThese are worthwhile goals that are hard to reach in practice. User stories can never be completely independent but keeping the goal in mind surely helps.
Interesting comparison! I must admit I never thought of it. I agree that trier independence is a goal to strive to, but I do think that if you find that it can't be broken, you should try to make it cohesive. Its better to fail at estimating a large task, than to fail due to the skewed distribution of complex interdependencies.
DeleteThanks foot the input! If you don't mind, I'll use the OO analogy to convince some old school guys I know.
Assaf
I'm not so sure that your analysis stacks up. In your simple C depends on A and B scenario, it seems to be suggesting that when B is late to finish, the guy(s) who have already completed A are just sitting on their butts doing nothing.
ReplyDeleteI still agree that the agile approach is better, but I'm not convinced that this analysis is the full story.
Hi Richard,thanks for your input.
DeleteIn a real world scenario, according to Brook's law, adding a developer to an already late project (i.e. Task B) will only make it later. Also remember that this is just a simple case to illustrate the problem. In reality, the dependencies are far more complex. I won't go out and calculate it, but it is obvious that dependencies increase the probability of delay, and thus cumulatively delay the whole project.
I agree that there are other agile techniques that contribute to a project's overall success, but it is my experience that removing dependencies is the single largest contributor to "success", where success is defined as delivering on scope, on budget and on time.
Assaf.
Wow, you've got a point with: "a project with no dependency (beyond sequence) has an increased tendency towards the mean, or the estimate." Very bright observation indeed.
ReplyDeleteThanks for this article.
ReplyDeleteIt seems to me that the weight given here to dependency of 'content' as a cause for 'Agile' success is too high.
Content management (including estimation) weather is technical oriented (usually in waterfall) or business oriented ('Agile' user stories) will not eliminate the fact that there are dependencies.
The success of 'Agile' (when correctly deployed) is a result of the target ‘Agile’ methodologies are aiming to.
‘Agile’ is a framework which is mostly aimed to push for early feedback (QA feedback and customer feedback).
Of course it also push the development team for early treatment according to this feedback (‘Done’ is done before moving forward).
This target (early feedback, early treatment) makes the development force more efficient (fixing defects early is much cheaper. Fixing design early is much cheaper) and ensure the final result is in quality.
Regarding project prediction: my experience shows me that the quality of the project prediction in ‘Agile’ is due to the fact that the prediction forward is based on the velocity so far which is based on ‘Done’ content (and not on ‘checked-in’ content).