Estimating the effort | DevOps

The team wants to know how much effort they will need to put in delivering the (part of the) IT system that the product owner and other stakeholders require. So, they need to estimate. Based on this estimate the product owner, together with the team, can prioritize and decide in what order the tasks will be put on the backlog.

What is a good estimate? A good estimate is an estimate that stakeholders find sufficiently reliable to use for making decisions. And a good estimate includes all relevant aspects, by involving (or representing) all relevant people. An estimate indicates a range rather than one number. It is an approximate calculation of what the answer is likely to be. For example, the realization of this user story is half the size of a reference user story that was 13 story points and took 20 hours, so this will be 8 story points and take somewhere between 8 and 12 hours, probably around 10 hours. A budget is never exact. When making a budget, preferably use methods based on historical data.

A characteristic of most DevOps (and Agile) estimation techniques is that it is not estimated in exact hours, but in points. Estimating points gives a relative estimate of an activity from the past. So, instead of estimating whether it is sixteen hours of work or thirty-two hours of work, it is twice as much work as a previously realized user story. To arrive at a good estimate, it is essential to estimate on a relative scale with the whole team (whole-team approach). Estimates are better supported when a consensus is reached. An open conversation between the team members is essential in order to reach this. With such an approach, a learning curve is automatically integrated that ensures that we learn from the previous estimate when making the next one, which makes the next estimate more accurate. In this way, the estimation not only leads to a more accurate estimate, but also to knowledge transfer and commitment of the entire team!

  • Besides the estimation techniques, there is one other important aspect when planning a sprint (see Planning the delivery): the concept of "velocity".
The velocity is the average amount of work— often measured in story points — that the team can perform in one sprint.

Based on experience, the teams are only allowed to deliver items from backlog based on the previous sprint velocities. The estimation is therefore automatically refined every sprint.

In DevOps, or in Agile approaches generally, there are various estimation techniques. In this section we cover four popular techniques:

Leave the team to choose its own estimation approach if the team has a common understanding of what must be estimated, e.g. analysis, design, development, testing (including test design, test execution, test automation if applicable), etc.

The more traditional estimation techniques are often also applicable and therefore an overview of these techniques is provided at the end of this section.

When estimating the effort, the team members must take into account that the effort for QA & testing activities will differ depending on the quality risk of the test object. Where the development effort for two units of software that have the same size will be similar, the QA & testing effort may differ because with a high risk there will be more effort for QA & testing than with a low risk. So, the total number of points in the estimation will differ. The team needs to be aware of this when applying any of the estimation techniques in this section.

DevOps (Agile) estimating techniques

Estimating in DevOps is quite different from estimating in sequential IT delivery. Where sequential IT delivery teams try to make detailed estimates of the entire project, which may span many months or even years, DevOps teams estimate the workload of the upcoming sprint and maybe a part of the backlog for future sprints. Also, DevOps teams don't strive for exact numbers but for workable insight in effort needed. Therefore, DevOps teams use a variety of easy-to-use estimation techniques.

In this chapter we provide an explanation of the following estimation techniques: planning poker, T-shirt sizing, swim lane sizing (bucket sizing) and dot voting. Note that regardless of which technique is used, the item is estimated against one or more reference items. For example, reference items with the amount of – realized – work of 1, 3 and 8 points respectively. This gives the team members direction to estimate other items.

All development activities are included in this estimate. This therefore also includes quality measures. Because high risk does not necessarily mean a lot of coding, it is advised to explicitly identify the risk. High risk does have an impact on the total estimate of effort. A valuable technique to make identified risks explicit is risk poker. For an explanation of this technique see "Quality risk analysis & test strategy".

Sometimes, often unexperienced team members try to convert and calculate story points into days and hours. In such a situation, you could try T-shirt sizing, swim lane sizing or dot voting first. These methods are more abstract than story points, which means that it is harder to convert them into units of time. When the team learns how to compare items in practice, without using days and hours, you can always switch over to planning poker if wanted.

Planning poker

Planning poker is an effective way to assign relative story points to items (e.g. user stories, features, spikes, etc.) in a meeting with all team members using cards that have specific values to represent a number of story points.

The impact of the quality risk level must be taken into consideration. This can be done during the planning poker as described below. But the team can also make the effect of the risk level explicit by applying risk poker first. See “Quality risk analysis & test strategy” for more information on risk poker.

Often, the planning poker set contains cards with the following denominations: 0, ½, 1, 2, 3, 5, 8, 13, 20, 40, 100, + card and coffee break card.

planning poker


Story points represent a relative measure in respect to the quantity of work which is required in order to realize the items of the product backlog. The estimation is not made in terms of time, but in complexity, effort and risk in respect to one another. Planning poker is not about obtaining an estimate which is perfectly accurate. The main thing is to make sure that the entire team deliberates thereto in order to achieve mutual consensus.

complexitiy risk effort

Game rules:

  1. Each team member gets a set of cards (usually not the scrum master and product owner, unless they also have a different role in the team).
  2. One person, preferably the product owner, will read – out loud – the item to be estimated.
  3. The team discusses the item.
  4. Each team member will decide for themselves which card represents the correct amount of work (considering complexity, risk and effort). If a team member is not able to make a choice, select the 0 card. If the team member thinks the amount of work is huge, select the + card.
  5. Once everyone made a choice, the cards will be placed on the table simultaneously.
  6. If all the estimates correspond, the story point estimation for that item is considered done.
  7. If the story point estimates differ, the differences will be discussed within the group (the discussion will focus on the deviating values).
  8. Repeat this until consensus has been achieved and move on to the next item.

If many items require estimating, the coffee break card can be played to allow people to relax a bit. A coffee break card should always be respected.

T-shirt sizing

In the case of T-shirt sizing, items are classified into T-shirt sizes instead of story points. The range of T-shirt sizes used is up to the team to determine. In this example the following sizes are used:

  • Extra Small (XS)
  • Small (S)
  • Medium (M)
  • Large (L)
  • Extra Large (XL)

Using T-shirt sizing, the team members make an estimation whether they think a story is extra small, small, medium, large or extra-large. Due to absence of numbers, the team members have to think in a more creative manner about estimating the amount of work of a story.

One way to proceed is the same as planning poker, see Planning poker. Obviously, when applying this technique, not the poker cards with 1, 2, etc. are used, but cards with XS, S, and so on (or just use an app).

t-shirt sizing


Another way to proceed, is arranging the sizes in a row on a table or board. You can use the same index cards you would use on the scrum board. Place the pile of story cards on the table in front of the row. The team then works together to organize the story card under the headings XS to XL. Of course, differences will be discussed within the team until consensus has been achieved.

The sizes can, if necessary, be given – ranges of – story points after the estimation is finished. Two suggestions:

T-shirt size

Story points range

Story points
















Swim lane sizing (bucket sizing)

For swim lane sizing (sometimes referred to as "bucket sizing") the team needs a table so that team members can walk around. Make eight columns on the table ("swim lanes"). Also put all the story cards to be estimated on the table. No value has yet been assigned to the swim lanes.

swim lane

At the start, all team members place two story cards together in the swim lanes, with columns going from small (left) to large (right), so that everyone immediately has two reference points. The remaining story cards are distributed among the team members. They then place the story cards in the swim lanes in five minutes, without consultation.

swim lane

After all the story cards have been laid down, the team members – without consultation – look at the distribution of the story cards for ten minutes and possibly shift story cards from one swim lane to another. Team members may move story cards back and forth. If that happens too often, the story card in question is taken out. After ten minutes, the story cards that have been taken out are discussed. If no consensus is reached on the choice of the swim lane, the largest swim lane is often chosen for that specific story card.

swim lane

The team members then determine whether there could be a smaller user story than the user stories in the leftmost swim lane. If not, this first swim lane gets the number 1. If the team members think that smaller user stories are possible, the first swim lane gets the number 2, 3 or 5, depending on what the team decides. The rest of the swim lanes are then numbered based on the Fibonacci sequence. In the example below, the team decided there are no smaller stories possible than the ones given in the first swim lane.

swim lane

Dot voting

Dot voting is a straightforward technique. All story cards are put on a wall. All team members receive various colors of dots (stickers). Each color represents a size estimate. For example, purple is 1 point, dark-green 2, red 3, orange 5, blue 8, light-green 13, etc. The team members then stick 1 of their colored dots on each card. The result is evaluated after ten minutes. Story cards with various colors must be further discussed within the team until consensus has been achieved about the story points for each of the user stories.

dot voting


Adapted traditional estimating techniques

In addition to the DevOps (Agile) estimating techniques, more traditional approaches are also available, whereby testing is estimated separately. In this section, two techniques are adapted to DevOps and briefly described:

For a detailed explanation of the abovementioned techniques when used in a traditional setting (such as sequential IT delivery) and the five techniques below:

  • Estimation based on test object size
  • Work breakdown structure
  • Proportionate estimation
  • Review estimation approach
  • Test point analysis

Estimation based on ratios

To use ratios as a basis to create an estimate for the user story tasks, it is important to collect the greatest possible amount of experience figures. This makes it possible to derive "standard" ratios for similar user stories.

An example of a ratio is:

Design: Build: Test = 2 : 5 : 3

This means that about 33% of the work to realize a user story is spent on testing tasks.


This technique is used to start building up experience numbers right from the first sprint. Based on the development of these numbers over time, it is possible to make an estimate – by extrapolation – for future sprints. The more numbers are known, the more accurate the estimate.

In the example below, this technique is used to estimate the velocity of the team. During the first sprints the realized story points vary greatly (between eight and sixteen). In other words; sometimes twice as much work is performed from one sprint to another. At a given moment, as the team gets better at estimating and executing, the number of story points achieved converges. With this team, after fifteen sprints, the velocity appears to be between fourteen and seventeen story points per sprint. Because the team is becoming increasingly responsive to each sprint and will start working more efficiently, an upward trend is often visible in the number of story points achieved per sprint.



As an example of common practice: some teams use the rule of thumb that they have done well when the actual story points achieved per sprint are between 80% and 120% of the planned story points per sprint. In other words, a margin of 20% compared to the planned number of story points.