Ten Devops & Agility Metrics to Check at the Team Level

When I coach teams that are getting into the DevOps and Continuous Delivery mindset, a common question that comes up is "What should we measure?"

Measuring is a core piece of change - how do you know you're progressing without measuring anything?

Here are ten ideas for things you can measure to see if your team is getting closer to a DevOps and continuous delivery skillset. It's important to realize that what we are measuring are end symptoms - results. The core behaviors that need to change can be varied quite a bit, but at the end of the day, we want to see real progress in things that matter to us from a continuous delivery perspective.


  1. Cycle time. (you want to see this number going down)  If you put a GoPro on a user story, from the moment it enters the mind of a customer or PO, and track what it goes through, to the point of being active in production, you get a calendar-time number that represents your core delivery cycle time. It could take weeks, months and sometimes years in large organizations. It usually id a big surprise. I'll write about this more in a separate blog post.  The idea is to see cycle time reduced over time, so you actually deliver faster and be more competitive.
  2. Time from red build to green build (you want to see this number going down) - Take the last instances of a red-to-green build (count from the first red build, until the first green build after that) to get how long on average it takes to make a red build green. This is how effective your team is with dealing with a build failure. Build failures a re a good thing - they tell us what's really going on. We should not avoid them. But we should be taking care of them quickly and efficiently (for example you can set up "build keeper" shifts -every day someone else is in charge of build investigations and pushing the issue to the right people in the team.
  3. Amount of open pull requests on a daily basis, closed pull requests, coupled with the avg. time a pull request has been open. (you want to see closed requests going up, request time going down and open requests being stable or going down).  This gives us a measure of team communication and collaboration - how often does code get reviewed, and how often is code code stuck waiting for a review. A trend of open pull requests going up could mean the team has a bottleneck in the code review area. The same is true for very long pull request times.
  4. Frequency of merges to Trunk. (this should be going up or staying stable) If your code gets merged to trunk every few days or weeks, it means that whatever it is your build pipeline is building and delivering is days old or weeks old code. It also is a path to many types of risks such as: not getting feedback fast enough on how your code integrates with everyone else, your code not being deployed and available to turn on with a feature flag, and generally it's a pathway for people who are afraid of exposing their work to the world, thus potentially creating hours and sometimes days of pain down the line. 
  5. Test Code Coverage (coupled with test reviews) (you want to see this go up or stay stable at a high level, while watching closely for quality of code reviews). I always like to say that low code coverage means only one thing - you are missing tests. but high code coverage is meaningless unless you have code reviewed, because human nature leads us to fulfill whatever we are measured on. so sometimes you can see teams writing tests with no asserts just to get high code coverage. this is where the code reviews come in.   
  6. Amount of tests (this should obviously be going up as you add new functionality to your product).
  7. pipeline run time . (this should be declining or staying at a low level). The slower your automated build pipeline is the slower your feedback is.  This helps you know if the steps you are taking also help increase the feedback cycle.
  8. pipeline visibility in team rooms (you want to see this go up or stay stable at a high level). This is a metric that tells you about commitment to visual indicators, information radiators etc. It's a small but important part of team non verbal communication and increases the team's ability to respond quickly to important events. 
  9. team pairing time (should be going up or stay stable at a medium or high level) - we can measure this to see if we have knowledge sharing going on.
  10. amount of feature flags -(should be going up as team learns about feature flags, and then stay stable. if it continues to increase it means you're not getting rid of feature flags fast enough which can lead to trouble down the line. 

Two bonus metrics:

  1. feature size estimate (should be staying stable or going down) - helps to track how well the team estimates feature sizes or to check the variance of the feature sizes you estimate.
  2. Bus factor count - (should be going down and staying down) how many people are bus factors