9 Metrics DevOps Teams Should be Tracking
Payal Chakravarty from IBM wrote an article today on DevOps.com describing the lessons learned from her team’s transformation from Agile to DevOps over the past year, and the resulting “DevOps scorecard” they’ve developed to track indicators of the team’s progress (see article here). Change is never easy, but when beginning a transformation it’s important to identify the objectives of the project and how those objectives will translate into the appropriate metrics for tracking progress. For Payal’s group, this was one of the biggest challenges in starting out, as team members asked questions about how they were going to achieve success. Ultimately these questions led to documenting what they chose to be the success criteria for their transformation – “Ship code frequently without causing a customer outage.”
Over time and through their experiences, the team found a “more granular way to track success,” and broke down their self-defined mantra into “quantifiable success metrics that could be represented in a scorecard.” Here, then, are the 9 metrics Payal’s team uses to track their continued progress:
1. Deployment Frequency – How often is the team deploying new code? “This metric should trend up or remain stable from week to week.”
2. Change Volume – How many user stories and new lines of code are being deployed? Payal suggests another important parameter to track around this metric is the complexity of change.
3. Lead Time (from development to deployment) – This is the cycle time from when new code starts development to when it successfully gets deployed into production. Cycle time is an important indicator of efficiency in the process – when tracked using value stream mapping, it can help the team to visualize areas in the process which need improvement, such as handoff times between work centers. “Lead time should reduce as the team gets a better hold of the lifecycle.”
4. Percentage of Failed Deployments – What is the percentage of deployments which have caused an outage or negative user reaction? DevOps’ emphasis on building quality in from the beginning should reduce this metric over time. Payal suggests that this metric should be reviewed together with change volume. “If the change volume is low or remained the same but the percent of failed deployments increased, then there may be a dysfunction somewhere.”
5. Mean Time To Recovery (MTTR) – When failure does occur, how long does it take the team to recover from the issue? According to Payal, this is a “true indicator of how good [the team is] getting with handling change.” Spikes in MTTR are fine for complex issues which the team has never encountered before, but the overall trend for this metric should decrease over time.
6. Customer Ticket Volume – “This is a basic indicator of customer satisfaction,” and an insightful metric to track. From the team’s own defined success criteria, the goal of their DevOps transformation was to ship more frequently without causing customer outages, and the number of tickets generated by users is a good indicator of how well they’re doing in achieving that goal.
7. % Change in User Volume – “Number of new users signing up, interacting with [the team’s] service and generating traffic.” Tracking this metric can help ensure that your infrastructure is able to meet demand.
8. Availability – Were any SLAs violated, and what is the overall uptime for the product or service? If you can maintain healthy uptime even given fluctuations in user volume, you’re tracking pretty well.
9. Performance (Response Time) – “This metric should remain stable irrespective of % change in user volume or any new deployment,” and indicates that the product or service is operating within predetermined thresholds.
Interested in learning more about DevOps? Check out our white paper The Business Case for DevOps.