DevOps metrics and KPIs are important for two main reasons. First of all, they give us information about how well we cope with the way we are doing DevOps. Secondly, they are good indicators for what we can do better. In other words, they help us make data-driven decisions to improve our work.
They’re critical to DevOps success.
At MentorMate, we strongly believe in DevOps metrics and KPIs and invest in their development.
The goal of every software development company is to have better software delivery and operational (SDO) performance. That means faster software delivery and time to market (the period of time from when the idea is born to the moment it can be accomplished).
SDO performance refers to the four DevOps metrics and KPIs. We support the main industry-standard DevOps metrics, which show how well the team performs in its field.
4 Technical DevOps Metrics and KPIs
Generally, there are four types of DevOps metrics and KPIs that we measure — deployment frequency, lead time for changes, time to restore service, and change failure rate.
Deployment frequency shows how often the organization deploys code into production, i.e. how often the organization releases it to end-users. The authors of the annual Accelerate State of DevOps Report, present a comparison between elite, high, medium, and low performing organizations.
According to their findings, elite organizations deploy code multiple times per day. They routinely deploy on-demand. These results are consistent for this elite group over the last several years.
On the other end of the statistics spectrum, low performers deploy between two to twelve times per year. This result shows a decrease in performance in comparison with the previous year’s results.
Accelerate found that elite performers deploy code 208 times more frequently than low performers.
At MentorMate, we deploy multiple releases per day. To achieve this, Continuous Integration/Continuous Delivery (CI/CD) pipelines assist in a way to automate the software delivery process. These pipelines help remove errors and thus, allow more confidence in the product.
Lead Time for Changes
According to Accelerate, lead time for changes is a key metric that shows “how long it takes to go from code committed to code successfully running in production”.
For elite performing companies, that lead time is less than a single day. For low performers, it’s between one and six months. As a result, elite performers have 106 times faster change lead times than low performers.
Time to Restore Service
Time to restore service shows how long it takes to restore service if and when the software breaks or a defect occurs that impacts users. Such incidents could be an unplanned outage or a service impairment.
According to Accelerate, elite performers report less than one hour to restore service. On the other hand, low performers replied that it takes them between one week and one month to restore service.
The calculations show that elite performers recover from incidents 2,604 times faster than low performers. It is also curious to note here that high and medium performers report the same average time to restore service, namely less than one day to restore service.
Change Failure Rate
The change failure rate shows how often new changes and updates to the software break or fail. In other words, “what percentage of changes to production or released to users result in degraded service (e.g., lead to service impairment or service outage) and subsequently require remediation (e.g. require a hotfix, rollback, fix forward, patch)”.
According to the report, elite performers report a change failure rate between 0 and 15%, while low performers report between 46% to 60%. These statistics show that elite performers have seven times better change failure rates than low performers.
At MentorMate, we invest in automation. In this way, we’re able to decrease human error rates. We believe this leads to more predictable results and fewer failed deployments.
According to the above analysis, elite, low, medium, and high performers all execute different levels of speed and stability. However, high speed and strong stability are both possible at the same time.
We have also observed that when a company journeys from a low performing company into one that is high-performing, the main metrics tend to get worse initially. This is mainly because the old way of doing things goes away and in its place comes the new way. This usually leads to initial chaos until things stabilize again.
Looking at these metrics on a graph, it tends to form the letter “J”. First, the metrics decline and then comes a rapid improvement in the results.
Non-Technical DevOps Metrics
There are also several types of so-called non-technical metrics or non-technical key elements. They represent the culture of psychological safety in a company. According to Accelerate, “this culture of psychological safety is predictive of software delivery performance, organizational performance, and productivity”.
There are several ways in which this culture of psychological safety affects the SDO in a company. They can be represented as a chain, in which every piece affects the next one.
Here’s how this chain works in general and how it is reflected in MentorMate’s everyday work and values:
Psychological safety is at the top of the chain and means that team members feel secure to take risks and can be open with one another.
At MentorMate, we feel the constant support of our managers and team members and work in a highly psychologically safe environment. This is achieved by following one of our core values: integrity, which helps us build trust and be moral.
Dependability follows psychological safety and means that team members get things done on time and tasks get distributed evenly among team members. There’s no one doing most of the job on account of others.
We strongly believe in collaboration as one of our core values. We can’t achieve dependability without collaboration as it allows team members to get things done on time by means of even distribution of tasks and good cooperation and collaboration skills.
Structure and Clarity
Structure and clarity refer to the ability of team members to have clear roles, goals, and responsibilities. This piece of the chain is well represented in MentorMate’s values: working intelligently and striving for results. We work intelligently which results in a clear structure of the processes and clear roles and responsibilities within the team as well as within the company as a whole.
At the same time, we set clear goals for ourselves and strive to achieve them, while remaining flexible towards our clients’ needs. Along the way, we modify processes based on those needs.
Meaning refers to the fact that team members find that their work is meaningful. MentorMate clients fall into many different industries — healthcare, education, manufacturing, finance, and agriculture to name a few. All of our projects in these fields are meaningful and leave a lasting impression and value for our clients and their users.
Impact shows that team members believe that their work is impactful, i.e. it leads to positive change in the surrounding community. MentorMate’s portfolio is not only rich in different industries, but it also impacts the community as a whole in a positive way, it matters and creates change.
That is how MentorMate has a specific approach to non-technical key elements and creates such a culture of psychological safety. Not only that, but as a consequence of the performance of the non-technical metrics, there is also an improvement in productivity and SDO performance.
At MentorMate, we believe in continuous improvement by means of investing time and efforts in automation, traceability, and monitoring. What’s more, we offer clients the whole spectrum of services: business analysis, project management, developers in the whole range of practices, as well as further support after the product release. Our aim is that the released product will be able to handle thousands of users.
According to the graphic above, automated testing leads to continuous integration, which in turn leads to continuous delivery. Cloud services, along with continuous delivery and the culture of psychological safety, all generally contribute to better SDO performance at MentorMate. This in turn is the reason for improved organizational performance on a larger scale.
At MentorMate, we incorporate the usage of both technical and non-technical metrics and KPIs in order to provide better products and services to our clients and their users. We are also investing our efforts, time, and finance in automation wherever possible. This allows us to reduce human error and achieve better results.
This post was written in collaboration with MentorMate’s Cloud & DevOps Manager Daniel Rankov.