This article is the 20th day of Motivation Cloud Series Advent Calendar 2020 .
After joining the engineer organization for about a year, I was involved in the development of web applications, and from September this year I joined an inexperienced SRE team.
Because the SRE team has a wide scope to deal with and a wide range of tasks to deal with, Immediately after joining the SRE team, there was a time when I was confused because I didn't know the priority of the task at all, "What should I start with?"
In the book "Site Reliability Engineering", "What SRE should do" is "Service availability", "Latency", "Performance", "Efficiency", "Change management", "Monitoring", "Emergency response", and "Capacity planning". It is written that it is a responsibility. In addition to this, we are also responsible for "security risk," "cost," and "development productivity."
item | Contents |
---|---|
Security risk | Reduce security risks so that customers can use our services with peace of mind. |
cost | Optimize the cost of system operation. In addition, it will be managed so as not to exceed the set budget. |
Development productivity | Improve development productivity by eradicating Toil and improving deployment flow. |
In this article, we have summarized how our team has devised how to prioritize the various daily operations of SRE as described above. I hope it will be helpful for you.
In order to prioritize tasks, we first organized the tasks from the following perspectives.
item | Contents | Remarks |
---|---|---|
Target of influence | Where are the targets affected? | ・ Customer impact ・ Internal influence |
Impact | What is the extent of influence in your organization? | ・ The degree of influence is large ・ The degree of influence is normal ・ Small degree of influence |
frequency | How often does the phenomenon occur? | ・ It's happening right now ・ May occur ・ There is almost no possibility of it happening |
cost | How long does it take to respond? | ・ Within 1 day ・ Within a few days ・ 1 week or more |
For each viewpoint, we calculated the priority score by weighting and decided the priority. I will explain the score and the reason for each.
At our company, we think that the customer is the first, and we think that it is necessary to prioritize the solution of the task that has the customer influence over the task that has the internal influence, and we formulate the score as follows.
item | Score |
---|---|
Customer impact | 5 |
Internal impact | 3 |
There are various factors such as the degree of influence from the field to the development organization, customers, etc., and even if the granularity of each is finely divided, the priority cannot be determined. By making the granularity coarser, it is easier to decide the priority, and the degree of influence is high, so the scores are assigned as follows.
item | Score |
---|---|
Great degree of influence | 5 |
Normal degree of influence | 3 |
The degree of influence is small | 1 |
If you don't get the task done quickly, you may take the time to deal with the problem right now. Therefore, determine the score by considering whether the problem is occurring right now .
item | Score |
---|---|
I'm having a problem right now | 3 |
Problems can occur | 2 |
There is almost no possibility of problems | 1 |
Cost refers to the time it takes to complete the task. When deciding on a score, don't think on your own, but combine awareness of how long you can finish within the team. Allocate the score so that the delivery date will not be delayed no matter who you give it to.
item | Score |
---|---|
Within 1 day | 3 |
Within a few days | 2 |
1 week or more | 1 |
We have defined the priority score as follows:
Priority score=Category x Impact x Frequency x Cost
By keeping in mind that the priority score is 70 or higher and making a schedule from among them, it became clear what to do.
Until now, there was a priority among each person in the team, and there was some deviation, but the priority of the team has been aligned. As a result, today I'm not wondering what to do in the short timeline of a week, and my performance has improved.
When you're wondering what to do as an SRE, you can organize and prioritize your daily tasks. It's no longer confusing as it becomes clear what to focus on today.
It may be tempting to look down on doing everything from task organization to prioritization every day, It is important to work consciously.
Recommended Posts