SLOs and SLIs: How do I go about it?
- Sit down with stakeholders and define statements for the critical things they care about.
- Translate those critical paths into SLIs by identifying metrics that matter.
- Establish priority and firm handshakes on adding the required metrics and instrumentation needed to measure these SLIs.
- Negotiate SLO targets with the team so that, once SLIs are in place, rules and alerts can be created.
- With SLOs defined, SLIs implemented, and alerts configured, move on to discussions around short-term and long-term error budgets.
SLO Statement
“The value for X remains as expected for Y% over Z time-window”
Rationale
“X represents a metric or attribute of key interest because of (specific reason)”
“Y was calculated based on (specific reason)”
“Z is the time-duration considered for X because of (specific reason)”
SLI
The metric that expresses X is: (some valid metric)
This metric is located in: (dataset or monitoring platform) and is sourced from (client or server).
Alert
This metric triggers an alert if the value is over or under X by (threshold) for (defined duration).
This alert is configured at: (location of the alert based on the system).