SLOs and SLIs: How do I go about it?
-
Sit down with stakeholders and define statements for the critical things they care about.
-
Translate those critical paths into SLIs by identifying metrics that matter.
-
Establish priority and firm handshakes on adding the required metrics and instrumentation
needed to measure these SLIs.
-
Negotiate SLO targets with the team so that, once SLIs are in place, rules and alerts
can be created.
-
With SLOs defined, SLIs implemented, and alerts configured, move on to discussions
around short-term and long-term error budgets.
SLO Statement
“The value for X remains as expected for Y% over Z time-window”
Rationale
“X represents a metric or attribute of key interest because of (specific reason)”
“Y was calculated based on (specific reason)”
“Z is the time-duration considered for X because of (specific reason)”
SLI
The metric that expresses X is:
(some valid metric)
This metric is located in:
(dataset or monitoring platform) and is sourced from
(client or server).
Alert
This metric triggers an alert if the value is over or under X by
(threshold) for (defined duration).
This alert is configured at:
(location of the alert based on the system).