SLI
A Service Level Indicator is a numerical measurement of one specific dimension of a service's behaviour — request latency, error rate, throughput, availability — expressed over a defined window. SLIs are the raw measurements that SLOs reference: an SLI says 'p99 latency this hour was 412ms'; the SLO says 'p99 latency should stay under 500ms 99.9% of the time'.
Good SLIs share three properties: they measure what users actually experience (not internal system metrics), they degrade gracefully (a percentile, not a hard count), and they're computable from telemetry the service already emits. Common SLIs include availability (successful requests ÷ total requests), latency percentiles (p50, p95, p99), error budget burn rate, and freshness (time since last successful refresh of derived data). The most common SLI mistake is measuring server-side success when the user's experience is determined by edge-side success — a 200 OK at the load balancer means nothing if the response took 30 seconds to arrive.
Related terms
- SLO
A Service-Level Objective is a target reliability metric for a service — typically expressed as a percentage over a time window.
- Error budget
An error budget is the allowable reliability gap between the SLA (customer contract) and the SLO (operational target).
- Latency percentile
A latency percentile (p50, p95, p99, p999) is the response time below which that share of requests completed.