Four golden signals
The four golden signals — latency, traffic, errors, saturation — are the minimum monitoring set Google SRE recommends for any user-facing service. Together they cover the questions 'is the service responding quickly enough?', 'how much load is it handling?', 'how often is it failing?', and 'how close to its capacity ceiling?'.
The framing comes from chapter 6 of Google's Site Reliability Engineering book. Latency tracks both successful and failed requests separately — a fast error is still an error. Traffic is the demand placed on the service (RPS, concurrent users, queue depth). Errors are the rate of failed requests, broken down by failure mode. Saturation is the fullness of the most-constrained resource (CPU, memory, IOPS, queue depth) and is the leading indicator for upcoming capacity problems. A service instrumented for the four golden signals plus a small set of business-specific SLIs covers most operational diagnosis needs without the dashboard sprawl that comes from monitoring everything.
Related terms
- Observability
Observability is the property of a system that lets engineers understand its internal state from external outputs — answering questions about how the system is behaving without modifying it.
- SLI
A Service Level Indicator is a numerical measurement of one specific dimension of a service's behaviour — request latency, error rate, throughput, availability — expressed over a defined window.
- Saturation
Saturation is the measure of how full the most-constrained resource of a system is — CPU, memory, IOPS, network bandwidth, queue depth, file descriptors.