SLI
Service Level Indicator: a quantitative measure of service behavior, like the proportion of requests faster than 300ms. The raw metric that feeds SLOs.
What is SLI?
Service Level Indicator: a quantitative measure of service behavior, like the proportion of requests faster than 300ms. The raw metric that feeds SLOs.
SLI is a advanced concept that sits in the Reliability & Resilience area of system design. Engineers reach for it whenever they need to reason about real-world trade-offs in that space — not just for textbook correctness, but because real production systems at companies like Netflix, Amazon, and Google make these decisions every day.
If you want to go deeper than this definition — with diagrams, code, and a quiz to lock it in — work through the "SLI" lesson linked below. It walks through the why, the mechanism, the trade-offs, and how the giants actually use it in production.
Learn SLI in depth
Full interactive lesson with diagrams, code examples, real-world references, and a quiz.
Open the SLI lessonRelated lessons
Lessons that touch on SLI as part of a larger topic.
SLI (Service Level Indicators)
The specific measurements that tell you whether your service is healthy, the numbers behind your promises
intermediate · observability monitoring
SLA/SLO/SLI Overview
Putting it all together, how SLIs, SLOs, and SLAs work as a unified reliability framework
intermediate · observability monitoring
Sliding Windows
Windows that slide with every event, continuous recalculation over the most recent N events or T time
advanced · stream batch processing
Sliding Window
Rate limit with a moving time window, smoother than fixed windows and more accurate
intermediate · api design protocols
Routing Slip
Attach a travel itinerary to the message itself, each service processes it and forwards to the next stop
intermediate · messaging event systems
See also
Related glossary terms you might want to look up next.
SLO
Service Level Objective: a target value for an SLI, like '99.9% of requests under 300ms.' The internal engineering goal that drives reliability investment.
SLA
Service Level Agreement: a contractual commitment between provider and customer specifying uptime, response time, and penalties for breaches. The business version of an SLO.
Metrics
Numerical measurements collected over time that describe system behavior: request rate, error rate, latency percentiles, CPU utilization. Prometheus is the standard collector.