Load Shedding
Deliberately dropping low-priority requests during overload to protect the system's ability to serve high-priority traffic. Better to serve some requests than crash serving none.
What is Load Shedding?
Deliberately dropping low-priority requests during overload to protect the system's ability to serve high-priority traffic. Better to serve some requests than crash serving none.
Load Shedding is a advanced concept that sits in the Reliability & Resilience area of system design. Engineers reach for it whenever they need to reason about real-world trade-offs in that space — not just for textbook correctness, but because real production systems at companies like Netflix, Amazon, and Google make these decisions every day.
If you want to go deeper than this definition — with diagrams, code, and a quiz to lock it in — work through the "Load Shedding" lesson linked below. It walks through the why, the mechanism, the trade-offs, and how the giants actually use it in production.
Learn Load Shedding in depth
Full interactive lesson with diagrams, code examples, real-world references, and a quiz.
Open the Load Shedding lessonSee also
Related glossary terms you might want to look up next.
Rate Limiting
Controlling how many requests a client can make in a given time window. Protects your API from abuse and ensures fair usage.
Back Pressure
A flow control mechanism where a slow consumer signals upstream producers to slow down. Prevents systems from being overwhelmed by data they can't process.
Graceful Degradation
A strategy where a system continues to function with reduced capability when a component fails, instead of crashing entirely. Show cached results when the database is down.