Latency
The time delay between sending a request and getting a response. Amazon found every 100ms of extra latency costs 1% in sales.
What is Latency?
The time delay between sending a request and getting a response. Amazon found every 100ms of extra latency costs 1% in sales.
Latency is a foundational concept that sits in the Core Fundamentals area of system design. Engineers reach for it whenever they need to reason about real-world trade-offs in that space — not just for textbook correctness, but because real production systems at companies like Netflix, Amazon, and Google make these decisions every day.
If you want to go deeper than this definition — with diagrams, code, and a quiz to lock it in — work through the "Latency" lesson linked below. It walks through the why, the mechanism, the trade-offs, and how the giants actually use it in production.
Learn Latency in depth
Full interactive lesson with diagrams, code examples, real-world references, and a quiz.
Open the Latency lessonRelated lessons
Lessons that touch on Latency as part of a larger topic.
Latency-Based Routing
Route traffic to the backend with the lowest measured network latency
foundation · load balancing proxies
Network Latency Optimization
Practical techniques to reduce network latency in cloud architectures, from protocol tuning to geographic placement
intermediate · cloud infrastructure
Garbage Collection
Automatic memory reclamation, how GC algorithms work and why they cause latency spikes in distributed systems
advanced · consistency models
Geographic Distribution
Spread data across physical locations to reduce latency and survive regional disasters
intermediate · data replication distribution
Global Tables
Fully replicated tables available in every region, low-latency reads worldwide with the trade-off of slower writes
intermediate · database types storage
See also
Related glossary terms you might want to look up next.
Throughput
The number of operations a system can handle per unit of time. Think of it as how many cars a highway can move per hour.
Bandwidth
The maximum amount of data that can be transferred over a network in a given time. It's the width of the pipe, not how fast the water flows.
CDN
A network of servers distributed globally that caches content close to users. Netflix uses CDNs to stream video from servers near you, not from one central location.