Token Bucket
A rate limiting algorithm where tokens are added to a bucket at a fixed rate. Each request consumes a token; requests are rejected when the bucket is empty. Allows short bursts.
What is Token Bucket?
A rate limiting algorithm where tokens are added to a bucket at a fixed rate. Each request consumes a token; requests are rejected when the bucket is empty. Allows short bursts.
Token Bucket is a intermediate-level concept that sits in the Security Architecture area of system design. Engineers reach for it whenever they need to reason about real-world trade-offs in that space — not just for textbook correctness, but because real production systems at companies like Netflix, Amazon, and Google make these decisions every day.
If you want to go deeper than this definition — with diagrams, code, and a quiz to lock it in — work through the "Token Bucket" lesson linked below. It walks through the why, the mechanism, the trade-offs, and how the giants actually use it in production.
Learn Token Bucket in depth
Full interactive lesson with diagrams, code examples, real-world references, and a quiz.
Open the Token Bucket lessonRelated lessons
Lessons that touch on Token Bucket as part of a larger topic.
Token Bucket Algorithm
A bucket of tokens that refills at a steady rate, the most popular rate limiting algorithm in production
intermediate · api design protocols
Rate Limiting
Protect your API from abuse and overload by controlling how many requests each consumer can make
intermediate · api design protocols
Rate Limiting for Resilience
Protect services from abuse and overload, token bucket, sliding window, and distributed rate limiting
advanced · reliability resilience
Design a Rate Limiter
Design a distributed rate limiting system - token bucket, sliding window, and protecting services at massive scale
capstone · capstone
See also
Related glossary terms you might want to look up next.
Rate Limiting
Controlling how many requests a client can make in a given time window. Protects your API from abuse and ensures fair usage.
Sliding Window
A rate limiting algorithm that tracks requests in a rolling time window. More accurate than fixed windows because it smooths out spikes at window boundaries.
Throttling
Slowing down the rate of processing requests instead of rejecting them outright. The gentler cousin of rate limiting.