Is this a video course?

No. This is an interactive, slide-based learning platform. Each lesson has rich text, animated diagrams, live code editors, and quizzes. You learn by reading, interacting, and doing, not by watching videos passively.

How long do I have access?

Forever. Both pricing tiers are one-time payments with lifetime access. This includes all current 766 lessons and any future content we add.

What level of experience do I need?

None. We start from absolute basics like 'What is latency?' and build up to distributed consensus protocols. The Foundation level assumes zero prior knowledge of system design.

How much does the system design course cost?

7.99 US dollars for lifetime access globally, or 499 Indian rupees for lifetime access in India. One-time payment, no subscription, no hidden fees. 11 lessons are free with no signup required.

What technologies are covered?

Everything from DNS and load balancers to Kubernetes, Kafka, distributed databases, consensus protocols, stream processing, security architecture, and observability. We cover principles and real-world implementations used at Netflix, Google, Amazon, Uber, Stripe, and more.

Is this useful for system design interview preparation?

Yes. The lessons are structured around the exact topics asked in system design interviews at FAANG and top-tier companies. Interactive diagrams help you practice whiteboard-style explanations. Covers everything from URL shortener design to distributed payment systems.

How is this different from ByteByteGo or Educative?

766 interactive lessons (4x more than most competitors), 16 different diagram types that build step by step, real production examples from Netflix, Google, Amazon, Uber, and Stripe, and lifetime access for a one-time payment of $7.99 instead of annual subscriptions costing 100 to 200 dollars per year.

How does Uber match a rider with a driver so quickly?

Uber maintains an in-memory geospatial index keyed by H3 hexagon. When a request arrives, the matching service queries the index for idle drivers in nearby hexagons, ranks them by ETA, and dispatches an offer over WebSocket. The whole loop runs in under 5 seconds at the 99th percentile.

What is H3 and why does Uber use it instead of geohash?

H3 is a hexagonal hierarchical index where every cell has exactly 6 neighbors at uniform distance. Geohash uses rectangles, which have uneven neighbor distances and weird behavior at the equator and prime meridian. H3 makes radius queries simpler and ETA estimates more uniform.

How does Uber prevent double-booking a driver?

When the matching service dispatches an offer, it acquires a distributed lock keyed by driver_id with a short TTL. Only the first acceptor that successfully takes the lock wins the trip. Other parallel offers are rejected with a clear reason code.

Why is surge pricing capped?

Without a cap, surge creates a runaway feedback loop: high surge scares riders away, which lowers demand, which removes surge, which brings riders back, which spikes surge again. A clamp and a smoothing window stabilize the multiplier.

How do you store millions of driver GPS pings?

You do not. Live pings live only in the in-memory geo index. Downsampled traces (one point every 30 seconds) are logged to S3 or BigQuery for analytics. Idle-driver pings are discarded entirely after they expire.

System design interview guide

Design Uber: System Design Interview Guide

Uber handles 25 million rides per day across 70+ countries, matching 5 million active drivers with riders in under 15 seconds on average.

Designing Uber means solving real-time location streaming, low-latency geospatial matching, and a strict trip state machine that survives driver disconnects, GPS gaps, and surge events. It is one of the most asked system design problems at FAANG and ride-hailing companies.

Where it shows up

Commonly asked at Meta, Google, Amazon, Uber, Lyft, DoorDash, Grab, and Ola. It is the canonical real-time geospatial system design problem.

Why this question is asked

Interviewers like Design Uber because it forces you to make decisions about geospatial indexing, real-time bidirectional streaming, a multi-party state machine (rider, driver, server), and dynamic pricing in the same hour. You cannot bluff your way through it with a generic three-tier diagram.

Requirements

Always clarify these in the first 5 minutes of the interview. Do not start drawing boxes until both lists are agreed.

Functional requirements

Riders can request a ride from point A to point B
Drivers nearby see the request and can accept within a short window
Both parties see the other party on a live map during pickup and trip
Fare is calculated at the start (estimate) and end (final), including surge
Trip state transitions: requested, matched, en route, in trip, completed, canceled
Payment is captured automatically when the trip ends
Rider and driver can rate each other after the trip

Non-functional requirements

Match latency under 5 seconds at the 99th percentile
Location updates every 4 seconds per active driver
99.99% availability for the matching and trip services
Strong consistency on payment and trip completion
Horizontal scaling to 5 million concurrent active drivers
Geographic distribution across 70+ countries with regional data residency

Back-of-envelope scale estimates

Show your math. Pulling numbers from thin air signals you have not thought about the load.

Daily active riders

20M

Roughly 80% of public reporting on Uber rider numbers. Assume 2 to 4 ride attempts per rider per active day.

Active drivers (concurrent peak)

1.5M

5 million total drivers, peak concurrency around 30%. Each emits one location update every 4 seconds.

Location updates per second

375K

1.5M drivers / 4 seconds. This is the steady-state write load on the location pipeline.

Rides per second (global peak)

800

25M rides per day divided by 86,400 seconds, with a peak factor of about 3x.

Storage per ride record

10 KB

Includes pickup, drop, route waypoints (sampled), fare breakdown, driver and rider IDs. 25M rides per day equals roughly 90 TB per year.

High-level architecture

Clients (rider and driver apps) talk to an API Gateway over HTTPS for REST calls and over a persistent WebSocket for real-time updates. Behind the gateway, three primary services own the workflow: a Location Service ingests driver locations at 4-second cadence and writes to a sharded in-memory geo index (Redis with geohash or a quadtree service). A Matching Service consumes incoming ride requests, queries the geo index for nearby idle drivers, and pushes offers over WebSocket. A Trip Service owns the trip state machine, writes the durable trip record to a sharded SQL store, and emits events to Kafka. Downstream consumers handle pricing (including surge), payments (idempotent capture against Stripe or Adyen), notifications, and analytics. A separate Pricing Service reads recent supply and demand from a stream processor (Flink) to compute surge multipliers per H3 hexagon every minute.

In a real interview, sketch this on the whiteboard before diving into any single box.

Core components

Walk through each service. The interviewer wants to hear what each one owns, not just the names.

Location Service

Receives location updates from drivers every 4 seconds and writes them into a sharded geo index. Uses Uber H3 hexagons or geohashes as the shard key so that a query for nearby drivers hits one or a few shards instead of fanning out globally.

Geo Index

Redis with GEOADD, or a custom service backed by quadtrees, depending on scale. The index answers radius queries in single-digit milliseconds. It is in-memory because the working set is small (just live drivers) and freshness is non-negotiable.

Matching Service

Receives a ride request, calls the geo index for candidate drivers, ranks them by ETA and rating, and dispatches offers in parallel or sequentially. Handles driver acceptance, rejection, and timeout. Uses a distributed lock keyed by driver ID to prevent double-booking.

Trip Service

Owns the canonical trip state machine. Each transition (matched, picked up, completed, canceled) is a durable write to a sharded SQL store partitioned by city. Publishes events to Kafka for downstream consumers.

Pricing and Surge Service

A Flink job consumes the live driver location and ride request streams, computes per-hexagon supply and demand every 60 seconds, and writes the surge multiplier to a low-latency KV store. The Matching Service reads the multiplier when quoting a fare.

Payment Service

Captures the fare from the saved payment method at trip end. Uses idempotency keys so retries do not double-charge. Settlement to drivers runs as a separate daily batch.

Notification Gateway

Pushes WebSocket events for in-app updates (offer, pickup imminent, arriving). Falls back to APNs and FCM for background pushes when the app is killed.

Data model

Pick the right store per table. Justify each choice with the access pattern, not by reflex.

drivers

driver_id (PK)current_status (idle, on_trip, offline)vehicle_typerating_avghome_city

Static driver profile. Sharded by driver_id. Hot status field is also cached in Redis for fast matching reads.

driver_locations

driver_id (PK)latlngheadingh3_indexupdated_at

In-memory only. Driver locations are not durably persisted at full fidelity. A downsampled trace is logged for completed trips only.

trips

trip_id (PK)rider_iddriver_idpickup_lat, pickup_lngdrop_lat, drop_lngstatefare_centscurrencycreated_atcompleted_at

Sharded by city or by trip_id hash. Append-mostly. State changes are written as a separate trip_events table to preserve audit history.

payments

payment_id (PK)trip_id (FK)rider_idamount_centsstatusidempotency_keyprovider_charge_id

Strong consistency on this table. Indexed by trip_id and idempotency_key. Webhooks from the payment provider update status asynchronously.

Deep dives

These are the conversations the interviewer is steering you toward. Practice each one until you can talk through it without notes.

Matching riders with drivers in real time

When a ride request arrives, you need to find idle drivers within a small radius (say 3 km) in under 100 ms. Brute force scanning every driver record is out of the question at 1.5M concurrent drivers. The standard answer is a geospatial index: H3 hexagons (Uber's own published library) or geohash strings as the shard key into Redis. A radius query expands to a small set of neighbor hexagons and merges results. Once you have candidates, you rank by ETA (using a routing service) and acceptance rate, then dispatch offers. Offers can be parallel (broadcast to top 5 drivers, first acceptor wins, others get a rejection notice) or sequential (offer to one driver at a time with a 10-second window). Parallel reduces match latency but causes wasted notifications. Sequential is cleaner but slower. Most production systems pick parallel with a tie-breaker held by a Redis distributed lock on driver_id.

Handling driver location updates at 375K writes per second

Every driver emits a location every 4 seconds. At 1.5M concurrent drivers that is 375K writes per second of GPS data. You do not write this to your durable SQL store. You write to a sharded in-memory geo index (Redis with GEOADD or a custom service). The shard key is the H3 cell so writes for the same area land on the same shard, which keeps proximity queries fast. For audit and analytics, downsample the trace (one point every 30 seconds) and write it to a Kafka topic for batch storage in S3 or BigQuery.

Surge pricing without runaway feedback loops

Surge is a multiplier applied to the base fare when demand exceeds supply in a small geographic cell. A Flink job aggregates supply (idle drivers in cell) and demand (incoming ride requests in cell) per minute per H3 cell. The multiplier is published to a Redis cache keyed by cell ID. The Matching Service reads the multiplier when quoting fares. To prevent runaway loops where surge causes riders to abandon, which lowers demand, which removes surge, the multiplier is smoothed over a 5-minute window and clamped to a max (e.g., 5x). The multiplier is locked in at request time so the rider sees the same price they accepted.

Trip state machine and exactly-once payment

A trip has roughly 6 states: requested, matched, en_route_to_pickup, in_trip, completed, canceled. State transitions are guarded by a finite state machine and persisted atomically. The tricky one is completed: the server must capture payment exactly once even if the driver app retries the complete call due to a network glitch. The fix is idempotency: the client generates an idempotency key when it first issues the complete call, the server stores it on first success, and any retry with the same key returns the cached result without re-charging. The payment row also has its own idempotency key so the call to Stripe is itself idempotent.

Trade-offs to discuss

Every senior interviewer expects you to surface at least 3 of these. Pick the decisions, state the alternatives, and justify your choice.

Redis vs custom quadtree service for geo indexing

Redis with GEOADD is operationally simple and gives you sub-millisecond radius queries up to a few million keys per shard. A custom quadtree service is more flexible (dynamic cell sizing, denser cells in cities) but is a multi-month build. Start with Redis. Move to a custom service only after Redis fails at scale.

WebSocket vs long polling for real-time updates

WebSocket gives lower latency and lower battery cost than long polling. The downside is sticky connections, which complicate load balancer rolling deploys. Use WebSocket on a dedicated gateway with connection draining. Fall back to FCM and APNs background pushes when the app is killed.

Sharded SQL vs DynamoDB for trips

Trips need strong consistency for the state machine and idempotent payment. DynamoDB offers strong consistency but its transaction story is more limited than Postgres. Sharded Postgres (by city) gives you joins for analytics queries and is easier to reason about. The cost is sharding ops. At Uber scale, sharded Postgres or MySQL is the proven path.

Parallel vs sequential driver offers

Parallel gets you lower match latency at the cost of one wasted notification per accepted offer. Sequential is cleaner socially but blows match latency past 30 seconds in dense markets. Production Uber uses a hybrid: parallel to the top 1-3 drivers, with one explicitly held by a lock so only the first acceptor wins.

Storing every GPS ping vs downsampled traces

Storing every ping is 90+ TB per year and almost nothing reads it. Downsample to one point every 30 seconds for completed trips. Discard idle-driver pings entirely after they expire from the in-memory index.

How Uber actually does it

Uber published H3, its open-source hexagonal hierarchical geospatial index, which is now the industry default for ride-hailing and delivery (DoorDash, Grab, and others use it). Uber's matching system was originally a Node.js service called Dispatch that has since been rewritten in Go. The trip state is stored in Schemaless, Uber's MySQL-based KV layer. Payments run through a service called Cherami for queueing and Stripe and Braintree for capture. Location streaming uses Apache Kafka for fan-out into Flink jobs that compute surge and ETA features.

Sources

Lessons to study before this interview

If any of these topics are fuzzy, the interviewer will catch it. Each lesson is 15 to 60 minutes with diagrams, code, and a quiz.

Geospatial Indexing

intermediate / database types storage

Distributed Locks

advanced / distributed systems core

WebSockets

intermediate / messaging event systems

Database Sharding

foundation / database fundamentals

Idempotency

foundation / core fundamentals

Capstone: Design Uber Ride-Sharing

capstone / capstone

Frequently asked questions

Practice with 766 system design lessons

Lifetime access for INR 499 or $7.99. Interactive diagrams, runnable code, quizzes, and 20 capstone projects including Design Uber.

Design Uber: System Design Interview Guide

Uber handles 25 million rides per day across 70+ countries, matching 5 million active drivers with riders in under 15 seconds on average.

Where it shows up

Commonly asked at Meta, Google, Amazon, Uber, Lyft, DoorDash, Grab, and Ola. It is the canonical real-time geospatial system design problem.

Requirements

Always clarify these in the first 5 minutes of the interview. Do not start drawing boxes until both lists are agreed.

Functional requirements

Riders can request a ride from point A to point B
Drivers nearby see the request and can accept within a short window
Both parties see the other party on a live map during pickup and trip
Fare is calculated at the start (estimate) and end (final), including surge
Trip state transitions: requested, matched, en route, in trip, completed, canceled
Payment is captured automatically when the trip ends
Rider and driver can rate each other after the trip

Non-functional requirements

Match latency under 5 seconds at the 99th percentile
Location updates every 4 seconds per active driver
99.99% availability for the matching and trip services
Strong consistency on payment and trip completion
Horizontal scaling to 5 million concurrent active drivers
Geographic distribution across 70+ countries with regional data residency

Back-of-envelope scale estimates

Show your math. Pulling numbers from thin air signals you have not thought about the load.

Daily active riders

20M

Roughly 80% of public reporting on Uber rider numbers. Assume 2 to 4 ride attempts per rider per active day.

Active drivers (concurrent peak)

1.5M

5 million total drivers, peak concurrency around 30%. Each emits one location update every 4 seconds.

Location updates per second

375K

1.5M drivers / 4 seconds. This is the steady-state write load on the location pipeline.

Rides per second (global peak)

800

25M rides per day divided by 86,400 seconds, with a peak factor of about 3x.

Storage per ride record

10 KB

Includes pickup, drop, route waypoints (sampled), fare breakdown, driver and rider IDs. 25M rides per day equals roughly 90 TB per year.

How Uber actually does it

Design Uber: System Design Interview Guide

Why this question is asked

Requirements

Functional requirements

Non-functional requirements

Back-of-envelope scale estimates

High-level architecture

Core components

Location Service

Geo Index

Matching Service

Trip Service

Pricing and Surge Service

Payment Service

Notification Gateway

Data model

Deep dives

Matching riders with drivers in real time

Handling driver location updates at 375K writes per second

Surge pricing without runaway feedback loops

Trip state machine and exactly-once payment

Trade-offs to discuss

Redis vs custom quadtree service for geo indexing

WebSocket vs long polling for real-time updates

Sharded SQL vs DynamoDB for trips

Parallel vs sequential driver offers

Storing every GPS ping vs downsampled traces

How Uber actually does it

Lessons to study before this interview

Related system design interview questions

Frequently asked questions

How does Uber match a rider with a driver so quickly?

What is H3 and why does Uber use it instead of geohash?

How does Uber prevent double-booking a driver?

Why is surge pricing capped?

How do you store millions of driver GPS pings?

Practice with 766 system design lessons

Design Uber: System Design Interview Guide

Why this question is asked

Requirements

Functional requirements

Non-functional requirements

Back-of-envelope scale estimates

High-level architecture

Core components

Location Service

Geo Index

Matching Service

Trip Service

Pricing and Surge Service

Payment Service

Notification Gateway

Data model

Deep dives

Matching riders with drivers in real time

Handling driver location updates at 375K writes per second

Surge pricing without runaway feedback loops

Trip state machine and exactly-once payment

Trade-offs to discuss

Redis vs custom quadtree service for geo indexing

WebSocket vs long polling for real-time updates

Sharded SQL vs DynamoDB for trips

Parallel vs sequential driver offers

Storing every GPS ping vs downsampled traces

How Uber actually does it

Lessons to study before this interview

Related system design interview questions

Frequently asked questions

How does Uber match a rider with a driver so quickly?

What is H3 and why does Uber use it instead of geohash?

How does Uber prevent double-booking a driver?

Why is surge pricing capped?

How do you store millions of driver GPS pings?

Practice with 766 system design lessons