Design Uber: System Design Interview Guide
Uber handles 25 million rides per day across 70+ countries, matching 5 million active drivers with riders in under 15 seconds on average.
Designing Uber means solving real-time location streaming, low-latency geospatial matching, and a strict trip state machine that survives driver disconnects, GPS gaps, and surge events. It is one of the most asked system design problems at FAANG and ride-hailing companies.
Asked at: Commonly asked at Meta, Google, Amazon, Uber, Lyft, DoorDash, Grab, and Ola. It is the canonical real-time geospatial system design problem.
Why this question is asked
Interviewers like Design Uber because it forces you to make decisions about geospatial indexing, real-time bidirectional streaming, a multi-party state machine (rider, driver, server), and dynamic pricing in the same hour. You cannot bluff your way through it with a generic three-tier diagram.
Requirements
Always clarify these in the first 5 minutes of the interview. Do not start drawing boxes until both lists are agreed.
Functional requirements
- Riders can request a ride from point A to point B
- Drivers nearby see the request and can accept within a short window
- Both parties see the other party on a live map during pickup and trip
- Fare is calculated at the start (estimate) and end (final), including surge
- Trip state transitions: requested, matched, en route, in trip, completed, canceled
- Payment is captured automatically when the trip ends
- Rider and driver can rate each other after the trip
Non-functional requirements
- Match latency under 5 seconds at the 99th percentile
- Location updates every 4 seconds per active driver
- 99.99% availability for the matching and trip services
- Strong consistency on payment and trip completion
- Horizontal scaling to 5 million concurrent active drivers
- Geographic distribution across 70+ countries with regional data residency
Back-of-envelope scale estimates
Show your math. Pulling numbers from thin air signals you have not thought about the load.
Daily active riders
20M
Roughly 80% of public reporting on Uber rider numbers. Assume 2 to 4 ride attempts per rider per active day.
Active drivers (concurrent peak)
1.5M
5 million total drivers, peak concurrency around 30%. Each emits one location update every 4 seconds.
Location updates per second
375K
1.5M drivers / 4 seconds. This is the steady-state write load on the location pipeline.
Rides per second (global peak)
800
25M rides per day divided by 86,400 seconds, with a peak factor of about 3x.
Storage per ride record
10 KB
Includes pickup, drop, route waypoints (sampled), fare breakdown, driver and rider IDs. 25M rides per day equals roughly 90 TB per year.
High-level architecture
Clients (rider and driver apps) talk to an API Gateway over HTTPS for REST calls and over a persistent WebSocket for real-time updates. Behind the gateway, three primary services own the workflow: a Location Service ingests driver locations at 4-second cadence and writes to a sharded in-memory geo index (Redis with geohash or a quadtree service). A Matching Service consumes incoming ride requests, queries the geo index for nearby idle drivers, and pushes offers over WebSocket. A Trip Service owns the trip state machine, writes the durable trip record to a sharded SQL store, and emits events to Kafka. Downstream consumers handle pricing (including surge), payments (idempotent capture against Stripe or Adyen), notifications, and analytics. A separate Pricing Service reads recent supply and demand from a stream processor (Flink) to compute surge multipliers per H3 hexagon every minute.
In a real interview, sketch this on the whiteboard before diving into any single box.
Core components
Walk through each service. The interviewer wants to hear what each one owns, not just the names.
Location Service
Receives location updates from drivers every 4 seconds and writes them into a sharded geo index. Uses Uber H3 hexagons or geohashes as the shard key so that a query for nearby drivers hits one or a few shards instead of fanning out globally.
Geo Index
Redis with GEOADD, or a custom service backed by quadtrees, depending on scale. The index answers radius queries in single-digit milliseconds. It is in-memory because the working set is small (just live drivers) and freshness is non-negotiable.
Matching Service
Receives a ride request, calls the geo index for candidate drivers, ranks them by ETA and rating, and dispatches offers in parallel or sequentially. Handles driver acceptance, rejection, and timeout. Uses a distributed lock keyed by driver ID to prevent double-booking.
Trip Service
Owns the canonical trip state machine. Each transition (matched, picked up, completed, canceled) is a durable write to a sharded SQL store partitioned by city. Publishes events to Kafka for downstream consumers.
Pricing and Surge Service
A Flink job consumes the live driver location and ride request streams, computes per-hexagon supply and demand every 60 seconds, and writes the surge multiplier to a low-latency KV store. The Matching Service reads the multiplier when quoting a fare.
Payment Service
Captures the fare from the saved payment method at trip end. Uses idempotency keys so retries do not double-charge. Settlement to drivers runs as a separate daily batch.
Notification Gateway
Pushes WebSocket events for in-app updates (offer, pickup imminent, arriving). Falls back to APNs and FCM for background pushes when the app is killed.
Data model
Pick the right store per table. Justify each choice with the access pattern, not by reflex.
driversdriver_id (PK)current_status (idle, on_trip, offline)vehicle_typerating_avghome_cityStatic driver profile. Sharded by driver_id. Hot status field is also cached in Redis for fast matching reads.
driver_locationsdriver_id (PK)latlngheadingh3_indexupdated_atIn-memory only. Driver locations are not durably persisted at full fidelity. A downsampled trace is logged for completed trips only.
tripstrip_id (PK)rider_iddriver_idpickup_lat, pickup_lngdrop_lat, drop_lngstatefare_centscurrencycreated_atcompleted_atSharded by city or by trip_id hash. Append-mostly. State changes are written as a separate trip_events table to preserve audit history.
paymentspayment_id (PK)trip_id (FK)rider_idamount_centsstatusidempotency_keyprovider_charge_idStrong consistency on this table. Indexed by trip_id and idempotency_key. Webhooks from the payment provider update status asynchronously.
Deep dives
These are the conversations the interviewer is steering you toward. Practice each one until you can talk through it without notes.
Matching riders with drivers in real time
When a ride request arrives, you need to find idle drivers within a small radius (say 3 km) in under 100 ms. Brute force scanning every driver record is out of the question at 1.5M concurrent drivers. The standard answer is a geospatial index: H3 hexagons (Uber's own published library) or geohash strings as the shard key into Redis. A radius query expands to a small set of neighbor hexagons and merges results. Once you have candidates, you rank by ETA (using a routing service) and acceptance rate, then dispatch offers. Offers can be parallel (broadcast to top 5 drivers, first acceptor wins, others get a rejection notice) or sequential (offer to one driver at a time with a 10-second window). Parallel reduces match latency but causes wasted notifications. Sequential is cleaner but slower. Most production systems pick parallel with a tie-breaker held by a Redis distributed lock on driver_id.
Handling driver location updates at 375K writes per second
Every driver emits a location every 4 seconds. At 1.5M concurrent drivers that is 375K writes per second of GPS data. You do not write this to your durable SQL store. You write to a sharded in-memory geo index (Redis with GEOADD or a custom service). The shard key is the H3 cell so writes for the same area land on the same shard, which keeps proximity queries fast. For audit and analytics, downsample the trace (one point every 30 seconds) and write it to a Kafka topic for batch storage in S3 or BigQuery.
Surge pricing without runaway feedback loops
Surge is a multiplier applied to the base fare when demand exceeds supply in a small geographic cell. A Flink job aggregates supply (idle drivers in cell) and demand (incoming ride requests in cell) per minute per H3 cell. The multiplier is published to a Redis cache keyed by cell ID. The Matching Service reads the multiplier when quoting fares. To prevent runaway loops where surge causes riders to abandon, which lowers demand, which removes surge, the multiplier is smoothed over a 5-minute window and clamped to a max (e.g., 5x). The multiplier is locked in at request time so the rider sees the same price they accepted.
Trip state machine and exactly-once payment
A trip has roughly 6 states: requested, matched, en_route_to_pickup, in_trip, completed, canceled. State transitions are guarded by a finite state machine and persisted atomically. The tricky one is completed: the server must capture payment exactly once even if the driver app retries the complete call due to a network glitch. The fix is idempotency: the client generates an idempotency key when it first issues the complete call, the server stores it on first success, and any retry with the same key returns the cached result without re-charging. The payment row also has its own idempotency key so the call to Stripe is itself idempotent.
Trade-offs to discuss
Every senior interviewer expects you to surface at least 3 of these. Pick the decisions, state the alternatives, and justify your choice.
Redis vs custom quadtree service for geo indexing
Redis with GEOADD is operationally simple and gives you sub-millisecond radius queries up to a few million keys per shard. A custom quadtree service is more flexible (dynamic cell sizing, denser cells in cities) but is a multi-month build. Start with Redis. Move to a custom service only after Redis fails at scale.
WebSocket vs long polling for real-time updates
WebSocket gives lower latency and lower battery cost than long polling. The downside is sticky connections, which complicate load balancer rolling deploys. Use WebSocket on a dedicated gateway with connection draining. Fall back to FCM and APNs background pushes when the app is killed.
Sharded SQL vs DynamoDB for trips
Trips need strong consistency for the state machine and idempotent payment. DynamoDB offers strong consistency but its transaction story is more limited than Postgres. Sharded Postgres (by city) gives you joins for analytics queries and is easier to reason about. The cost is sharding ops. At Uber scale, sharded Postgres or MySQL is the proven path.
Parallel vs sequential driver offers
Parallel gets you lower match latency at the cost of one wasted notification per accepted offer. Sequential is cleaner socially but blows match latency past 30 seconds in dense markets. Production Uber uses a hybrid: parallel to the top 1-3 drivers, with one explicitly held by a lock so only the first acceptor wins.
Storing every GPS ping vs downsampled traces
Storing every ping is 90+ TB per year and almost nothing reads it. Downsample to one point every 30 seconds for completed trips. Discard idle-driver pings entirely after they expire from the in-memory index.
How Uber actually does it
Uber published H3, its open-source hexagonal hierarchical geospatial index, which is now the industry default for ride-hailing and delivery (DoorDash, Grab, and others use it). Uber's matching system was originally a Node.js service called Dispatch that has since been rewritten in Go. The trip state is stored in Schemaless, Uber's MySQL-based KV layer. Payments run through a service called Cherami for queueing and Stripe and Braintree for capture. Location streaming uses Apache Kafka for fan-out into Flink jobs that compute surge and ETA features.
Lessons to study before this interview
If any of these topics are fuzzy, the interviewer will catch it. Each lesson is 15 to 60 minutes with diagrams, code, and a quiz.
Geospatial Indexing
intermediate / database types storage
Distributed Locks
advanced / distributed systems core
WebSockets
intermediate / messaging event systems
Database Sharding
foundation / database fundamentals
Idempotency
foundation / core fundamentals
Capstone: Design Uber Ride-Sharing
capstone / capstone