Zepto System Design Interview: 10-Minute Delivery at Scale
Zepto promises groceries in about 10 minutes by keeping its own stock in hundreds of dark stores placed within a couple of kilometers of customers, and packing an order in under 75 seconds. It reported growing past half a million orders a day and an annualized order value near 4 billion dollars, which forced it to rebuild its order pipeline to survive the peak-hour rush.
Designing Zepto is the quick-commerce problem, which is different from food delivery. Zepto does not pick from restaurants or third-party shops; it stocks its own small warehouses, called dark stores, placed close to customers, and delivers in about 10 minutes. That promise drives everything: dense stores, a small curated set of high-demand products, in-store picking in under 75 seconds, and real-time inventory per store so it never sells what a specific store does not have. This walkthrough centers on the dark-store model and the data architecture Zepto has published, including a purpose-built order pipeline that splits fast draft orders from durable confirmed orders, and is honest that the store-placement and dispatch details are the general quick-commerce pattern.
Asked at: Commonly asked at Zepto, Blinkit, Swiggy Instamart, and quick-commerce and retail teams, and the general forms, meaning design a grocery-delivery app, a warehouse or inventory system, or a hyperlocal fulfillment system, show up at most product companies for SDE2 and SDE3 rounds. Zepto is a good question because the 10-minute promise turns a delivery problem into a tightly coupled inventory, warehouse, and fulfillment problem with a strict latency budget.
Why this question is asked
Quick commerce forces choices that a normal e-commerce or food-delivery design does not. The 10-minute promise means you cannot rely on a courier picking from a distant store; you need your own stock physically near the customer, which turns store placement, per-store inventory, and in-store picking into first-class system problems. Real-time inventory becomes a hard correctness constraint, because selling an item a specific dark store has run out of breaks the promise. And the order volume concentrates into peak hours, which stresses the order database in a very specific way. Interviewers use Zepto to check whether you understand the dark-store model, can design hyperlocal real-time inventory, and can reason about an order pipeline that stays fast and correct under peak-hour contention, rather than just drawing a generic cart and checkout.
Requirements
Always clarify these in the first 5 minutes of the interview. Do not start drawing boxes until both lists are agreed.
Functional requirements
- Customer enters an address and sees the products actually available at the dark store that serves them
- Customer browses a curated catalog, builds a cart, and places an order, paying by UPI, card, or wallet
- The order is assigned to the serving dark store, picked, packed, and delivered in about 10 minutes
- In-store staff get a picking task that assembles the order quickly, in under about 75 seconds
- A rider is dispatched to carry the order from the dark store to the customer
- Inventory per dark store is tracked in real time, so the app only offers what is in stock at that store
- Order tracking, and handling of items that turn out to be unavailable
- Absorb evening and weekend demand peaks
Non-functional requirements
- Real-time, per-store inventory accuracy, so a customer is never sold an item their store does not have
- The whole flow, order to doorstep, fits inside roughly a 10-minute budget
- Fast product and availability reads on the storefront, which is browse-heavy
- The order write path stays fast and correct during peak hours, when orders concentrate
- High availability on the order and inventory paths, since downtime directly loses orders
- Scale to well over half a million orders a day across hundreds of stores
- Efficient enough per order to work at quick-commerce margins
Back-of-envelope scale estimates
Show your math. Pulling numbers from thin air signals you have not thought about the load.
Orders per day
550K+ (2024), millions (per Zepto)
Analyst estimates put Zepto around 550,000 orders a day in early 2024, rising past 700,000 later that year, and Zepto's own engineering writing refers to millions of orders a day. This concentrated volume is what stresses the order pipeline.
Dark stores
300+ (early 2024) to 550+
Zepto reportedly ran more than 300 dark stores in early 2024, growing past 550, each doing on the order of 1,500 orders a day. Store count and density are what make 10-minute delivery physically possible.
Delivery and packing time
~10 min delivery, under 75s packing
Zepto's promise is delivery in about 10 minutes, and it reported assembling an order inside a dark store in under 75 seconds. The picking time is a real, published constraint that shapes store layout and the picking flow.
Annualized order value
~$1.2B (2024) to ~$4B (2025)
Zepto's annualized gross order value was reported around 1.2 billion dollars in early 2024 and near 4 billion dollars by early 2025, a roughly threefold rise. It shows how fast the volume the systems carry has grown.
Catalog per store
Curated, thousands of SKUs
A dark store carries a curated set of high-demand products, in the low thousands of SKUs, rather than a full supermarket range, because the store is small and the assortment is chosen for turnover. The overall catalog is larger, in the tens of thousands.
Order API performance
sub-10ms draft writes
After rebuilding its order intake on DynamoDB, Zepto reported the create-order path improving average throughput by about 60 percent with roughly 40 percent better p99 latency and sub-10-millisecond writes. This is the peak-survival number.
High-level architecture
Design Zepto around the dark store, because the 10-minute promise makes physical fulfillment the center of the system, not an afterthought. Zepto runs on AWS and has co-published a fair amount of its data architecture, while the store-placement, picking, and dispatch algorithms are the general quick-commerce pattern and are described as such. The model is owned micro-fulfillment. Unlike food delivery, where a courier picks up from a restaurant, Zepto stocks its own small warehouses, dark stores, each holding a curated set of high-demand products and placed within a couple of kilometers of the customers it serves. When a customer opens the app, the system maps their address to the dark store that serves that area and shows only what that specific store has in stock. This is why real-time, per-store inventory is a correctness requirement: the catalog a customer sees is really the shelf of one nearby store, and offering an out-of-stock item breaks the promise. The store-to-address mapping and store placement are the standard hyperlocal approach and are not something Zepto has published in detail. The data architecture is where Zepto has published specifics. The product catalog and availability run on MongoDB, which Zepto adopted for the catalog and product service because the data is naturally nested and document-shaped and MongoDB's in-memory serving replaced an older setup of relational storage plus large caches, cutting critical read latency by about 40 percent and multiplying the traffic it could handle. Product search and discovery run on OpenSearch. The transactional core, user management, payments, and confirmed orders, runs on Amazon Aurora PostgreSQL, where the confirmed-orders table had grown to billions of rows over the years. The order pipeline is the standout. Zepto found that at peak hours, funneling every incoming order straight into the giant Aurora orders table caused lock and maintenance contention that hurt latency exactly when volume was highest. So it split the pipeline. Incoming orders are first written to a draft-order service on DynamoDB, which accepts them with single-digit-millisecond writes and absorbs the peak, and only confirmed, paid orders then flow into the Aurora order-management service for durable transactional handling. This two-phase design, a fast key-value front for intake and a relational store for the committed record, improved the create-order path's throughput by about 60 percent with better tail latency. Around all this, managed Kafka carries events between services and streams change-data-capture to a data lake on S3, and analytics flow through Kafka into ClickHouse. The services run on Kubernetes.
In a real interview, sketch this on the whiteboard before diving into any single box.
Core components
Walk through each service. The interviewer wants to hear what each one owns, not just the names.
Serviceability and dark-store mapping
Maps a customer's address to the dark store that serves their area, so the app shows that store's real stock. Dark stores are small warehouses placed within a couple of kilometers of customers. The mapping and placement follow the standard hyperlocal pattern rather than a Zepto-published design.
Catalog and product service (MongoDB)
Serves the product catalog and availability. Zepto built this on MongoDB because product data is nested and document-shaped, and MongoDB's in-memory serving replaced an older relational-plus-cache setup, cutting critical read latency by about 40 percent and handling several times the traffic. The browse-heavy storefront reads from here.
Search and discovery (OpenSearch)
Powers product search and discovery over the catalog. Read-heavy and tuned for fast, relevant results as customers look for items, sitting alongside the catalog service.
Draft-order service (DynamoDB)
The fast front of the order pipeline. Every incoming order is written here first with single-digit-millisecond writes, which absorbs the peak-hour rush without contending on the main relational orders table. Zepto introduced this to fix peak-time contention, improving create-order throughput by about 60 percent.
Order management (Aurora PostgreSQL)
The durable transactional core for confirmed, paid orders, plus user management and payments, on Amazon Aurora PostgreSQL. The confirmed-orders table had grown to billions of rows. Only committed orders flow here from the draft service, keeping the heavy relational store off the peak intake path.
Real-time inventory
Tracks stock per SKU per dark store in near real time, so the catalog only offers what a store actually has and picking works against accurate counts. Zepto's low-latency data layer serves this live stock state to the app, packers, and riders. The exact decrement and reservation mechanism is not published, so it is described as the standard hyperlocal-inventory pattern.
In-store picking and fulfillment
The dark-store operation that turns an order into a packed bag in under about 75 seconds, then hands it to a rider for the short trip to the customer. Path-optimized picking and rider dispatch follow the general quick-commerce pattern; the sub-75-second packing time is a figure Zepto reported.
Event backbone and analytics
Managed Kafka carries events between services and streams change-data-capture into a data lake on S3, with change capture via database streams and stream processing. Analytics flow through Kafka into ClickHouse for real-time insight. This decouples slow and analytical work from the live order path.
Data model
Pick the right store per table. Justify each choice with the access pattern, not by reflex.
dark_storesstore_id (PK)geoservice_areastatusThe micro-warehouse and the area it serves. An address maps to one serving store. Placement and density are what make 10-minute delivery possible.
productsproduct_id (PK)namecategoryattributes (nested)price_paiseThe catalog, held on MongoDB as nested documents. Read-heavy and served with in-memory speed. The overall catalog is large, but each store stocks a curated subset.
store_inventorystore_idsku_idavailable_qtyupdated_atStock per SKU per dark store, tracked in near real time. This is the correctness core of quick commerce: the app must only offer what this store has, and picking must work against accurate counts.
draft_ordersdraft_id (PK)user_idstore_iditems (jsonb)amount_paisecreated_atThe fast intake record on DynamoDB, written for every incoming order with single-digit-millisecond latency. Absorbs the peak so the relational store is not hit on every attempt. Confirmed drafts become orders.
ordersorder_id (PK)user_idstore_iditems (jsonb)amount_paisepayment_idstatecreated_atThe durable confirmed order on Aurora PostgreSQL, with billions of rows over time. Strong consistency on the money and the committed order. Only paid orders land here, from the draft service.
deliveriesdelivery_id (PK)order_id (FK)rider_idpicked_atdelivered_atstatusThe fulfillment record: picking, rider assignment, and the short trip. The tight time budget, packing under about 75 seconds and delivery in about 10 minutes, lives here.
Deep dives
These are the conversations the interviewer is steering you toward. Practice each one until you can talk through it without notes.
The dark-store model and why it is different from food delivery
The defining choice in quick commerce is owning the fulfillment. Food-delivery apps route a courier to pick up from a restaurant or a third-party store, so their hard problems are matching and dispatch. Zepto instead stocks its own small warehouses, dark stores, each carrying a curated set of high-demand products and placed within a couple of kilometers of customers, and it delivers from there. This changes the whole system. Store placement, what to stock in each store, and mapping a customer to their serving store become first-class problems, and the catalog a customer sees is really the shelf of one nearby store rather than a global marketplace. The 10-minute promise is only physically possible because a store is close and already holds the goods. In an interview, the key insight is that quick commerce trades the asset-light marketplace model for owned, dense, hyperlocal inventory, and every other design decision follows from that trade.
Hyperlocal real-time inventory as a correctness constraint
Because the customer is really shopping one nearby store's shelf, inventory accuracy per store is not a nicety, it is correctness. If the app shows an item that this specific dark store has just run out of, either the order fails after the customer has paid, or the promise is broken. So stock has to be tracked per SKU per dark store in near real time, updated as orders are picked, and reflected in what the app offers for that address. Zepto rebuilt its data layer partly for this, using a low-latency store to serve live stock state to the app, the packers, and the riders. The design tension is between perfect accuracy and cost and speed: truly real-time global inventory is expensive, so the practical approach is fast per-store counts with reservation at checkout so two orders cannot claim the last unit, plus reconciliation. Zepto has not published the exact mechanism, so it is honest to describe this as the standard hyperlocal-inventory pattern while noting that the live stock state is a published requirement.
Hitting 10 minutes: density, curation, and sub-75-second packing
The 10-minute budget is spent across several steps: the customer's order reaching the store, picking and packing, and the rider's trip. Zepto attacks each. Store density and the couple-of-kilometer radius keep the rider leg short. A curated, limited assortment of high-turnover products, rather than a full supermarket, keeps stores small enough to be dense and makes picking fast, because there are fewer items and they are arranged for speed. And the in-store picking itself is optimized to assemble an order in under about 75 seconds, a figure Zepto reported, using a path-optimized picking flow so a picker walks the shortest route through the store. The lesson is that a hard end-to-end latency budget has to be decomposed and each segment attacked separately, and that some of the biggest wins are physical and operational, store placement and assortment, not just software.
The order pipeline: splitting fast intake from the durable record
This is Zepto's clearest published engineering decision. Its confirmed-orders table on Aurora PostgreSQL had grown to billions of rows, and at peak hours, writing every incoming order attempt straight into that table caused lock contention and database-maintenance contention that hurt latency exactly when order volume was highest. Zepto split the pipeline into two phases. Incoming orders are first written to a draft-order service backed by DynamoDB, a key-value store that accepts them with single-digit-millisecond writes and scales to absorb the peak, decoupled from the heavy relational table. Only once an order is confirmed and paid does it flow into the Aurora order-management service as the durable, transactional record. This gave a reported improvement of about 60 percent in create-order throughput with better tail latency, and sub-10-millisecond writes on the draft path. The general principle is powerful: put a fast, horizontally scalable store in front to absorb high-volume intake, and reserve the strongly consistent relational store for the committed record, so peak load does not fight with the durable write path.
Choosing the right store for each workload
Zepto's stack is a good example of matching each workload to a store rather than forcing one database to do everything. The product catalog runs on MongoDB, because product data is naturally nested and document-shaped and MongoDB's in-memory serving let Zepto retire an older relational-plus-large-cache setup, cutting critical read latency by about 40 percent and handling several times the traffic. Search runs on OpenSearch. Order intake runs on DynamoDB for fast, scalable writes. The durable transactional core, users, payments, and confirmed orders, runs on Aurora PostgreSQL for its relational guarantees. Managed Kafka moves events between services and streams change-data-capture to a data lake on S3, and analytics flow through Kafka into ClickHouse. The interview value is the reasoning: document data to a document store, high-volume intake to a fast key-value store, money and committed state to a relational store, and analytics to a columnar store, each chosen for its access pattern.
Surviving the evening and weekend peaks
Quick-commerce demand concentrates into peaks, evenings and weekends, so like other spiky systems the design must target the peak, not the average. Zepto has not published a full surge-handling playbook, but its order-pipeline rebuild is direct evidence of designing for the peak: the whole reason to put a DynamoDB draft-order service in front of Aurora was that peak-hour concentration was causing contention on the relational store. The general levers apply on top of that: scale the stateless services out for known busy windows, serve the browse-heavy catalog and availability from fast, cacheable stores so reads do not overwhelm the backend, and keep inventory reservations correct under contention so popular items in a store are not oversold. The honest framing is to describe the published pipeline split as the concrete peak-survival measure and the rest as the standard spike-handling approach.
Trade-offs to discuss
Every senior interviewer expects you to surface at least 3 of these. Pick the decisions, state the alternatives, and justify your choice.
Owned dark stores versus a marketplace pickup model
A marketplace model, picking from existing restaurants or shops, is asset-light and avoids holding inventory, which is how food delivery works. Zepto instead owns and stocks dense micro-warehouses, which is the only way to guarantee 10-minute delivery, but it means holding inventory, running warehouses, and taking on the risk of unsold stock. The trade is capital and operational complexity in exchange for the speed and control that the quick-commerce promise requires.
A curated, limited assortment versus a full catalog per store
Stocking a full supermarket range in each store would offer more choice, but it would make stores large, sparse, and slow to pick from. A curated set of high-turnover products keeps stores small enough to be dense and near customers and keeps picking fast. The cost is that not everything is available everywhere, accepted because density and speed are what the model sells, and demand data guides what each store carries.
A DynamoDB draft-order front versus writing straight to Aurora
Writing every incoming order directly to the relational orders table is simpler, but at peak, contention on a billions-row Aurora table hurt latency when it mattered most. Putting a DynamoDB draft-order service in front absorbs the peak with fast, scalable writes, and only confirmed orders hit Aurora. The cost is a two-phase pipeline and moving orders between stores, accepted because it improved create-order throughput by about 60 percent and protected the durable path under load.
MongoDB for the catalog versus relational storage plus caches
A relational catalog with large caches in front is a common setup, but Zepto found its product data was naturally nested and that the cache tier had grown unwieldy. Moving the catalog to MongoDB, with document storage and in-memory serving, cut critical read latency by about 40 percent and simplified the stack. The cost is operating another datastore and modeling for it, accepted because it matched the catalog's shape and access pattern better.
Dense small stores versus fewer large warehouses
Fewer large warehouses are cheaper to run per unit of stock and simpler to manage, but they are far from customers, which breaks a 10-minute promise. Many small, dense dark stores put goods within a couple of kilometers of customers, at the cost of more sites, more duplicated inventory, and harder logistics. Quick commerce accepts that higher cost because proximity is the product.
Near-real-time per-store inventory versus perfectly real-time global stock
Perfectly real-time, globally consistent inventory would be ideal but is expensive at this store count and update rate. Fast per-store counts, backed by reservation at checkout and reconciliation, keep the promise that a customer is not sold what their store lacks while staying affordable. The cost is handling rare edge cases through reservation rather than perfect consistency, which is the standard hyperlocal trade.
How Zepto actually does it
Zepto's data architecture is unusually well documented for a young company, through a case study co-authored with AWS and a MongoDB customer story, while the store-placement, picking, and dispatch algorithms are the general quick-commerce pattern rather than Zepto-published designs. Zepto runs on AWS. It reported building its product catalog and availability on MongoDB, adopted for nested document data with in-memory serving in place of an older relational-plus-cache setup, cutting critical read latency by about 40 percent and handling several times the traffic. It reported running search on OpenSearch, and its transactional core, user management, payments, and confirmed orders, on Amazon Aurora PostgreSQL, where the confirmed-orders table had grown to billions of rows. Its most notable published decision is the order pipeline: to fix peak-hour contention on the Aurora orders table, it introduced a draft-order service on DynamoDB that accepts every incoming order with single-digit-millisecond writes, with only confirmed orders flowing into Aurora, improving the create-order path's throughput by about 60 percent with better tail latency. It uses managed Kafka for inter-service events and change-data-capture to a data lake on S3, and streams analytics through Kafka into ClickHouse, with services on Kubernetes. On scale and operations, it reported delivering in about 10 minutes with in-store packing under 75 seconds, running hundreds of dark stores at on the order of 1,500 orders a day each, growing past half a million orders a day, with annualized order value rising from about 1.2 billion dollars in early 2024 toward 4 billion by early 2025. Three accuracy notes for the interview. First, several widely circulated numbers, such as specific daily-order counts above a million, exact store counts near a thousand, a precise average order value, and any peak-concurrency figure, are not from primary sources, so treat them as unverified. Second, the store-to-address mapping, in-store picking algorithm, rider dispatch and batching, and per-store demand forecasting are described here as the standard quick-commerce pattern, not as Zepto-published internals. Third, the exact real-time inventory decrement and reservation mechanism is not public, though serving live stock state is a published requirement.
Sources
- AWS Database Blog, How Zepto scales to millions of orders per day using Amazon DynamoDB: the draft-order and Aurora split, MongoDB catalog, OpenSearch, managed Kafka, and change-data-capture (Zepto co-authored)
- MongoDB, Commerce at scale: Zepto reduces latency by 40 percent with MongoDB: the catalog datastore, in-memory serving, and under-75-second packing
- Zepto Engineering, ClickHouse ingestion at scale: streaming events through Kafka into ClickHouse for analytics
- TechCrunch, Zepto reaches about 1.2 billion dollars in annualized sales per Goldman Sachs: around 550,000 orders a day, 300-plus dark stores, about 1,500 orders per store
- Entrackr, Zepto FY24 financials: revenue, more than 550 dark stores, over 700,000 orders a day, and the 10-minute promise
- Upstox, Zepto nearing 4 billion dollars annualized gross order value with cash burn halved: founder statement, April 2025
Lessons to study before this interview
If any of these topics are fuzzy, the interviewer will catch it. Each lesson is 15 to 60 minutes with diagrams, code, and a quiz.
Geospatial Indexing
intermediate / database types storage
Cache-Aside Pattern
foundation / caching strategies
Message Queues
intermediate / messaging event systems
Database Sharding
foundation / database fundamentals
Load Balancing
foundation / core fundamentals
High Availability
advanced / reliability resilience
Rate Limiting for Resilience
advanced / reliability resilience