Is this a video course?

No. This is an interactive, slide-based learning platform. Each lesson has rich text, animated diagrams, live code editors, and quizzes. You learn by reading, interacting, and doing, not by watching videos passively.

How long do I have access?

Forever. Both pricing tiers are one-time payments with lifetime access. This includes all current 766 lessons and any future content we add.

What level of experience do I need?

None. We start from absolute basics like 'What is latency?' and build up to distributed consensus protocols. The Foundation level assumes zero prior knowledge of system design.

How much does the system design course cost?

7.99 US dollars for lifetime access globally, or 499 Indian rupees for lifetime access in India. One-time payment, no subscription, no hidden fees. 11 lessons are free with no signup required.

What technologies are covered?

Everything from DNS and load balancers to Kubernetes, Kafka, distributed databases, consensus protocols, stream processing, security architecture, and observability. We cover principles and real-world implementations used at Netflix, Google, Amazon, Uber, Stripe, and more.

Is this useful for system design interview preparation?

Yes. The lessons are structured around the exact topics asked in system design interviews at FAANG and top-tier companies. Interactive diagrams help you practice whiteboard-style explanations. Covers everything from URL shortener design to distributed payment systems.

How is this different from ByteByteGo or Educative?

766 interactive lessons (4x more than most competitors), 16 different diagram types that build step by step, real production examples from Netflix, Google, Amazon, Uber, and Stripe, and lifetime access for a one-time payment of $7.99 instead of annual subscriptions costing 100 to 200 dollars per year.

Why does Netflix run its own CDN?

At 280 Tbps of peak egress, third-party CDN bills would cost hundreds of millions per year. Open Connect appliances are placed inside ISP networks for free, so the ISP saves transit fees and Netflix saves CDN fees. Users get the bytes from their local POP.

What is adaptive bitrate streaming?

The client downloads a manifest listing every encoded variant of a title (bitrate, codec, resolution). It picks the best variant for its current bandwidth, downloads short segments (4 to 10 seconds), and re-evaluates after each segment. This keeps the stream playing even when bandwidth drops.

Why does Netflix use Cassandra?

Cassandra handles huge write throughput (every bookmark update, every viewing event) and scales horizontally without manual sharding. The trade-off is no joins and weak consistency, which Netflix is fine with for watch history and recommendations.

How does Netflix recommend what you watch next?

Most rows on your home screen are precomputed by overnight Spark jobs running deep learning models. A thin online service patches in fresh signals (just-watched, time of day) before returning the result. The split keeps heavy ML offline and the online layer fast.

How does DRM work in this design?

The client requests a license from the DRM service after authenticating. The license contains per-title decryption keys, signed for a specific device. Widevine on Android and Chrome, FairPlay on iOS and Safari, PlayReady on Windows and Xbox.

System design interview guide

Design Netflix: System Design Interview Guide

Netflix streams 250+ million hours of video per day to 270+ million subscribers, with peak traffic at 15% of global internet bandwidth.

Designing Netflix forces you to think about video encoding pipelines, multi-CDN delivery with adaptive bitrate, a recommendation system that drives 80% of watched content, and a microservices architecture that has to stay up while half a million users press play in the same minute.

Where it shows up

Commonly asked at Meta, Google, Amazon, Netflix, Disney+, Hulu, and YouTube. It is the canonical video streaming system design problem.

Why this question is asked

Design Netflix tests whether you understand video pipelines (encoding, packaging, DRM), CDN economics, low-latency streaming protocols (HLS, DASH), and personalization at scale. It is also one of the few problems where you should bring up cost as an explicit constraint, since bandwidth dominates.

Requirements

Always clarify these in the first 5 minutes of the interview. Do not start drawing boxes until both lists are agreed.

Functional requirements

Users browse a catalog with personalized rows
Users play any title and resume from where they left off
Video adapts bitrate to the user's bandwidth in real time
Users get personalized recommendations on the home screen
Subtitles, audio tracks, and multiple languages are supported
Users can download titles for offline viewing on mobile
Account sharing limits and household detection

Non-functional requirements

Start-up latency under 2 seconds at the 95th percentile
Buffering ratio below 0.5% of watch time
99.99% availability for the playback service
Global delivery from the nearest edge with no manual region selection
Cost per stream optimized through CDN tiering and codec selection
DRM enforced so content is not trivially extractable

Back-of-envelope scale estimates

Show your math. Pulling numbers from thin air signals you have not thought about the load.

Subscribers

270M

Public Q3 2024 reporting. Assume 1.5 profiles per account on average.

Concurrent streams (peak)

70M

Roughly 25% of subscribers stream at the same time during global peak events.

Average bitrate per stream

4 Mbps

Weighted across SD, HD, and 4K. 4K is 15 Mbps, HD is 5 Mbps, SD is 1 Mbps, with most viewing in HD.

Peak egress bandwidth

280 Tbps

70M concurrent streams times 4 Mbps. This is why CDN placement matters. Netflix runs Open Connect inside ISPs to avoid paying transit on this volume.

Catalog storage per encoded title

1 to 6 TB

Each title is encoded into 100+ variants (codec, resolution, bitrate, HDR, audio language). A two-hour movie is 1 TB after encoding; a series season is 3 to 6 TB.

High-level architecture

The control plane (catalog, recommendations, user profile, billing) runs as microservices on AWS, fronted by an API Gateway and Zuul. The data plane (actual video bytes) is served from Open Connect, Netflix's own CDN, with appliances embedded inside ISP networks. When a user presses play, the client asks a Playback API for a manifest. The manifest lists available bitrates and the URLs of the nearest Open Connect appliances. The client uses adaptive bitrate (HLS or DASH) to switch quality based on observed throughput. Personalization (rows on the home page, ranking within rows) is precomputed offline by ML jobs in Spark and pushed to a low-latency Cassandra cluster fronted by EVCache for reads. Real-time signals (last watched, bookmark position) are written to a Cassandra column family with low consistency requirements.

In a real interview, sketch this on the whiteboard before diving into any single box.

Core components

Walk through each service. The interviewer wants to hear what each one owns, not just the names.

Encoding Pipeline

Ingests source masters, then encodes each title into 100+ variants (codecs: H.264, HEVC, AV1; resolutions: 240p to 4K; HDR profiles; audio tracks per language). Runs as a fleet of GPU and CPU jobs on AWS. The output is packaged into HLS and DASH manifests.

Open Connect (CDN)

Netflix's own CDN, with appliances deployed inside ISP networks. Each appliance caches the most-watched titles for its region. When a user requests a stream, the playback service returns URLs of the nearest appliances. This avoids paying transit on hundreds of terabits.

Playback API

Issues a signed manifest to the client at start. The manifest lists bitrate variants and CDN URLs. Also enforces DRM by issuing a license to authorized clients.

Catalog and Metadata Service

Owns title metadata: titles, descriptions, cast, genres, artwork URLs, parental ratings. Backed by Cassandra. Reads are heavily cached in EVCache because catalog churn is slow.

Recommendation Service

Precomputes ranked rows per user using collaborative filtering and a deep learning model. The top N rows per user are stored in Cassandra. Real-time refinements (just-watched signal, time-of-day re-ranking) happen in a thin online service.

Bookmark and Watch History Service

Writes the current playback position every 5 seconds during a stream. Backed by Cassandra with low consistency since a slightly stale bookmark is harmless. Reads on resume.

DRM and License Service

Issues Widevine, PlayReady, or FairPlay licenses to authenticated clients. Keys are per-title and per-device, with rotation policies.

Data model

Pick the right store per table. Justify each choice with the access pattern, not by reflex.

titles

title_id (PK)namedescriptionrelease_yearduration_secondsratinggenresartwork_url

Cassandra. Heavily cached. Updated by editorial workflows, not user actions.

title_variants

variant_id (PK)title_id (FK)codecresolutionbitrate_kbpshdr_profilemanifest_url

One row per encoded variant. The Playback API picks the right rows based on client capabilities and constructs the manifest.

viewing_history

user_id (PK partition)title_id (clustering)last_position_secondscompletedupdated_at

Cassandra. Partitioned by user_id so the resume query is one partition read. Eventually consistent.

user_recommendations

user_id (PK)row_indextitle_ids[]row_labelgenerated_at

Precomputed by Spark jobs nightly, then patched online by recent activity. Read by the home screen on every load.

Deep dives

These are the conversations the interviewer is steering you toward. Practice each one until you can talk through it without notes.

Adaptive bitrate streaming and the manifest

When a user presses play, the client downloads a manifest (HLS .m3u8 or DASH .mpd) that lists every bitrate variant available for that title. The client estimates its current bandwidth and picks a variant. It downloads short segments (typically 4 to 10 seconds each) and re-evaluates after every segment. If bandwidth drops, it steps down. If buffer health is good and bandwidth is high, it steps up. The encoding pipeline produces a ladder of variants (240p at 300 kbps, 360p at 600 kbps, 480p at 1.2 Mbps, 720p at 2.5 Mbps, 1080p at 5 Mbps, 4K at 15 Mbps) so the client always has a sane choice. The clever part is the manifest: it embeds CDN URLs of the nearest Open Connect appliances, so the client never has to do a separate CDN lookup.

Why Netflix runs its own CDN (Open Connect)

At 280 Tbps of peak egress, paying a third-party CDN like Akamai or Cloudflare would cost hundreds of millions per year. Netflix built Open Connect: rack-mounted appliances loaded with the most-watched 95% of the catalog, deployed for free inside ISP networks. The ISP saves on transit (they would have paid for that bandwidth anyway). Netflix saves on CDN fees. The user gets lower latency because the bytes are now in their ISP's local POP. The remaining 5% of long-tail catalog is served from Netflix's central S3-backed origin.

Precomputed vs real-time recommendations

Most recommendation rows on the home screen are precomputed once a day by Spark jobs running on AWS. The output (top N titles per row per user) is written to Cassandra. This works because user preferences shift slowly. The exception is the Continue Watching row and the Because You Watched X row: these need to respond to events from the last few minutes. A small online service reads the precomputed base rows and patches in fresh signals from the bookmark service before returning to the client. The split keeps the heavy ML offline (cheap, GPU-batched) and the online layer thin (latency-bounded).

Handling a Thursday Stranger Things drop

When a new season drops at midnight Pacific, you get a coordinated traffic spike: 10x the normal Thursday load in 30 minutes. The encoding pipeline preheats the CDN by replicating all variants to every Open Connect appliance hours before the drop. The Playback API auto-scales based on RPS. The recommendation service is cache-warmed with the new title pre-injected into Trending Now rows. The catch is preventing thundering-herd login traffic at midnight: this is handled by JWT tokens that are still valid from earlier in the day, so most clients hit the API gateway with a cached session.

Trade-offs to discuss

Every senior interviewer expects you to surface at least 3 of these. Pick the decisions, state the alternatives, and justify your choice.

Own CDN vs third-party CDN

Owning is cheaper at extreme scale and gives you tighter ISP partnerships. The cost is the engineering investment. Below ~50 Tbps, just use Cloudflare or Akamai. Netflix is past that threshold by 5x.

Cassandra vs SQL for watch history

Cassandra wins on write throughput and easy horizontal scaling. The cost is no joins and weak consistency. For watch history, both are acceptable: a slightly stale bookmark is harmless, and the only query is by user_id.

Precomputed recommendations vs real-time

Precomputed is cheap, batch-friendly, and lets you run heavy ML models offline. The cost is staleness. Real-time is the opposite. Netflix splits the problem: precompute the rows, patch in real-time signals just before serving.

HLS vs DASH

HLS is mandatory on iOS. DASH has better codec flexibility (AV1, HEVC HDR) and a cleaner manifest format. Netflix delivers both. The client picks based on platform.

Encode every codec or pick one and migrate

Encoding into AV1, HEVC, and H.264 means 3x the storage and 3x the encoding cost. But AV1 saves ~30% bandwidth versus H.264, which dwarfs the storage cost at Netflix's egress volume. The math forces you to encode everything.

How Netflix actually does it

Netflix runs Open Connect, its own CDN, with appliances inside thousands of ISPs. Encoding uses a homegrown pipeline that runs on AWS and produces hundreds of variants per title. Personalization is built on a combination of matrix factorization, deep learning, and contextual bandits. The microservices run on Spring Boot in Java with Hystrix for fault tolerance (now replaced by Resilience4j). Service discovery uses Eureka. The main data store for user state is Cassandra, fronted by EVCache (a Netflix fork of Memcached).

Sources

Lessons to study before this interview

If any of these topics are fuzzy, the interviewer will catch it. Each lesson is 15 to 60 minutes with diagrams, code, and a quiz.

Content Delivery Network

foundation / load balancing proxies

Adaptive Bitrate Streaming

intermediate / web content delivery

Video Encoding

intermediate / web content delivery

Capstone: Design a CDN

capstone / capstone

Capstone: Design YouTube and Netflix

capstone / capstone

Frequently asked questions

Practice with 766 system design lessons

Lifetime access for INR 499 or $7.99. Interactive diagrams, runnable code, quizzes, and 20 capstone projects including Design Netflix.

Design Netflix: System Design Interview Guide

Netflix streams 250+ million hours of video per day to 270+ million subscribers, with peak traffic at 15% of global internet bandwidth.

Where it shows up

Commonly asked at Meta, Google, Amazon, Netflix, Disney+, Hulu, and YouTube. It is the canonical video streaming system design problem.

Requirements

Always clarify these in the first 5 minutes of the interview. Do not start drawing boxes until both lists are agreed.

Functional requirements

Users browse a catalog with personalized rows
Users play any title and resume from where they left off
Video adapts bitrate to the user's bandwidth in real time
Users get personalized recommendations on the home screen
Subtitles, audio tracks, and multiple languages are supported
Users can download titles for offline viewing on mobile
Account sharing limits and household detection

Non-functional requirements

Start-up latency under 2 seconds at the 95th percentile
Buffering ratio below 0.5% of watch time
99.99% availability for the playback service
Global delivery from the nearest edge with no manual region selection
Cost per stream optimized through CDN tiering and codec selection
DRM enforced so content is not trivially extractable

Back-of-envelope scale estimates

Show your math. Pulling numbers from thin air signals you have not thought about the load.

Subscribers

270M

Public Q3 2024 reporting. Assume 1.5 profiles per account on average.

Concurrent streams (peak)

70M

Roughly 25% of subscribers stream at the same time during global peak events.

Average bitrate per stream

4 Mbps

Weighted across SD, HD, and 4K. 4K is 15 Mbps, HD is 5 Mbps, SD is 1 Mbps, with most viewing in HD.

Peak egress bandwidth

280 Tbps

70M concurrent streams times 4 Mbps. This is why CDN placement matters. Netflix runs Open Connect inside ISPs to avoid paying transit on this volume.

Catalog storage per encoded title

1 to 6 TB

Each title is encoded into 100+ variants (codec, resolution, bitrate, HDR, audio language). A two-hour movie is 1 TB after encoding; a series season is 3 to 6 TB.

How Netflix actually does it

Design Netflix: System Design Interview Guide

Why this question is asked

Requirements

Functional requirements

Non-functional requirements

Back-of-envelope scale estimates

High-level architecture

Core components

Encoding Pipeline

Open Connect (CDN)

Playback API

Catalog and Metadata Service

Recommendation Service

Bookmark and Watch History Service

DRM and License Service

Data model

Deep dives

Adaptive bitrate streaming and the manifest

Why Netflix runs its own CDN (Open Connect)

Precomputed vs real-time recommendations

Handling a Thursday Stranger Things drop

Trade-offs to discuss

Own CDN vs third-party CDN

Cassandra vs SQL for watch history

Precomputed recommendations vs real-time

HLS vs DASH

Encode every codec or pick one and migrate

How Netflix actually does it

Lessons to study before this interview

Related system design interview questions

Frequently asked questions

Why does Netflix run its own CDN?

What is adaptive bitrate streaming?

Why does Netflix use Cassandra?

How does Netflix recommend what you watch next?

How does DRM work in this design?

Practice with 766 system design lessons

Design Netflix: System Design Interview Guide

Why this question is asked

Requirements

Functional requirements

Non-functional requirements

Back-of-envelope scale estimates

High-level architecture

Core components

Encoding Pipeline

Open Connect (CDN)

Playback API

Catalog and Metadata Service

Recommendation Service

Bookmark and Watch History Service

DRM and License Service

Data model

Deep dives

Adaptive bitrate streaming and the manifest

Why Netflix runs its own CDN (Open Connect)

Precomputed vs real-time recommendations

Handling a Thursday Stranger Things drop

Trade-offs to discuss

Own CDN vs third-party CDN

Cassandra vs SQL for watch history

Precomputed recommendations vs real-time

HLS vs DASH

Encode every codec or pick one and migrate

How Netflix actually does it

Lessons to study before this interview

Related system design interview questions

Frequently asked questions

Why does Netflix run its own CDN?

What is adaptive bitrate streaming?

Why does Netflix use Cassandra?

How does Netflix recommend what you watch next?

How does DRM work in this design?

Practice with 766 system design lessons