API Design and Protocols
An API is the contract between your system and everyone who depends on it. Get it right and a mobile app, a partner integration, and an internal service can all talk to your backend for years without anyone touching the code that serves them. Get it wrong and a single careless change breaks paying customers in production, a missing rate limit lets one bad client take down the whole service, or a retried payment charges a user twice. The stakes are concrete: Stripe processes billions of dollars through APIs where a duplicate request is a real refund, and every major platform you use exposes its product through an API before it ever ships a screen.
This category covers how to design those contracts and the protocols that carry them. You will learn the request and response styles teams actually choose between (SOAP, GraphQL, gRPC), the wire formats that encode the bytes (Protocol Buffers, Apache Avro, Apache Thrift, MessagePack), the HTTP transports underneath (HTTP/2 multiplexing, HTTP/3 QUIC), and the operational controls that keep an API healthy in production: rate limiting, throttling, quotas, idempotency keys, versioning, CORS, and API keys. The goal is for you to make these decisions on purpose instead of copying whatever the last tutorial used.
What an API contract really is
An API is a promise about inputs, outputs, and behavior. A client sends a request shaped a certain way and expects a response shaped a certain way, with predictable status codes and errors. The moment a real client depends on that shape, you have a contract you cannot quietly break. That is why API Versioning exists: when you need to change the contract, you give old clients a path that still works while new clients adopt the new one. Skipping versioning is the most common way teams lock themselves into a design they regret.
The contract is more than the data shape. It includes who is allowed to call the API (API Keys identify and authenticate a caller), where calls are allowed to come from (CORS controls which browser origins a service will accept), and what happens when the same request arrives twice (Idempotency Keys let a client safely retry without causing duplicate side effects). These four concepts, versioning, keys, CORS, and idempotency, are the difference between an API that survives contact with real traffic and one that generates support tickets.
A useful habit is to treat your API documentation as the source of truth and your implementation as something that must conform to it. When the contract is explicit, every other decision in this category, which protocol, which encoding, which limits, becomes a choice you can reason about instead of an accident.
Choosing a protocol and an encoding
The big architectural choice is the request style. SOAP is the older, heavily standardized approach built on XML and strict contracts, still common in enterprise, banking, and legacy integrations where rigor and tooling matter more than speed. GraphQL lets the client ask for exactly the fields it needs in a single query, which is why front-end heavy products lean on it, and GraphQL Schema is where you define the types and relationships that make that flexibility safe. gRPC uses a compact binary protocol over HTTP/2 and is built for fast service-to-service calls inside a backend, where latency and throughput dominate.
Underneath the protocol sits the encoding, the actual bytes on the wire. Protocol Buffers, Apache Avro, Apache Thrift, and MessagePack are compact binary formats that are smaller and faster to parse than JSON. They trade human readability for size and speed, which is the right trade between your own services and the wrong one for a public API meant to be easy to consume. A Schema Registry keeps everyone agreeing on the structure of those binary messages over time so a producer and a consumer never disagree about what a field means.
The transport layer matters too. HTTP/2 Multiplexing sends many requests over one connection without head-of-line blocking at the application layer, and HTTP/3 QUIC pushes that further by running over UDP to remove connection setup and stalls on lossy networks. You rarely pick these by hand, but knowing what they fix explains why a modern API can feel faster than an old one on identical hardware.
Protecting an API in production
A public API is a shared resource, and shared resources get abused, accidentally or on purpose. Rate Limiting caps how many requests a client may make in a window, while Throttling slows clients down rather than rejecting them outright. The classic algorithms behind these are worth knowing by name: the Token Bucket Algorithm allows controlled bursts, the Leaky Bucket Algorithm smooths traffic into a steady rate, and the Sliding Window counts requests over a moving time range to avoid the unfair edges of fixed windows. API Rate Limiting and Quota Management apply these ideas at the level of individual customers or plans, which is also how usage-based pricing works.
The other half of production health is efficiency on the wire. Keep-Alive Connections and Connection Reuse avoid paying the cost of opening a new TCP and TLS connection for every call, which is one of the cheapest large wins available. Request Batching and Response Batching combine many small operations into one round trip, cutting overhead when a client needs a lot of small things at once.
These controls interact. A good API publishes its limits, returns clear errors when a client crosses them, and supports retries that are safe because of idempotency keys. Together they let one service stay up while one misbehaving client is contained, instead of everyone going down together.
How real companies put it together
Most large systems do not pick one protocol and stop. A common pattern is gRPC for internal service-to-service traffic, where speed wins, and a friendlier REST or GraphQL layer at the edge for clients you do not control. To make that edge layer pleasant, teams use API Composition to gather data from several backend services into one response, and Backend for Frontend (BFF), where each client type, web, iOS, Android, gets its own tailored API rather than forcing one generic shape on all of them.
Stripe is the textbook example of API discipline: versioned endpoints, mandatory idempotency keys on payment creation so a retried request never double-charges, and clear rate limits with documented errors. Netflix popularized the BFF idea because a TV remote and a phone have very different needs from the same catalog. GitHub runs both a REST API and a GraphQL API side by side so integrators can choose the style that fits their use case. The lesson across all of them is that protocol, encoding, limits, and shape are separate decisions, and the strongest APIs make each one deliberately.