Picture this. You walk into a coffee shop every single morning and order the exact same thing, a medium cappuccino with oat milk. And every single morning, the barista asks you: "What would you like?" They write it down on a fresh cup. They look up the price. They start from scratch. Every. Single. Time.
Now imagine a different barista. After the third morning, she sees you walk in and just starts making your cappuccino. She remembered. No questions, no lookups, no waiting. That's a local cache.
In software, your application does the same dumb thing as the first barista, all the time. A function computes the same result. A service calls the same API endpoint. A database query returns the same rows. Over and over and over. And each time, it starts from zero.
A local cache is the simplest fix: store the result right there in memory, in the same process, and hand it back instantly the next time someone asks for it. No network call. No disk read. Just memory. Nanoseconds instead of milliseconds.
A local cache is a data store that lives inside the same process as the application that uses it. It's not a separate service. It's not on another server. It's right there, in the same chunk of RAM that your application is already using.
When your code needs a piece of data, it checks the local cache first. If the data is there, called a cache hit, it gets returned immediately. If it's not there, a cache miss, the code fetches it from the original source (database, API, filesystem) and stores a copy in the cache for next time.
| Property | Local Cache |
|---|---|
| Location | Same process, same memory space |
| Speed | Nanoseconds (memory access) |
| Scope |
| Only visible to this one process |
| Lifetime | Dies when the process dies |
| Size | Limited by process memory |
| Shared? | No, each instance has its own copy |
The key thing to understand: a local cache is not shared. If you're running three instances of your application behind a , each instance has its own separate local cache. They don't know about each other. This has consequences, and we'll get into that when we talk about application-level caching.
Your application, the local cache, and the database form a simple triangle. Every read follows the same path: check the cache first, fall back to the database on a miss, and store the result for next time.
Local caches work great on a single server. But the moment you deploy multiple instances behind a , things get interesting, and not in a good way.
At its core, a local cache is just a hash map. A dictionary. A key-value store sitting in memory.
You call cache.get("user:42"). The hash map does a constant-time lookup. O(1), and either returns the value or tells you it's not there. That's it. There's no magic. It's the same data structure you learned in your first algorithms class, put to spectacularly practical use.
The typical flow looks like this:
Most implementations also set a (Time-to-Live) on each entry. After 60 seconds (or whatever you configure), the entry expires and gets evicted. This prevents you from serving stale data forever. We'll cover TTL in depth in a dedicated lesson.
Where local caches shine:
Where they fall apart:
In practice, a local cache is rarely the only cache. Most production systems stack multiple layers, each one trading speed for broader visibility. The local cache sits at the very top of that stack: the fastest layer, but also the most limited.
Local caches are dead simple, and that's both their greatest strength and their biggest limitation.
Use a local cache when:
Do NOT use a local cache when:
The honest truth is that most production systems outgrow local caches quickly. As soon as you scale to multiple servers, you'll want a shared cache like Redis or Memcached. But that doesn't make local caches useless, far from it. Even in distributed systems, a local cache often sits as the first layer in a multi-tier strategy. Check the local cache first (nanoseconds), then the shared cache (milliseconds), then the database (tens of milliseconds). Each layer catches what the previous one missed.
3 questions - Score 80% to pass
Where does a local cache store its data?
You deploy 4 instances of your app behind a load balancer. User A's data gets cached on Instance 1. User A's next request hits Instance 3. What happens?
What is the typical time complexity of a local cache lookup?