Caching

Every time your application queries a database, calls a third-party API, or runs a complex computation, it does work. Work takes time. If you do the same expensive work on every request, your system is slower than it needs to be. Caching is the solution: store the result the first time, return the stored result every time after.

Cache Miss vs Cache Hit

Cache Miss

Client

App Server

Cache

Redis

Database

Slow — had to hit the database

Cache Hit

Client

App Server

Cache

Redis

Database

Fast — returned from cache

Cache Hit vs Cache Miss

When a request arrives, your application checks the cache first.

A cache hit means the data is there. You return it immediately — no database query, no computation. The response is fast, the database is untouched. This is the goal.

A cache miss means the data isn't in the cache. Your application falls through to the source — the database, an external API, wherever the data actually lives. It fetches the result, serves it to the user, and stores it in the cache so the next request can hit instead of miss.

The first request for any piece of data always misses. Every request after that hits — until the data expires or is invalidated. A well-designed cache will serve the vast majority of reads as hits.

TTL — Time to Live

Cached data doesn't live forever. Every entry in a cache has a TTL (Time to Live) — a duration after which it expires and is removed. The next request for that data will miss and refetch from the source.

TTL is how you control staleness. A user profile might be fine cached for five minutes — if it changes, users won't notice a brief lag. Stock prices need a TTL of seconds. Static configuration data might be cached for hours.

Setting TTL too short means you don't get much benefit from caching — you're back to hitting the database constantly. Setting it too long means users see stale data. The right TTL depends entirely on how often your data changes and how much staleness your users can tolerate.

user:profile:u123  →  { name: "Ada", ... }  →  TTL: 5 minutes
exchange:BTC/USD   →  { price: 67432 }        →  TTL: 10 seconds
config:feature-flags → { ... }               →  TTL: 1 hour

Eviction Policies

A cache has finite memory. When it's full and new data needs to be stored, something has to go. The eviction policy decides what.

LRU (Least Recently Used) is the most common policy. When the cache is full, it evicts the entry that hasn't been accessed in the longest time. The logic is intuitive: if you haven't needed it recently, you probably don't need it. Redis uses LRU by default.

LFU (Least Frequently Used) evicts the entry that has been accessed the fewest times overall. Better for workloads where some data is repeatedly hot and some is rarely touched, but more expensive to track.

FIFO (First In, First Out) evicts the oldest entry regardless of how often it's been used. Simple to implement but rarely the right choice — age alone is a poor predictor of future usefulness.

For most applications, LRU is the right default. You don't need to think deeply about eviction policy until cache hit rate becomes a problem.

What to Cache

Not all data is worth caching. The best candidates are results that are:

Expensive to compute — database queries that scan millions of rows, aggregations, joins across multiple tables
Frequently requested — the same data fetched by many users or many requests
Rarely changing — data that stays the same long enough for the cache to be useful

Common examples:

Database query results — the leaderboard for your game, the list of products in a category, a user's order history
Computation results — a recommendation engine's output, a rendered report, an aggregated analytics summary
Session data — authentication tokens, user preferences, shopping cart contents
Third-party API responses — weather data, exchange rates, external search results you're proxying

What Not to Cache

Some data is a bad fit for caching:

Frequently changing data — if a value changes every second, caching it for even a minute causes user-visible inconsistency. Price-sensitive financial data, real-time inventory counts, live scores during a sports match.
Sensitive data — personally identifiable information, payment details, access tokens. Caches are often shared infrastructure; sensitive data shouldn't sit in them any longer than necessary.
Unique per-user data that changes often — a live activity feed, notification counts, anything that needs to be exactly right right now.

A useful test: if a user saw the cached version rather than the real one, would they notice or care? If the answer is "yes and it would be a problem", don't cache it.

Tools — Redis vs Memcached

Redis is the default choice for almost every caching use case today. It's an in-memory data structure store with persistence options, support for rich data types (strings, lists, sets, sorted sets, hashes), pub/sub, TTL, and LRU eviction. It's fast, battle-tested, and has excellent client libraries across every language.

Memcached is simpler — a pure key-value cache with no persistence, no rich types, and a multi-threaded architecture that can slightly outperform Redis on high-core-count machines at extreme scale. It's the right choice in a narrow set of circumstances: you need maximum raw throughput, your values are simple strings, and you don't need persistence or any of Redis's advanced features.

In practice, reach for Redis. The additional capabilities — sorted sets for leaderboards, pub/sub for real-time notifications, persistence for durability — often become useful as your system grows, and the performance difference only matters at a scale most applications never reach.

Cache Invalidation

Cache invalidation is notoriously difficult to get right. Phil Karlton famously said: "There are only two hard things in computer science: cache invalidation and naming things."

The problem: when the underlying data changes, cached copies become stale. You need to either expire them or actively remove them — without serving incorrect data in the window between the change and the expiry.

Three main strategies:

TTL expiry — the simplest approach. Set a TTL and accept that data may be stale for up to that duration. When the TTL expires, the next request fetches fresh data. Works well when brief staleness is acceptable. No coordination required.

Event-based invalidation — when data changes, explicitly delete or update the corresponding cache entry. A user updates their profile → your application deletes user:profile:u123 from Redis → the next request fetches fresh data. This keeps the cache accurate but requires your write path to know what to invalidate, which adds complexity and coupling.

Write-through cache — on every write, update both the database and the cache simultaneously. The cache is always consistent with the database. More writes go to the cache (even for data that may never be read again), but you eliminate the stale-read problem entirely.

Most production systems use a combination: TTL for tolerance of eventual consistency, event-based invalidation for data where accuracy matters, and write-through for the hottest, most-read data.

Real-World Example

A social platform caches user profiles in Redis. Here's how it works end to end:

A request arrives for user u123's profile
The app checks Redis: GET user:profile:u123
Cache hit — Redis returns the profile JSON. Response in < 1ms. Database untouched.
Cache miss — Redis returns nil. The app queries the database, gets the profile, writes it to Redis with a 5-minute TTL, and serves it.

When the user updates their profile:

The app writes the new profile to the database
The app deletes user:profile:u123 from Redis (event-based invalidation)
The next request for this profile misses the cache, fetches fresh data, and repopulates it

This pattern — cache on read, invalidate on write, TTL as a safety net — handles the vast majority of production caching requirements.

Key Takeaways

Caching stores expensive results so you don't recompute them on every request — a cache hit is always faster than hitting the source
A cache miss falls through to the source, fetches fresh data, and populates the cache for future requests
TTL controls staleness — set it based on how often data changes and how much delay your users can tolerate
LRU eviction is the right default for most workloads — evict what hasn't been used recently
Cache database query results, computation results, session data, and third-party API responses; don't cache frequently changing data, sensitive data, or data that must always be current
Redis is the default caching tool — use Memcached only when you need its specific performance characteristics at extreme scale
Cache invalidation is hard — use TTL for acceptable staleness, event-based invalidation when accuracy matters, and write-through for the hottest data

What's Next

Caching reduces the load on your servers and database — but as traffic grows, a single server handling all requests becomes the bottleneck. In the next lesson we'll look at load balancers — how they distribute traffic across multiple servers, and how they're the first line of defence against traffic spikes.