Caching Strategies

Medium 25 min read

Why Caching Matters

Why Caching Matters

The Problem: Database queries and API calls are slow and expensive. If 1,000 users request the same product page, fetching it from the database 1,000 times is wasteful.

The Solution: Store frequently accessed data in fast, in-memory storage (cache). Serve subsequent requests from the cache instead of the slow data source.

Real Impact: Facebook uses Memcached to handle billions of requests. A cache hit takes ~1ms vs ~50ms for a database query -- a 50x improvement.

Real-World Analogy

Think of caching like a chef's prep station:

  • Database = The walk-in refrigerator in the back (lots of storage, slow to access)
  • Cache = The prep counter right in front of the chef (limited space, instant access)
  • Cache hit = Ingredient is already on the prep counter -- grab it immediately
  • Cache miss = Need to walk to the fridge, get it, bring it to the counter
  • Eviction = Counter is full -- remove something to make space for new ingredients

Cache Hierarchy

The Cache Hierarchy (Fastest to Slowest)
L1 CPU Cache ~0.5 ns | 64 KB L2 CPU Cache ~7 ns | 256 KB L3 CPU Cache ~30 ns | 8-32 MB Application Cache (Redis / Memcached) ~0.1 ms | GBs of RAM CDN Cache (Edge Servers) ~10-50 ms | TBs distributed globally Database (Disk Storage) ~1-100 ms | TBs-PBs on disk Fastest Slowest

Caching Strategies

There are several patterns for how your application interacts with the cache. The right choice depends on your read/write ratio and consistency requirements.

Cache-Aside (Lazy Loading)

Application checks cache first. On miss, fetches from DB and stores in cache. Most common pattern. Simple to implement. Cache only contains requested data.

Write-Through

Every write goes to both cache and DB synchronously. Cache is always consistent with DB. Higher write latency but no stale data. Good for read-heavy workloads.

Write-Back (Write-Behind)

Writes go to cache only, then asynchronously flushed to DB. Very fast writes but risk of data loss if cache crashes before flush. Good for write-heavy workloads.

Write-Around

Writes go directly to DB, bypassing cache. Cache only populated on reads (cache-aside for reads). Prevents cache pollution from data that is written but rarely read.

Cache-Aside Pattern Flow
Application Cache (Redis) Fast: ~0.1ms Database Slow: ~10ms 1 Check cache 2 Cache miss: query DB 3 Store result in cache for next time Next request for same data: Step 1 returns a cache hit (fast!)

Cache Eviction Policies

Caches have limited memory. When the cache is full and a new item needs to be stored, an eviction policy decides which existing item to remove.

Policy Strategy Best For Complexity
LRULeast Recently Used -- evict oldest accessed itemGeneral purpose, most commonO(1)
LFULeast Frequently Used -- evict least accessed itemWorkloads with stable hot setO(log n)
FIFOFirst In, First Out -- evict oldest itemSimple, time-based dataO(1)
TTLTime To Live -- expire after fixed durationSession data, tokensO(1)
RandomRemove a random itemSurprisingly effective, very simpleO(1)

Distributed Caching with Redis

Redis is the most popular distributed caching solution. It is an in-memory data store that supports strings, hashes, lists, sets, and sorted sets with built-in TTL support.

redis_caching.py
import redis
import json
import time

# Connect to Redis
cache = redis.Redis(host="localhost", port=6379, db=0)

def get_user_profile(user_id):
    """Cache-aside pattern for user profiles."""
    cache_key = f"user:{user_id}"

    # Step 1: Check cache
    cached = cache.get(cache_key)
    if cached:
        print("Cache HIT")
        return json.loads(cached)

    # Step 2: Cache miss - fetch from database
    print("Cache MISS - querying database")
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)

    # Step 3: Store in cache with 5 minute TTL
    cache.setex(
        cache_key,
        300,  # TTL in seconds (5 minutes)
        json.dumps(user)
    )

    return user

def update_user_profile(user_id, new_data):
    """Write-through: update DB and cache together."""
    # Update database first
    db.execute(
        "UPDATE users SET name=%s, email=%s WHERE id=%s",
        (new_data["name"], new_data["email"], user_id)
    )

    # Invalidate cache (or update it)
    cache_key = f"user:{user_id}"
    cache.delete(cache_key)  # Invalidation approach
    # OR: cache.setex(cache_key, 300, json.dumps(new_data))

def get_leaderboard(limit=10):
    """Use Redis sorted set for a real-time leaderboard."""
    # Add scores
    cache.zadd("leaderboard", {"alice": 1500, "bob": 2100, "carol": 1800})

    # Get top N players (descending)
    top_players = cache.zrevrange("leaderboard", 0, limit - 1, withscores=True)
    return [(name.decode(), score) for name, score in top_players]

# Measure cache performance
start = time.time()
profile = get_user_profile(42)  # Miss: ~10ms
print(f"First call: {(time.time()-start)*1000:.1f}ms")

start = time.time()
profile = get_user_profile(42)  # Hit: ~0.1ms
print(f"Second call: {(time.time()-start)*1000:.1f}ms")
Output
First call: 12.3ms   (cache MISS - fetched from database)
Second call: 0.2ms   (cache HIT - served from Redis)
Speedup: 61.5x faster!

Common Pitfall: Cache Stampede

Problem: A popular cache key expires. Hundreds of requests simultaneously hit the database to regenerate it.

Solutions:

  • Locking: Only one request fetches from DB; others wait for cache to be populated
  • Early refresh: Refresh the cache before it expires (e.g., at 80% of TTL)
  • Jittered TTL: Add random variance to TTL so keys do not expire simultaneously
Key Takeaway: Cache stampede is the most dangerous caching failure mode. Protect popular keys with locking (only one request regenerates), jittered TTLs (stagger expirations), or early refresh (regenerate before expiry).

Practice Problems

Easy Choose the Caching Strategy

For each scenario, which caching strategy is best and why?

  1. A product catalog page (read 99%, write 1%)
  2. A logging service writing millions of events/second
  3. A user session store

Consider the read/write ratio and consistency needs. Cache-aside works for reads, write-back for writes, write-through for consistency.

# 1. Product catalog: Cache-Aside
#    - Read-heavy (99% reads)
#    - Stale data for a few minutes is OK
#    - Set TTL to 5-10 minutes
#    - Products rarely change

# 2. Logging service: Write-Back
#    - Extremely write-heavy
#    - Buffer writes in cache, flush to DB in batches
#    - Acceptable to lose a few seconds of logs
#    - Reduces DB write pressure by 100x+

# 3. Session store: Write-Through with TTL
#    - Every session update must be persisted
#    - Reads need to be fast (every request)
#    - TTL = session expiry time (e.g., 30 min)
#    - Redis is perfect for this use case

Medium Cache Hit Ratio Optimization

Your cache has a 60% hit ratio. Each cache hit costs 1ms and each miss costs 50ms (cache check + DB query). With 10,000 requests/second:

  1. What is the average response time?
  2. If you increase the hit ratio to 90%, what is the new average?
  3. What strategies would you use to improve from 60% to 90%?

Average = (hit_ratio * hit_time) + (miss_ratio * miss_time). To improve hit ratio: increase cache size, better TTL, pre-warm cache, analyze access patterns.

# 1. Average with 60% hit ratio:
avg_60 = (0.60 * 1) + (0.40 * 50)
print(f"60% hit: {avg_60}ms avg")  # 20.6ms

# 2. Average with 90% hit ratio:
avg_90 = (0.90 * 1) + (0.10 * 50)
print(f"90% hit: {avg_90}ms avg")  # 5.9ms

# That's a 3.5x improvement!

# 3. Strategies to improve hit ratio:
# a) Increase cache size (more items fit)
# b) Tune TTL (too short = unnecessary misses)
# c) Pre-warm cache on startup with popular items
# d) Use cache-aside + write-through together
# e) Analyze access patterns, cache hot data only
# f) Use consistent hashing for distributed cache

Medium Design a Caching Layer

Design a multi-level caching layer for a news feed that serves 1M active users:

  1. What would you cache at each level?
  2. What TTL for each cache level?
  3. How do you handle cache invalidation when a new post is created?

Use CDN for static assets, Redis for feed data, local memory for hot users. Invalidation can use pub/sub to notify cache nodes.

# Multi-level caching for news feed

# Level 1: CDN (static assets)
# - Profile images, CSS, JS, video thumbnails
# - TTL: 24 hours (versioned URLs)
# - Invalidate via CDN purge API

# Level 2: Redis Cluster (feed data)
# - Pre-computed feed: user_id -> [post_ids]
# - Post content: post_id -> {title, body, author}
# - TTL: 5 minutes for feeds, 1 hour for posts

# Level 3: App Server Memory (hot data)
# - Top 100 trending posts (shared across requests)
# - TTL: 30 seconds

# Cache invalidation on new post:
# 1. Write post to database
# 2. Publish event to Redis Pub/Sub channel
# 3. Feed workers subscribe to channel:
#    - Update cached feeds for post author's followers
#    - Use fan-out-on-write for users with <10K followers
#    - Use fan-out-on-read for celebrities (>10K followers)
Deep Dive: Cache-Aside vs Write-Through Consistency

Cache-aside (lazy loading) only populates the cache on read misses. This means after a write, the cache contains stale data until the TTL expires or the key is explicitly invalidated. Write-through updates the cache synchronously with every write, guaranteeing consistency but adding latency to writes. In practice, most teams use cache-aside with explicit invalidation on writes: update the database, then delete (not update) the cache key. Deleting is safer than updating because it avoids race conditions where a concurrent read could overwrite the delete with stale data.

Common Mistake

Wrong: Caching database query results without a TTL

Why it fails: Without a TTL, cached data never expires. If the underlying data changes, users see stale results forever. Memory usage also grows unbounded until the cache runs out of memory and starts evicting unpredictably.

Instead: Always set a TTL appropriate to your data freshness needs. Product catalog: 5-10 minutes. User sessions: 30 minutes. Static content: 24 hours.

Quick Reference

Caching Strategy Selection Guide

Strategy Read Perf Write Perf Consistency Best For
Cache-AsideHigh (after warm-up)N/A (reads only)EventualRead-heavy, general purpose
Write-ThroughHighLower (dual write)StrongRead-heavy with consistency
Write-BackHighVery HighEventualWrite-heavy workloads
Write-AroundHigh (for hot data)HighEventualWrite-once, read-later data

Key Takeaways

  • Cache-aside is the most common and safest starting point
  • Always set a TTL -- unbounded caches lead to stale data and memory issues
  • LRU is the default eviction policy for most use cases
  • Cache hit ratio above 90% is the target; below 80% investigate your access patterns
  • Guard against cache stampede, especially for popular keys