5 Caching
5.1 What is Caching?
Caching is the process of storing frequently accessed data in fast memory (cache) for quick retrieval. Instead of fetching data from slower storage (disk, database) every time, the system first checks if it exists in the cache.
5.2 The 80/20 Rule
A famous principle in caching:
20% of data is accessed 80% of the time
This means:
- Small subset of data accounts for most requests
- Cache this “hot” data for maximum benefit
- Don’t need to cache everything
5.3 Why Cache?
Benefits:
- ✅ Reduced latency (faster response times)
- ✅ Reduced database load
- ✅ Lower bandwidth usage
- ✅ Cost savings (fewer database reads)
- ✅ Better scalability
- ✅ Improved user experience
Trade-offs:
- ❌ Additional complexity
- ❌ Potential data staleness
- ❌ Memory costs
- ❌ Cache invalidation challenges
5.4 Cache Hierarchy
Speed vs Size trade-off:
CPU Cache (L1/L2/L3) → Fastest, Smallest
RAM → Fast, Medium
SSD → Medium, Large
HDD → Slow, Largest
Network/Database → Slowest, Distributed
5.5 Cache Invalidation Strategies
“There are only two hard things in Computer Science: cache invalidation and naming things.” — Phil Karlton
5.5.1 1. Write-Through Cache
How it works: Data is written to both cache and database simultaneously.
Write Request → [Cache + Database] → Response
Read Request → Cache (if hit) or Database (if miss)
Characteristics:
- ✅ Cache always consistent with database
- ✅ Data not lost on cache failure
- ✅ Simple to implement
- ❌ Higher write latency (write to both)
- ❌ Unnecessary cache writes for rarely-read data
When to use:
- Applications requiring strong consistency
- Read-heavy workloads
- When write latency is acceptable

5.5.2 2. Write-Around Cache
How it works: Data is written only to database, bypassing cache. Cache is invalidated/marked stale.
Write Request → Database → Response
(Cache entry marked invalid)
Read Request → Database (cache miss) → Update Cache
Characteristics:
- ✅ Reduces cache churn for infrequently-read data
- ✅ No unnecessary cache writes
- ❌ Read penalty after write (cache miss)
- ❌ First read after write is slow
When to use:
- Write-heavy workloads
- Data written but not immediately read
- When cache space is limited
5.5.3 3. Write-Back Cache (Write-Behind)
How it works: Data written to cache first, asynchronously written to database later.
Write Request → Cache → Response (immediate)
Background Process → Database (asynchronous)
Characteristics:
- ✅ Very fast writes
- ✅ Reduced database load
- ✅ Better write performance
- ❌ Risk of data loss (if cache fails before DB write)
- ❌ More complex to implement
- ❌ Eventual consistency
When to use:
- High write throughput required
- Write latency critical
- Acceptable to lose recent writes in failure
- Social media updates, view counts
5.6 Cache Eviction Policies
When cache is full, which item should be removed?
5.6.1 1. Least Recently Used (LRU)
Strategy: Remove the item that hasn’t been accessed for the longest time.
Implementation: Doubly-linked list + hash map
Pros:
- Effective for most use cases
- Balances recency and frequency
- Industry standard
Cons:
- Moderate implementation complexity
- Doesn’t handle periodic access well
Example:
Cache: [A, B, C, D] (D is most recent)
Access: E
Result: [B, C, D, E] (A evicted)
5.6.2 2. Least Frequently Used (LFU)
Strategy: Remove the item accessed least frequently.
Pros:
- Good for items with varying access patterns
- Retains popular items
Cons:
- New items may be evicted quickly
- Stale popular items stay too long
- More complex to implement
5.6.3 3. First In First Out (FIFO)
Strategy: Remove the oldest item regardless of access.
Pros:
- Simple to implement
- Low overhead
Cons:
- Ignores access patterns
- May evict frequently-used items
When to use:
- Simple caching needs
- Items have similar access patterns
5.6.4 4. Most Recently Used (MRU)
Strategy: Remove the most recently accessed item.
When to use:
- Specific access patterns (sequential scans)
- Opposite of LRU behavior needed
5.6.5 5. Random Replacement (RR)
Strategy: Randomly select item to evict.
Pros:
- Simplest to implement
- No overhead
- Surprisingly effective at scale
Cons:
- Unpredictable
- May evict hot data
5.6.6 6. Last In First Out (LIFO)
Strategy: Remove the most recently added item.
Rarely used in practice.

5.7 Caching Layers
5.7.1 Client-Side Caching
Browser cache:
- HTTP cache headers (Cache-Control, ETag)
- LocalStorage / SessionStorage
- Service Workers
5.7.2 CDN Caching
- Caches static assets geographically
- Reduces origin server load
- Improves global performance
5.7.3 Application-Level Caching
In-memory cache within application:
- Process-level cache (local variables)
- Shared cache (Redis, Memcached)
5.7.4 Database Caching
- Query result cache
- Object cache
- Prepared statement cache
5.8 Distributed Caching
For large-scale systems, cache must be distributed across multiple servers.

Popular distributed caches:
- Redis: In-memory, supports data structures, persistence
- Memcached: Simple, fast, key-value store
- Amazon ElastiCache: Managed Redis/Memcached
Challenges:
- Cache coherence across nodes
- Consistent hashing for distribution
- Handling node failures
- Network latency between cache nodes
5.8.1 Consistent Hashing
Distributes cache keys across nodes such that:
- Adding/removing nodes minimizes key redistribution
- Avoids cache stampede
- Ensures even distribution
5.9 Cache Patterns
5.9.1 Cache-Aside (Lazy Loading)
Flow:
- Application checks cache
- If miss, query database
- Store result in cache
- Return data
Pros:
- Only requested data cached
- Cache failures don’t bring system down
Cons:
- Initial request slow (cache miss)
- Stale data possible
5.9.2 Read-Through
Flow:
- Application queries cache
- Cache is responsible for loading from database on miss
- Return data
Difference from cache-aside: Cache manages database interaction.
5.9.3 Refresh-Ahead
Flow:
- Cache automatically refreshes data before expiration
- Based on access patterns
Pros:
- Reduced latency
- No cache miss delays
Cons:
- May refresh unused data
- More complex
5.10 Cache Invalidation Strategies
5.10.1 Time-To-Live (TTL)
Strategy: Data expires after fixed time period.
SET user:123 "John Doe" EX 3600 # Expires in 1 hour
When to use:
- Data changes predictably
- Stale data acceptable
- Simplicity preferred
5.10.2 Event-Based Invalidation
Strategy: Invalidate cache when data changes.
# On user update
UPDATE users SET name = 'Jane' WHERE id = 123;
DELETE cache:user:123;
When to use:
- Strong consistency required
- Update frequency moderate
5.11 Cache Monitoring
Key metrics:
- Hit Rate: Cache hits / Total requests (aim for >80%)
- Miss Rate: Cache misses / Total requests
- Eviction Rate: Items evicted / Time period
- Memory Usage: Current cache size / Max size
- Latency: Cache response time
5.12 Best Practices
5.12.1 1. Choose Appropriate TTL
- Short TTL: Frequently changing data
- Long TTL: Rarely changing data
- No expiry: Static/immutable data
5.12.2 2. Cache at Multiple Levels
- Browser cache
- CDN
- Application cache
- Database query cache
5.12.3 3. Monitor Cache Performance
- Track hit/miss ratio
- Alert on sudden changes
- Analyze access patterns
5.12.4 4. Handle Cache Failures Gracefully
- Fallback to database
- Circuit breaker pattern
- Don’t let cache failure bring down system
5.12.5 5. Avoid Cache Stampede
Problem: Many requests for same expired key hit database simultaneously.
Solutions:
- Probabilistic early expiration
- Lock-based loading
- Refresh-ahead caching
5.12.6 6. Serialize Appropriately
- Use efficient formats (Protocol Buffers, MessagePack)
- Consider compression for large values
- Balance size vs. serialization cost
5.13 Common Pitfalls
5.13.1 1. Caching Too Much
- Wastes memory
- Reduces hit rate
- Increases evictions
5.13.2 2. Caching Too Little
- Misses optimization opportunities
- Higher database load
5.13.3 3. Inconsistent Cache Keys
- Use standardized key naming
- Include version in key if needed
- Document key structure
5.13.4 4. Not Handling Cache Misses
- Always have fallback logic
- Implement retry mechanisms
- Log and monitor misses
5.14 Cache Size Calculation
Example calculation:
Average object size: 1 KB
Request rate: 1000 req/sec
Hit rate goal: 80%
Working set: 20% of total data
Required cache size = (1000 req/sec × 3600 sec × 1 KB) × 0.2
= 720 MB per hour of unique requests
5.15 Summary
Caching is one of the most effective techniques for improving system performance and scalability. The key is to:
- Cache the right data (80/20 rule)
- Choose appropriate invalidation strategy
- Monitor and tune cache performance
- Handle failures gracefully
- Use distributed caching for scale
Properly implemented caching can reduce latency by 10-100x and dramatically reduce database load.