5 Caching

5.1 What is Caching?

Caching is the process of storing frequently accessed data in fast memory (cache) for quick retrieval. Instead of fetching data from slower storage (disk, database) every time, the system first checks if it exists in the cache.

5.2 The 80/20 Rule

A famous principle in caching:

20% of data is accessed 80% of the time

This means:

Small subset of data accounts for most requests
Cache this “hot” data for maximum benefit
Don’t need to cache everything

5.3 Why Cache?

Benefits:

✅ Reduced latency (faster response times)
✅ Reduced database load
✅ Lower bandwidth usage
✅ Cost savings (fewer database reads)
✅ Better scalability
✅ Improved user experience

Trade-offs:

❌ Additional complexity
❌ Potential data staleness
❌ Memory costs
❌ Cache invalidation challenges

5.4 Cache Hierarchy

Speed vs Size trade-off:

CPU Cache (L1/L2/L3) → Fastest, Smallest
RAM → Fast, Medium
SSD → Medium, Large
HDD → Slow, Largest
Network/Database → Slowest, Distributed

5.5 Cache Invalidation Strategies

“There are only two hard things in Computer Science: cache invalidation and naming things.” — Phil Karlton

5.5.1 1. Write-Through Cache

How it works: Data is written to both cache and database simultaneously.

Write Request → [Cache + Database] → Response
Read Request → Cache (if hit) or Database (if miss)

Characteristics:

✅ Cache always consistent with database
✅ Data not lost on cache failure
✅ Simple to implement
❌ Higher write latency (write to both)
❌ Unnecessary cache writes for rarely-read data

When to use:

Applications requiring strong consistency
Read-heavy workloads
When write latency is acceptable

5.5.2 2. Write-Around Cache

How it works: Data is written only to database, bypassing cache. Cache is invalidated/marked stale.

Write Request → Database → Response
(Cache entry marked invalid)
Read Request → Database (cache miss) → Update Cache

Characteristics:

✅ Reduces cache churn for infrequently-read data
✅ No unnecessary cache writes
❌ Read penalty after write (cache miss)
❌ First read after write is slow

When to use:

Write-heavy workloads
Data written but not immediately read
When cache space is limited

5.5.3 3. Write-Back Cache (Write-Behind)

How it works: Data written to cache first, asynchronously written to database later.

Write Request → Cache → Response (immediate)
Background Process → Database (asynchronous)

Characteristics:

✅ Very fast writes
✅ Reduced database load
✅ Better write performance
❌ Risk of data loss (if cache fails before DB write)
❌ More complex to implement
❌ Eventual consistency

When to use:

High write throughput required
Write latency critical
Acceptable to lose recent writes in failure
Social media updates, view counts

5.6 Cache Eviction Policies

When cache is full, which item should be removed?

5.6.1 1. Least Recently Used (LRU)

Strategy: Remove the item that hasn’t been accessed for the longest time.

Implementation: Doubly-linked list + hash map

Pros:

Effective for most use cases
Balances recency and frequency
Industry standard

Cons:

Moderate implementation complexity
Doesn’t handle periodic access well

Example:

Cache: [A, B, C, D] (D is most recent)
Access: E
Result: [B, C, D, E] (A evicted)

5.6.2 2. Least Frequently Used (LFU)

Strategy: Remove the item accessed least frequently.

Pros:

Good for items with varying access patterns
Retains popular items

Cons:

New items may be evicted quickly
Stale popular items stay too long
More complex to implement

5.6.3 3. First In First Out (FIFO)

Strategy: Remove the oldest item regardless of access.

Pros:

Simple to implement
Low overhead

Cons:

Ignores access patterns
May evict frequently-used items

When to use:

Simple caching needs
Items have similar access patterns

5.6.4 4. Most Recently Used (MRU)

Strategy: Remove the most recently accessed item.

When to use:

Specific access patterns (sequential scans)
Opposite of LRU behavior needed

5.6.5 5. Random Replacement (RR)

Strategy: Randomly select item to evict.

Pros:

Simplest to implement
No overhead
Surprisingly effective at scale

Cons:

Unpredictable
May evict hot data

5.6.6 6. Last In First Out (LIFO)

Strategy: Remove the most recently added item.

Rarely used in practice.

5.7 Caching Layers

5.7.1 Client-Side Caching

Browser cache:

HTTP cache headers (Cache-Control, ETag)
LocalStorage / SessionStorage
Service Workers

5.7.2 CDN Caching

Caches static assets geographically
Reduces origin server load
Improves global performance

5.7.3 Application-Level Caching

In-memory cache within application:

Process-level cache (local variables)
Shared cache (Redis, Memcached)

5.7.4 Database Caching

Query result cache
Object cache
Prepared statement cache

5.8 Distributed Caching

For large-scale systems, cache must be distributed across multiple servers.

Popular distributed caches:

Redis: In-memory, supports data structures, persistence
Memcached: Simple, fast, key-value store
Amazon ElastiCache: Managed Redis/Memcached

Challenges:

Cache coherence across nodes
Consistent hashing for distribution
Handling node failures
Network latency between cache nodes

5.8.1 Consistent Hashing

Distributes cache keys across nodes such that:

Adding/removing nodes minimizes key redistribution
Avoids cache stampede
Ensures even distribution

5.9 Cache Patterns

5.9.1 Cache-Aside (Lazy Loading)

Flow:

Application checks cache
If miss, query database
Store result in cache
Return data

Pros:

Only requested data cached
Cache failures don’t bring system down

Cons:

Initial request slow (cache miss)
Stale data possible

5.9.2 Read-Through

Flow:

Application queries cache
Cache is responsible for loading from database on miss
Return data

Difference from cache-aside: Cache manages database interaction.

5.9.3 Refresh-Ahead

Flow:

Cache automatically refreshes data before expiration
Based on access patterns

Pros:

Reduced latency
No cache miss delays

Cons:

May refresh unused data
More complex

5.10 Cache Invalidation Strategies

5.10.1 Time-To-Live (TTL)

Strategy: Data expires after fixed time period.

SET user:123 "John Doe" EX 3600  # Expires in 1 hour

When to use:

Data changes predictably
Stale data acceptable
Simplicity preferred

5.10.2 Event-Based Invalidation

Strategy: Invalidate cache when data changes.

# On user update
UPDATE users SET name = 'Jane' WHERE id = 123;
DELETE cache:user:123;

When to use:

Strong consistency required
Update frequency moderate

5.10.3 Cache Tags

Strategy: Tag related cache entries, invalidate by tag.

Cache entry: product:123 [tags: category:electronics, brand:sony]
Invalidate: All entries with tag brand:sony

5.11 Cache Monitoring

Key metrics:

Hit Rate: Cache hits / Total requests (aim for >80%)
Miss Rate: Cache misses / Total requests
Eviction Rate: Items evicted / Time period
Memory Usage: Current cache size / Max size
Latency: Cache response time

5.12 Best Practices

5.12.1 1. Choose Appropriate TTL

Short TTL: Frequently changing data
Long TTL: Rarely changing data
No expiry: Static/immutable data

5.12.2 2. Cache at Multiple Levels

Browser cache
CDN
Application cache
Database query cache

5.12.3 3. Monitor Cache Performance

Track hit/miss ratio
Alert on sudden changes
Analyze access patterns

5.12.4 4. Handle Cache Failures Gracefully

Fallback to database
Circuit breaker pattern
Don’t let cache failure bring down system

5.12.5 5. Avoid Cache Stampede

Problem: Many requests for same expired key hit database simultaneously.

Solutions:

Probabilistic early expiration
Lock-based loading
Refresh-ahead caching

5.12.6 6. Serialize Appropriately

Use efficient formats (Protocol Buffers, MessagePack)
Consider compression for large values
Balance size vs. serialization cost

5.13 Common Pitfalls

5.13.1 1. Caching Too Much

Wastes memory
Reduces hit rate
Increases evictions

5.13.2 2. Caching Too Little

Misses optimization opportunities
Higher database load

5.13.3 3. Inconsistent Cache Keys

Use standardized key naming
Include version in key if needed
Document key structure

5.13.4 4. Not Handling Cache Misses

Always have fallback logic
Implement retry mechanisms
Log and monitor misses

5.14 Cache Size Calculation

Example calculation:

Average object size: 1 KB
Request rate: 1000 req/sec
Hit rate goal: 80%
Working set: 20% of total data

Required cache size = (1000 req/sec × 3600 sec × 1 KB) × 0.2
                    = 720 MB per hour of unique requests

5.15 Summary

Caching is one of the most effective techniques for improving system performance and scalability. The key is to:

Cache the right data (80/20 rule)
Choose appropriate invalidation strategy
Monitor and tune cache performance
Handle failures gracefully
Use distributed caching for scale

Properly implemented caching can reduce latency by 10-100x and dramatically reduce database load.