3 Scaling Strategies – Foundations of System Design

3.1 Introduction to Scaling

Large enterprises handle massive amounts of data. As data volume grows, the system must scale to manage it effectively. Scaling is the process of growing or shrinking system resources to handle varying loads.

3.2 Types of Scaling

3.2.1 Vertical Scaling (Scale Up)

Vertical scaling means upgrading the hardware and software of existing systems by adding more resources to a single machine.

Examples:

Adding more RAM
Upgrading to faster CPUs
Adding more storage (HDD/SSD)
Increasing network bandwidth

Characteristics:

✅ Simpler to implement
✅ No code changes required
✅ Data consistency maintained
✅ No need for load balancers
❌ Hardware limits (maximum capacity ceiling)
❌ Single point of failure
❌ Downtime required for upgrades
❌ Can be expensive at high end

When to use:

Early-stage applications
Applications with strict consistency requirements
Databases that are difficult to shard
When horizontal scaling complexity isn’t justified

3.2.2 Horizontal Scaling (Scale Out)

Horizontal scaling means adding multiple additional systems (servers) to distribute the load.

Examples:

Adding more web servers behind a load balancer
Deploying multiple application server instances
Sharding databases across multiple machines
Using distributed caching systems

Characteristics:

✅ No practical scaling limit
✅ Resilient to system failures
✅ Uses commodity hardware (cost-effective)
✅ Can scale incrementally
❌ Requires load balancers
❌ More complex architecture
❌ Potential data inconsistency
❌ Network overhead
❌ Application must be designed for distribution

When to use:

Large-scale applications
High availability requirements
When traffic is unpredictable
Cloud-native applications

3.3 Comparison: Horizontal vs Vertical Scaling

Aspect	Vertical Scaling	Horizontal Scaling
Load Balancer	Not necessary	Required
Failure Resilience	Single point of failure	Resilient to failures
Data Consistency	Maintained easily	Can be challenging
Scaling Limit	Hardware ceiling	Virtually unlimited
Cost at Scale	Expensive	More economical
Implementation	Simpler	More complex
Downtime	Required for upgrades	Zero-downtime possible

3.4 Scaling Different Components

3.4.1 Web Servers

Horizontal scaling approach:

Deploy multiple web server instances
Use load balancer to distribute traffic
Ensure stateless design (session stored externally)
Use shared storage for uploaded files

3.4.2 Application Servers

Horizontal scaling approach:

Package application as containers (Docker)
Use orchestration (Kubernetes, ECS)
Auto-scaling based on metrics
Externalize configuration

3.4.3 Databases

Options for scaling databases:

Read Replicas: Route read queries to replica databases
Sharding: Partition data across multiple databases
Vertical Scaling: Upgrade database hardware (limited)
NoSQL: Use databases designed for horizontal scaling

3.4.4 Caching Layer

Horizontal scaling approach:

Use distributed caching (Redis Cluster, Memcached)
Consistent hashing for cache distribution
Cache replication for redundancy

3.5 Auto-Scaling

Modern cloud platforms support automatic scaling based on metrics:

Triggers for scaling:

CPU utilization > 70%
Memory usage > 80%
Request queue length
Custom application metrics

Scaling policies:

Target tracking: Maintain specific metric value
Step scaling: Add/remove capacity in steps
Scheduled scaling: Predictable load patterns

Example (AWS Auto Scaling):

ScalingPolicy:
  TargetValue: 70
  MetricType: CPUUtilization
  MinInstances: 2
  MaxInstances: 10

3.6 Best Practices

3.6.1 Design for Horizontal Scaling

Stateless Services: Store session data externally (Redis, database)
Loose Coupling: Services should be independent
Asynchronous Communication: Use message queues
Idempotent Operations: Safe to retry
Distributed Tracing: Monitor across services

3.6.2 Cost Optimization

Right-size instances: Don’t over-provision
Reserved instances: For baseline load
Spot instances: For fault-tolerant workloads
Auto-scaling: Match capacity to demand
Monitor and optimize: Continuously review usage

3.7 Summary

Choosing between vertical and horizontal scaling depends on your specific requirements, budget, and growth expectations. Most modern large-scale systems use a combination of both strategies, vertically scaling individual components while horizontally scaling the overall architecture.

The key is to design your application with horizontal scaling in mind from the start, as retrofitting an application for distributed architecture is significantly more challenging than building it that way initially.

3.1 Introduction to Scaling

3.2 Types of Scaling

3.2.1 Vertical Scaling (Scale Up)

3.2.2 Horizontal Scaling (Scale Out)

3.3 Comparison: Horizontal vs Vertical Scaling

3.4 Scaling Different Components

3.4.1 Web Servers

3.4.2 Application Servers

3.4.3 Databases

3.4.4 Caching Layer

3.5 Auto-Scaling

3.6 Best Practices

3.6.1 Design for Horizontal Scaling

3.6.2 Cost Optimization

3.6.3 Scaling Checklist

3.7 Summary