Horizontal vs Vertical

Overview

Scaling is the process of increasing a system's capacity to handle more load. There are two fundamental approaches: vertical scaling (scaling up) and horizontal scaling (scaling out).

Quick Reference

Vertical Scaling (Scale Up)

Add more power (CPU, RAM, disk) to existing machine
Simple to implement, no code changes needed
Limitations:
- Hard ceiling on hardware capacity
- Single point of failure
- Expensive at high specs (cost grows exponentially)
- Downtime during upgrades

Horizontal Scaling (Scale Out)

Add more servers to distribute load
No theoretical ceiling, better fault tolerance
Challenges:
- Requires stateless application design
- Data consistency across nodes
- More operational complexity

Database Scaling

Vertical first: Upgrade primary instance (e.g., AWS RDS supports up to 24TB RAM)
Horizontal (sharding): Split data across multiple servers by shard key
Sharding challenges:
- Resharding when data grows unevenly (use consistent hashing)
- Hotspot/celebrity problem (popular data on same shard)
- Cross-shard joins become difficult (de-normalize instead)

Key Principles

Keep web tier stateless for easy horizontal scaling
Build redundancy at every tier
Cache aggressively to reduce database load
Use CDN for static assets
Scale data tier by sharding when vertical limits are reached

Questions

Q1: What is horizontal vs vertical scaling?

Detailed Explanation

Vertical Scaling:

Upgrade the hardware of a single server
Increase CPU cores, RAM, or storage capacity
No changes to application architecture required
Has physical limits (you can only make a server so big)

Horizontal Scaling:

Add more servers to a pool
Distribute workload across multiple machines
Requires application design considerations (statelessness, data consistency)
Theoretically unlimited scaling potential

Example

Consider a web application experiencing slow response times:

Vertical approach: Upgrade from a 4-core server with 16GB RAM to a 16-core server with 64GB RAM
Horizontal approach: Add 3 more identical servers behind a load balancer

Q2: When would you choose one over the other?

Detailed Explanation

Choose Vertical Scaling when:

Your application is stateful and difficult to distribute
You need a quick fix without architectural changes
Your workload is I/O bound and benefits from faster hardware
Cost of re-architecting exceeds cost of bigger hardware
You're dealing with legacy systems

Choose Horizontal Scaling when:

You need high availability and fault tolerance
Your traffic is unpredictable and requires elasticity
You've hit the limits of vertical scaling
Your application is stateless or can be made stateless
You need geographic distribution

Example

Database primary: Often scaled vertically first (bigger instance) because writes typically go to a single node
Web/API servers: Usually scaled horizontally because they're stateless and can easily run in parallel
Cache layer: Can go either way—Redis can scale vertically to a point, then requires clustering (horizontal)

Q3: What are the pros and cons of each?

Detailed Explanation

Vertical Scaling:

Pros	Cons
Simple to implement	Hardware limits (ceiling)
No code changes needed	Single point of failure
Lower operational complexity	Downtime during upgrades
Better for ACID transactions	Cost grows exponentially
Simpler data consistency	Vendor lock-in risk

Horizontal Scaling:

Pros	Cons
No theoretical ceiling	Complex architecture
High availability/fault tolerance	Data consistency challenges
Cost-effective at scale	Network latency between nodes
Geographic distribution	Requires stateless design
Pay for what you use	Operational overhead

Example

Cost comparison at scale:

Vertical: A server with 2x the CPU often costs more than 2x the price
Horizontal: Two standard servers typically cost exactly 2x, sometimes less

Q4: Give real-world examples of each

Detailed Explanation

Vertical Scaling Examples:

Instagram (early days): Ran on a single PostgreSQL server that was continuously upgraded before eventually sharding
Stack Overflow: Famous for scaling vertically—runs on a surprisingly small number of powerful servers
Most startup MVPs: Begin with a single beefy database server

Horizontal Scaling Examples:

Netflix: Thousands of microservices distributed globally
Google Search: Distributes queries across massive server farms
Facebook: Memcached clusters with thousands of nodes
Amazon: Auto-scaling groups that add/remove EC2 instances based on demand

Example

Stack Overflow's approach: They handle billions of page views with just a handful of servers by heavily optimizing their code and using powerful hardware. This is a counterexample to "always scale horizontally"—sometimes vertical scaling with good engineering is the right choice.

Q5: What challenges come with horizontal scaling?

Detailed Explanation

Key Challenges:

Data Consistency
- Keeping data synchronized across nodes
- Handling distributed transactions
- Dealing with eventual consistency
Session Management
- User sessions must be shared or externalized
- Sticky sessions vs. stateless design
- Token-based authentication becomes preferred
Service Discovery
- How do services find each other?
- Dynamic IP addresses as instances scale
- Tools: Consul, etcd, Kubernetes DNS
Load Balancing
- Even distribution of traffic
- Health checks and failover
- Algorithm selection (round-robin, least connections, etc.)
Operational Complexity
- More servers = more things to monitor
- Distributed logging and tracing
- Configuration management at scale

Example

Session management evolution:

Phase 1: Sessions stored on server (breaks with horizontal scaling)
Phase 2: Sticky sessions via load balancer (limits flexibility)
Phase 3: Centralized session store (Redis)
Phase 4: Stateless JWT tokens (no server-side session needed)

Q6: Can you combine both approaches?

Detailed Explanation

Hybrid Scaling Strategy:

Application Layer: Scale horizontally
- Stateless web servers behind load balancers
- Easy to add/remove instances
Cache Layer: Start vertical, then horizontal
- Single Redis instance initially
- Redis Cluster when you outgrow it
Database Layer: Vertical first, then horizontal
- Upgrade primary instance as long as possible
- Add read replicas (horizontal for reads)
- Eventually shard (horizontal for writes)
File Storage: Horizontal from the start
- Object storage (S3) is inherently distributed

Example

Typical e-commerce architecture:

10+ stateless API servers (horizontal)
1 large primary database + 5 read replicas (vertical + horizontal)
Redis cluster for sessions and caching (horizontal)
CDN for static assets (horizontal by nature)

Q7: How does cloud computing affect this decision?

Detailed Explanation

Cloud Advantages for Scaling:

Auto-Scaling
- Automatically add/remove instances based on metrics
- No capacity planning needed
- Pay only for what you use
Managed Services
- RDS handles database scaling complexity
- ElastiCache manages Redis clustering
- Reduces operational burden
Global Infrastructure
- Multiple regions and availability zones
- Built-in redundancy
- CDN integration
Instance Variety
- Can scale vertically with one click
- Wide range of instance sizes
- Specialized instances (compute, memory, storage optimized)

Cloud-Native Patterns:

Serverless (Lambda): Automatic horizontal scaling to zero
Containers (ECS/EKS): Easy horizontal scaling with orchestration
Spot instances: Cost-effective horizontal scaling

Example

Before cloud:

Horizontal scaling required purchasing, racking, and configuring new servers
Lead time: weeks to months
Requires accurate capacity forecasting

With cloud:

Horizontal scaling is an API call or configuration change
Lead time: minutes
Can react to actual demand in real-time

Q8: What metrics determine when to scale?

Detailed Explanation

Primary Scaling Metrics:

Metric	When to Scale	Typical Threshold
CPU Utilization	High compute load	70-80% sustained
Memory Usage	Memory pressure	80-85%
Request Latency	Slow responses	p95 > SLA target
Queue Depth	Backlog building	Growing trend
Error Rate	System stress	Above baseline
Connection Count	Connection exhaustion	Near pool limit

Scaling Strategies:

Reactive Scaling
- Respond to current metrics
- Risk: Lag time before new capacity is ready
Predictive Scaling
- Use historical patterns
- Scale before the traffic arrives
- Better for known events (sales, launches)
Scheduled Scaling
- Based on time of day/week
- Good for predictable patterns

Example

Setting up auto-scaling on AWS:

Target tracking policy: Maintain average CPU at 60%
Step scaling: Add 2 instances if CPU > 80%, add 4 if > 90%
Cooldown period: 300 seconds to prevent thrashing

Overview​

Quick Reference​

Vertical Scaling (Scale Up)​

Horizontal Scaling (Scale Out)​

Database Scaling​

Key Principles​

Questions​

Q1: What is horizontal vs vertical scaling?​

Detailed Explanation​

Example​

Q2: When would you choose one over the other?​

Detailed Explanation​

Example​

Q3: What are the pros and cons of each?​

Detailed Explanation​

Example​

Q4: Give real-world examples of each​

Detailed Explanation​

Example​

Q5: What challenges come with horizontal scaling?​

Detailed Explanation​

Example​

Q6: Can you combine both approaches?​

Detailed Explanation​

Example​

Q7: How does cloud computing affect this decision?​

Detailed Explanation​

Example​

Q8: What metrics determine when to scale?​

Detailed Explanation​

Example​

Overview

Quick Reference

Vertical Scaling (Scale Up)

Horizontal Scaling (Scale Out)

Database Scaling

Key Principles

Questions

Q1: What is horizontal vs vertical scaling?

Detailed Explanation

Example

Q2: When would you choose one over the other?

Detailed Explanation

Example

Q3: What are the pros and cons of each?

Detailed Explanation

Example

Q4: Give real-world examples of each

Detailed Explanation

Example

Q5: What challenges come with horizontal scaling?

Detailed Explanation

Example

Q6: Can you combine both approaches?

Detailed Explanation

Example

Q7: How does cloud computing affect this decision?

Detailed Explanation

Example

Q8: What metrics determine when to scale?

Detailed Explanation

Example