26 KiB
Horizontal Scaling Operations Guide
Audience: DevOps Engineers, System Administrators Last Updated: November 2025 Applies To: MapleFile Backend
Table of Contents
- Overview
- Understanding Scaling
- Prerequisites
- Scaling Up (Adding Replicas)
- Scaling Down (Removing Replicas)
- Adding Worker Nodes
- Monitoring Scaled Services
- Common Scenarios
- Troubleshooting
- Best Practices
Overview
What is Horizontal Scaling?
Horizontal scaling means adding more servers (replicas) to handle increased load, rather than making existing servers more powerful (vertical scaling).
Example:
- Before: 1 server handling 100 requests/second
- After: 3 servers each handling 33 requests/second
Why Scale Horizontally?
- Higher availability: If one server fails, others keep serving traffic
- Better performance: Load distributed across multiple servers
- Handle traffic spikes: Scale up during peak times, scale down during quiet times
- Zero downtime deployments: Update servers one at a time
Current Architecture
Single-Server Setup (Current):
Worker-8: Backend (1 replica) + Cassandra + Redis
↓
100% of traffic
Multi-Server Setup (After Scaling):
Worker-8: Backend (replica 1) + Cassandra + Redis
Worker-10: Backend (replica 2)
Worker-11: Backend (replica 3)
↓ ↓ ↓
33% 33% 34% of traffic
Understanding Scaling
Vertical vs Horizontal Scaling
| Aspect | Vertical Scaling | Horizontal Scaling |
|---|---|---|
| Method | Bigger server | More servers |
| Cost | Expensive (high-tier droplets) | Cheaper (many small droplets) |
| Limit | Hardware limit (max CPU/RAM) | Unlimited (add more servers) |
| Downtime | Required (resize server) | Zero downtime |
| Complexity | Simple | More complex (load balancing) |
| Failure | Single point of failure | High availability |
Example:
- Vertical: Upgrade from $12/mo (2 vCPU, 2GB RAM) to $48/mo (8 vCPU, 16GB RAM)
- Horizontal: Add 3x $12/mo droplets = $36/mo total for 6 vCPU, 6GB RAM
When to Scale
Scale up when:
- CPU usage consistently above 70%
- Memory usage consistently above 80%
- Response times increasing
- Error rates increasing
- Traffic growing steadily
Scale down when:
- CPU usage consistently below 30%
- Memory usage consistently below 50%
- Traffic decreased
- Cost optimization needed
How Docker Swarm Handles Scaling
Docker Swarm automatically:
- Load balances traffic across all replicas
- Health checks each replica
- Restarts failed replicas
- Distributes replicas across worker nodes
- Updates replicas with zero downtime
Prerequisites
Before Scaling Up
Ensure your application supports horizontal scaling:
✅ MapleFile Backend is Ready
MapleFile backend is designed for horizontal scaling:
- ✅ Stateless: No local state (uses Cassandra/Redis for shared state)
- ✅ Leader election: Scheduled tasks run only on one instance
- ✅ Shared database: All replicas use same Cassandra cluster
- ✅ Shared cache: All replicas use same Redis instance
- ✅ Session storage: JWT tokens are stateless (no session store needed)
⚠️ Check Your Application
If you were scaling a different app, verify:
- No local file storage (use S3 instead)
- No in-memory sessions (use Redis instead)
- No local caching (use Redis instead)
- Database supports concurrent connections
- No port conflicts (don't bind to host ports)
Scaling Up (Adding Replicas)
Method 1: Quick Scale (Same Worker)
Scale to multiple replicas on the same worker node.
Step 1: SSH to Manager
ssh dockeradmin@<MANAGER_IP>
Step 2: Scale the Service
# Scale MapleFile backend from 1 to 3 replicas
docker service scale maplefile_backend=3
# Or use update command
docker service update --replicas 3 maplefile_backend
Step 3: Monitor Scaling
# Watch replicas start
watch docker service ls
# Expected output:
# NAME REPLICAS IMAGE
# maplefile_backend 3/3 ...maplefile-backend:prod
3/3 means: 3 desired replicas, 3 running
Step 4: Verify All Replicas Running
# Check where replicas are running
docker service ps maplefile_backend
# Output:
# NAME NODE CURRENT STATE
# maplefile_backend.1 worker-8 Running 5 minutes ago
# maplefile_backend.2 worker-8 Running 30 seconds ago
# maplefile_backend.3 worker-8 Running 30 seconds ago
Step 5: Check Logs
# Check logs from all replicas
docker service logs maplefile_backend --tail 50
# Look for successful startup from each replica
Step 6: Test Load Balancing
# Make multiple requests - should be distributed across replicas
for i in {1..10}; do
curl -s https://maplefile.ca/health
done
# Check logs to see different replicas handling requests
docker service logs maplefile_backend --tail 20
Method 2: Scale Across Multiple Workers
Scale replicas across different worker nodes for better availability.
Step 1: Add Worker Nodes (If Needed)
See Adding Worker Nodes section below.
Step 2: Label Worker Nodes
# Label worker-10 as backend node
docker node update --label-add maplefile-backend=true mapleopentech-swarm-worker-10-prod
# Label worker-11 as backend node
docker node update --label-add maplefile-backend=true mapleopentech-swarm-worker-11-prod
# Verify labels
docker node inspect mapleopentech-swarm-worker-10-prod --format '{{.Spec.Labels}}'
Step 3: Update Stack File
# Edit stack file
nano ~/stacks/maplefile-stack.yml
Change deployment configuration:
services:
backend:
deploy:
replicas: 3 # Change from 1 to 3
placement:
constraints:
# Remove single-node constraint
- node.labels.maplefile-backend == true # Now matches multiple workers
preferences:
# Spread replicas across different nodes
- spread: node.hostname
Step 4: Redeploy Stack
cd ~/stacks
docker stack deploy -c maplefile-stack.yml maplefile
Step 5: Verify Distribution
# Check which nodes replicas are running on
docker service ps maplefile_backend --format "table {{.Name}}\t{{.Node}}\t{{.CurrentState}}"
# Expected output (distributed across nodes):
# NAME NODE CURRENT STATE
# maplefile_backend.1 worker-8 Running
# maplefile_backend.2 worker-10 Running
# maplefile_backend.3 worker-11 Running
Method 3: Auto-Scaling (Advanced)
Note: Docker Swarm doesn't have built-in auto-scaling. You would need to implement custom auto-scaling using:
- Prometheus for metrics
- Custom script to monitor CPU/memory
- Script to scale service based on thresholds
Example auto-scale script:
#!/bin/bash
# auto-scale.sh - Example only, not production-ready
# Get average CPU usage across all replicas
CPU_AVG=$(docker stats --no-stream --format "{{.CPUPerc}}" | grep maplefile | awk '{sum+=$1; count++} END {print sum/count}')
# Scale up if CPU > 70%
if (( $(echo "$CPU_AVG > 70" | bc -l) )); then
CURRENT=$(docker service inspect maplefile_backend --format '{{.Spec.Mode.Replicated.Replicas}}')
NEW=$((CURRENT + 1))
docker service scale maplefile_backend=$NEW
echo "Scaled up to $NEW replicas (CPU: $CPU_AVG%)"
fi
# Scale down if CPU < 30% and more than 1 replica
if (( $(echo "$CPU_AVG < 30" | bc -l) )); then
CURRENT=$(docker service inspect maplefile_backend --format '{{.Spec.Mode.Replicated.Replicas}}')
if [ $CURRENT -gt 1 ]; then
NEW=$((CURRENT - 1))
docker service scale maplefile_backend=$NEW
echo "Scaled down to $NEW replicas (CPU: $CPU_AVG%)"
fi
fi
Scaling Down (Removing Replicas)
When to Scale Down
Scale down to save costs when:
- Traffic decreased
- CPU/memory usage consistently low
- Cost optimization needed
- Testing showed fewer replicas handle load fine
Step 1: SSH to Manager
ssh dockeradmin@<MANAGER_IP>
Step 2: Scale Down Service
# Scale from 3 replicas to 1
docker service scale maplefile_backend=1
# Or update
docker service update --replicas 1 maplefile_backend
Step 3: Monitor Scaling Down
# Watch replicas stop
watch docker service ls
# Expected output:
# NAME REPLICAS IMAGE
# maplefile_backend 1/1 ...maplefile-backend:prod
Step 4: Verify Which Replica Kept
# Check which replica is still running
docker service ps maplefile_backend
# Output:
# NAME NODE CURRENT STATE
# maplefile_backend.1 worker-8 Running 10 minutes ago
# maplefile_backend.2 worker-10 Shutdown 10 seconds ago
# maplefile_backend.3 worker-11 Shutdown 10 seconds ago
Step 5: Test Service Still Works
# Test endpoint
curl https://maplefile.ca/health
# Check logs
docker service logs maplefile_backend --tail 20
Adding Worker Nodes
When to Add Worker Nodes
Add worker nodes when:
- Want to distribute backend across multiple servers
- Current worker at capacity
- Need better high availability
- Planning for growth
Step 1: Create New DigitalOcean Droplet
From DigitalOcean dashboard or CLI:
# Create worker-10 droplet (Ubuntu 22.04, $12/mo)
doctl compute droplet create mapleopentech-swarm-worker-10-prod \
--region nyc3 \
--size s-2vcpu-2gb \
--image ubuntu-22-04-x64 \
--ssh-keys <your-ssh-key-id> \
--tag-names production,swarm-worker,maplefile
# Get IP address
doctl compute droplet get mapleopentech-swarm-worker-10-prod --format PublicIPv4
Step 2: Install Docker on New Worker
# SSH to new worker
ssh root@<worker-10-ip>
# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sh get-docker.sh
# Verify Docker installed
docker --version
Step 3: Join Worker to Swarm
On manager node:
# Get join token
ssh dockeradmin@<MANAGER_IP>
docker swarm join-token worker
# Output:
# docker swarm join --token SWMTKN-xxx <MANAGER_IP>:2377
On new worker:
# Join swarm (use token from above)
docker swarm join --token SWMTKN-xxx <MANAGER_IP>:2377
# Output:
# This node joined a swarm as a worker.
Step 4: Verify Worker Joined
On manager:
# List all nodes
docker node ls
# Output should include new worker:
# ID HOSTNAME STATUS AVAILABILITY
# xyz123 mapleopentech-swarm-manager-1-prod Ready Active Leader
# abc456 mapleopentech-swarm-worker-8-prod Ready Active
# def789 mapleopentech-swarm-worker-10-prod Ready Active ← New!
Step 5: Label New Worker
# Label worker-10 for backend workloads
docker node update --label-add maplefile-backend=true mapleopentech-swarm-worker-10-prod
# Verify label
docker node inspect mapleopentech-swarm-worker-10-prod --format '{{.Spec.Labels}}'
Step 6: Join to Private Network
Important: Workers must access Cassandra and Redis.
# Add worker-10 to maple-private-prod network
# This is done automatically when services start on the worker
# But verify connectivity:
# On worker-10, test Redis connectivity
ssh root@<worker-10-ip>
docker run --rm --network maple-private-prod redis:7.0-alpine redis-cli -h redis ping
# Should output: PONG
Step 7: Scale Service to Use New Worker
# On manager
docker service update --replicas 2 maplefile_backend
# Check distribution
docker service ps maplefile_backend --format "table {{.Name}}\t{{.Node}}\t{{.CurrentState}}"
# Should show replicas on both worker-8 and worker-10
Monitoring Scaled Services
Real-Time Monitoring
Watch service status:
# All services
watch docker service ls
# Specific service
watch 'docker service ps maplefile_backend --format "table {{.Name}}\t{{.Node}}\t{{.CurrentState}}"'
Monitor resource usage:
# CPU and memory of all replicas
docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}" | grep maplefile
# Continuous monitoring
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}" | grep maplefile
Check Load Distribution
See which replica handled request:
# Follow logs from all replicas
docker service logs -f maplefile_backend
# Filter for specific endpoint
docker service logs -f maplefile_backend | grep "/api/v1/users"
# You should see different replica IDs in logs
Prometheus Monitoring (If Configured)
Query metrics:
# Average CPU usage across all replicas
avg(rate(container_cpu_usage_seconds_total{service="maplefile_backend"}[5m]))
# Request rate per replica
sum(rate(http_requests_total{service="maplefile_backend"}[5m])) by (instance)
# P95 response time
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
Common Scenarios
Scenario 1: Handling Traffic Spike
Sudden traffic increase - need to scale quickly.
# SSH to manager
ssh dockeradmin@<MANAGER_IP>
# Check current load
docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}" | grep maplefile
# Scale from 1 to 5 replicas immediately
docker service scale maplefile_backend=5
# Monitor scaling
watch docker service ls
# Wait for all replicas healthy (5/5)
# Verify load distributed
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}" | grep maplefile
# After traffic spike ends, scale back down
docker service scale maplefile_backend=2
Scenario 2: Planned Scaling for Event
You know a marketing campaign will increase traffic.
Day Before Event:
# Add worker nodes if needed (see Adding Worker Nodes section)
# Scale up gradually
docker service scale maplefile_backend=3
# Verify all healthy
docker service ps maplefile_backend
# Load test
# Run load tests to verify system handles expected traffic
During Event:
# Monitor continuously
watch docker service ls
# Scale up more if needed
docker service scale maplefile_backend=5
# Check logs for errors
docker service logs maplefile_backend --tail 100 | grep -i error
After Event:
# Scale back down gradually
docker service scale maplefile_backend=3
# Monitor for 1 hour
# Scale to normal
docker service scale maplefile_backend=1
Scenario 3: Zero-Downtime Deployment with Scaling
Deploy new version with zero downtime using scaled replicas.
# 1. Scale up to 3 replicas BEFORE deploying
docker service scale maplefile_backend=3
# Wait for all healthy
docker service ps maplefile_backend
# 2. Deploy new image
docker service update --image registry.digitalocean.com/ssp/maplefile-backend:prod maplefile_backend
# Docker Swarm will:
# - Update replica 1, wait for health check
# - Update replica 2, wait for health check
# - Update replica 3, wait for health check
# Always at least 2 replicas serving traffic
# 3. Monitor update
docker service ps maplefile_backend
# 4. After successful deployment, scale back down if desired
docker service scale maplefile_backend=1
Scenario 4: High Availability Setup
Run 3 replicas across 3 workers for maximum availability.
# Ensure 3 worker nodes labeled
docker node update --label-add maplefile-backend=true mapleopentech-swarm-worker-8-prod
docker node update --label-add maplefile-backend=true mapleopentech-swarm-worker-10-prod
docker node update --label-add maplefile-backend=true mapleopentech-swarm-worker-11-prod
# Update stack file for HA
nano ~/stacks/maplefile-stack.yml
Stack file HA configuration:
services:
backend:
deploy:
replicas: 3
placement:
constraints:
- node.labels.maplefile-backend == true
preferences:
# Spread across different nodes
- spread: node.hostname
max_replicas_per_node: 1 # Only 1 replica per node
update_config:
parallelism: 1 # Update 1 replica at a time
delay: 10s
failure_action: rollback
monitor: 60s
order: start-first # Start new before stopping old
Deploy HA stack:
docker stack deploy -c maplefile-stack.yml maplefile
# Verify distribution
docker service ps maplefile_backend --format "table {{.Name}}\t{{.Node}}\t{{.CurrentState}}"
# Should show 1 replica on each worker
Scenario 5: Cost Optimization
Running 3 replicas but only need 1 during off-peak hours.
Create scale-down script:
# Create script
cat > ~/stacks/scale-schedule.sh << 'EOF'
#!/bin/bash
HOUR=$(date +%H)
# Scale up during business hours (9 AM - 6 PM)
if [ $HOUR -ge 9 ] && [ $HOUR -lt 18 ]; then
docker service scale maplefile_backend=3
echo "$(date): Scaled to 3 replicas (business hours)"
# Scale down during off-peak (6 PM - 9 AM)
else
docker service scale maplefile_backend=1
echo "$(date): Scaled to 1 replica (off-peak)"
fi
EOF
chmod +x ~/stacks/scale-schedule.sh
Add to crontab:
# Run every hour
crontab -e
# Add:
0 * * * * /root/stacks/scale-schedule.sh >> /var/log/maplefile-scaling.log 2>&1
Troubleshooting
Problem: Replica Won't Start
Symptom: Service shows 2/3 replicas (one missing)
Diagnosis:
# Check service tasks
docker service ps maplefile_backend --no-trunc
# Look for ERROR or FAILED states
# Common errors:
# - "no suitable node"
# - "resource constraints not met"
# - "starting container failed"
Solutions:
If "no suitable node":
# Check node availability
docker node ls
# Check placement constraints
docker service inspect maplefile_backend --format '{{.Spec.TaskTemplate.Placement}}'
# Fix: Add more worker nodes or adjust constraints
If "resource constraints":
# Check worker resources
docker node inspect <worker-name> --format '{{.Description.Resources}}'
# Fix: Add more memory/CPU or scale down other services
If "container failed to start":
# Check logs
docker service logs maplefile_backend --tail 100
# Fix: Resolve application error (database connection, etc.)
Problem: Uneven Load Distribution
Symptom: One replica handling more traffic than others
Diagnosis:
# Check CPU/memory per replica
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}" | grep maplefile
# Check request logs
docker service logs maplefile_backend | grep "HTTP request"
Causes:
- External load balancer pinning connections
- Long-lived connections (WebSockets)
- Some replicas slower (different hardware)
Solution:
# Ensure using Docker Swarm's built-in load balancer (ingress network)
# Check service network mode
docker service inspect maplefile_backend --format '{{.Spec.EndpointSpec.Mode}}'
# Should be: vip (virtual IP for load balancing)
# If not, update service
docker service update --endpoint-mode vip maplefile_backend
Problem: Replica on Wrong Node
Symptom: Replica running on node without required labels
Diagnosis:
# Check where replicas are running
docker service ps maplefile_backend --format "table {{.Name}}\t{{.Node}}"
# Check node labels
docker node inspect <node-name> --format '{{.Spec.Labels}}'
Solution:
# Add label to node
docker node update --label-add maplefile-backend=true <node-name>
# Or force replica to move
docker service update --force maplefile_backend
Problem: Can't Scale Down
Symptom: docker service scale hangs or fails
Diagnosis:
# Check service update status
docker service inspect maplefile_backend --format '{{.UpdateStatus.State}}'
# Check for stuck tasks
docker service ps maplefile_backend --no-trunc
Solution:
# Cancel stuck update
docker service update --rollback maplefile_backend
# Force scale
docker service update --replicas 1 --force maplefile_backend
Problem: Leader Election Issues (Multiple Leaders)
Symptom: Scheduled tasks running multiple times
Diagnosis:
# Check logs for leader election messages
docker service logs maplefile_backend | grep -i "leader"
# Should see only one "Elected as leader"
Cause: Redis connection issues or split-brain
Solution:
# Restart all replicas to re-elect leader
docker service update --force maplefile_backend
# Verify single leader in logs
docker service logs maplefile_backend --tail 50 | grep -i "leader"
Best Practices
1. Start Small, Scale Gradually
# Don't go from 1 to 10 replicas immediately
# Scale gradually:
docker service scale maplefile_backend=2 # Test with 2
# Monitor for 30 minutes
docker service scale maplefile_backend=3 # Increase to 3
# Monitor for 30 minutes
docker service scale maplefile_backend=5 # Increase to 5
2. Always Scale Before Deploying
# Scale up for safer deployments
docker service scale maplefile_backend=3
docker service update --image ...new-image... maplefile_backend
# Can scale back down after deployment succeeds
3. Use Health Checks
Ensure stack file has health checks:
services:
backend:
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8000/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 60s
4. Monitor Resource Usage
# Check BEFORE scaling
docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"
# If CPU < 50%, probably don't need to scale yet
# If CPU > 70%, scale up
# If CPU > 90%, scale urgently
5. Document Scaling Decisions
Keep a scaling log:
## Scaling Log
### 2025-11-14 - Scaled to 3 Replicas
- **Reason:** Marketing campaign expected to 3x traffic
- **Duration:** 2025-11-14 to 2025-11-16
- **Command:** `docker service scale maplefile_backend=3`
- **Result:** Successfully handled 3x traffic, CPU avg 45%
- **Cost:** +$24/mo for 2 extra droplets
### 2025-11-16 - Scaled back to 1 Replica
- **Reason:** Campaign ended, traffic back to normal
- **Command:** `docker service scale maplefile_backend=1`
- **Result:** Single replica handling load fine, CPU avg 35%
6. Test Scaling in Non-Production First
If you have QA environment:
# Test scaling in QA
ssh qa-manager
docker service scale maplefile_backend_qa=3
# Verify works correctly
# - Load balancing
# - Leader election
# - Database connections
# - Performance
# Then apply to production
ssh dockeradmin@<MANAGER_IP>
docker service scale maplefile_backend=3
7. Plan for Database Connections
Each replica needs database connections:
# If you have 3 replicas with 2 connections each = 6 total connections
# Ensure Cassandra can handle this
# Check Cassandra connection limit (default: high)
# Check Redis connection limit (default: 10000)
# If scaling to 10+ replicas, verify database can handle connections
8. Consider Cost vs Performance
Calculate costs:
# Current: 1 replica on worker-8 ($12/mo)
# Total: $12/mo
# Scaled: 3 replicas across 3 workers ($12/mo each)
# Total: $36/mo (+$24/mo)
# Is the performance gain worth $24/mo?
# - If traffic justifies it: Yes
# - If just for redundancy: Maybe use 2 replicas instead
9. Use Placement Strategies
Spread across nodes for HA:
deploy:
placement:
preferences:
- spread: node.hostname # Spread across different nodes
Or pack onto fewer nodes for cost:
deploy:
placement:
preferences:
- spread: node.id # Pack onto fewer nodes first
10. Set Resource Limits
Prevent one replica from using all resources:
services:
backend:
deploy:
resources:
limits:
memory: 1G # Max 1GB per replica
cpus: '0.5' # Max 50% of 1 CPU
reservations:
memory: 512M # Reserve 512MB
cpus: '0.25' # Reserve 25% of 1 CPU
Quick Reference
Essential Commands
# Scale service
docker service scale maplefile_backend=3
docker service update --replicas 3 maplefile_backend
# Check replicas
docker service ls | grep maplefile
docker service ps maplefile_backend
# Monitor resources
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}" | grep maplefile
# Check distribution
docker service ps maplefile_backend --format "table {{.Name}}\t{{.Node}}\t{{.CurrentState}}"
# Scale down
docker service scale maplefile_backend=1
# Force update (re-distribute replicas)
docker service update --force maplefile_backend
Scaling Decision Matrix
| CPU Usage | Memory Usage | Action |
|---|---|---|
| < 30% | < 50% | Scale down or keep current |
| 30-70% | 50-80% | Keep current (optimal) |
| 70-85% | 80-90% | Scale up soon (planned) |
| > 85% | > 90% | Scale up now (urgent) |
Replica Count Guidelines
| Traffic Level | Suggested Replicas | Cost |
|---|---|---|
| Development | 1 | $12/mo |
| Low (< 1000 req/day) | 1 | $12/mo |
| Medium (1000-10000 req/day) | 2-3 | $24-36/mo |
| High (10000-100000 req/day) | 5-10 | $60-120/mo |
| Very High (> 100000 req/day) | 10+ | $120+/mo |
Related Documentation
Questions?
- Check service status:
docker service ls | grep maplefile - Check replica distribution:
docker service ps maplefile_backend - Monitor resources:
docker stats | grep maplefile
Last Updated: November 2025