20 KiB
Production Infrastructure Setup Guide
Audience: DevOps Engineers, Infrastructure Team, Junior Engineers Purpose: Complete step-by-step deployment of Maple Open Technologies production infrastructure from scratch Time to Complete: 6-8 hours (first-time deployment) Prerequisites: DigitalOcean account, basic Linux knowledge, SSH access
Overview
This directory contains comprehensive guides for deploying Maple Open Technologies production infrastructure on DigitalOcean from a completely fresh start. Follow these guides in sequential order to build a complete, production-ready infrastructure.
What you'll build:
- Docker Swarm cluster (7+ nodes)
- High-availability databases (Cassandra 3-node cluster)
- Caching layer (Redis)
- Search engine (Meilisearch)
- Backend API (Go application)
- Frontend (React SPA)
- Automatic HTTPS with SSL certificates
- Multi-application architecture (MaplePress, MapleFile)
Infrastructure at completion:
Internet (HTTPS)
├─ getmaplepress.ca → Backend API (worker-6)
└─ getmaplepress.com → Frontend (worker-7)
↓
Backend Services (maple-public-prod + maple-private-prod)
↓
Databases (maple-private-prod only)
├─ Cassandra: 3-node cluster (workers 2,3,4) - RF=3, QUORUM
├─ Redis: Single instance (worker-1/manager)
└─ Meilisearch: Single instance (worker-5)
↓
Object Storage: DigitalOcean Spaces (S3-compatible)
Setup Guides (In Order)
Phase 0: Planning & Prerequisites (30 minutes)
00-getting-started.md - Local workspace setup
- DigitalOcean account setup
- API token configuration
- SSH key generation
.envfile initialization- Command-line tools verification
00-network-architecture.md - Network design
- Network segmentation strategy (
maple-private-prodvsmaple-public-prod) - Security principles (defense in depth)
- Service communication patterns
- Firewall rules overview
00-multi-app-architecture.md - Multi-app strategy
- Naming conventions for services, stacks, hostnames
- Shared infrastructure design (Cassandra/Redis/Meilisearch)
- Application isolation patterns
- Scaling to multiple apps (MaplePress, MapleFile)
Prerequisites checklist:
- DigitalOcean account with billing enabled
- DigitalOcean API token (read + write permissions)
- SSH key pair generated (
~/.ssh/id_rsa.pub) - Domain names registered (e.g.,
getmaplepress.ca,getmaplepress.com) - Local machine: git, ssh, curl installed
.envfile created from.env.template
Total time: 30 minutes
Phase 1: Infrastructure Foundation (3-4 hours)
01_init_docker_swarm.md - Docker Swarm cluster
- Create 7+ DigitalOcean droplets (Ubuntu 24.04)
- Install Docker on all nodes
- Initialize Docker Swarm (1 manager, 6+ workers)
- Configure private networking (VPC)
- Set up firewall rules
- Verify cluster connectivity
What you'll have:
- Manager node (worker-1): Swarm orchestration
- Worker nodes (2-7+): Application/database hosts
- Private network: 10.116.0.0/16
- All nodes communicating securely
Total time: 1-1.5 hours
02_cassandra.md - Cassandra database cluster
- Deploy 3-node Cassandra cluster (workers 2, 3, 4)
- Configure replication (RF=3, QUORUM consistency)
- Create keyspace and initial schema
- Verify cluster health (
nodetool status) - Performance tuning for production
What you'll have:
- Highly available database cluster
- Automatic failover (survives 1 node failure)
- QUORUM reads/writes for consistency
- Ready for application data
Total time: 1-1.5 hours
03_redis.md - Redis cache server
- Deploy Redis on manager node (worker-1)
- Configure persistence (RDB + AOF)
- Set up password authentication
- Test connectivity from other services
What you'll have:
- High-performance caching layer
- Session storage
- Rate limiting storage
- Persistent cache (survives restarts)
Total time: 30 minutes
04_meilisearch.md - Search engine
- Deploy Meilisearch on worker-5
- Configure API key authentication
- Create initial indexes
- Test search functionality
What you'll have:
- Fast full-text search engine
- Typo-tolerant search
- Faceted filtering
- Ready for content indexing
Total time: 30 minutes
04.5_spaces.md - Object storage
- Create DigitalOcean Spaces bucket
- Configure access keys
- Set up CORS policies
- Create Docker secrets for Spaces credentials
- Test upload/download
What you'll have:
- S3-compatible object storage
- Secure credential management
- Ready for file uploads
- CDN-backed storage
Total time: 30 minutes
Phase 2: Application Deployment (2-3 hours)
05_maplepress_backend.md - Backend API deployment (Part 1)
- Create worker-6 droplet
- Join worker-6 to Docker Swarm
- Configure DNS (point domain to worker-6)
- Authenticate with DigitalOcean Container Registry
- Create Docker secrets (JWT, encryption keys)
- Deploy backend service (Go application)
- Connect to databases (Cassandra, Redis, Meilisearch)
- Verify health checks
What you'll have:
- Backend API running on worker-6
- Connected to all databases
- Docker secrets configured
- Health checks passing
- Ready for reverse proxy
Total time: 1-1.5 hours
06_maplepress_caddy.md - Backend reverse proxy (Part 2)
- Configure Caddy reverse proxy
- Set up automatic SSL/TLS (Let's Encrypt)
- Configure security headers
- Enable HTTP to HTTPS redirect
- Preserve CORS headers for frontend
- Test SSL certificate acquisition
What you'll have:
- Backend accessible at
https://getmaplepress.ca - Automatic SSL certificate management
- Zero-downtime certificate renewals
- Security headers configured
- CORS configured for frontend
Total time: 30 minutes
07_maplepress_frontend.md - Frontend deployment
- Create worker-7 droplet
- Join worker-7 to Docker Swarm
- Install Node.js on worker-7
- Clone repository and build React app
- Configure production environment (API URL)
- Deploy Caddy for static file serving
- Configure SPA routing
- Set up automatic SSL for frontend domain
What you'll have:
- Frontend accessible at
https://getmaplepress.com - React app built with production API URL
- Automatic HTTPS
- SPA routing working
- Static asset caching
- Complete end-to-end application
Total time: 1 hour
Phase 3: Optional Enhancements (1 hour)
99_extra.md - Extra operations
- Domain changes (backend and/or frontend)
- Horizontal scaling (multiple backend replicas)
- SSL certificate management
- Load balancing verification
Total time: As needed
Quick Start (Experienced Engineers)
If you're familiar with Docker Swarm and don't need detailed explanations:
# 1. Prerequisites (5 min)
cd cloud/infrastructure/production
cp .env.template .env
vi .env # Add DIGITALOCEAN_TOKEN
source .env
# 2. Infrastructure (1 hour)
# Follow 01_init_docker_swarm.md - create 7 droplets, init swarm
# SSH to manager, run quick verification
# 3. Databases (1 hour)
# Deploy Cassandra (02), Redis (03), Meilisearch (04), Spaces (04.5)
# Verify all services: docker service ls
# 4. Applications (1 hour)
# Deploy backend (05), backend-caddy (06), frontend (07)
# Test: curl https://getmaplepress.ca/health
# curl https://getmaplepress.com
# 5. Verify (15 min)
docker service ls # All services 1/1
docker node ls # All nodes Ready
# Test in browser: https://getmaplepress.com
Total time for experienced: ~3 hours
Directory Structure
setup/
├── README.md # This file
│
├── 00-getting-started.md # Prerequisites & workspace setup
├── 00-network-architecture.md # Network design principles
├── 00-multi-app-architecture.md # Multi-app naming & strategy
│
├── 01_init_docker_swarm.md # Docker Swarm cluster
├── 02_cassandra.md # Cassandra database cluster
├── 03_redis.md # Redis cache server
├── 04_meilisearch.md # Meilisearch search engine
├── 04.5_spaces.md # DigitalOcean Spaces (object storage)
│
├── 05_backend.md # Backend API deployment
├── 06_caddy.md # Backend reverse proxy (Caddy + SSL)
├── 07_frontend.md # Frontend deployment (React + Caddy)
│
├── 08_extra.md # Domain changes, scaling, extras
│
└── templates/ # Configuration templates
├── cassandra-stack.yml.template
├── redis-stack.yml.template
├── backend-stack.yml.template
└── Caddyfile.template
Infrastructure Specifications
Hardware Requirements
| Component | Droplet Size | vCPUs | RAM | Disk | Monthly Cost |
|---|---|---|---|---|---|
| Manager (worker-1) + Redis | Basic | 2 | 2 GB | 50 GB | $18 |
| Cassandra Node 1 (worker-2) | General Purpose | 2 | 4 GB | 80 GB | $48 |
| Cassandra Node 2 (worker-3) | General Purpose | 2 | 4 GB | 80 GB | $48 |
| Cassandra Node 3 (worker-4) | General Purpose | 2 | 4 GB | 80 GB | $48 |
| Meilisearch (worker-5) | Basic | 2 | 2 GB | 50 GB | $18 |
| Backend (worker-6) | Basic | 2 | 2 GB | 50 GB | $18 |
| Frontend (worker-7) | Basic | 1 | 1 GB | 25 GB | $6 |
| Total | - | 13 | 19 GB | 415 GB | ~$204/mo |
Additional costs:
- DigitalOcean Spaces: $5/mo (250 GB storage + 1 TB transfer)
- Bandwidth: Included (1 TB per droplet)
- Backups (optional): +20% of droplet cost
Total estimated: ~$210-250/month
Software Versions
| Software | Version | Notes |
|---|---|---|
| Ubuntu | 24.04 LTS | Base OS |
| Docker | 27.x+ | Container runtime |
| Docker Swarm | Built-in | Orchestration |
| Cassandra | 4.1.x | Database |
| Redis | 7.x-alpine | Cache |
| Meilisearch | v1.5+ | Search |
| Caddy | 2-alpine | Reverse proxy |
| Go | 1.21+ | Backend runtime |
| Node.js | 20 LTS | Frontend build |
Key Concepts
Docker Swarm Architecture
Manager node (worker-1):
- Orchestrates all services
- Schedules tasks to workers
- Maintains cluster state
- Runs Redis (collocated)
Worker nodes (2-7+):
- Execute service tasks (containers)
- Report health to manager
- Isolated workloads via labels
Node labels:
backend=true: Backend deployment target (worker-6)maplepress-frontend=true: Frontend target (worker-7)
Network Architecture
maple-private-prod (overlay network):
- All databases (Cassandra, Redis, Meilisearch)
- Backend services (access to databases)
- No internet access (security)
- Internal-only communication
maple-public-prod (overlay network):
- Caddy reverse proxies
- Backend services (receive HTTP requests)
- Ports 80/443 exposed to internet
Backends join BOTH networks:
- Receive requests from Caddy (public network)
- Access databases (private network)
Multi-Application Pattern
Shared infrastructure (workers 1-5):
- Cassandra, Redis, Meilisearch serve ALL apps
- Cost-efficient (1 infrastructure for unlimited apps)
Per-application deployment (workers 6+):
- Each app gets dedicated workers
- Independent scaling and deployment
- Clear isolation
Example: Adding MapleFile
- Worker-8:
maplefile_backend+maplefile_backend-caddy - Worker-9:
maplefile-frontend_caddy - Uses same Cassandra/Redis/Meilisearch
- No changes to infrastructure
Common Commands Reference
Swarm Management
# List all nodes
docker node ls
# List all services
docker service ls
# View service logs
docker service logs -f maplepress_backend
# Scale service
docker service scale maplepress_backend=3
# Update service (rolling restart)
docker service update --force maplepress_backend
# Remove service
docker service rm maplepress_backend
Stack Management
# Deploy stack
docker stack deploy -c stack.yml stack-name
# List stacks
docker stack ls
# View stack services
docker stack services maplepress
# Remove stack
docker stack rm maplepress
Troubleshooting
# Check service status
docker service ps maplepress_backend
# View container logs
docker logs <container-id>
# Inspect service
docker service inspect maplepress_backend
# Check network
docker network inspect maple-private-prod
# List configs
docker config ls
# List secrets
docker secret ls
Deployment Checklist
Use this checklist to track your progress:
Phase 0: Prerequisites
- DigitalOcean account created
- API token generated and saved
- SSH keys generated (
ssh-keygen) - SSH key added to DigitalOcean
- Domain names registered
.envfile created from template.envfile has correct permissions (600)- Git repository cloned locally
Phase 1: Infrastructure
- 7 droplets created (workers 1-7)
- Docker Swarm initialized
- All workers joined swarm
- Private networking configured (VPC)
- Firewall rules configured on all nodes
- Cassandra 3-node cluster deployed
- Cassandra cluster healthy (
nodetool status) - Redis deployed on manager
- Redis authentication configured
- Meilisearch deployed on worker-5
- Meilisearch API key configured
- DigitalOcean Spaces bucket created
- Spaces access keys stored as Docker secrets
Phase 2: Applications
- Worker-6 created and joined swarm
- Worker-6 labeled for backend
- DNS pointing backend domain to worker-6
- Backend Docker secrets created (JWT, IP encryption)
- Backend service deployed
- Backend health check passing
- Backend Caddy deployed
- Backend SSL certificate obtained
- Backend accessible at
https://domain.ca - Worker-7 created and joined swarm
- Worker-7 labeled for frontend
- DNS pointing frontend domain to worker-7
- Node.js installed on worker-7
- Repository cloned on worker-7
- Frontend built with production API URL
- Frontend Caddy deployed
- Frontend SSL certificate obtained
- Frontend accessible at
https://domain.com - CORS working (frontend can call backend)
Phase 3: Verification
- All services show 1/1 replicas (
docker service ls) - All nodes show Ready (
docker node ls) - Backend health endpoint returns 200
- Frontend loads in browser
- Frontend can call backend API (no CORS errors)
- SSL certificates valid (green padlock)
- HTTP redirects to HTTPS
Next Steps
- Set up monitoring (see
../operations/02_monitoring_alerting.md) - Configure backups (see
../operations/01_backup_recovery.md) - Review incident runbooks (see
../operations/03_incident_response.md)
Troubleshooting Guide
Problem: Docker Swarm Join Fails
Symptoms: Worker can't join swarm, connection refused
Check:
# On manager, verify swarm is initialized
docker info | grep "Swarm: active"
# Verify firewall allows swarm ports
sudo ufw status | grep -E "2377|7946|4789"
# Get new join token
docker swarm join-token worker
Problem: Service Won't Start
Symptoms: Service stuck at 0/1 replicas
Check:
# View service events
docker service ps service-name --no-trunc
# Common issues:
# - Image not found: Authenticate with registry
# - Network not found: Create network first
# - Secret not found: Create secrets
# - No suitable node: Check node labels
Problem: DNS Not Resolving
Symptoms: Domain doesn't resolve to correct IP
Check:
# Test DNS resolution
dig yourdomain.com +short
# Should return worker IP
# If not, wait 5-60 minutes for propagation
# Or check DNS provider settings
Problem: SSL Certificate Not Obtained
Symptoms: HTTPS not working, certificate errors
Check:
# Verify DNS points to correct server
dig yourdomain.com +short
# Verify port 80 accessible (Let's Encrypt challenge)
curl http://yourdomain.com
# Check Caddy logs
docker service logs service-name --tail 100 | grep -i certificate
# Common issues:
# - DNS not pointing to server
# - Port 80 blocked by firewall
# - Rate limited (5 certs/domain/week)
Problem: Services Can't Communicate
Symptoms: Backend can't reach database
Check:
# Verify both services on same network
docker service inspect backend --format '{{.Spec.TaskTemplate.Networks}}'
docker service inspect database --format '{{.Spec.TaskTemplate.Networks}}'
# Test DNS resolution from container
docker exec <container> nslookup database-hostname
# Verify firewall allows internal traffic
sudo ufw status | grep 10.116.0.0/16
Getting Help
Documentation Resources
Within this repository:
- This directory (
setup/): Initial deployment guides ../operations/: Day-to-day operational procedures../reference/: Architecture diagrams, capacity planning../automation/: Scripts for common tasks
External resources:
- Docker Swarm: https://docs.docker.com/engine/swarm/
- Cassandra: https://cassandra.apache.org/doc/latest/
- DigitalOcean: https://docs.digitalocean.com/
- Caddy: https://caddyserver.com/docs/
Common Questions
Q: Can I use a different cloud provider (AWS, GCP, Azure)? A: Yes, but you'll need to adapt networking and object storage sections. The Docker Swarm and application deployment sections remain the same.
Q: Can I deploy with fewer nodes? A: Minimum viable: 3 nodes (1 manager + 2 workers). Run Cassandra in single-node mode (not recommended for production). Colocate services on same workers.
Q: How do I add a new application (e.g., MapleFile)?
A: Follow 00-multi-app-architecture.md. Add 2 workers (backend + frontend), deploy new stacks. Reuse existing databases.
Q: What if I only have one domain?
A: Use subdomains: api.yourdomain.com (backend), app.yourdomain.com (frontend). Update DNS and Caddyfiles accordingly.
Security Best Practices
Implemented by these guides:
- ✅ Firewall configured (UFW) on all nodes
- ✅ SSH key-based authentication (no passwords)
- ✅ Docker secrets for sensitive values
- ✅ Network segmentation (private vs public)
- ✅ Automatic HTTPS with Let's Encrypt
- ✅ Security headers configured in Caddy
- ✅ Database authentication (Redis password, Meilisearch API key)
- ✅ Private Docker registry authentication
Additional recommendations:
- Rotate secrets quarterly (see
../operations/07_security_operations.md) - Enable 2FA on DigitalOcean account
- Regular security updates (Ubuntu unattended-upgrades)
- Monitor for unauthorized access attempts
- Backup encryption (GPG for backup files)
Maintenance Schedule
After deployment, establish these routines:
Daily:
- Check service health (
docker service ls) - Review monitoring dashboards
- Check backup completion logs
Weekly:
- Review security logs
- Check disk space across all nodes
- Verify SSL certificate expiry dates
Monthly:
- Apply security updates (
apt update && apt upgrade) - Review capacity and performance metrics
- Test backup restore procedures
- Rotate non-critical secrets
Quarterly:
- Full disaster recovery drill
- Review and update documentation
- Capacity planning review
- Security audit
What's Next?
After completing setup:
-
Configure Operations (
../operations/)- Set up monitoring and alerting
- Configure automated backups
- Review incident response runbooks
-
Optimize Performance
- Tune database settings
- Configure caching strategies
- Load test your infrastructure
-
Add Redundancy
- Scale critical services
- Set up failover procedures
- Implement health checks
-
Automate
- CI/CD pipeline for deployments
- Automated testing
- Infrastructure as Code (Terraform)
Last Updated: January 2025 Maintained By: Infrastructure Team Review Frequency: Quarterly
Feedback: Found an issue or have a suggestion? Open an issue on Codeberg or contact the infrastructure team.
Success! 🎉
If you've completed all guides in this directory, you now have:
✅ Production-ready infrastructure on DigitalOcean ✅ High-availability database cluster (Cassandra RF=3) ✅ Caching and search infrastructure (Redis, Meilisearch) ✅ Secure backend API with automatic HTTPS ✅ React frontend with automatic SSL ✅ Multi-application architecture ready to scale ✅ Network segmentation for security ✅ Docker Swarm orchestration
Welcome to production operations! 🚀
Now head to ../operations/ to learn how to run and maintain your infrastructure.