Initial commit: Open sourcing all of the Maple Open Technologies code.
This commit is contained in:
commit
755d54a99d
2010 changed files with 448675 additions and 0 deletions
745
cloud/infrastructure/production/setup/README.md
Normal file
745
cloud/infrastructure/production/setup/README.md
Normal file
|
|
@ -0,0 +1,745 @@
|
|||
# Production Infrastructure Setup Guide
|
||||
|
||||
**Audience**: DevOps Engineers, Infrastructure Team, Junior Engineers
|
||||
**Purpose**: Complete step-by-step deployment of Maple Open Technologies production infrastructure from scratch
|
||||
**Time to Complete**: 6-8 hours (first-time deployment)
|
||||
**Prerequisites**: DigitalOcean account, basic Linux knowledge, SSH access
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This directory contains comprehensive guides for deploying Maple Open Technologies production infrastructure on DigitalOcean from a **completely fresh start**. Follow these guides in sequential order to build a complete, production-ready infrastructure.
|
||||
|
||||
**What you'll build:**
|
||||
- Docker Swarm cluster (7+ nodes)
|
||||
- High-availability databases (Cassandra 3-node cluster)
|
||||
- Caching layer (Redis)
|
||||
- Search engine (Meilisearch)
|
||||
- Backend API (Go application)
|
||||
- Frontend (React SPA)
|
||||
- Automatic HTTPS with SSL certificates
|
||||
- Multi-application architecture (MaplePress, MapleFile)
|
||||
|
||||
**Infrastructure at completion:**
|
||||
```
|
||||
Internet (HTTPS)
|
||||
├─ getmaplepress.ca → Backend API (worker-6)
|
||||
└─ getmaplepress.com → Frontend (worker-7)
|
||||
↓
|
||||
Backend Services (maple-public-prod + maple-private-prod)
|
||||
↓
|
||||
Databases (maple-private-prod only)
|
||||
├─ Cassandra: 3-node cluster (workers 2,3,4) - RF=3, QUORUM
|
||||
├─ Redis: Single instance (worker-1/manager)
|
||||
└─ Meilisearch: Single instance (worker-5)
|
||||
↓
|
||||
Object Storage: DigitalOcean Spaces (S3-compatible)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Setup Guides (In Order)
|
||||
|
||||
### Phase 0: Planning & Prerequisites (30 minutes)
|
||||
|
||||
**[00-getting-started.md](00-getting-started.md)** - Local workspace setup
|
||||
- DigitalOcean account setup
|
||||
- API token configuration
|
||||
- SSH key generation
|
||||
- `.env` file initialization
|
||||
- Command-line tools verification
|
||||
|
||||
**[00-network-architecture.md](00-network-architecture.md)** - Network design
|
||||
- Network segmentation strategy (`maple-private-prod` vs `maple-public-prod`)
|
||||
- Security principles (defense in depth)
|
||||
- Service communication patterns
|
||||
- Firewall rules overview
|
||||
|
||||
**[00-multi-app-architecture.md](00-multi-app-architecture.md)** - Multi-app strategy
|
||||
- Naming conventions for services, stacks, hostnames
|
||||
- Shared infrastructure design (Cassandra/Redis/Meilisearch)
|
||||
- Application isolation patterns
|
||||
- Scaling to multiple apps (MaplePress, MapleFile)
|
||||
|
||||
**Prerequisites checklist:**
|
||||
- [ ] DigitalOcean account with billing enabled
|
||||
- [ ] DigitalOcean API token (read + write permissions)
|
||||
- [ ] SSH key pair generated (`~/.ssh/id_rsa.pub`)
|
||||
- [ ] Domain names registered (e.g., `getmaplepress.ca`, `getmaplepress.com`)
|
||||
- [ ] Local machine: git, ssh, curl installed
|
||||
- [ ] `.env` file created from `.env.template`
|
||||
|
||||
**Total time: 30 minutes**
|
||||
|
||||
---
|
||||
|
||||
### Phase 1: Infrastructure Foundation (3-4 hours)
|
||||
|
||||
**[01_init_docker_swarm.md](01_init_docker_swarm.md)** - Docker Swarm cluster
|
||||
- Create 7+ DigitalOcean droplets (Ubuntu 24.04)
|
||||
- Install Docker on all nodes
|
||||
- Initialize Docker Swarm (1 manager, 6+ workers)
|
||||
- Configure private networking (VPC)
|
||||
- Set up firewall rules
|
||||
- Verify cluster connectivity
|
||||
|
||||
**What you'll have:**
|
||||
- Manager node (worker-1): Swarm orchestration
|
||||
- Worker nodes (2-7+): Application/database hosts
|
||||
- Private network: 10.116.0.0/16
|
||||
- All nodes communicating securely
|
||||
|
||||
**Total time: 1-1.5 hours**
|
||||
|
||||
---
|
||||
|
||||
**[02_cassandra.md](02_cassandra.md)** - Cassandra database cluster
|
||||
- Deploy 3-node Cassandra cluster (workers 2, 3, 4)
|
||||
- Configure replication (RF=3, QUORUM consistency)
|
||||
- Create keyspace and initial schema
|
||||
- Verify cluster health (`nodetool status`)
|
||||
- Performance tuning for production
|
||||
|
||||
**What you'll have:**
|
||||
- Highly available database cluster
|
||||
- Automatic failover (survives 1 node failure)
|
||||
- QUORUM reads/writes for consistency
|
||||
- Ready for application data
|
||||
|
||||
**Total time: 1-1.5 hours**
|
||||
|
||||
---
|
||||
|
||||
**[03_redis.md](03_redis.md)** - Redis cache server
|
||||
- Deploy Redis on manager node (worker-1)
|
||||
- Configure persistence (RDB + AOF)
|
||||
- Set up password authentication
|
||||
- Test connectivity from other services
|
||||
|
||||
**What you'll have:**
|
||||
- High-performance caching layer
|
||||
- Session storage
|
||||
- Rate limiting storage
|
||||
- Persistent cache (survives restarts)
|
||||
|
||||
**Total time: 30 minutes**
|
||||
|
||||
---
|
||||
|
||||
**[04_meilisearch.md](04_meilisearch.md)** - Search engine
|
||||
- Deploy Meilisearch on worker-5
|
||||
- Configure API key authentication
|
||||
- Create initial indexes
|
||||
- Test search functionality
|
||||
|
||||
**What you'll have:**
|
||||
- Fast full-text search engine
|
||||
- Typo-tolerant search
|
||||
- Faceted filtering
|
||||
- Ready for content indexing
|
||||
|
||||
**Total time: 30 minutes**
|
||||
|
||||
---
|
||||
|
||||
**[04.5_spaces.md](04.5_spaces.md)** - Object storage
|
||||
- Create DigitalOcean Spaces bucket
|
||||
- Configure access keys
|
||||
- Set up CORS policies
|
||||
- Create Docker secrets for Spaces credentials
|
||||
- Test upload/download
|
||||
|
||||
**What you'll have:**
|
||||
- S3-compatible object storage
|
||||
- Secure credential management
|
||||
- Ready for file uploads
|
||||
- CDN-backed storage
|
||||
|
||||
**Total time: 30 minutes**
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Application Deployment (2-3 hours)
|
||||
|
||||
**[05_maplepress_backend.md](05_maplepress_backend.md)** - Backend API deployment (Part 1)
|
||||
- Create worker-6 droplet
|
||||
- Join worker-6 to Docker Swarm
|
||||
- Configure DNS (point domain to worker-6)
|
||||
- Authenticate with DigitalOcean Container Registry
|
||||
- Create Docker secrets (JWT, encryption keys)
|
||||
- Deploy backend service (Go application)
|
||||
- Connect to databases (Cassandra, Redis, Meilisearch)
|
||||
- Verify health checks
|
||||
|
||||
**What you'll have:**
|
||||
- Backend API running on worker-6
|
||||
- Connected to all databases
|
||||
- Docker secrets configured
|
||||
- Health checks passing
|
||||
- Ready for reverse proxy
|
||||
|
||||
**Total time: 1-1.5 hours**
|
||||
|
||||
---
|
||||
|
||||
**[06_maplepress_caddy.md](06_maplepress_caddy.md)** - Backend reverse proxy (Part 2)
|
||||
- Configure Caddy reverse proxy
|
||||
- Set up automatic SSL/TLS (Let's Encrypt)
|
||||
- Configure security headers
|
||||
- Enable HTTP to HTTPS redirect
|
||||
- Preserve CORS headers for frontend
|
||||
- Test SSL certificate acquisition
|
||||
|
||||
**What you'll have:**
|
||||
- Backend accessible at `https://getmaplepress.ca`
|
||||
- Automatic SSL certificate management
|
||||
- Zero-downtime certificate renewals
|
||||
- Security headers configured
|
||||
- CORS configured for frontend
|
||||
|
||||
**Total time: 30 minutes**
|
||||
|
||||
---
|
||||
|
||||
**[07_maplepress_frontend.md](07_maplepress_frontend.md)** - Frontend deployment
|
||||
- Create worker-7 droplet
|
||||
- Join worker-7 to Docker Swarm
|
||||
- Install Node.js on worker-7
|
||||
- Clone repository and build React app
|
||||
- Configure production environment (API URL)
|
||||
- Deploy Caddy for static file serving
|
||||
- Configure SPA routing
|
||||
- Set up automatic SSL for frontend domain
|
||||
|
||||
**What you'll have:**
|
||||
- Frontend accessible at `https://getmaplepress.com`
|
||||
- React app built with production API URL
|
||||
- Automatic HTTPS
|
||||
- SPA routing working
|
||||
- Static asset caching
|
||||
- Complete end-to-end application
|
||||
|
||||
**Total time: 1 hour**
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Optional Enhancements (1 hour)
|
||||
|
||||
**[99_extra.md](99_extra.md)** - Extra operations
|
||||
- Domain changes (backend and/or frontend)
|
||||
- Horizontal scaling (multiple backend replicas)
|
||||
- SSL certificate management
|
||||
- Load balancing verification
|
||||
|
||||
**Total time: As needed**
|
||||
|
||||
---
|
||||
|
||||
## Quick Start (Experienced Engineers)
|
||||
|
||||
**If you're familiar with Docker Swarm and don't need detailed explanations:**
|
||||
|
||||
```bash
|
||||
# 1. Prerequisites (5 min)
|
||||
cd cloud/infrastructure/production
|
||||
cp .env.template .env
|
||||
vi .env # Add DIGITALOCEAN_TOKEN
|
||||
source .env
|
||||
|
||||
# 2. Infrastructure (1 hour)
|
||||
# Follow 01_init_docker_swarm.md - create 7 droplets, init swarm
|
||||
# SSH to manager, run quick verification
|
||||
|
||||
# 3. Databases (1 hour)
|
||||
# Deploy Cassandra (02), Redis (03), Meilisearch (04), Spaces (04.5)
|
||||
# Verify all services: docker service ls
|
||||
|
||||
# 4. Applications (1 hour)
|
||||
# Deploy backend (05), backend-caddy (06), frontend (07)
|
||||
# Test: curl https://getmaplepress.ca/health
|
||||
# curl https://getmaplepress.com
|
||||
|
||||
# 5. Verify (15 min)
|
||||
docker service ls # All services 1/1
|
||||
docker node ls # All nodes Ready
|
||||
# Test in browser: https://getmaplepress.com
|
||||
```
|
||||
|
||||
**Total time for experienced: ~3 hours**
|
||||
|
||||
---
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
setup/
|
||||
├── README.md # This file
|
||||
│
|
||||
├── 00-getting-started.md # Prerequisites & workspace setup
|
||||
├── 00-network-architecture.md # Network design principles
|
||||
├── 00-multi-app-architecture.md # Multi-app naming & strategy
|
||||
│
|
||||
├── 01_init_docker_swarm.md # Docker Swarm cluster
|
||||
├── 02_cassandra.md # Cassandra database cluster
|
||||
├── 03_redis.md # Redis cache server
|
||||
├── 04_meilisearch.md # Meilisearch search engine
|
||||
├── 04.5_spaces.md # DigitalOcean Spaces (object storage)
|
||||
│
|
||||
├── 05_backend.md # Backend API deployment
|
||||
├── 06_caddy.md # Backend reverse proxy (Caddy + SSL)
|
||||
├── 07_frontend.md # Frontend deployment (React + Caddy)
|
||||
│
|
||||
├── 08_extra.md # Domain changes, scaling, extras
|
||||
│
|
||||
└── templates/ # Configuration templates
|
||||
├── cassandra-stack.yml.template
|
||||
├── redis-stack.yml.template
|
||||
├── backend-stack.yml.template
|
||||
└── Caddyfile.template
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Infrastructure Specifications
|
||||
|
||||
### Hardware Requirements
|
||||
|
||||
| Component | Droplet Size | vCPUs | RAM | Disk | Monthly Cost |
|
||||
|-----------|--------------|-------|-----|------|--------------|
|
||||
| Manager (worker-1) + Redis | Basic | 2 | 2 GB | 50 GB | $18 |
|
||||
| Cassandra Node 1 (worker-2) | General Purpose | 2 | 4 GB | 80 GB | $48 |
|
||||
| Cassandra Node 2 (worker-3) | General Purpose | 2 | 4 GB | 80 GB | $48 |
|
||||
| Cassandra Node 3 (worker-4) | General Purpose | 2 | 4 GB | 80 GB | $48 |
|
||||
| Meilisearch (worker-5) | Basic | 2 | 2 GB | 50 GB | $18 |
|
||||
| Backend (worker-6) | Basic | 2 | 2 GB | 50 GB | $18 |
|
||||
| Frontend (worker-7) | Basic | 1 | 1 GB | 25 GB | $6 |
|
||||
| **Total** | - | **13** | **19 GB** | **415 GB** | **~$204/mo** |
|
||||
|
||||
**Additional costs:**
|
||||
- DigitalOcean Spaces: $5/mo (250 GB storage + 1 TB transfer)
|
||||
- Bandwidth: Included (1 TB per droplet)
|
||||
- Backups (optional): +20% of droplet cost
|
||||
|
||||
**Total estimated: ~$210-250/month**
|
||||
|
||||
### Software Versions
|
||||
|
||||
| Software | Version | Notes |
|
||||
|----------|---------|-------|
|
||||
| Ubuntu | 24.04 LTS | Base OS |
|
||||
| Docker | 27.x+ | Container runtime |
|
||||
| Docker Swarm | Built-in | Orchestration |
|
||||
| Cassandra | 4.1.x | Database |
|
||||
| Redis | 7.x-alpine | Cache |
|
||||
| Meilisearch | v1.5+ | Search |
|
||||
| Caddy | 2-alpine | Reverse proxy |
|
||||
| Go | 1.21+ | Backend runtime |
|
||||
| Node.js | 20 LTS | Frontend build |
|
||||
|
||||
---
|
||||
|
||||
## Key Concepts
|
||||
|
||||
### Docker Swarm Architecture
|
||||
|
||||
**Manager node (worker-1):**
|
||||
- Orchestrates all services
|
||||
- Schedules tasks to workers
|
||||
- Maintains cluster state
|
||||
- Runs Redis (collocated)
|
||||
|
||||
**Worker nodes (2-7+):**
|
||||
- Execute service tasks (containers)
|
||||
- Report health to manager
|
||||
- Isolated workloads via labels
|
||||
|
||||
**Node labels:**
|
||||
- `backend=true`: Backend deployment target (worker-6)
|
||||
- `maplepress-frontend=true`: Frontend target (worker-7)
|
||||
|
||||
### Network Architecture
|
||||
|
||||
**`maple-private-prod` (overlay network):**
|
||||
- All databases (Cassandra, Redis, Meilisearch)
|
||||
- Backend services (access to databases)
|
||||
- **No internet access** (security)
|
||||
- Internal-only communication
|
||||
|
||||
**`maple-public-prod` (overlay network):**
|
||||
- Caddy reverse proxies
|
||||
- Backend services (receive HTTP requests)
|
||||
- Ports 80/443 exposed to internet
|
||||
|
||||
**Backends join BOTH networks:**
|
||||
- Receive requests from Caddy (public network)
|
||||
- Access databases (private network)
|
||||
|
||||
### Multi-Application Pattern
|
||||
|
||||
**Shared infrastructure (workers 1-5):**
|
||||
- Cassandra, Redis, Meilisearch serve ALL apps
|
||||
- Cost-efficient (1 infrastructure for unlimited apps)
|
||||
|
||||
**Per-application deployment (workers 6+):**
|
||||
- Each app gets dedicated workers
|
||||
- Independent scaling and deployment
|
||||
- Clear isolation
|
||||
|
||||
**Example: Adding MapleFile**
|
||||
- Worker-8: `maplefile_backend` + `maplefile_backend-caddy`
|
||||
- Worker-9: `maplefile-frontend_caddy`
|
||||
- Uses same Cassandra/Redis/Meilisearch
|
||||
- No changes to infrastructure
|
||||
|
||||
---
|
||||
|
||||
## Common Commands Reference
|
||||
|
||||
### Swarm Management
|
||||
|
||||
```bash
|
||||
# List all nodes
|
||||
docker node ls
|
||||
|
||||
# List all services
|
||||
docker service ls
|
||||
|
||||
# View service logs
|
||||
docker service logs -f maplepress_backend
|
||||
|
||||
# Scale service
|
||||
docker service scale maplepress_backend=3
|
||||
|
||||
# Update service (rolling restart)
|
||||
docker service update --force maplepress_backend
|
||||
|
||||
# Remove service
|
||||
docker service rm maplepress_backend
|
||||
```
|
||||
|
||||
### Stack Management
|
||||
|
||||
```bash
|
||||
# Deploy stack
|
||||
docker stack deploy -c stack.yml stack-name
|
||||
|
||||
# List stacks
|
||||
docker stack ls
|
||||
|
||||
# View stack services
|
||||
docker stack services maplepress
|
||||
|
||||
# Remove stack
|
||||
docker stack rm maplepress
|
||||
```
|
||||
|
||||
### Troubleshooting
|
||||
|
||||
```bash
|
||||
# Check service status
|
||||
docker service ps maplepress_backend
|
||||
|
||||
# View container logs
|
||||
docker logs <container-id>
|
||||
|
||||
# Inspect service
|
||||
docker service inspect maplepress_backend
|
||||
|
||||
# Check network
|
||||
docker network inspect maple-private-prod
|
||||
|
||||
# List configs
|
||||
docker config ls
|
||||
|
||||
# List secrets
|
||||
docker secret ls
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deployment Checklist
|
||||
|
||||
**Use this checklist to track your progress:**
|
||||
|
||||
### Phase 0: Prerequisites
|
||||
- [ ] DigitalOcean account created
|
||||
- [ ] API token generated and saved
|
||||
- [ ] SSH keys generated (`ssh-keygen`)
|
||||
- [ ] SSH key added to DigitalOcean
|
||||
- [ ] Domain names registered
|
||||
- [ ] `.env` file created from template
|
||||
- [ ] `.env` file has correct permissions (600)
|
||||
- [ ] Git repository cloned locally
|
||||
|
||||
### Phase 1: Infrastructure
|
||||
- [ ] 7 droplets created (workers 1-7)
|
||||
- [ ] Docker Swarm initialized
|
||||
- [ ] All workers joined swarm
|
||||
- [ ] Private networking configured (VPC)
|
||||
- [ ] Firewall rules configured on all nodes
|
||||
- [ ] Cassandra 3-node cluster deployed
|
||||
- [ ] Cassandra cluster healthy (`nodetool status`)
|
||||
- [ ] Redis deployed on manager
|
||||
- [ ] Redis authentication configured
|
||||
- [ ] Meilisearch deployed on worker-5
|
||||
- [ ] Meilisearch API key configured
|
||||
- [ ] DigitalOcean Spaces bucket created
|
||||
- [ ] Spaces access keys stored as Docker secrets
|
||||
|
||||
### Phase 2: Applications
|
||||
- [ ] Worker-6 created and joined swarm
|
||||
- [ ] Worker-6 labeled for backend
|
||||
- [ ] DNS pointing backend domain to worker-6
|
||||
- [ ] Backend Docker secrets created (JWT, IP encryption)
|
||||
- [ ] Backend service deployed
|
||||
- [ ] Backend health check passing
|
||||
- [ ] Backend Caddy deployed
|
||||
- [ ] Backend SSL certificate obtained
|
||||
- [ ] Backend accessible at `https://domain.ca`
|
||||
- [ ] Worker-7 created and joined swarm
|
||||
- [ ] Worker-7 labeled for frontend
|
||||
- [ ] DNS pointing frontend domain to worker-7
|
||||
- [ ] Node.js installed on worker-7
|
||||
- [ ] Repository cloned on worker-7
|
||||
- [ ] Frontend built with production API URL
|
||||
- [ ] Frontend Caddy deployed
|
||||
- [ ] Frontend SSL certificate obtained
|
||||
- [ ] Frontend accessible at `https://domain.com`
|
||||
- [ ] CORS working (frontend can call backend)
|
||||
|
||||
### Phase 3: Verification
|
||||
- [ ] All services show 1/1 replicas (`docker service ls`)
|
||||
- [ ] All nodes show Ready (`docker node ls`)
|
||||
- [ ] Backend health endpoint returns 200
|
||||
- [ ] Frontend loads in browser
|
||||
- [ ] Frontend can call backend API (no CORS errors)
|
||||
- [ ] SSL certificates valid (green padlock)
|
||||
- [ ] HTTP redirects to HTTPS
|
||||
|
||||
### Next Steps
|
||||
- [ ] Set up monitoring (see `../operations/02_monitoring_alerting.md`)
|
||||
- [ ] Configure backups (see `../operations/01_backup_recovery.md`)
|
||||
- [ ] Review incident runbooks (see `../operations/03_incident_response.md`)
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting Guide
|
||||
|
||||
### Problem: Docker Swarm Join Fails
|
||||
|
||||
**Symptoms:** Worker can't join swarm, connection refused
|
||||
|
||||
**Check:**
|
||||
```bash
|
||||
# On manager, verify swarm is initialized
|
||||
docker info | grep "Swarm: active"
|
||||
|
||||
# Verify firewall allows swarm ports
|
||||
sudo ufw status | grep -E "2377|7946|4789"
|
||||
|
||||
# Get new join token
|
||||
docker swarm join-token worker
|
||||
```
|
||||
|
||||
### Problem: Service Won't Start
|
||||
|
||||
**Symptoms:** Service stuck at 0/1 replicas
|
||||
|
||||
**Check:**
|
||||
```bash
|
||||
# View service events
|
||||
docker service ps service-name --no-trunc
|
||||
|
||||
# Common issues:
|
||||
# - Image not found: Authenticate with registry
|
||||
# - Network not found: Create network first
|
||||
# - Secret not found: Create secrets
|
||||
# - No suitable node: Check node labels
|
||||
```
|
||||
|
||||
### Problem: DNS Not Resolving
|
||||
|
||||
**Symptoms:** Domain doesn't resolve to correct IP
|
||||
|
||||
**Check:**
|
||||
```bash
|
||||
# Test DNS resolution
|
||||
dig yourdomain.com +short
|
||||
|
||||
# Should return worker IP
|
||||
# If not, wait 5-60 minutes for propagation
|
||||
# Or check DNS provider settings
|
||||
```
|
||||
|
||||
### Problem: SSL Certificate Not Obtained
|
||||
|
||||
**Symptoms:** HTTPS not working, certificate errors
|
||||
|
||||
**Check:**
|
||||
```bash
|
||||
# Verify DNS points to correct server
|
||||
dig yourdomain.com +short
|
||||
|
||||
# Verify port 80 accessible (Let's Encrypt challenge)
|
||||
curl http://yourdomain.com
|
||||
|
||||
# Check Caddy logs
|
||||
docker service logs service-name --tail 100 | grep -i certificate
|
||||
|
||||
# Common issues:
|
||||
# - DNS not pointing to server
|
||||
# - Port 80 blocked by firewall
|
||||
# - Rate limited (5 certs/domain/week)
|
||||
```
|
||||
|
||||
### Problem: Services Can't Communicate
|
||||
|
||||
**Symptoms:** Backend can't reach database
|
||||
|
||||
**Check:**
|
||||
```bash
|
||||
# Verify both services on same network
|
||||
docker service inspect backend --format '{{.Spec.TaskTemplate.Networks}}'
|
||||
docker service inspect database --format '{{.Spec.TaskTemplate.Networks}}'
|
||||
|
||||
# Test DNS resolution from container
|
||||
docker exec <container> nslookup database-hostname
|
||||
|
||||
# Verify firewall allows internal traffic
|
||||
sudo ufw status | grep 10.116.0.0/16
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Getting Help
|
||||
|
||||
### Documentation Resources
|
||||
|
||||
**Within this repository:**
|
||||
- This directory (`setup/`): Initial deployment guides
|
||||
- `../operations/`: Day-to-day operational procedures
|
||||
- `../reference/`: Architecture diagrams, capacity planning
|
||||
- `../automation/`: Scripts for common tasks
|
||||
|
||||
**External resources:**
|
||||
- Docker Swarm: https://docs.docker.com/engine/swarm/
|
||||
- Cassandra: https://cassandra.apache.org/doc/latest/
|
||||
- DigitalOcean: https://docs.digitalocean.com/
|
||||
- Caddy: https://caddyserver.com/docs/
|
||||
|
||||
### Common Questions
|
||||
|
||||
**Q: Can I use a different cloud provider (AWS, GCP, Azure)?**
|
||||
A: Yes, but you'll need to adapt networking and object storage sections. The Docker Swarm and application deployment sections remain the same.
|
||||
|
||||
**Q: Can I deploy with fewer nodes?**
|
||||
A: Minimum viable: 3 nodes (1 manager + 2 workers). Run Cassandra in single-node mode (not recommended for production). Colocate services on same workers.
|
||||
|
||||
**Q: How do I add a new application (e.g., MapleFile)?**
|
||||
A: Follow `00-multi-app-architecture.md`. Add 2 workers (backend + frontend), deploy new stacks. Reuse existing databases.
|
||||
|
||||
**Q: What if I only have one domain?**
|
||||
A: Use subdomains: `api.yourdomain.com` (backend), `app.yourdomain.com` (frontend). Update DNS and Caddyfiles accordingly.
|
||||
|
||||
---
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
**Implemented by these guides:**
|
||||
- ✅ Firewall configured (UFW) on all nodes
|
||||
- ✅ SSH key-based authentication (no passwords)
|
||||
- ✅ Docker secrets for sensitive values
|
||||
- ✅ Network segmentation (private vs public)
|
||||
- ✅ Automatic HTTPS with Let's Encrypt
|
||||
- ✅ Security headers configured in Caddy
|
||||
- ✅ Database authentication (Redis password, Meilisearch API key)
|
||||
- ✅ Private Docker registry authentication
|
||||
|
||||
**Additional recommendations:**
|
||||
- Rotate secrets quarterly (see `../operations/07_security_operations.md`)
|
||||
- Enable 2FA on DigitalOcean account
|
||||
- Regular security updates (Ubuntu unattended-upgrades)
|
||||
- Monitor for unauthorized access attempts
|
||||
- Backup encryption (GPG for backup files)
|
||||
|
||||
---
|
||||
|
||||
## Maintenance Schedule
|
||||
|
||||
**After deployment, establish these routines:**
|
||||
|
||||
**Daily:**
|
||||
- Check service health (`docker service ls`)
|
||||
- Review monitoring dashboards
|
||||
- Check backup completion logs
|
||||
|
||||
**Weekly:**
|
||||
- Review security logs
|
||||
- Check disk space across all nodes
|
||||
- Verify SSL certificate expiry dates
|
||||
|
||||
**Monthly:**
|
||||
- Apply security updates (`apt update && apt upgrade`)
|
||||
- Review capacity and performance metrics
|
||||
- Test backup restore procedures
|
||||
- Rotate non-critical secrets
|
||||
|
||||
**Quarterly:**
|
||||
- Full disaster recovery drill
|
||||
- Review and update documentation
|
||||
- Capacity planning review
|
||||
- Security audit
|
||||
|
||||
---
|
||||
|
||||
## What's Next?
|
||||
|
||||
**After completing setup:**
|
||||
|
||||
1. **Configure Operations** (`../operations/`)
|
||||
- Set up monitoring and alerting
|
||||
- Configure automated backups
|
||||
- Review incident response runbooks
|
||||
|
||||
2. **Optimize Performance**
|
||||
- Tune database settings
|
||||
- Configure caching strategies
|
||||
- Load test your infrastructure
|
||||
|
||||
3. **Add Redundancy**
|
||||
- Scale critical services
|
||||
- Set up failover procedures
|
||||
- Implement health checks
|
||||
|
||||
4. **Automate**
|
||||
- CI/CD pipeline for deployments
|
||||
- Automated testing
|
||||
- Infrastructure as Code (Terraform)
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: January 2025
|
||||
**Maintained By**: Infrastructure Team
|
||||
**Review Frequency**: Quarterly
|
||||
|
||||
**Feedback**: Found an issue or have a suggestion? Open an issue on Codeberg or contact the infrastructure team.
|
||||
|
||||
---
|
||||
|
||||
## Success! 🎉
|
||||
|
||||
If you've completed all guides in this directory, you now have:
|
||||
|
||||
✅ Production-ready infrastructure on DigitalOcean
|
||||
✅ High-availability database cluster (Cassandra RF=3)
|
||||
✅ Caching and search infrastructure (Redis, Meilisearch)
|
||||
✅ Secure backend API with automatic HTTPS
|
||||
✅ React frontend with automatic SSL
|
||||
✅ Multi-application architecture ready to scale
|
||||
✅ Network segmentation for security
|
||||
✅ Docker Swarm orchestration
|
||||
|
||||
**Welcome to production operations!** 🚀
|
||||
|
||||
Now head to `../operations/` to learn how to run and maintain your infrastructure.
|
||||
Loading…
Add table
Add a link
Reference in a new issue