Initial commit: Open sourcing all of the Maple Open Technologies code.

This commit is contained in:
Bartlomiej Mika 2025-12-02 14:33:08 -05:00
commit 755d54a99d
2010 changed files with 448675 additions and 0 deletions

View file

@ -0,0 +1,101 @@
# 🏗️ Infrastructure
Infrastructure setup for running and deploying MapleFile software (MaplePress backend, MapleFile, etc.), organized for both development and production environments.
---
## 📂 Directory Structure
```
infrastructure/
├── development/ # Local development infrastructure
│ ├── docker-compose.dev.yml
│ ├── Taskfile.yml
│ └── README.md # Development setup instructions
└── production/ # Production deployment infrastructure
├── docker-compose.yml
├── .env.sample
├── README.md # Production deployment guide
├── nginx/ # Reverse proxy configuration
├── monitoring/ # Prometheus + Grafana
├── backup/ # Backup automation
└── scripts/ # Deployment automation
```
---
## 🚀 Quick Start
### For Local Development
If you're a **contributor** or want to **run the project locally**:
👉 **Go to [`development/README.md`](./development/README.md)**
This gives you:
- Local Cassandra cluster (3 nodes)
- Redis cache
- Meilisearch for search
- SeaweedFS for object storage
- WordPress for plugin testing
- All pre-configured for local development
**Quick start:**
```bash
cd development
task dev:start
```
### For Production Deployment
If you're **self-hosting** or **deploying to production**:
👉 **Go to [`production/README.md`](./production/README.md)**
This provides:
- Production-ready Docker Compose setup
- SSL/TLS configuration with Let's Encrypt
- Nginx reverse proxy
- Monitoring with Prometheus + Grafana
- Automated backups
- Security hardening
- Deployment automation
**⚠️ Note:** Production setup requires:
- A server (VPS, cloud instance, or dedicated server)
- A domain name with DNS configured
- Basic Linux administration knowledge
---
## 🎯 Which One Should I Use?
| Scenario | Use This | Location |
|----------|----------|----------|
| Contributing to the project | **Development** | [`development/`](./development/) |
| Running locally for testing | **Development** | [`development/`](./development/) |
| Learning the architecture | **Development** | [`development/`](./development/) |
| Self-hosting for personal use | **Production** | [`production/`](./production/) |
| Deploying for others to use | **Production** | [`production/`](./production/) |
| Running a SaaS business | **Production** | [`production/`](./production/) |
---
## 📚 Documentation
- **Development Setup:** [`development/README.md`](./development/README.md)
- **Production Deployment:** [`production/README.md`](./production/README.md)
- **Architecture Overview:** [`../../CLAUDE.md`](../../CLAUDE.md)
---
## 🤝 Contributing
Found a bug? Want to improve the infrastructure? Please create an [issue](https://codeberg.org/mapleopentech/monorepo/issues/new).
---
## 📝 License
This infrastructure is licensed under the [**GNU Affero General Public License v3.0**](https://opensource.org/license/agpl-v3). See [LICENSE](../../LICENSE) for more information.

View file

@ -0,0 +1,387 @@
# 🏗️ MapleFile (Development) Infrastructure
> Shared development infrastructure for all MapleFile projects. Start once, use everywhere.
## 📖 What is this?
Think of this as your **local cloud environment**. Instead of each MapleFile project (maplefile-backend, maplepress-backend, etc.) running its own database, cache, and storage, they all share this common infrastructure - just like production apps share AWS/cloud services.
**What you get:**
- Database (Cassandra) - stores your data
- Cache (Redis) - makes things fast
- Search (Meilisearch) - powers search features
- File Storage (SeaweedFS) - stores uploaded files
- WordPress (for plugin testing)
**Why shared?**
- Start infrastructure once, restart your apps quickly (seconds vs minutes)
- Closer to real production setup
- Learn proper microservices architecture
**No environment variables needed here** - this project is already configured for local development. The apps that connect to it will have their own `.env` files.
## ⚡ TL;DR
```bash
task dev:start # Start everything (takes 2-3 minutes first time)
task dev:status # Verify all services show "healthy"
```
**Then:** Navigate to a backend project (`../maplepress-backend/` or `../maplefile-backend/`) and follow its README to set up and start the backend. See [What's Next?](#whats-next) section below.
## 📋 Prerequisites
You need these tools installed before starting. Don't worry - they're free and easy to install.
### 1. Docker Desktop
**What is Docker?** A tool that runs software in isolated containers. Think of it as lightweight virtual machines that start instantly.
**Download & Install:**
- **macOS:** [Docker Desktop for Mac](https://www.docker.com/products/docker-desktop/) (includes docker-compose)
- **Windows:** [Docker Desktop for Windows](https://www.docker.com/products/docker-desktop/)
- **Linux:** Follow instructions at [docs.docker.com/engine/install](https://docs.docker.com/engine/install/)
**Verify installation:**
```bash
docker --version # Should show: Docker version 20.x or higher
docker compose version # Should show: Docker Compose version 2.x or higher
```
**What is Docker Compose?** A tool for running multiple Docker containers together. It's **included with Docker Desktop** - you don't need to install it separately! When you install Docker Desktop, you automatically get Docker Compose.
**Note on Docker Compose versions:**
- **Docker Compose v1** (older): Uses `docker-compose` command (hyphen)
- **Docker Compose v2** (current): Uses `docker compose` command (space)
- Our Taskfile **automatically detects** which version you have and uses the correct command
- If you're on Linux with Docker Compose v2, use `docker compose version` (not `docker-compose --version`)
### 2. Task (Task Runner)
**What is Task?** A simple command runner (like `make` but better). We use it instead of typing long docker commands.
**Install:**
- **macOS:** `brew install go-task`
- **Windows:** `choco install go-task` (using [Chocolatey](https://chocolatey.org/))
- **Linux:** `snap install task --classic`
- **Manual install:** Download from [taskfile.dev](https://taskfile.dev/installation/)
**Verify installation:**
```bash
task --version # Should show: Task version 3.x or higher
```
### 3. All other services (Cassandra, Redis, etc.)
**Do I need to install them?** **NO!** Docker will automatically download and run everything. You don't install Cassandra, Redis, or any database directly on your computer.
**What happens when you run `task dev:start`:**
1. Docker downloads required images (first time only - takes a few minutes)
2. Starts all services in containers
3. That's it - everything is ready to use!
## ❓ Common Questions
**Q: Do I need to configure environment variables or create a `.env` file?**
A: **No!** This infrastructure project is pre-configured for local development. However, the application projects that connect to it (like `maplefile-backend`) will need their own `.env` files - check their READMEs.
**Q: Do I need to install Cassandra, Redis, or other databases?**
A: **No!** Docker handles everything. You only install Docker and Task, nothing else.
**Q: Will this mess up my computer or conflict with other projects?**
A: **No!** Everything runs in isolated Docker containers. You can safely remove it all with `task dev:clean` and `docker system prune`.
**Q: How much disk space does this use?**
A: Initial download: ~2-3 GB. Running services + data: ~5-10 GB depending on usage.
**Q: Can I use this on Windows?**
A: **Yes!** Docker Desktop works on Windows. Just make sure to use PowerShell or Git Bash for commands.
**Q: What is Docker Compose? Do I need to install it separately?**
A: **No!** Docker Compose is included with Docker Desktop automatically. When you install Docker Desktop, you get both `docker` and `docker compose` commands.
**Q: I'm getting "docker-compose: command not found" on Linux. What should I do?**
A: You likely have Docker Compose v2, which uses `docker compose` (space) instead of `docker-compose` (hyphen). Our Taskfile automatically detects and uses the correct command. Just run `task dev:start` and it will work on both Mac and Linux.
## 🚀 Quick Start
### 1. Start Infrastructure
```bash
task dev:start
```
Wait for: `✅ Infrastructure ready!`
### 2. Verify Everything Works
```bash
task dev:status
```
**Expected output:** All services show `Up X minutes (healthy)`
```
NAMES STATUS PORTS
maple-cassandra-1-dev Up 2 minutes (healthy) 0.0.0.0:9042->9042/tcp
maple-redis-dev Up 2 minutes (healthy) 0.0.0.0:6379->6379/tcp
maple-wordpress-dev Up 2 minutes (healthy) 0.0.0.0:8081->80/tcp
...
```
### 3. Start Your App
Now navigate to your app directory (e.g., `maplefile-backend`) and run its `task dev` command. Your app will automatically connect to this infrastructure.
### 4. Stop Infrastructure (End of Day)
```bash
task dev:stop # Stops services, keeps data
```
## 🎯 What's Next?
🎉 **Infrastructure is running!** Now set up a backend:
- **MaplePress Backend:** [`../maplepress-backend/README.md`](../maplepress-backend/README.md)
- **MapleFile Backend:** [`../maplefile-backend/README.md`](../maplefile-backend/README.md)
Pick one, navigate to its directory, and follow its setup instructions.
## 📅 Daily Commands
```bash
# Morning - start infrastructure
task dev:start
# Check if everything is running
task dev:status
# Evening - stop infrastructure (keeps data)
task dev:stop
# Nuclear option - delete everything and start fresh
task dev:clean # ⚠️ DELETES ALL DATA
```
## 🔍 Troubleshooting
### Service shows unhealthy or won't start
```bash
# Check logs for specific service
task dev:logs -- cassandra-1
task dev:logs -- redis
task dev:logs -- wordpress
# Or follow logs in real-time
task dev:logs -- cassandra-1
```
**Service names:** `cassandra-1`, `cassandra-2`, `cassandra-3`, `redis`, `meilisearch`, `seaweedfs`, `mariadb`, `wordpress`
### Port already in use
Another service is using the required ports. Check:
- Port 9042 (Cassandra)
- Port 6379 (Redis)
- Port 8081 (WordPress)
- Port 3306 (MariaDB)
Find and stop the conflicting service:
```bash
lsof -i :9042 # macOS/Linux
```
### Want to reset everything
```bash
task dev:clean # Removes all containers and data
task dev:start # Fresh start
```
## 🌐 What's Running?
When you start infrastructure, you get these services:
| Service | Port | Purpose | Access |
|---------|------|---------|--------|
| Cassandra Cluster | 9042 | Database (3-node cluster) | `task cql` |
| Redis | 6379 | Cache & sessions | `task redis` |
| Meilisearch | 7700 | Search engine | http://localhost:7700 |
| SeaweedFS | 8333, 9333 | S3-compatible storage | http://localhost:9333 |
| MariaDB | 3306 | WordPress database | - |
| WordPress | 8081 | Plugin testing | http://localhost:8081 |
## 🔧 Common Operations
### Working with Cassandra
```bash
# Open CQL shell
task cql
# List all keyspaces
task cql:keyspaces
# List tables in a keyspace
task cql:tables -- maplepress
# Check cluster health
task cql:status
```
**Available keyspaces:**
- `maplefile` - MapleFile backend (Redis DB: 1)
- `maplepress` - MaplePress backend (Redis DB: 0)
### Working with Redis
```bash
# Open Redis CLI
task redis
# Then inside Redis CLI:
# SELECT 0 # Switch to maplepress database
# SELECT 1 # Switch to maplefile database
# KEYS * # List all keys
```
### Working with WordPress
**Access:** http://localhost:8081
**First-time setup:**
1. Visit http://localhost:8081
2. Complete WordPress installation wizard
3. Use any credentials (this is a dev site)
**Credentials for WordPress database:**
- Host: `mariadb:3306`
- Database: `wordpress`
- User: `wordpress`
- Password: `wordpress`
**View debug logs:**
```bash
docker exec -it maple-wordpress-dev tail -f /var/www/html/wp-content/debug.log
```
### Working with SeaweedFS (S3 Storage)
**Web UI:** http://localhost:9333
**S3 Configuration for your apps:**
```bash
S3_ENDPOINT=http://seaweedfs:8333
S3_REGION=us-east-1
S3_ACCESS_KEY=any
S3_SECRET_KEY=any
```
## 💻 Development Workflow
**Typical daily flow:**
1. **Morning:** `task dev:start` (in this directory)
2. **Start app:** `cd ../maplefile-backend && task dev`
3. **Work on code** - restart app as needed (fast!)
4. **Infrastructure keeps running** - no need to restart
5. **Evening:** `task dev:stop` (optional - can leave running)
**Why this approach?**
- Infrastructure takes 2-3 minutes to start (Cassandra cluster is slow)
- Your app restarts in seconds
- Start infrastructure once, restart apps freely
## 💾 Data Persistence
All data is stored in Docker volumes and survives restarts:
- `maple-cassandra-1-dev`, `maple-cassandra-2-dev`, `maple-cassandra-3-dev`
- `maple-redis-dev`
- `maple-meilisearch-dev`
- `maple-seaweedfs-dev`
- `maple-mariadb-dev`
- `maple-wordpress-dev`
**To completely reset (deletes all data):**
```bash
task dev:clean
```
## 🎓 Advanced Topics
> **⚠️ SKIP THIS SECTION FOR INITIAL SETUP!**
>
> These topics are for **future use** - after you've successfully set up and used the infrastructure. You don't need to read or do anything here when setting up for the first time.
>
> Come back here only when you need to:
> - Add a new project to the infrastructure (not needed now - mapleopentech and maplepress already configured)
> - Understand Cassandra cluster architecture (curiosity only)
> - Learn why we chose this approach (optional reading)
### Adding a New Project
**When do I need this?** Only if you're creating a brand new project (not maplefile-backend or maplepress-backend - those are already set up).
To add a new project to shared infrastructure:
1. Add keyspace to `cassandra/init-scripts/01-create-keyspaces.cql`:
```cql
CREATE KEYSPACE IF NOT EXISTS mynewproject
WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};
```
2. Configure your project's `docker-compose.dev.yml`:
```yaml
networks:
maple-dev:
external: true
services:
app:
environment:
- DATABASE_HOSTS=cassandra-1:9042,cassandra-2:9042,cassandra-3:9042
- DATABASE_KEYSPACE=mynewproject
- DATABASE_CONSISTENCY=QUORUM
- DATABASE_REPLICATION=3
- REDIS_HOST=redis
- REDIS_DB=2 # Use next available: 0=maplepress, 1=maplefile
networks:
- maple-dev
```
3. Restart infrastructure:
```bash
task dev:restart
```
### Cassandra Cluster Details
- **3-node cluster** for high availability
- **Replication factor: 3** (data on all nodes)
- **Consistency level: QUORUM** (2 of 3 nodes must agree)
- **Seed node:** cassandra-1 (other nodes join via this node)
### Architecture Decision: Why Separate Infrastructure?
**Benefits:**
- Faster app restarts (seconds vs minutes)
- Share infrastructure across multiple projects
- Closer to production architecture
- Learn proper service separation
**Trade-off:**
- One extra terminal/directory to manage
- Slightly more complex than monolithic docker-compose
We chose speed and realism over simplicity.
## Contributing
Found a bug? Want a feature to improve the infrastructure? Please create an [issue](https://codeberg.org/mapleopentech/monorepo/issues/new).
## License
This application is licensed under the [**GNU Affero General Public License v3.0**](https://opensource.org/license/agpl-v3). See [LICENSE](../../LICENSE) for more information.

View file

@ -0,0 +1,168 @@
version: '3'
# Variables for Docker Compose command detection
vars:
DOCKER_COMPOSE_CMD:
sh: |
if command -v docker-compose >/dev/null 2>&1; then
echo "docker-compose"
elif docker compose version >/dev/null 2>&1; then
echo "docker compose"
else
echo "docker-compose"
fi
tasks:
dev:start:
desc: Start all infrastructure services for development
cmds:
- "{{.DOCKER_COMPOSE_CMD}} -f docker-compose.dev.yml up -d"
- echo "⏳ Waiting for services to be healthy..."
- task: dev:wait
- task: dev:init
- echo ""
- echo "✅ Infrastructure ready!"
- echo ""
- echo "📊 Running Services:"
- docker ps --filter "name=maple-"
dev:wait:
desc: Wait for all services to be healthy
silent: true
cmds:
- |
echo "Waiting for Cassandra Node 1..."
for i in {1..30}; do
if docker exec maple-cassandra-1-dev cqlsh -e "describe cluster" >/dev/null 2>&1; then
echo "✅ Cassandra Node 1 is ready"
break
fi
echo " ... ($i/30)"
sleep 2
done
- |
echo "Waiting for Cassandra Node 2..."
for i in {1..30}; do
if docker exec maple-cassandra-2-dev cqlsh -e "describe cluster" >/dev/null 2>&1; then
echo "✅ Cassandra Node 2 is ready"
break
fi
echo " ... ($i/30)"
sleep 2
done
- |
echo "Waiting for Cassandra Node 3..."
for i in {1..30}; do
if docker exec maple-cassandra-3-dev cqlsh -e "describe cluster" >/dev/null 2>&1; then
echo "✅ Cassandra Node 3 is ready"
break
fi
echo " ... ($i/30)"
sleep 2
done
- |
echo "Waiting for Redis..."
for i in {1..10}; do
if docker exec maple-redis-dev redis-cli ping >/dev/null 2>&1; then
echo "✅ Redis is ready"
break
fi
sleep 1
done
- |
echo "Waiting for SeaweedFS..."
for i in {1..10}; do
if docker exec maple-seaweedfs-dev /usr/bin/wget -q --spider http://127.0.0.1:9333/cluster/status 2>/dev/null; then
echo "✅ SeaweedFS is ready"
break
fi
sleep 1
done
dev:init:
desc: Initialize keyspaces and databases
cmds:
- |
echo "📦 Initializing Cassandra keyspaces..."
docker exec -i maple-cassandra-1-dev cqlsh < cassandra/init-scripts/01-create-keyspaces.cql
echo "✅ Keyspaces initialized with replication_factor=3"
dev:status:
desc: Show status of all infrastructure services
cmds:
- |
echo "📊 Infrastructure Status:"
docker ps --filter "name=maple-" --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
dev:stop:
desc: Stop all infrastructure services (keeps data)
cmds:
- "{{.DOCKER_COMPOSE_CMD}} -f docker-compose.dev.yml down"
- echo "✅ Infrastructure stopped (data preserved)"
dev:restart:
desc: Restart all infrastructure services
cmds:
- task: dev:stop
- task: dev:start
dev:logs:
desc: View infrastructure logs (usage task dev:logs -- cassandra)
cmds:
- "{{.DOCKER_COMPOSE_CMD}} -f docker-compose.dev.yml logs -f {{.CLI_ARGS}}"
dev:clean:
desc: Stop services and remove all data (DESTRUCTIVE!)
prompt: This will DELETE ALL DATA in Cassandra, Redis, Meilisearch, and SeaweedFS. Continue?
cmds:
- "{{.DOCKER_COMPOSE_CMD}} -f docker-compose.dev.yml down -v"
- echo "✅ Infrastructure cleaned (all data removed)"
dev:clean:keyspace:
desc: Drop and recreate a specific Cassandra keyspace (usage task dev:clean:keyspace -- maplefile)
prompt: This will DELETE ALL DATA in the {{.CLI_ARGS}} keyspace. Continue?
cmds:
- |
KEYSPACE={{.CLI_ARGS}}
if [ -z "$KEYSPACE" ]; then
echo "❌ Error: Please specify a keyspace name"
echo "Usage: task dev:clean:keyspace -- maplefile"
exit 1
fi
echo "🗑️ Dropping keyspace: $KEYSPACE"
docker exec maple-cassandra-1-dev cqlsh -e "DROP KEYSPACE IF EXISTS $KEYSPACE;"
echo "📦 Recreating keyspace: $KEYSPACE"
docker exec maple-cassandra-1-dev cqlsh -e "CREATE KEYSPACE IF NOT EXISTS $KEYSPACE WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 3} AND DURABLE_WRITES = true;"
echo "✅ Keyspace $KEYSPACE cleaned and recreated"
# Cassandra-specific tasks
cql:
desc: Open Cassandra CQL shell (connects to node 1)
cmds:
- docker exec -it maple-cassandra-1-dev cqlsh
cql:keyspaces:
desc: List all keyspaces
cmds:
- docker exec maple-cassandra-1-dev cqlsh -e "DESCRIBE KEYSPACES;"
cql:tables:
desc: List tables in a keyspace (usage task cql:tables -- maplepress)
cmds:
- docker exec maple-cassandra-1-dev cqlsh -e "USE {{.CLI_ARGS}}; DESCRIBE TABLES;"
cql:status:
desc: Show Cassandra cluster status
cmds:
- docker exec maple-cassandra-1-dev nodetool status
# Redis-specific tasks
redis:
desc: Open Redis CLI
cmds:
- docker exec -it maple-redis-dev redis-cli
redis:info:
desc: Show Redis info
cmds:
- docker exec maple-redis-dev redis-cli INFO

View file

@ -0,0 +1,30 @@
-- Maple Infrastructure - Keyspace Initialization
-- This creates keyspaces for all Maple projects with replication factor 3
-- MaplePress Backend
CREATE KEYSPACE IF NOT EXISTS maplepress
WITH REPLICATION = {
'class': 'SimpleStrategy',
'replication_factor': 3
}
AND DURABLE_WRITES = true;
-- MapleFile Backend
CREATE KEYSPACE IF NOT EXISTS maplefile
WITH REPLICATION = {
'class': 'SimpleStrategy',
'replication_factor': 3
}
AND DURABLE_WRITES = true;
-- Future projects can be added here
-- Example:
-- CREATE KEYSPACE IF NOT EXISTS mapleanalytics
-- WITH REPLICATION = {
-- 'class': 'SimpleStrategy',
-- 'replication_factor': 1
-- };
-- Verify keyspaces were created
DESCRIBE KEYSPACES;

View file

@ -0,0 +1,250 @@
# Shared network for all Maple services in development
networks:
maple-dev:
name: maple-dev
driver: bridge
# Persistent volumes for development data
volumes:
cassandra-1-dev-data:
name: maple-cassandra-1-dev
cassandra-2-dev-data:
name: maple-cassandra-2-dev
cassandra-3-dev-data:
name: maple-cassandra-3-dev
redis-dev-data:
name: maple-redis-dev
meilisearch-dev-data:
name: maple-meilisearch-dev
seaweedfs-dev-data:
name: maple-seaweedfs-dev
mariadb-dev-data:
name: maple-mariadb-dev
wordpress-dev-data:
name: maple-wordpress-dev
services:
cassandra-1:
image: cassandra:5.0.4
container_name: maple-cassandra-1-dev
hostname: cassandra-1
ports:
- "9042:9042" # CQL native transport
- "9160:9160" # Thrift (legacy, optional)
environment:
- CASSANDRA_CLUSTER_NAME=maple-dev-cluster
- CASSANDRA_DC=datacenter1
- CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch
- CASSANDRA_SEEDS=cassandra-1,cassandra-2,cassandra-3
- MAX_HEAP_SIZE=512M
- HEAP_NEWSIZE=128M
volumes:
- cassandra-1-dev-data:/var/lib/cassandra
- ./cassandra/init-scripts:/init-scripts:ro
networks:
- maple-dev
healthcheck:
test: ["CMD-SHELL", "cqlsh -e 'describe cluster' || exit 1"]
interval: 30s
timeout: 10s
retries: 5
start_period: 80s
restart: unless-stopped
cassandra-2:
image: cassandra:5.0.4
container_name: maple-cassandra-2-dev
hostname: cassandra-2
environment:
- CASSANDRA_CLUSTER_NAME=maple-dev-cluster
- CASSANDRA_DC=datacenter1
- CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch
- CASSANDRA_SEEDS=cassandra-1,cassandra-2,cassandra-3
- MAX_HEAP_SIZE=512M
- HEAP_NEWSIZE=128M
volumes:
- cassandra-2-dev-data:/var/lib/cassandra
networks:
- maple-dev
depends_on:
- cassandra-1
healthcheck:
test: ["CMD-SHELL", "cqlsh -e 'describe cluster' || exit 1"]
interval: 30s
timeout: 10s
retries: 5
start_period: 80s
restart: unless-stopped
cassandra-3:
image: cassandra:5.0.4
container_name: maple-cassandra-3-dev
hostname: cassandra-3
environment:
- CASSANDRA_CLUSTER_NAME=maple-dev-cluster
- CASSANDRA_DC=datacenter1
- CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch
- CASSANDRA_SEEDS=cassandra-1,cassandra-2,cassandra-3
- MAX_HEAP_SIZE=512M
- HEAP_NEWSIZE=128M
volumes:
- cassandra-3-dev-data:/var/lib/cassandra
networks:
- maple-dev
depends_on:
- cassandra-1
healthcheck:
test: ["CMD-SHELL", "cqlsh -e 'describe cluster' || exit 1"]
interval: 30s
timeout: 10s
retries: 5
start_period: 80s
restart: unless-stopped
redis:
image: redis:7-alpine
container_name: maple-redis-dev
hostname: redis
ports:
- "6379:6379"
volumes:
- redis-dev-data:/data
- ./redis/redis.dev.conf:/usr/local/etc/redis/redis.conf:ro
networks:
- maple-dev
command: redis-server /usr/local/etc/redis/redis.conf
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 3s
retries: 3
restart: unless-stopped
meilisearch:
image: getmeili/meilisearch:v1.5
container_name: maple-meilisearch-dev
hostname: meilisearch
ports:
- "7700:7700"
environment:
- MEILI_ENV=development
- MEILI_MASTER_KEY=maple-dev-master-key-change-in-production
- MEILI_NO_ANALYTICS=true
volumes:
- meilisearch-dev-data:/meili_data
networks:
- maple-dev
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:7700/health"]
interval: 10s
timeout: 3s
retries: 3
restart: unless-stopped
seaweedfs:
image: chrislusf/seaweedfs:latest
container_name: maple-seaweedfs-dev
hostname: seaweedfs
ports:
- "8333:8333" # S3 API
- "9333:9333" # Master server (web UI)
- "8080:8080" # Volume server
environment:
- WEED_MASTER_VOLUME_SIZE_LIMIT_MB=1024
volumes:
- seaweedfs-dev-data:/data
networks:
- maple-dev
command: server -s3 -dir=/data -s3.port=8333 -ip=0.0.0.0
healthcheck:
test: ["CMD", "/usr/bin/wget", "-q", "--spider", "http://127.0.0.1:9333/cluster/status"]
interval: 10s
timeout: 3s
retries: 3
start_period: 15s
restart: unless-stopped
# Nginx - CORS proxy for SeaweedFS
# Access: localhost:8334 (proxies to seaweedfs:8333 with CORS headers)
# Use this endpoint from frontend for file uploads
nginx-s3-proxy:
image: nginx:alpine
container_name: maple-nginx-s3-proxy-dev
hostname: nginx-s3-proxy
ports:
- "8334:8334" # CORS-enabled S3 API proxy
volumes:
- ./nginx/seaweedfs-cors.conf:/etc/nginx/conf.d/default.conf:ro
networks:
- maple-dev
depends_on:
- seaweedfs
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:8334/"]
interval: 10s
timeout: 3s
retries: 3
restart: unless-stopped
# MariaDB - WordPress database
# Access: localhost:3306
# Credentials: wordpress/wordpress (root: maple-dev-root-password)
mariadb:
image: mariadb:11.2
container_name: maple-mariadb-dev
hostname: mariadb
ports:
- "3306:3306"
environment:
- MARIADB_ROOT_PASSWORD=maple-dev-root-password
- MARIADB_DATABASE=wordpress
- MARIADB_USER=wordpress
- MARIADB_PASSWORD=wordpress
volumes:
- mariadb-dev-data:/var/lib/mysql
networks:
- maple-dev
healthcheck:
test: ["CMD", "healthcheck.sh", "--connect", "--innodb_initialized"]
interval: 10s
timeout: 3s
retries: 5
start_period: 30s
restart: unless-stopped
# WordPress - Plugin development and testing
# Access: http://localhost:8081
# Plugin auto-mounted from: native/wordpress/maplepress-plugin
# Debug logs: docker exec -it maple-wordpress-dev tail -f /var/www/html/wp-content/debug.log
wordpress:
image: wordpress:latest
container_name: maple-wordpress-dev
hostname: wordpress
ports:
- "8081:80"
environment:
- WORDPRESS_DB_HOST=mariadb:3306
- WORDPRESS_DB_USER=wordpress
- WORDPRESS_DB_PASSWORD=wordpress
- WORDPRESS_DB_NAME=wordpress
- WORDPRESS_DEBUG=1
- WORDPRESS_CONFIG_EXTRA=
define('WP_DEBUG', true);
define('WP_DEBUG_LOG', true);
define('WP_DEBUG_DISPLAY', false);
volumes:
- wordpress-dev-data:/var/www/html
# MaplePress plugin - mounted read-only for live development
- ../../../native/wordpress/maplepress-plugin:/var/www/html/wp-content/plugins/maplepress-plugin:ro
networks:
- maple-dev
depends_on:
mariadb:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:80/"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
restart: unless-stopped

View file

@ -0,0 +1,51 @@
server {
listen 8334;
server_name localhost;
# Map to dynamically set CORS origin based on request
# This allows multiple localhost ports for development
set $cors_origin "";
if ($http_origin ~* "^http://localhost:(5173|5174|5175|3000|8080)$") {
set $cors_origin $http_origin;
}
# Proxy to SeaweedFS S3 endpoint
location / {
# Hide CORS headers from upstream SeaweedFS (to prevent duplicates)
proxy_hide_header 'Access-Control-Allow-Origin';
proxy_hide_header 'Access-Control-Allow-Methods';
proxy_hide_header 'Access-Control-Allow-Headers';
proxy_hide_header 'Access-Control-Expose-Headers';
proxy_hide_header 'Access-Control-Max-Age';
proxy_hide_header 'Access-Control-Allow-Credentials';
# CORS Headers for development - dynamically set based on request origin
add_header 'Access-Control-Allow-Origin' $cors_origin always;
add_header 'Access-Control-Allow-Methods' 'GET, PUT, POST, DELETE, HEAD, OPTIONS' always;
add_header 'Access-Control-Allow-Headers' '*' always;
add_header 'Access-Control-Expose-Headers' 'ETag, Content-Length, Content-Type' always;
add_header 'Access-Control-Max-Age' '3600' always;
# Handle preflight requests
if ($request_method = 'OPTIONS') {
add_header 'Access-Control-Allow-Origin' $cors_origin always;
add_header 'Access-Control-Allow-Methods' 'GET, PUT, POST, DELETE, HEAD, OPTIONS' always;
add_header 'Access-Control-Allow-Headers' '*' always;
add_header 'Access-Control-Max-Age' '3600' always;
add_header 'Content-Type' 'text/plain; charset=utf-8' always;
add_header 'Content-Length' '0' always;
return 204;
}
# Proxy to SeaweedFS
proxy_pass http://seaweedfs:8333;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Important for large file uploads
proxy_request_buffering off;
client_max_body_size 1G;
}
}

View file

@ -0,0 +1,23 @@
# Maple Infrastructure - Redis Development Configuration
# Network
bind 0.0.0.0
port 6379
protected-mode no
# Persistence
save 900 1
save 300 10
save 60 10000
appendonly yes
appendfilename "appendonly.aof"
# Limits
maxmemory 256mb
maxmemory-policy allkeys-lru
# Logging
loglevel notice
# Databases (default 16)
databases 16

View file

@ -0,0 +1,28 @@
# Claude Code Ignore File
# Prevents sensitive files from being read by Claude Code (LLMs)
#
# SECURITY: This file protects production secrets and infrastructure details
# from being accidentally exposed to AI assistants.
# Environment files (contain real secrets)
.env
.env.*
!.env.template
*.env.backup
*.env.bak
# Old documentation (may contain real infrastructure details)
_md/
# Backup files
*.backup
*.bak
*~
# Sensitive logs
*.log
# Any files with "secret" or "private" in the name
*secret*
*private*
*credential*

View file

@ -0,0 +1,195 @@
# ==============================================================================
# Maple Open Technologies - Production Infrastructure Configuration Template
# ==============================================================================
#
# INSTRUCTIONS:
# 1. Copy this file to .env: cp .env.template .env
# 2. Replace all CHANGEME values with your actual infrastructure details
# 3. Never commit .env to Git (it's in .gitignore)
# 4. Keep .env file permissions secure: chmod 600 .env
#
# SECURITY WARNING:
# This file will contain sensitive information including:
# - IP addresses
# - API tokens
# - Passwords
# - Join tokens
# Treat it like a password file!
#
# ==============================================================================
# ------------------------------------------------------------------------------
# DigitalOcean API Access
# ------------------------------------------------------------------------------
# Get this from: https://cloud.digitalocean.com/account/api/tokens
DIGITALOCEAN_TOKEN=CHANGEME
# ------------------------------------------------------------------------------
# Infrastructure Region & VPC
# ------------------------------------------------------------------------------
# Region where all resources are deployed (e.g., tor1, nyc1, sfo3)
SWARM_REGION=CHANGEME
# VPC Network name (usually default-[region], e.g., default-tor1)
SWARM_VPC_NAME=CHANGEME
# VPC Private network subnet in CIDR notation (e.g., 10.116.0.0/16)
SWARM_VPC_SUBNET=CHANGEME
# ------------------------------------------------------------------------------
# Docker Swarm - Manager Node
# ------------------------------------------------------------------------------
SWARM_MANAGER_1_HOSTNAME=maplefile-swarm-manager-1-prod
SWARM_MANAGER_1_PUBLIC_IP=CHANGEME
SWARM_MANAGER_1_PRIVATE_IP=CHANGEME
# ------------------------------------------------------------------------------
# Docker Swarm - Worker Nodes
# ------------------------------------------------------------------------------
# Worker 1
SWARM_WORKER_1_HOSTNAME=maplefile-swarm-worker-1-prod
SWARM_WORKER_1_PUBLIC_IP=CHANGEME
SWARM_WORKER_1_PRIVATE_IP=CHANGEME
# Worker 2 (Cassandra Node 1)
SWARM_WORKER_2_HOSTNAME=maplefile-swarm-worker-2-prod
SWARM_WORKER_2_PUBLIC_IP=CHANGEME
SWARM_WORKER_2_PRIVATE_IP=CHANGEME
# Worker 3 (Cassandra Node 2)
SWARM_WORKER_3_HOSTNAME=maplefile-swarm-worker-3-prod
SWARM_WORKER_3_PUBLIC_IP=CHANGEME
SWARM_WORKER_3_PRIVATE_IP=CHANGEME
# Worker 4 (Cassandra Node 3)
SWARM_WORKER_4_HOSTNAME=maplefile-swarm-worker-4-prod
SWARM_WORKER_4_PUBLIC_IP=CHANGEME
SWARM_WORKER_4_PRIVATE_IP=CHANGEME
# Worker 5 (Meilisearch - SHARED by all apps)
SWARM_WORKER_5_HOSTNAME=maplefile-swarm-worker-5-prod
SWARM_WORKER_5_PUBLIC_IP=CHANGEME
SWARM_WORKER_5_PRIVATE_IP=CHANGEME
# Worker 6 (MaplePress Backend + Backend Caddy)
SWARM_WORKER_6_HOSTNAME=maplefile-swarm-worker-6-prod
SWARM_WORKER_6_PUBLIC_IP=CHANGEME
SWARM_WORKER_6_PRIVATE_IP=CHANGEME
# Worker 7 (MaplePress Frontend + Frontend Caddy)
SWARM_WORKER_7_HOSTNAME=maplefile-swarm-worker-7-prod
SWARM_WORKER_7_PUBLIC_IP=CHANGEME
SWARM_WORKER_7_PRIVATE_IP=CHANGEME
# ------------------------------------------------------------------------------
# Docker Swarm - Cluster Configuration
# ------------------------------------------------------------------------------
# Join token for adding new worker nodes
# Get this from manager: docker swarm join-token worker -q
SWARM_JOIN_TOKEN=CHANGEME
# ==============================================================================
# SHARED INFRASTRUCTURE (Used by ALL Apps)
# ==============================================================================
# ------------------------------------------------------------------------------
# Cassandra Configuration (3-node cluster) - SHARED
# ------------------------------------------------------------------------------
# Cluster settings
CASSANDRA_CLUSTER_NAME=CHANGEME
CASSANDRA_DC=CHANGEME
CASSANDRA_REPLICATION_FACTOR=3
# Node IPs (private IPs from workers 2, 3, 4)
CASSANDRA_NODE_1_IP=CHANGEME
CASSANDRA_NODE_2_IP=CHANGEME
CASSANDRA_NODE_3_IP=CHANGEME
# Connection settings
CASSANDRA_CONTACT_POINTS=CHANGEME # Comma-separated: 10.116.0.4,10.116.0.5,10.116.0.6
CASSANDRA_CQL_PORT=9042
# ------------------------------------------------------------------------------
# Redis Configuration - SHARED
# ------------------------------------------------------------------------------
# Generated in 03_redis.md setup guide
REDIS_HOST=redis
REDIS_PORT=6379
REDIS_PASSWORD=CHANGEME
# ------------------------------------------------------------------------------
# Meilisearch Configuration - SHARED
# ------------------------------------------------------------------------------
# Generated in 04_app_meilisearch.md setup guide
MEILISEARCH_HOST=meilisearch
MEILISEARCH_PORT=7700
MEILISEARCH_MASTER_KEY=CHANGEME
MEILISEARCH_URL=http://meilisearch:7700
# ------------------------------------------------------------------------------
# DigitalOcean Spaces (S3-Compatible Object Storage) - SHARED
# ------------------------------------------------------------------------------
# Generated in 04.5_spaces.md setup guide
# Access keys from DigitalOcean dashboard: API → Spaces access keys
# Note: Each app can have its own bucket, but shares the same access keys
SPACES_ACCESS_KEY=CHANGEME
SPACES_SECRET_KEY=CHANGEME
SPACES_ENDPOINT=CHANGEME # e.g., nyc3.digitaloceanspaces.com
SPACES_REGION=CHANGEME # e.g., nyc3, sfo3, sgp1
# ==============================================================================
# MAPLEPRESS APPLICATION
# ==============================================================================
# ------------------------------------------------------------------------------
# MaplePress Backend Configuration
# ------------------------------------------------------------------------------
# Generated in 05_backend.md setup guide
# Domain for backend API
MAPLEPRESS_BACKEND_DOMAIN=getmaplepress.ca
# Spaces bucket (app-specific)
MAPLEPRESS_SPACES_BUCKET=maplepress-prod
# JWT Secret (generated via: openssl rand -base64 64 | tr -d '\n')
# Stored as Docker secret: maplepress_jwt_secret
MAPLEPRESS_JWT_SECRET=CHANGEME
# IP Encryption Key (generated via: openssl rand -hex 16)
# Stored as Docker secret: maplepress_ip_encryption_key
MAPLEPRESS_IP_ENCRYPTION_KEY=CHANGEME
# ------------------------------------------------------------------------------
# MaplePress Frontend Configuration
# ------------------------------------------------------------------------------
# Configured in 07_frontend.md setup guide
# Domain for frontend
MAPLEPRESS_FRONTEND_DOMAIN=getmaplepress.com
# API endpoint (backend URL)
MAPLEPRESS_FRONTEND_API_URL=https://getmaplepress.ca
# ==============================================================================
# MAPLEFILE APPLICATION (Future)
# ==============================================================================
# ------------------------------------------------------------------------------
# MapleFile Backend Configuration (Future)
# ------------------------------------------------------------------------------
# MAPLEFILE_BACKEND_DOMAIN=maplefile.ca
# MAPLEFILE_SPACES_BUCKET=maplefile-prod
# MAPLEFILE_JWT_SECRET=CHANGEME
# MAPLEFILE_IP_ENCRYPTION_KEY=CHANGEME
# ------------------------------------------------------------------------------
# MapleFile Frontend Configuration (Future)
# ------------------------------------------------------------------------------
# MAPLEFILE_FRONTEND_DOMAIN=maplefile.com
# MAPLEFILE_FRONTEND_API_URL=https://maplefile.ca
# ==============================================================================
# END OF CONFIGURATION
# ==============================================================================

View file

@ -0,0 +1,17 @@
# Environment configuration (contains secrets)
.env
.env.production
# Backup files
*.env.backup
*.env.bak
# Editor files
.DS_Store
*~
*.swp
*.swo
# Logs
*.log

View file

@ -0,0 +1,129 @@
# Maple Open Technologies - Production Infrastructure
This directory contains configuration and documentation for deploying Maple Open Technologies to production on DigitalOcean.
## Quick Start
```bash
# 1. Copy environment template
cp .env.template .env
# 2. Edit .env and replace all CHANGEME values
nano .env
# 3. Set secure permissions
chmod 600 .env
# 4. Verify .env is gitignored
git check-ignore -v .env
# 5. Start with setup documentation
cd setup/
cat 00-getting-started.md
```
## Directory Structure
```
production/
├── .env.template # Template with CHANGEME placeholders (safe to commit)
├── .env # Your actual config (gitignored, NEVER commit)
├── .gitignore # Ensures .env is never committed to Git
├── .claudeignore # Protects secrets from LLMs/AI assistants
├── README.md # This file
└── setup/ # Step-by-step deployment guides
├── 00-getting-started.md
├── 01_init_docker_swarm.md
└── ... (more guides)
```
## Environment Configuration
### `.env.template` vs `.env`
| File | Purpose | Git Status | Contains |
|------|---------|------------|----------|
| `.env.template` | Template for team | ✅ Committed | `CHANGEME` placeholders |
| `.env` | Your actual config | ❌ Gitignored | Real IPs, passwords, tokens |
### Security Rules
🔒 **DO:**
- Keep `.env` file with `chmod 600` permissions
- Store backups of `.env` securely (encrypted)
- Use `.env.template` to share config structure
- Verify `.env` is gitignored before adding secrets
- Trust `.claudeignore` to protect secrets from AI assistants
🚫 **DON'T:**
- Commit `.env` to Git
- Share `.env` via email/Slack/unencrypted channels
- Use world-readable permissions (644, 777)
- Hardcode values from `.env` in documentation
### Multi-Layer Security Protection
This directory uses **three layers** of secret protection:
1. **`.gitignore`** - Prevents committing secrets to Git repository
2. **`.claudeignore`** - Prevents LLMs/AI assistants from reading secrets
3. **File permissions** - `chmod 600` prevents other users from reading secrets
All three layers work together to protect your production infrastructure.
## Setup Guides
Follow these guides in order:
1. **[00-getting-started.md](setup/00-getting-started.md)**
- Local workspace setup
- DigitalOcean API token configuration
- `.env` file initialization
2. **[01_init_docker_swarm.md](setup/01_init_docker_swarm.md)**
- Create DigitalOcean droplets (Ubuntu 24.04)
- Install Docker on nodes
- Configure Docker Swarm with private networking
- Verify cluster connectivity
3. **More guides coming...**
- Cassandra deployment
- Redis setup
- Application deployment
- SSL/HTTPS configuration
## Infrastructure Overview
### Naming Convention
Format: `{company}-{role}-{sequential-number}-{environment}`
Examples:
- `mapleopentech-swarm-manager-1-prod`
- `mapleopentech-swarm-worker-1-prod`
- `mapleopentech-swarm-worker-2-prod`
**Why this pattern?**
- Simple sequential numbering (never reused)
- No role-specific prefixes (use Docker labels instead)
- Easy to scale (just add worker-N)
- Flexible (can repurpose servers without renaming)
## Getting Help
### Documentation
- Setup guides in `setup/` directory
- `.env.template` has inline comments for all variables
- Each guide includes troubleshooting section
### Common Issues
1. **`.env` file missing**: Run `cp .env.template .env`
2. **Variables not loading**: Run `source .env` in your terminal
3. **Git showing .env**: It shouldn't be - check `.gitignore`
---
**Last Updated**: November 3, 2025
**Maintained By**: Infrastructure Team

View file

@ -0,0 +1,693 @@
# Automation Scripts and Tools
**Audience**: DevOps Engineers, Automation Teams
**Purpose**: Automated scripts, monitoring configs, and CI/CD pipelines for production infrastructure
**Prerequisites**: Infrastructure deployed, basic scripting knowledge
---
## Overview
This directory contains automation tools, scripts, and configurations to reduce manual operational overhead and ensure consistency across deployments.
**What's automated:**
- Backup procedures (scheduled)
- Deployment workflows (CI/CD)
- Monitoring and alerting (Prometheus/Grafana configs)
- Common maintenance tasks (scripts)
- Infrastructure health checks
---
## Directory Structure
```
automation/
├── README.md # This file
├── scripts/ # Operational scripts
│ ├── backup-all.sh # Master backup orchestrator
│ ├── backup-cassandra.sh # Cassandra snapshot + upload
│ ├── backup-redis.sh # Redis RDB/AOF backup
│ ├── backup-meilisearch.sh # Meilisearch dump export
│ ├── deploy-backend.sh # Backend deployment automation
│ ├── deploy-frontend.sh # Frontend deployment automation
│ ├── health-check.sh # Infrastructure health verification
│ ├── rotate-secrets.sh # Secret rotation automation
│ └── cleanup-docker.sh # Docker cleanup (images, containers)
├── monitoring/ # Monitoring configurations
│ ├── prometheus.yml # Prometheus scrape configs
│ ├── alertmanager.yml # Alert routing and receivers
│ ├── alert-rules.yml # Prometheus alert definitions
│ └── grafana-dashboards/ # JSON dashboard exports
│ ├── infrastructure.json
│ ├── maplepress.json
│ └── databases.json
└── ci-cd/ # CI/CD pipeline examples
├── github-actions.yml # GitHub Actions workflow
├── gitlab-ci.yml # GitLab CI pipeline
└── deployment-pipeline.md # CI/CD setup guide
```
---
## Scripts
### Backup Scripts
All backup scripts are designed to be run via cron. They:
- Create local snapshots/dumps
- Compress and upload to DigitalOcean Spaces
- Clean up old backups (retention policy)
- Log to `/var/log/`
- Exit with appropriate codes for monitoring
**See `../operations/01_backup_recovery.md` for complete script contents and setup instructions.**
**Installation:**
```bash
# On manager node
ssh dockeradmin@<manager-ip>
# Copy scripts (once scripts are created in this directory)
sudo cp automation/scripts/backup-*.sh /usr/local/bin/
sudo chmod +x /usr/local/bin/backup-*.sh
# Schedule via cron
sudo crontab -e
# 0 2 * * * /usr/local/bin/backup-all.sh >> /var/log/backup-all.log 2>&1
```
### Deployment Scripts
**`deploy-backend.sh`** - Automated backend deployment
```bash
#!/bin/bash
# Purpose: Deploy new backend version with zero downtime
# Usage: ./deploy-backend.sh [tag]
# Example: ./deploy-backend.sh prod
set -e
TAG=${1:-prod}
echo "=== Deploying Backend: Tag $TAG ==="
# Step 1: Build and push (from local dev machine)
echo "Building and pushing image..."
cd ~/go/src/codeberg.org/mapleopentech/monorepo/cloud/mapleopentech-backend
task deploy
# Step 2: Force pull on worker-6
echo "Forcing fresh pull on worker-6..."
ssh dockeradmin@<worker-6-ip> \
"docker pull registry.digitalocean.com/ssp/maplepress_backend:$TAG"
# Step 3: Redeploy stack
echo "Redeploying stack..."
ssh dockeradmin@<manager-ip> << 'ENDSSH'
cd ~/stacks
docker stack rm maplepress
sleep 10
docker config rm maplepress_caddyfile 2>/dev/null || true
docker stack deploy -c maplepress-stack.yml maplepress
ENDSSH
# Step 4: Verify deployment
echo "Verifying deployment..."
sleep 30
ssh dockeradmin@<manager-ip> << 'ENDSSH'
docker service ps maplepress_backend | head -5
docker service logs maplepress_backend --tail 20
ENDSSH
# Step 5: Health check
echo "Testing health endpoint..."
curl -f https://getmaplepress.ca/health || { echo "Health check failed!"; exit 1; }
echo "✅ Backend deployment complete!"
```
**`deploy-frontend.sh`** - Automated frontend deployment
```bash
#!/bin/bash
# Purpose: Deploy new frontend build
# Usage: ./deploy-frontend.sh
set -e
echo "=== Deploying Frontend ==="
# SSH to worker-7 and run deployment
ssh dockeradmin@<worker-7-ip> << 'ENDSSH'
cd /var/www/monorepo
echo "Pulling latest code..."
git pull origin main
cd web/maplepress-frontend
echo "Configuring production environment..."
cat > .env.production << 'EOF'
VITE_API_BASE_URL=https://getmaplepress.ca
NODE_ENV=production
EOF
echo "Installing dependencies..."
npm install
echo "Building frontend..."
npm run build
echo "Verifying build..."
if grep -q "getmaplepress.ca" dist/assets/*.js 2>/dev/null; then
echo "✅ Production API URL confirmed"
else
echo "⚠️ Warning: Production URL not found in build"
fi
ENDSSH
# Test frontend
echo "Testing frontend..."
curl -f https://getmaplepress.com || { echo "Frontend test failed!"; exit 1; }
echo "✅ Frontend deployment complete!"
```
### Health Check Script
**`health-check.sh`** - Comprehensive infrastructure health verification
```bash
#!/bin/bash
# Purpose: Check health of all infrastructure components
# Usage: ./health-check.sh
# Exit codes: 0=healthy, 1=warnings, 2=critical
WARNINGS=0
CRITICAL=0
echo "=== Infrastructure Health Check ==="
echo "Started: $(date)"
echo ""
# Check all services
echo "--- Docker Services ---"
SERVICES_DOWN=$(docker service ls | grep -v "1/1" | grep -v "REPLICAS" | wc -l)
if [ $SERVICES_DOWN -gt 0 ]; then
echo "⚠️ WARNING: $SERVICES_DOWN services not at full capacity"
docker service ls | grep -v "1/1" | grep -v "REPLICAS"
WARNINGS=$((WARNINGS + 1))
else
echo "✅ All services running (1/1)"
fi
# Check all nodes
echo ""
echo "--- Docker Nodes ---"
NODES_DOWN=$(docker node ls | grep -v "Ready" | grep -v "STATUS" | wc -l)
if [ $NODES_DOWN -gt 0 ]; then
echo "🔴 CRITICAL: $NODES_DOWN nodes not ready!"
docker node ls | grep -v "Ready" | grep -v "STATUS"
CRITICAL=$((CRITICAL + 1))
else
echo "✅ All nodes ready"
fi
# Check disk space
echo ""
echo "--- Disk Space ---"
for NODE in worker-1 worker-2 worker-3 worker-4 worker-5 worker-6 worker-7; do
DISK_USAGE=$(ssh -o StrictHostKeyChecking=no dockeradmin@$NODE "df -h / | tail -1 | awk '{print \$5}' | tr -d '%'")
if [ $DISK_USAGE -gt 85 ]; then
echo "🔴 CRITICAL: $NODE disk usage: ${DISK_USAGE}%"
CRITICAL=$((CRITICAL + 1))
elif [ $DISK_USAGE -gt 75 ]; then
echo "⚠️ WARNING: $NODE disk usage: ${DISK_USAGE}%"
WARNINGS=$((WARNINGS + 1))
else
echo "✅ $NODE disk usage: ${DISK_USAGE}%"
fi
done
# Check endpoints
echo ""
echo "--- HTTP Endpoints ---"
if curl -sf https://getmaplepress.ca/health > /dev/null; then
echo "✅ Backend health check passed"
else
echo "🔴 CRITICAL: Backend health check failed!"
CRITICAL=$((CRITICAL + 1))
fi
if curl -sf https://getmaplepress.com > /dev/null; then
echo "✅ Frontend accessible"
else
echo "🔴 CRITICAL: Frontend not accessible!"
CRITICAL=$((CRITICAL + 1))
fi
# Summary
echo ""
echo "=== Summary ==="
echo "Warnings: $WARNINGS"
echo "Critical: $CRITICAL"
if [ $CRITICAL -gt 0 ]; then
echo "🔴 Status: CRITICAL"
exit 2
elif [ $WARNINGS -gt 0 ]; then
echo "⚠️ Status: WARNING"
exit 1
else
echo "✅ Status: HEALTHY"
exit 0
fi
```
---
## Monitoring Configuration Files
### Prometheus Configuration
**Located at**: `monitoring/prometheus.yml`
```yaml
# See ../operations/02_monitoring_alerting.md for complete configuration
# This file should be copied to ~/stacks/monitoring-config/ on manager node
global:
scrape_interval: 15s
evaluation_interval: 15s
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']
rule_files:
- /etc/prometheus/alert-rules.yml
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node-exporter'
dns_sd_configs:
- names: ['tasks.node-exporter']
type: 'A'
port: 9100
- job_name: 'cadvisor'
dns_sd_configs:
- names: ['tasks.cadvisor']
type: 'A'
port: 8080
- job_name: 'maplepress-backend'
static_configs:
- targets: ['maplepress-backend:8000']
metrics_path: '/metrics'
```
### Alert Rules
**Located at**: `monitoring/alert-rules.yml`
See `../operations/02_monitoring_alerting.md` for complete alert rule configurations.
### Grafana Dashboards
**Dashboard exports** (JSON format) should be stored in `monitoring/grafana-dashboards/`.
**To import:**
1. Access Grafana via SSH tunnel: `ssh -L 3000:localhost:3000 dockeradmin@<manager-ip>`
2. Open http://localhost:3000
3. Dashboards → Import → Upload JSON file
**Recommended dashboards:**
- Infrastructure Overview (node metrics, disk, CPU, memory)
- MaplePress Application (HTTP metrics, errors, latency)
- Database Metrics (Cassandra, Redis, Meilisearch)
---
## CI/CD Pipelines
### GitHub Actions Example
**File:** `ci-cd/github-actions.yml`
```yaml
name: Deploy to Production
on:
push:
branches:
- main
paths:
- 'cloud/mapleopentech-backend/**'
jobs:
build-and-deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Set up Go
uses: actions/setup-go@v4
with:
go-version: '1.21'
- name: Run tests
run: |
cd cloud/mapleopentech-backend
go test ./...
- name: Install doctl
uses: digitalocean/action-doctl@v2
with:
token: ${{ secrets.DIGITALOCEAN_TOKEN }}
- name: Build and push Docker image
run: |
cd cloud/mapleopentech-backend
doctl registry login
docker build -t registry.digitalocean.com/ssp/maplepress_backend:prod .
docker push registry.digitalocean.com/ssp/maplepress_backend:prod
- name: Deploy to production
uses: appleboy/ssh-action@master
with:
host: ${{ secrets.MANAGER_IP }}
username: dockeradmin
key: ${{ secrets.SSH_PRIVATE_KEY }}
script: |
# Force pull on worker-6
ssh dockeradmin@${{ secrets.WORKER_6_IP }} \
"docker pull registry.digitalocean.com/ssp/maplepress_backend:prod"
# Redeploy stack
cd ~/stacks
docker stack rm maplepress
sleep 10
docker config rm maplepress_caddyfile || true
docker stack deploy -c maplepress-stack.yml maplepress
# Wait and verify
sleep 30
docker service ps maplepress_backend | head -5
- name: Health check
run: |
curl -f https://getmaplepress.ca/health || exit 1
- name: Notify deployment
if: always()
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
text: 'Backend deployment ${{ job.status }}'
webhook_url: ${{ secrets.SLACK_WEBHOOK }}
```
### GitLab CI Example
**File:** `ci-cd/gitlab-ci.yml`
```yaml
stages:
- test
- build
- deploy
variables:
DOCKER_IMAGE: registry.digitalocean.com/ssp/maplepress_backend
DOCKER_TAG: prod
test:
stage: test
image: golang:1.21
script:
- cd cloud/mapleopentech-backend
- go test ./...
build:
stage: build
image: docker:latest
services:
- docker:dind
before_script:
- docker login registry.digitalocean.com -u $DIGITALOCEAN_TOKEN -p $DIGITALOCEAN_TOKEN
script:
- cd cloud/mapleopentech-backend
- docker build -t $DOCKER_IMAGE:$DOCKER_TAG .
- docker push $DOCKER_IMAGE:$DOCKER_TAG
only:
- main
deploy:
stage: deploy
image: alpine:latest
before_script:
- apk add --no-cache openssh-client
- eval $(ssh-agent -s)
- echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add -
- mkdir -p ~/.ssh
- chmod 700 ~/.ssh
- ssh-keyscan -H $MANAGER_IP >> ~/.ssh/known_hosts
script:
# Force pull on worker-6
- ssh dockeradmin@$WORKER_6_IP "docker pull $DOCKER_IMAGE:$DOCKER_TAG"
# Redeploy stack
- |
ssh dockeradmin@$MANAGER_IP << 'EOF'
cd ~/stacks
docker stack rm maplepress
sleep 10
docker config rm maplepress_caddyfile || true
docker stack deploy -c maplepress-stack.yml maplepress
EOF
# Verify deployment
- sleep 30
- ssh dockeradmin@$MANAGER_IP "docker service ps maplepress_backend | head -5"
# Health check
- apk add --no-cache curl
- curl -f https://getmaplepress.ca/health
only:
- main
environment:
name: production
url: https://getmaplepress.ca
```
---
## Usage Examples
### Running Scripts Manually
```bash
# Backup all services
ssh dockeradmin@<manager-ip>
sudo /usr/local/bin/backup-all.sh
# Health check
ssh dockeradmin@<manager-ip>
sudo /usr/local/bin/health-check.sh
echo "Exit code: $?"
# 0 = healthy, 1 = warnings, 2 = critical
# Deploy backend
cd ~/monorepo/cloud/infrastructure/production
./automation/scripts/deploy-backend.sh prod
# Deploy frontend
./automation/scripts/deploy-frontend.sh
```
### Scheduling Scripts with Cron
```bash
# Edit crontab on manager
ssh dockeradmin@<manager-ip>
sudo crontab -e
# Add these lines:
# Backup all services daily at 2 AM
0 2 * * * /usr/local/bin/backup-all.sh >> /var/log/backup-all.log 2>&1
# Health check every hour
0 * * * * /usr/local/bin/health-check.sh >> /var/log/health-check.log 2>&1
# Docker cleanup weekly (Sunday 3 AM)
0 3 * * 0 /usr/local/bin/cleanup-docker.sh >> /var/log/docker-cleanup.log 2>&1
# Secret rotation monthly (1st of month, 4 AM)
0 4 1 * * /usr/local/bin/rotate-secrets.sh >> /var/log/secret-rotation.log 2>&1
```
### Monitoring Script Execution
```bash
# View cron logs
sudo grep CRON /var/log/syslog | tail -20
# View specific script logs
tail -f /var/log/backup-all.log
tail -f /var/log/health-check.log
# Check script exit codes
echo "Last backup exit code: $?"
```
---
## Best Practices
### Script Development
1. **Always use `set -e`**: Exit on first error
2. **Log everything**: Redirect to `/var/log/`
3. **Use exit codes**: 0=success, 1=warning, 2=critical
4. **Idempotent**: Safe to run multiple times
5. **Document**: Comments and usage instructions
6. **Test**: Verify on staging before production
### Secret Management
**Never hardcode secrets in scripts!**
```bash
# ❌ Bad
REDIS_PASSWORD="mysecret123"
# ✅ Good
REDIS_PASSWORD=$(docker exec redis cat /run/secrets/redis_password)
# ✅ Even better
REDIS_PASSWORD=$(cat /run/secrets/redis_password 2>/dev/null || echo "")
if [ -z "$REDIS_PASSWORD" ]; then
echo "Error: Redis password not found"
exit 1
fi
```
### Error Handling
```bash
# Check command success
if ! docker service ls > /dev/null 2>&1; then
echo "Error: Cannot connect to Docker"
exit 2
fi
# Trap errors
trap 'echo "Script failed on line $LINENO"' ERR
# Verify prerequisites
for COMMAND in docker ssh s3cmd; do
if ! command -v $COMMAND &> /dev/null; then
echo "Error: $COMMAND not found"
exit 1
fi
done
```
---
## Troubleshooting
### Script Won't Execute
```bash
# Check permissions
ls -la /usr/local/bin/script.sh
# Should be: -rwxr-xr-x (executable)
# Fix permissions
sudo chmod +x /usr/local/bin/script.sh
# Check shebang
head -1 /usr/local/bin/script.sh
# Should be: #!/bin/bash
```
### Cron Job Not Running
```bash
# Check cron service
sudo systemctl status cron
# Check cron logs
sudo grep CRON /var/log/syslog | tail -20
# Test cron environment
* * * * * /usr/bin/env > /tmp/cron-env.txt
# Wait 1 minute, then check /tmp/cron-env.txt
```
### SSH Issues in Scripts
```bash
# Add SSH keys to ssh-agent
eval $(ssh-agent)
ssh-add ~/.ssh/id_rsa
# Disable strict host checking (only for internal network)
ssh -o StrictHostKeyChecking=no user@host "command"
# Use SSH config
cat >> ~/.ssh/config << EOF
Host worker-*
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
```
---
## Contributing
**When adding new automation:**
1. Place scripts in `automation/scripts/`
2. Document usage in header comments
3. Follow naming convention: `verb-noun.sh`
4. Test thoroughly on staging
5. Update this README with script description
6. Add to appropriate cron schedule if applicable
---
## Future Automation Ideas
**Not yet implemented, but good candidates:**
- [ ] Automatic SSL certificate monitoring (separate from Caddy)
- [ ] Database performance metrics collection
- [ ] Automated capacity planning reports
- [ ] Self-healing scripts (restart failed services)
- [ ] Traffic spike detection and auto-scaling
- [ ] Automated security vulnerability scanning
- [ ] Log aggregation and analysis
- [ ] Cost optimization recommendations
---
**Last Updated**: January 2025
**Maintained By**: Infrastructure Team
**Note**: Scripts in this directory are templates. Customize IP addresses, domains, and credentials for your specific environment before use.

View file

@ -0,0 +1,148 @@
# Backend Access & Database Operations
## Access Backend Container
```bash
# Find which node runs the backend
ssh dockeradmin@<manager-ip>
docker service ps maplefile_backend --filter "desired-state=running"
# Note the NODE column
# SSH to that worker
ssh dockeradmin@<worker-ip>
# Get container ID
export BACKEND_CONTAINER=$(docker ps --filter "name=maplefile.*backend" -q | head -1)
# Open shell
docker exec -it $BACKEND_CONTAINER sh
# Or run single command
docker exec $BACKEND_CONTAINER ./maplefile-backend --help
```
## View Logs
```bash
# Follow logs
docker logs -f $BACKEND_CONTAINER
# Last 100 lines
docker logs --tail 100 $BACKEND_CONTAINER
# Search for errors
docker logs $BACKEND_CONTAINER 2>&1 | grep -i error
```
## Database Operations
### Run Migrations (Safe)
```bash
docker exec $BACKEND_CONTAINER ./maplefile-backend migrate up
```
Auto-runs on backend startup when `DATABASE_AUTO_MIGRATE=true` (default in stack file).
### Rollback Last Migration (Destructive)
```bash
docker exec $BACKEND_CONTAINER ./maplefile-backend migrate down
```
Only rolls back 1 migration. Run multiple times for multiple rollbacks.
### Reset Database (Full Wipe)
```bash
# 1. SSH to any Cassandra node (any of the 3 nodes works)
ssh dockeradmin@<cassandra-node-ip>
# 2. Find the Cassandra container ID
export CASSANDRA_CONTAINER=$(docker ps --filter "name=cassandra" -q | head -1)
# 3. Drop keyspace (DELETES ALL DATA - propagates to all 3 nodes)
docker exec -it $CASSANDRA_CONTAINER cqlsh -e "DROP KEYSPACE IF EXISTS maplefile;"
# 4. Wait for schema to propagate across cluster
sleep 5
# 5. Recreate keyspace (propagates to all 3 nodes)
docker exec -it $CASSANDRA_CONTAINER cqlsh -e "
CREATE KEYSPACE IF NOT EXISTS maplefile
WITH replication = {
'class': 'SimpleStrategy',
'replication_factor': 3
};"
# 6. Wait for schema agreement across cluster
sleep 5
# 7. Verify keyspace exists
docker exec -it $CASSANDRA_CONTAINER cqlsh -e "DESCRIBE KEYSPACE maplefile;"
# 8. Restart backend to run migrations
# You must pull the new image on the worker node first
# Find which worker runs the service:
ssh dockeradmin@<manager-ip>
docker service ps maplefile_backend
# Note the worker node name
# Pull image on the worker:
ssh dockeradmin@<worker-ip>
docker pull registry.digitalocean.com/ssp/maplefile-backend:prod
exit
# Force restart on manager:
ssh dockeradmin@<manager-ip>
docker service update --force maplefile_backend
# Verify new version is running:
docker service logs maplefile_backend --tail 50
# Look for: 📝 Git Commit: <commit-sha>
```
## Troubleshooting
### Container Not Found
```bash
# Check service status
docker service ps maplefile_backend
# List all backend containers
docker ps | grep backend
```
### Wrong Container (MaplePress vs MapleFile)
```bash
# Verify you have MapleFile (not MaplePress)
docker ps | grep $BACKEND_CONTAINER
# Should show "maplefile-backend" in image name
```
### Migration Fails
```bash
# Check environment (from worker node)
docker exec $BACKEND_CONTAINER env | grep DATABASE
# Check Cassandra connectivity
docker exec $BACKEND_CONTAINER nc -zv cassandra-1 9042
```
## Configuration
Environment variables are in `~/stacks/maplefile-stack.yml` on manager node, not `.env` files.
To change config:
1. Edit `~/stacks/maplefile-stack.yml`
2. Pull new image on worker: `ssh dockeradmin@<worker-ip> && docker pull registry.digitalocean.com/ssp/maplefile-backend:prod && exit`
3. Force restart on manager: `docker service update --force maplefile_backend`
**Important**: Worker nodes cache images locally. You MUST pull the new image on the worker node before restarting the service. The `--resolve-image always` and `--with-registry-auth` flags do NOT reliably force worker nodes to pull new images.
---
**Last Updated**: November 2025

View file

@ -0,0 +1,196 @@
# Docker Image Updates & Deployment
**Quick Reference for MapleFile & MaplePress Backend**
## Images
- MapleFile: `registry.digitalocean.com/ssp/maplefile-backend:prod`
- MaplePress: `registry.digitalocean.com/ssp/maplepress-backend:prod`
## Build & Push
```bash
cd ~/go/src/codeberg.org/mapleopentech/monorepo/cloud/maplefile-backend
task deploy
# Note the Image ID and git commit from output
docker images registry.digitalocean.com/ssp/maplefile-backend:prod
```
## Deploy to Production
**CRITICAL**: Docker Swarm caches images. You MUST verify Image IDs match across all nodes.
### Step 1: Note Your Local Image ID
```bash
docker images registry.digitalocean.com/ssp/maplefile-backend:prod
# Example: IMAGE ID = 74b2fafb1f69
```
### Step 2: Find Worker Node & Pull Images
```bash
# SSH to manager
ssh dockeradmin@<MANAGER_IP>
# Find which worker runs the service
docker service ps maplefile_backend
# Note: NODE column (e.g., mapleopentech-swarm-worker-8-prod)
# Pull on manager
docker pull registry.digitalocean.com/ssp/maplefile-backend:prod
# Pull on worker
ssh dockeradmin@<WORKER_NODE>
docker pull registry.digitalocean.com/ssp/maplefile-backend:prod
exit
```
### Step 3: Verify Image IDs Match
```bash
# On manager
docker images registry.digitalocean.com/ssp/maplefile-backend:prod
# On worker
ssh dockeradmin@<WORKER_NODE>
docker images registry.digitalocean.com/ssp/maplefile-backend:prod
exit
# ALL THREE (local, manager, worker) must show SAME Image ID
```
### Step 4: Remove & Recreate Service
```bash
# On manager - remove service
docker service rm maplefile_backend
# Redeploy stack
cd ~/stacks
docker stack deploy -c maplefile-stack.yml maplefile
```
### Step 5: Verify Deployment
```bash
docker service logs maplefile_backend --tail 50
# Confirm these match your build:
# 🚀 Starting MapleFile Backend v0.1.0
# 📝 Git Commit: <your-commit-sha>
# 🕐 Build Time: <your-build-timestamp>
```
## For MaplePress
Same process, replace `maplefile` with `maplepress`:
```bash
docker service ps maplepress_backend
# Pull on both nodes
docker service rm maplepress_backend
docker stack deploy -c maplepress-stack.yml maplepress
```
## Why Remove & Recreate?
Docker Swarm's `docker service update --force` does NOT reliably use new images even after pulling. The `--resolve-image always` and `--with-registry-auth` flags also fail with mutable `:prod` tags.
**Only remove & recreate guarantees the new image is used.**
## Rollback
### Quick Rollback
```bash
# Automatic rollback to previous version
docker service rollback maplefile_backend
```
### Rollback to Specific Version
```bash
# Find previous image digest
docker service ps maplefile_backend --no-trunc
# Rollback to specific digest
docker service update --image registry.digitalocean.com/ssp/maplefile-backend:prod@sha256:def456... maplefile_backend
```
## Troubleshooting
### Health Check Failures
```bash
# Check logs
docker service logs maplefile_backend --tail 100
# Rollback if needed
docker service rollback maplefile_backend
```
### Image Pull Authentication Error
```bash
# Re-authenticate
doctl registry login
# Retry
docker service update --image registry.digitalocean.com/ssp/maplefile-backend:prod maplefile_backend
```
### Service Stuck Starting
```bash
# Common causes: database migrations failing, missing env vars, health check issues
# Check logs
docker service logs maplefile_backend --tail 50
# Rollback if urgent
docker service rollback maplefile_backend
```
## Standard Deployment Workflow
```bash
# 1. Local: Build & push (note the git commit and Image ID)
cd ~/go/src/codeberg.org/mapleopentech/monorepo/cloud/maplefile-backend
task deploy
# Example output: "Deployed version d90b6e2b - use this to verify on production"
# Note the local Image ID for verification
docker images registry.digitalocean.com/ssp/maplefile-backend:prod
# Example: IMAGE ID = 74b2fafb1f69
# 2. Find which worker is running the service
ssh dockeradmin@<MANAGER_IP>
docker service ps maplefile_backend
# Note the worker node (e.g., mapleopentech-swarm-worker-8-prod)
# 3. Pull the new image on the MANAGER node
docker pull registry.digitalocean.com/ssp/maplefile-backend:prod
# 4. Pull the new image on the WORKER node
ssh dockeradmin@<WORKER_NODE>
docker pull registry.digitalocean.com/ssp/maplefile-backend:prod
exit
# 5. Force restart on manager
ssh dockeradmin@<MANAGER_IP>
docker service update --force maplefile_backend
# 6. Verify git commit matches what you deployed
docker service logs maplefile_backend --tail 50
# Look for: 📝 Git Commit: d90b6e2b...
```
**Key points**:
- You MUST pull the image on **BOTH manager and worker nodes**
- Use `docker images` to verify Image ID matches your local build
- Use `docker service update --force` to restart with the new image
- Check startup logs for Git Commit to verify correct version is running
---
**Last Updated**: November 2025

View file

@ -0,0 +1,15 @@
To see console log of our backend:
Log in the specific worker node.
Afterwords run in the conosle"
```shell
docker ps | grep backend
```
and then:
```shell
docker logs -f aa1b2c65eba7
```

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,124 @@
# Frontend Updates & Deployment
**Quick Reference for MapleFile Frontend**
## Overview
The frontend runs on Worker Node 9 as a static site built with Vite/React. Updates are deployed by pulling the latest code and rebuilding.
## Prerequisites
- SSH access to worker-9 as `dockeradmin`
- Node.js and npm installed on the server
## Quick Deploy
```bash
# SSH to worker-9 and run deploy script
ssh dockeradmin@<WORKER_9_IP>
~/deploy-frontend.sh
```
## Manual Deploy
```bash
# 1. SSH to worker-9
ssh dockeradmin@<WORKER_9_IP>
# 2. Navigate to monorepo
cd /var/www/monorepo
# 3. Pull latest changes (includes .env.production from git)
git pull origin main
# 4. Navigate to frontend
cd web/maplefile-frontend
# 5. Install dependencies (if package.json changed)
npm install
# 6. Build production bundle
npm run build
```
## Verify Deployment
```bash
# Check build output exists
ls -la /var/www/monorepo/web/maplefile-frontend/dist/
# Check build timestamp
stat /var/www/monorepo/web/maplefile-frontend/dist/index.html
```
## Rollback
```bash
# SSH to worker-9
ssh dockeradmin@<WORKER_9_IP>
# Navigate to monorepo
cd /var/www/monorepo
# Reset to previous commit
git log --oneline -10 # Find the commit to rollback to
git checkout <COMMIT_SHA>
# Rebuild
cd web/maplefile-frontend
npm install
npm run build
```
## Troubleshooting
### Build Fails
```bash
# Clear node_modules and rebuild
cd /var/www/monorepo/web/maplefile-frontend
rm -rf node_modules
npm install
npm run build
```
### Check Node.js Version
```bash
node --version
npm --version
# If outdated, update Node.js
```
### Permission Issues
```bash
# Ensure correct ownership
sudo chown -R dockeradmin:dockeradmin /var/www/monorepo
```
## Standard Deployment Workflow
```bash
# 1. Local: Commit and push your changes
cd ~/go/src/codeberg.org/mapleopentech/monorepo/web/maplefile-frontend
git add .
git commit -m "feat: your changes"
git push origin main
# 2. Deploy to production
ssh dockeradmin@<WORKER_9_IP>
cd /var/www/monorepo
git pull origin main
cd web/maplefile-frontend
npm install
npm run build
# 3. Verify by visiting the site
# https://maplefile.app (or your domain)
```
---
**Last Updated**: November 2025

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,544 @@
# Reference Documentation
**Audience**: All infrastructure team members, architects, management
**Purpose**: High-level architecture, capacity planning, cost analysis, and strategic documentation
**Prerequisites**: Familiarity with deployed infrastructure
---
## Overview
This directory contains reference materials that provide the "big picture" view of your infrastructure. Unlike operational procedures (setup, operations, automation), these documents focus on **why** decisions were made, **what** the architecture looks like, and **how** to plan for the future.
**Contents:**
- Architecture diagrams and decision records
- Capacity planning and performance baselines
- Cost analysis and optimization strategies
- Security compliance documentation
- Technology choices and trade-offs
- Glossary of terms
---
## Directory Contents
### Architecture Documentation
**`architecture-overview.md`** - High-level system architecture
- Infrastructure topology
- Component interactions
- Data flow diagrams
- Network architecture
- Security boundaries
- Design principles and rationale
**`architecture-decisions.md`** - Architecture Decision Records (ADRs)
- Why Docker Swarm over Kubernetes?
- Why Cassandra over PostgreSQL?
- Why Caddy over NGINX?
- Multi-application architecture rationale
- Network segmentation strategy
- Service discovery approach
### Capacity Planning
**`capacity-planning.md`** - Growth planning and scaling strategies
- Current capacity baseline
- Performance benchmarks
- Growth projections
- Scaling thresholds
- Bottleneck analysis
- Future infrastructure needs
**`performance-baselines.md`** - Performance metrics and SLOs
- Response time percentiles
- Throughput measurements
- Database performance
- Resource utilization baselines
- Service Level Objectives (SLOs)
- Service Level Indicators (SLIs)
### Financial Planning
**`cost-analysis.md`** - Infrastructure costs and optimization
- Monthly cost breakdown
- Cost per service/application
- Cost trends and projections
- Optimization opportunities
- Reserved capacity vs on-demand
- TCO (Total Cost of Ownership)
**`cost-optimization.md`** - Strategies to reduce costs
- Right-sizing recommendations
- Idle resource identification
- Reserved instances opportunities
- Storage optimization
- Bandwidth optimization
- Alternative architecture considerations
### Security & Compliance
**`security-architecture.md`** - Security design and controls
- Defense-in-depth layers
- Authentication and authorization
- Secrets management approach
- Network security controls
- Data encryption (at rest and in transit)
- Security monitoring and logging
**`security-checklist.md`** - Security verification checklist
- Infrastructure hardening checklist
- Compliance requirements (GDPR, SOC2, etc.)
- Security audit procedures
- Vulnerability management
- Incident response readiness
**`compliance.md`** - Regulatory compliance documentation
- GDPR compliance measures
- Data residency requirements
- Audit trail procedures
- Privacy by design implementation
- Data retention policies
- Right to be forgotten procedures
### Technology Stack
**`technology-stack.md`** - Complete technology inventory
- Software versions and update policy
- Third-party services and dependencies
- Library and framework choices
- Language and runtime versions
- Tooling and development environment
**`technology-decisions.md`** - Why we chose each technology
- Database selection rationale
- Programming language choices
- Cloud provider selection
- Deployment tooling decisions
- Monitoring stack selection
### Operational Reference
**`runbook-index.md`** - Quick reference to all runbooks
- Emergency procedures quick links
- Common tasks reference
- Escalation contacts
- Critical command cheat sheet
**`glossary.md`** - Terms and definitions
- Docker Swarm terminology
- Database concepts (Cassandra RF, QUORUM, etc.)
- Network terms (overlay, ingress, etc.)
- Monitoring terminology
- Infrastructure jargon decoder
---
## Quick Reference Materials
### Architecture At-a-Glance
**Current Infrastructure (January 2025):**
```
Production Environment: maplefile-prod
Region: DigitalOcean Toronto (tor1)
Nodes: 7 workers (1 manager + 6 workers)
Applications: MaplePress (deployed), MapleFile (deployed)
Orchestration: Docker Swarm
Container Registry: DigitalOcean Container Registry (registry.digitalocean.com/ssp)
Object Storage: DigitalOcean Spaces (nyc3)
DNS: [Your DNS provider]
SSL: Let's Encrypt (automatic via Caddy)
Networks:
- maple-private-prod: Databases and internal services
- maple-public-prod: Public-facing services (Caddy + backends)
Databases:
- Cassandra: 3-node cluster, RF=3, QUORUM consistency
- Redis: Single instance, RDB + AOF persistence
- Meilisearch: Single instance
Applications:
- MaplePress Backend: Go 1.21+, Port 8000, Domain: getmaplepress.ca
- MaplePress Frontend: React 19 + Vite, Domain: getmaplepress.com
```
### Key Metrics Baseline (Example)
**As of [Date]:**
| Metric | Value | Threshold |
|--------|-------|-----------|
| Backend p95 Response Time | 150ms | < 500ms |
| Frontend Load Time | 1.2s | < 3s |
| Backend Throughput | 500 req/min | 5000 req/min capacity |
| Database Read Latency | 5ms | < 20ms |
| Database Write Latency | 10ms | < 50ms |
| Redis Hit Rate | 95% | > 90% |
| CPU Utilization (avg) | 35% | Alert at 80% |
| Memory Utilization (avg) | 50% | Alert at 85% |
| Disk Usage (avg) | 40% | Alert at 75% |
### Monthly Cost Breakdown (Example)
| Service | Monthly Cost | Notes |
|---------|--------------|-------|
| Droplets (7x) | $204 | See breakdown in cost-analysis.md |
| Spaces Storage | $5 | 250GB included |
| Additional Bandwidth | $0 | Within free tier |
| Container Registry | $0 | Included |
| DNS | $0 | Using [provider] |
| Monitoring (optional) | $0 | Self-hosted Prometheus |
| **Total** | **~$209/mo** | Can scale to ~$300/mo with growth |
### Technology Stack Summary
| Layer | Technology | Version | Purpose |
|-------|------------|---------|---------|
| **OS** | Ubuntu | 24.04 LTS | Base operating system |
| **Orchestration** | Docker Swarm | Built-in | Container orchestration |
| **Container Runtime** | Docker | 27.x+ | Container execution |
| **Database** | Cassandra | 4.1.x | Distributed database |
| **Cache** | Redis | 7.x | In-memory cache/sessions |
| **Search** | Meilisearch | v1.5+ | Full-text search |
| **Reverse Proxy** | Caddy | 2-alpine | HTTPS termination |
| **Backend** | Go | 1.21+ | Application runtime |
| **Frontend** | React + Vite | 19 + 5.x | Web UI |
| **Object Storage** | Spaces | S3-compatible | File storage |
| **Monitoring** | Prometheus + Grafana | Latest | Metrics & dashboards |
| **CI/CD** | TBD | - | GitHub Actions / GitLab CI |
---
## Architecture Decision Records (ADRs)
### ADR-001: Docker Swarm vs Kubernetes
**Decision**: Use Docker Swarm for orchestration
**Context**: Need container orchestration for production deployment
**Rationale**:
- Simpler to set up and maintain (< 1 hour vs days for k8s)
- Built into Docker (no additional components)
- Sufficient for our scale (< 100 services)
- Lower operational overhead
- Easier to troubleshoot
- Team familiarity with Docker
**Trade-offs**:
- Less ecosystem tooling than Kubernetes
- Limited advanced scheduling features
- Smaller community
- May need migration to k8s if scale dramatically (> 50 nodes)
**Status**: Accepted
---
### ADR-002: Cassandra for Distributed Database
**Decision**: Use Cassandra for primary datastore
**Context**: Need highly available, distributed database with linear scalability
**Rationale**:
- Write-heavy workload (user-generated content)
- Geographic distribution possible (multi-region)
- Proven at scale (Instagram, Netflix)
- No single point of failure (RF=3, QUORUM)
- Linear scalability (add nodes for capacity)
- Excellent write performance
**Trade-offs**:
- Higher complexity than PostgreSQL
- Eventually consistent (tunable)
- Schema migrations more complex
- Higher resource usage (3 nodes minimum)
- Steeper learning curve
**Alternatives Considered**:
- PostgreSQL + Patroni: Simpler but less scalable
- MongoDB: Similar, but prefer Cassandra's consistency model
- MySQL Cluster: Oracle licensing concerns
**Status**: Accepted
---
### ADR-003: Caddy for Reverse Proxy
**Decision**: Use Caddy instead of NGINX
**Context**: Need HTTPS termination and reverse proxy
**Rationale**:
- Automatic HTTPS with Let's Encrypt (zero configuration)
- Automatic certificate renewal (no cron jobs)
- Simpler configuration (10 lines vs 200+)
- Built-in HTTP/2 and HTTP/3
- Security by default
- Active development
**Trade-offs**:
- Less mature than NGINX (but production-ready)
- Smaller community
- Fewer third-party modules
- Slightly higher memory usage (negligible)
**Performance**: Equivalent for our use case (< 10k req/sec)
**Status**: Accepted
---
### ADR-004: Multi-Application Shared Infrastructure
**Decision**: Share database infrastructure across multiple applications
**Context**: Planning to deploy multiple applications (MaplePress, MapleFile)
**Rationale**:
- Cost efficiency (one 3-node Cassandra cluster vs 3 separate clusters)
- Operational efficiency (one set of database procedures)
- Resource utilization (databases rarely at capacity)
- Simplified backups (one backup process)
- Consistent data layer
**Isolation Strategy**:
- Separate keyspaces per application
- Separate workers for application backends
- Independent scaling per application
- Separate deployment pipelines
**Trade-offs**:
- Blast radius: One database failure affects all apps
- Resource contention possible (mitigated by capacity planning)
- Schema migration coordination needed
**Status**: Accepted
---
## Capacity Planning Guidelines
### Current Capacity
**Worker specifications:**
- Manager + Redis: 2 vCPU, 2 GB RAM
- Cassandra nodes (3x): 2 vCPU, 4 GB RAM each
- Meilisearch: 2 vCPU, 2 GB RAM
- Backend: 2 vCPU, 2 GB RAM
- Frontend: 1 vCPU, 1 GB RAM
**Total:** 13 vCPUs, 19 GB RAM
### Scaling Triggers
**When to scale:**
| Metric | Threshold | Action |
|--------|-----------|--------|
| CPU > 80% sustained | 5 minutes | Add worker or scale vertically |
| Memory > 85% sustained | 5 minutes | Increase droplet RAM |
| Disk > 75% full | Any node | Clear space or increase disk |
| Backend p95 > 1s | Consistent | Scale backend horizontally |
| Database latency > 50ms | Consistent | Add Cassandra node or tune |
| Request rate approaching capacity | 80% of max | Scale backend replicas |
### Scaling Options
**Horizontal Scaling (preferred):**
- Backend: Add replicas (`docker service scale maplepress_backend=3`)
- Cassandra: Add fourth node (increases capacity + resilience)
- Frontend: Add CDN or edge caching
**Vertical Scaling:**
- Resize droplets (requires brief restart)
- Increase memory limits in stack files
- Optimize application code first
**Cost vs Performance:**
- Horizontal: More resilient, linear cost increase
- Vertical: Simpler, better price/performance up to a point
---
## Cost Optimization Strategies
### Quick Wins
1. **Reserved Instances**: DigitalOcean doesn't offer reserved pricing, but consider annual contracts for discounts
2. **Right-sizing**: Monitor actual usage, downsize oversized droplets
3. **Cleanup**: Regular docker system prune, clear old snapshots
4. **Compression**: Enable gzip in Caddy (already done)
5. **Caching**: Maximize cache hit rates (Redis, CDN)
### Medium-term Optimizations
1. **CDN for static assets**: Offload frontend static files to CDN
2. **Object storage lifecycle**: Auto-delete old backups
3. **Database tuning**: Optimize queries to reduce hardware needs
4. **Spot instances**: Not available on DigitalOcean, but consider for batch jobs
### Alternative Architectures
**If cost becomes primary concern:**
- Single-node PostgreSQL instead of Cassandra cluster (-$96/mo)
- Collocate services on fewer droplets (-$50-100/mo)
- Use managed databases (different cost model)
**Trade-off**: Lower cost, higher operational risk
---
## Security Architecture
### Defense in Depth Layers
1. **Network**: VPC, firewalls, private overlay networks
2. **Transport**: TLS 1.3 for all external connections
3. **Application**: Authentication, authorization, input validation
4. **Data**: Encryption at rest (object storage), encryption in transit
5. **Monitoring**: Audit logs, security alerts, intrusion detection
### Key Security Controls
**Implemented:**
- ✅ SSH key-based authentication (no passwords)
- ✅ UFW firewall on all nodes
- ✅ Docker secrets for sensitive values
- ✅ Network segmentation (private vs public)
- ✅ Automatic HTTPS with perfect forward secrecy
- ✅ Security headers (HSTS, X-Frame-Options, etc.)
- ✅ Database authentication (passwords, API keys)
- ✅ Minimal attack surface (only ports 22, 80, 443 exposed)
**Planned:**
- [ ] fail2ban for SSH brute-force protection
- [ ] Intrusion detection system (IDS)
- [ ] Regular security scanning (Trivy for containers)
- [ ] Secret rotation automation
- [ ] Audit logging aggregation
---
## Compliance Considerations
### GDPR
**If processing EU user data:**
- Data residency: Deploy EU region workers
- Right to deletion: Implement user data purge
- Data portability: Export user data functionality
- Privacy by design: Minimal data collection
- Audit trail: Log all data access
### SOC2
**If pursuing SOC2 compliance:**
- Access controls: Role-based access, MFA
- Change management: All changes via git, reviewed
- Monitoring: Comprehensive logging and alerting
- Incident response: Documented procedures
- Business continuity: Backup and disaster recovery tested
**Document in**: `compliance.md`
---
## Glossary
### Docker Swarm Terms
**Manager node**: Swarm orchestrator, schedules tasks, maintains cluster state
**Worker node**: Executes tasks (containers) assigned by manager
**Service**: Definition of containers to run (image, replicas, network)
**Task**: Single container instance of a service
**Stack**: Group of related services deployed together
**Overlay network**: Virtual network spanning all swarm nodes
**Ingress network**: Built-in load balancing for published ports
**Node label**: Key-value tag for task placement constraints
### Cassandra Terms
**RF (Replication Factor)**: Number of copies of data (RF=3 = 3 copies)
**QUORUM**: Majority of replicas (2 out of 3 for RF=3)
**Consistency Level**: How many replicas must respond (ONE, QUORUM, ALL)
**Keyspace**: Database namespace (like database in SQL)
**SSTable**: Immutable data file on disk
**Compaction**: Merging SSTables to reclaim space
**Repair**: Synchronize data across replicas
**Nodetool**: Command-line tool for Cassandra administration
### Monitoring Terms
**Prometheus**: Time-series database and metrics collection
**Grafana**: Visualization and dashboarding
**Alertmanager**: Alert routing and notification
**Exporter**: Metrics collection agent (node_exporter, etc.)
**Scrape**: Prometheus collecting metrics from target
**Time series**: Sequence of data points over time
**PromQL**: Prometheus query language
---
## Related Documentation
**For initial deployment:**
- `../setup/` - Step-by-step infrastructure deployment
**For day-to-day operations:**
- `../operations/` - Backup, monitoring, incident response
**For automation:**
- `../automation/` - Scripts, CI/CD, monitoring configs
**External resources:**
- Docker Swarm: https://docs.docker.com/engine/swarm/
- Cassandra: https://cassandra.apache.org/doc/latest/
- DigitalOcean: https://docs.digitalocean.com/
---
## Contributing to Reference Docs
**When to update reference documentation:**
- Major architecture changes
- New technology adoption
- Significant cost changes
- Security incidents (document lessons learned)
- Compliance requirements change
- Quarterly review cycles
**Document format:**
- Use Markdown
- Include decision date
- Link to related ADRs
- Update index/glossary as needed
---
## Document Maintenance
**Review schedule:**
- **Architecture docs**: Quarterly or when major changes
- **Capacity planning**: Monthly (update with metrics)
- **Cost analysis**: Monthly (track trends)
- **Security checklist**: Quarterly or after incidents
- **Technology stack**: When versions change
- **Glossary**: As needed when new terms introduced
**Responsibility**: Infrastructure lead reviews quarterly, team contributes ongoing updates.
---
**Last Updated**: January 2025
**Maintained By**: Infrastructure Team
**Next Review**: April 2025
**Purpose**: These documents answer "why" and "what if" questions. They provide context for decisions and guidance for future planning.

View file

@ -0,0 +1,612 @@
# Getting Started with Production Deployment
**Audience**: Junior DevOps Engineers, Infrastructure Team
**Time to Complete**: 10-15 minutes (one-time setup)
**Prerequisites**:
- Basic Linux command line knowledge
- DigitalOcean account with billing enabled
- SSH access to your local machine
---
## Overview
This guide prepares your local machine for deploying Maple Open Technologies infrastructure to DigitalOcean **from scratch**. You'll set up your workspace and prepare to create servers (droplets), databases, and networking—all through command-line tools.
**What you'll accomplish:**
- Set up your local workspace
- Prepare the `.env` configuration file
- Understand how to store and use infrastructure details as you create them
- Get ready to run deployment scripts that create resources on DigitalOcean
**What you WON'T need:** Existing secrets or passwords (you'll generate these as you go)
---
## Table of Contents
1. [Prerequisites Check](#prerequisites-check)
2. [Setting Up Your Local Workspace](#setting-up-your-local-workspace)
3. [Understanding the `.env` File](#understanding-the-env-file)
4. [Next Steps](#next-steps)
---
## Prerequisites Check
Before starting, verify you have:
### 1. DigitalOcean Account
```bash
# You should be able to log in to DigitalOcean
# Visit: https://cloud.digitalocean.com/
```
**Need an account?** Sign up at https://www.digitalocean.com/
### 2. DigitalOcean API Token
You'll need a Personal Access Token to create resources from command line.
**Create one:**
1. Log into DigitalOcean: https://cloud.digitalocean.com/
2. Click **API** in left sidebar
3. Click **Generate New Token**
4. Name: "Production Deployment"
5. Scopes: Check **Read** and **Write**
6. Click **Generate Token**
7. **COPY THE TOKEN IMMEDIATELY** (you can't see it again)
Save this token somewhere safe - you'll add it to `.env` shortly.
### 3. SSH Key Pair
You need SSH keys to access the servers you'll create.
**Check if you already have keys:**
```bash
ls -la ~/.ssh/id_rsa.pub
# If you see the file, you're good! Skip to next section.
```
**Don't have keys? Create them:**
```bash
# Generate new SSH key pair
ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
# Press Enter to accept default location
# Enter a passphrase (optional but recommended)
# Verify creation
ls -la ~/.ssh/id_rsa.pub
# Should show the file exists
```
**Add SSH key to DigitalOcean:**
1. Copy your public key:
```bash
cat ~/.ssh/id_rsa.pub
# Copy the entire output
```
2. Go to DigitalOcean: https://cloud.digitalocean.com/
3. Click **Settings** → **Security**
4. Click **Add SSH Key**
5. Paste your public key
6. Name it: "My Local Machine"
7. Click **Add SSH Key**
### 4. Command Line Tools
Verify you have these installed:
```bash
# Check git
git --version
# Should show: git version 2.x.x
# Check ssh
ssh -V
# Should show: OpenSSH_x.x
# Check curl
curl --version
# Should show: curl 7.x.x or 8.x.x
```
**Missing tools?** Install them:
- **macOS**: Tools should be pre-installed or install via `brew install git`
- **Linux**: `sudo apt install git curl openssh-client` (Ubuntu/Debian)
- **Windows**: Use WSL2 (Windows Subsystem for Linux)
---
## Setting Up Your Local Workspace
### Step 1: Clone the Repository
```bash
# Navigate to where you keep code projects
cd ~/Projects # or wherever you prefer
# Clone the monorepo
git clone https://codeberg.org/mapleopentech/monorepo.git
# Navigate to infrastructure directory
cd monorepo/cloud/infrastructure/production
# Verify you're in the right place
pwd
# Should show: /Users/yourname/Projects/monorepo/cloud/infrastructure/production
```
### Step 2: Create Your `.env` File from Template
The repository includes a `.env.template` file with all configuration variables you'll need. Your actual `.env` file (with real values) is gitignored and will never be committed to the repository.
```bash
# Copy the template to create your .env file
cp .env.template .env
# The .env file is automatically gitignored (safe from accidental commits)
# Verify it was created
ls -la .env
# Should show: -rw-r--r-- ... .env
# Also verify .env is gitignored
git check-ignore -v .env
# Should show: .gitignore:2:.env .env
```
**What's the difference?**
- `.env.template` = Safe to commit, contains `CHANGEME` placeholders
- `.env` = Your private file, contains real IPs/passwords/tokens, NEVER commit!
### Step 3: Set Secure File Permissions
**Important**: This file will contain sensitive information.
```bash
# Make it readable/writable only by you
chmod 600 .env
# Verify permissions changed
ls -la .env
# Should show: -rw------- 1 youruser youruser ... .env
```
### Step 4: Add Your DigitalOcean API Token
```bash
# Open .env file in your editor
nano .env
# Or: vim .env
# Or: code .env
```
**Find this line:**
```bash
DIGITALOCEAN_TOKEN=CHANGEME
```
**Replace with your token:**
```bash
DIGITALOCEAN_TOKEN=dop_v1_abc123xyz789yourtoken
```
**Save and close** the file (in nano: `Ctrl+X`, then `Y`, then `Enter`)
### Step 5: Verify Gitignore Protection
**Critical**: Verify `.env` won't be committed to Git:
```bash
# Check if .env is ignored
git check-ignore -v .env
# Expected output:
# .gitignore:XX:.env .env
# Also verify git status doesn't show .env
git status
# .env should NOT appear in untracked files
```
✅ If `.env` appears in git status, STOP and check your `.gitignore` file
---
## Understanding the `.env` File
### What is the `.env` File For?
The `.env` file is your **infrastructure notebook**. As you create resources (servers, databases, etc.) on DigitalOcean, you'll record important details here:
- IP addresses of servers you create
- Passwords you generate
- API keys and tokens
- Configuration values
**Think of it like a worksheet**: You start with `CHANGEME` placeholders and fill them in as you build your infrastructure.
### What is "source .env"?
**Simple explanation**: The `source` command reads your `.env` file and loads all values into your current terminal session so deployment scripts can use them.
**Analogy**: It's like loading ingredients onto your kitchen counter before cooking. The `.env` file is your pantry (storage), and `source` puts everything on the counter (active memory).
**Example**:
```bash
# BEFORE running "source .env"
echo $DIGITALOCEAN_TOKEN
# Output: (blank - doesn't exist yet)
# AFTER running "source .env"
source .env
echo $DIGITALOCEAN_TOKEN
# Output: dop_v1_abc123... (the value from your .env file)
```
**Important**: You need to run `source .env` in EVERY new terminal window before running deployment commands.
### The `.env` File Structure
Your `.env` file looks like this:
```bash
# Each line is: VARIABLE_NAME=value
DIGITALOCEAN_TOKEN=dop_v1_abc123xyz
CASSANDRA_NODE1_IP=CHANGEME
CASSANDRA_ADMIN_PASSWORD=CHANGEME
# Lines starting with # are comments (ignored)
# Blank lines are also ignored
```
**Initial state**: Most values are `CHANGEME`
**As you deploy**: You'll replace `CHANGEME` with actual values
**Final state**: All `CHANGEME` values replaced with real infrastructure details
### How You'll Use It During Deployment
Here's the workflow you'll follow in the next guides:
1. **Create a resource** (e.g., create a database server on DigitalOcean)
2. **Note important details** (e.g., IP address: 10.137.0.11, password: abc123)
3. **Update `.env` file** (replace `CASSANDRA_NODE1_IP=CHANGEME` with `CASSANDRA_NODE1_IP=10.137.0.11`)
4. **Load the values** (run `source .env`)
5. **Run next deployment script** (which uses those values)
### Using Environment Variables
Every time you start a new terminal for deployment work:
```bash
# Step 1: Go to infrastructure directory
cd ~/monorepo/cloud/infrastructure/production
# Step 2: Load all values from .env into this terminal session
source .env
# Step 3: Verify it worked (check one variable)
echo "DigitalOcean Token: ${DIGITALOCEAN_TOKEN:0:15}..."
# Should show: DigitalOcean Token: dop_v1_abc123...
```
### Quick Verification
Test that your DigitalOcean token loaded correctly:
```bash
# Make sure you ran "source .env" first!
# Check the token (shows first 20 characters only)
echo "DIGITALOCEAN_TOKEN: ${DIGITALOCEAN_TOKEN:0:20}..."
# Expected output:
# DIGITALOCEAN_TOKEN: dop_v1_abc123xyz789...
```
**If you see blank output:**
1. Did you run `source .env`? (run it now)
2. Did you add your token to `.env`? (check Step 4 above)
3. Did you save the file after editing?
### Variable Naming Convention
We use consistent prefixes to organize variables:
| Prefix | Purpose | Example |
|--------|---------|---------|
| `CASSANDRA_*` | Cassandra database | `CASSANDRA_NODE1_IP` |
| `REDIS_*` | Redis cache | `REDIS_PASSWORD` |
| `AWS_*` | Cloud storage | `AWS_ACCESS_KEY_ID` |
| `BACKEND_*` | Application backend | `BACKEND_ADMIN_HMAC_SECRET` |
| `*_MAILGUN_*` | Email services | `MAPLEFILE_MAILGUN_API_KEY` |
---
## Common Mistakes to Avoid
### ❌ Mistake 1: Committing .env File
```bash
# NEVER DO THIS!
git add .env
git commit -m "Add environment variables"
# Always check before committing
git status
git diff --cached
```
### ❌ Mistake 2: Forgetting to Load Variables
**Symptom**: Scripts fail with errors like "variable not set" or blank values
```bash
# ❌ Wrong - running script without loading .env first
./scripts/create-droplet.sh
# Error: DIGITALOCEAN_TOKEN: variable not set
# ✅ Correct - load .env first, then run script
source .env
./scripts/create-droplet.sh
# ✅ Also correct - load and run in one line
(source .env && ./scripts/create-droplet.sh)
```
**Remember**: Each new terminal needs `source .env` run again!
### ❌ Mistake 3: Using Wrong Permissions
```bash
# Too permissive - others can read your secrets!
chmod 644 .env # ❌ Wrong
# Correct - only you can read/write
chmod 600 .env # ✅ Correct
```
### ❌ Mistake 4: Leaving CHANGEME Values
```bash
# ❌ Wrong - still has placeholder
CASSANDRA_NODE1_IP=CHANGEME
# ✅ Correct - replaced with actual value after creating server
CASSANDRA_NODE1_IP=10.137.0.11
```
---
## Troubleshooting
### Problem: "Permission denied" when reading .env
**Cause**: File permissions too restrictive or wrong owner
**Solution**:
```bash
# Check current permissions and owner
ls -la .env
# Fix permissions
chmod 600 .env
# If you're not the owner, fix ownership
sudo chown $(whoami):$(whoami) .env
```
### Problem: Variables not loading
**Symptoms**: Scripts fail with "variable not set" errors or echo commands show blank
**Solution - Check each step**:
```bash
# Step 1: Verify .env file exists in current directory
ls -la .env
# Should show: -rw------- 1 youruser youruser ... .env
# Step 2: Check it has content (not empty)
head .env
# Should show lines like: DIGITALOCEAN_TOKEN=dop_v1_abc123...
# Step 3: Load variables into current terminal
source .env
# (no output is normal - silence means success)
# Step 4: Verify loading worked by printing a variable
echo "DIGITALOCEAN_TOKEN is: ${DIGITALOCEAN_TOKEN:0:20}..."
# Should print: DIGITALOCEAN_TOKEN is: dop_v1_abc123xyz789...
# NOT: DIGITALOCEAN_TOKEN is: (blank)
```
**Still not working?** Check these:
- Are you in the correct directory? Run `pwd` to verify
- Is the `.env` file formatted correctly? No spaces around `=` sign
- Did you save the file after editing?
- Did you replace `CHANGEME` with actual values?
### Problem: Git showing .env file
**Symptoms**: `git status` shows `.env` as untracked or modified
**Solution**:
```bash
# Verify gitignore is working
git check-ignore -v .env
# If not ignored, check .gitignore exists
cat .gitignore | grep "\.env"
# If needed, manually add to gitignore
echo ".env" >> .gitignore
```
### Problem: Accidentally committed secrets
**⚠️ CRITICAL - Act immediately!**
**If not yet pushed**:
```bash
# Remove from staging
git reset HEAD .env
# Or undo last commit
git reset --soft HEAD~1
```
**If already pushed**:
1. **DO NOT PANIC** - but act quickly
2. Immediately contact team lead
3. All secrets in that file must be rotated (changed)
4. Team lead will help remove from Git history
---
## Quick Reference Commands
### Daily Workflow (Copy-Paste Template)
Every time you open a new terminal for deployment work, run these commands in order:
```bash
# Step 1: Go to the infrastructure directory
cd ~/monorepo/cloud/infrastructure/production
# Step 2: Load configuration into this terminal session
source .env
# Step 3: Verify token loaded correctly
echo "Token loaded: ${DIGITALOCEAN_TOKEN:0:15}..."
# Step 4: Now you can run deployment commands
# (You'll use these in the next guides)
```
**Why these steps?**
- Step 1: Ensures you're in the right folder where `.env` exists
- Step 2: Loads your DigitalOcean token and other config values
- Step 3: Confirms everything loaded correctly
- Step 4: Ready to create infrastructure!
### One-time Setup Summary
```bash
# Clone repository
git clone https://codeberg.org/mapleopentech/monorepo.git
cd monorepo/cloud/infrastructure/production
# Create .env file
cp .env.template .env
# Add your DigitalOcean token
nano .env
# Set permissions
chmod 600 .env
# Verify gitignored
git check-ignore -v .env
# Load and verify
source .env
echo "Token: ${DIGITALOCEAN_TOKEN:0:15}..."
```
---
## Next Steps
✅ **You've completed:**
- Local workspace setup
- `.env` file creation
- DigitalOcean API token configuration
- Understanding of how to use environment variables
**Next, you'll create infrastructure on DigitalOcean:**
1. **[Initialize Docker Swarm](01_init_docker_swarm.md)** - Create Docker Swarm cluster
2. **[Deploy Cassandra](02_cassandra.md)** - Set up Cassandra database cluster
3. **[Deploy Redis](03_redis.md)** - Set up Redis cache server
4. **[Deploy Meilisearch](04_meilisearch.md)** - Set up Meilisearch search engine
5. **[Configure Spaces](04.5_spaces.md)** - Set up DigitalOcean Spaces object storage
6. **[Deploy Backend](05_maplepress_backend.md)** - Deploy backend application
7. **[Setup Caddy](06_maplepress_caddy.md)** - Configure automatic SSL/TLS with Caddy
---
## Important Notes
### You're Building From Scratch
- **No existing infrastructure**: You'll create everything step by step
- **Generate secrets as needed**: Each guide will tell you when to create passwords/keys
- **Update `.env` as you go**: After creating each resource, add details to `.env`
- **Keep notes**: Write down IPs, passwords as you create them
### The `.env` File Will Grow
**Right now:** Only has `DIGITALOCEAN_TOKEN`
**After creating droplets:** Will have server IP addresses
**After setting up databases:** Will have passwords and connection strings
**At the end:** Will have all infrastructure details documented
---
## Security Reminders
🔒 **Always**:
- Verify `.env` is gitignored (check this NOW: `git check-ignore -v .env`)
- Use `chmod 600` for `.env` files
- Run `source .env` before running deployment scripts
- Keep `.env` file backed up securely (encrypted backup)
🚫 **Never**:
- Commit `.env` files to Git
- Share `.env` via email or Slack
- Use permissive file permissions (644, 777)
- Leave `CHANGEME` values in production
---
## Quick Pre-flight Check
Before continuing to the next guide, verify:
```bash
# 1. You're in the right directory
pwd
# Should show: .../monorepo/cloud/infrastructure/production
# 2. .env file exists with correct permissions
ls -la .env
# Should show: -rw------- ... .env
# 3. Your token is loaded
source .env
echo "Token: ${DIGITALOCEAN_TOKEN:0:15}..."
# Should show: Token: dop_v1_abc123...
# 4. Git won't commit .env
git check-ignore -v .env
# Should show: .gitignore:XX:.env .env
```
**All checks passed?** Continue to [Create DigitalOcean Droplets](01-create-droplets.md)
---
**Document Version**: 2.0 (From-Scratch Edition)
**Last Updated**: November 3, 2025
**Maintained By**: Infrastructure Team

View file

@ -0,0 +1,512 @@
# Multi-Application Architecture & Naming Conventions
**Audience**: DevOps Engineers, Infrastructure Team, Developers
**Status**: Architecture Reference Document
**Last Updated**: November 2025
---
## Overview
This document defines the **multi-application architecture** for Maple Open Technologies production infrastructure. The infrastructure is designed to support **multiple independent applications** (MaplePress, MapleFile, mapleopentech) sharing common infrastructure (Cassandra, Redis, Meilisearch) while maintaining clear boundaries and isolation.
---
## Architecture Principles
### 1. Shared Infrastructure, Isolated Applications
```
┌─────────────────────────────────────────────────────────────────┐
│ SHARED INFRASTRUCTURE │
│ (Used by ALL apps: MaplePress, MapleFile, mapleopentech) │
├─────────────────────────────────────────────────────────────────┤
│ Infrastructure Workers (1-5): │
│ - Manager Node (worker-1): Redis │
│ - Cassandra Cluster (workers 2,3,4) │
│ - Meilisearch (worker 5) │
│ │
│ Networks: │
│ - maple-private-prod (databases, cache, search) │
│ - maple-public-prod (reverse proxies + backends) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ APPLICATION: MAPLEPRESS │
├─────────────────────────────────────────────────────────────────┤
│ Worker 6 - MaplePress Backend + Proxy: │
│ Stack: maplepress │
│ │
│ Service: maplepress_backend │
│ Hostname: maplepress-backend │
│ Port: 8000 │
│ Networks: maple-private-prod + maple-public-prod │
│ Connects to: Cassandra, Redis, Meilisearch, Spaces │
│ │
│ Service: maplepress_backend-caddy │
│ Hostname: caddy │
│ Domain: getmaplepress.ca (API) │
│ Proxies to: maplepress-backend:8000 │
│ │
│ Worker 7 - MaplePress Frontend: │
│ Stack: maplepress-frontend │
│ Service: maplepress-frontend_caddy │
│ Hostname: frontend-caddy │
│ Domain: getmaplepress.com (Web UI) │
│ Serves: /var/www/maplepress-frontend/ │
│ Calls: https://getmaplepress.ca (backend API) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ APPLICATION: MAPLEFILE (Future) │
├─────────────────────────────────────────────────────────────────┤
│ Worker 8 - MapleFile Backend + Proxy: │
│ Stack: maplefile │
│ │
│ Service: maplefile_backend │
│ Hostname: maplefile-backend │
│ Port: 8000 │
│ Networks: maple-private-prod + maple-public-prod │
│ Connects to: Cassandra, Redis, Meilisearch, Spaces │
│ │
│ Service: maplefile_backend-caddy │
│ Hostname: maplefile-backend-caddy │
│ Domain: maplefile.ca (API) │
│ Proxies to: maplefile-backend:8000 │
│ │
│ Worker 9 - MapleFile Frontend: │
│ Stack: maplefile-frontend │
│ Service: maplefile-frontend_caddy │
│ Hostname: maplefile-frontend-caddy │
│ Domain: maplefile.com (Web UI) │
│ Serves: /var/www/maplefile-frontend/ │
│ Calls: https://maplefile.ca (backend API) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ APPLICATION: mapleopentech (Future) │
├─────────────────────────────────────────────────────────────────┤
│ Worker 10 - mapleopentech Backend + Proxy: │
│ Stack: mapleopentech │
│ │
│ Service: mapleopentech_backend │
│ Hostname: mapleopentech-backend │
│ Port: 8000 │
│ Networks: maple-private-prod + maple-public-prod │
│ Connects to: Cassandra, Redis, Meilisearch, Spaces │
│ │
│ Service: mapleopentech_backend-caddy │
│ Hostname: mapleopentech-backend-caddy │
│ Domain: api.mapleopentech.io (API) │
│ Proxies to: mapleopentech-backend:8000 │
│ │
│ Worker 11 - mapleopentech Frontend: │
│ Stack: mapleopentech-frontend │
│ Service: mapleopentech-frontend_caddy │
│ Hostname: mapleopentech-frontend-caddy │
│ Domain: mapleopentech.io (Web UI) │
│ Serves: /var/www/mapleopentech-frontend/ │
│ Calls: https://api.mapleopentech.io (backend API) │
└─────────────────────────────────────────────────────────────────┘
```
---
## Naming Conventions
### Pattern: **Option C - Hybrid Stacks**
**Strategy**:
- Backend + Backend Caddy in **one stack** (deployed together)
- Frontend Caddy in **separate stack** (independent deployment)
**Why this pattern?**
- Backend and its reverse proxy are tightly coupled → deploy together
- Frontend is independent → deploy separately
- Avoids redundant naming like `maplepress-backend_backend`
- Clean service names: `maplepress_backend`, `maplepress_backend-caddy`, `maplepress-frontend_caddy`
### Stack Names
| Application | Stack Name | Services in Stack | Purpose |
|-------------|-----------------------|---------------------------------|--------------------------------------|
| MaplePress | `maplepress` | `backend`, `backend-caddy` | Backend API + reverse proxy |
| MaplePress | `maplepress-frontend` | `caddy` | Frontend static files |
| MapleFile | `maplefile` | `backend`, `backend-caddy` | Backend API + reverse proxy |
| MapleFile | `maplefile-frontend` | `caddy` | Frontend static files |
| mapleopentech | `mapleopentech` | `backend`, `backend-caddy` | Backend API + reverse proxy |
| mapleopentech | `mapleopentech-frontend` | `caddy` | Frontend static files |
### Service Names (Docker Auto-Generated)
Docker Swarm automatically creates service names from: `{stack-name}_{service-name}`
| Stack Name | Service in YAML | Full Service Name | Purpose |
|-----------------------|------------------|-----------------------------|----------------------------------|
| `maplepress` | `backend` | `maplepress_backend` | Go backend API |
| `maplepress` | `backend-caddy` | `maplepress_backend-caddy` | Backend reverse proxy |
| `maplepress-frontend` | `caddy` | `maplepress-frontend_caddy` | Frontend static file server |
| `maplefile` | `backend` | `maplefile_backend` | Go backend API |
| `maplefile` | `backend-caddy` | `maplefile_backend-caddy` | Backend reverse proxy |
| `maplefile-frontend` | `caddy` | `maplefile-frontend_caddy` | Frontend static file server |
**View services:**
```bash
docker service ls
# Output:
# maplepress_backend 1/1
# maplepress_backend-caddy 1/1
# maplepress-frontend_caddy 1/1
# maplefile_backend 1/1
# maplefile_backend-caddy 1/1
# maplefile-frontend_caddy 1/1
```
### Hostnames (DNS Resolution Within Networks)
Hostnames are defined in the stack YAML (`hostname: ...`) and used for container-to-container communication.
| Application | Component | Hostname | Used By |
|-------------|-----------------|-------------------------------|----------------------------------|
| MaplePress | Backend | `maplepress-backend` | Caddy proxy, other services |
| MaplePress | Backend Caddy | `caddy` | Internal reference (rarely used) |
| MaplePress | Frontend Caddy | `frontend-caddy` | Internal reference (rarely used) |
| MapleFile | Backend | `maplefile-backend` | Caddy proxy, other services |
| MapleFile | Backend Caddy | `caddy` | Internal reference (rarely used) |
| MapleFile | Frontend Caddy | `frontend-caddy` | Internal reference (rarely used) |
**Example - Caddyfile for MaplePress backend:**
```caddy
getmaplepress.ca www.getmaplepress.ca {
reverse_proxy maplepress-backend:8000 # Uses hostname, not service name
}
```
**Example - Caddyfile for MapleFile backend:**
```caddy
maplefile.ca www.maplefile.ca {
reverse_proxy maplefile-backend:8000
}
```
### Docker Configs (Auto-Generated with Stack Prefix)
| Stack Name | Config in YAML | Full Config Name |
|-------------------------|----------------|-------------------------------------|
| `maplepress` | `caddyfile` | `maplepress_caddyfile` |
| `maplepress-frontend` | `caddyfile` | `maplepress-frontend_caddyfile` |
| `maplefile` | `caddyfile` | `maplefile_caddyfile` |
| `maplefile-frontend` | `caddyfile` | `maplefile-frontend_caddyfile` |
**View configs:**
```bash
docker config ls
# Output:
# maplepress_caddyfile
# maplepress-frontend_caddyfile
# maplefile_caddyfile
# maplefile-frontend_caddyfile
```
### File Paths
| Application | Component | Path |
|-------------|-----------|---------------------------------------|
| MaplePress | Frontend | `/var/www/maplepress-frontend/` |
| MaplePress | Backend | `/var/www/monorepo/cloud/mapleopentech-backend/` |
| MapleFile | Frontend | `/var/www/maplefile-frontend/` |
| MapleFile | Backend | `/var/www/monorepo/cloud/mapleopentech-backend/` |
---
## Resource Allocation
### Workers 1-5: Shared Infrastructure (ALL Apps)
| Worker | Role | Services | Shared By |
|--------|-----------------------|---------------------------------------|----------------|
| 1 | Manager + Redis | Swarm manager, Redis cache | All apps |
| 2 | Cassandra Node 1 | cassandra-1 | All apps |
| 3 | Cassandra Node 2 | cassandra-2 | All apps |
| 4 | Cassandra Node 3 | cassandra-3 | All apps |
| 5 | Meilisearch | meilisearch (full-text search) | All apps |
### Workers 6-7: MaplePress Application
| Worker | Role | Services |
|--------|----------------------------|---------------------------------------------|
| 6 | MaplePress Backend + Proxy | maplepress_backend, maplepress_backend-caddy |
| 7 | MaplePress Frontend | maplepress-frontend_caddy |
### Workers 8-9: MapleFile Application (Future)
| Worker | Role | Services |
|--------|---------------------------|--------------------------------------------|
| 8 | MapleFile Backend + Proxy | maplefile_backend, maplefile_backend-caddy |
| 9 | MapleFile Frontend | maplefile-frontend_caddy |
### Workers 10-11: mapleopentech Application (Future)
| Worker | Role | Services |
|--------|----------------------------|---------------------------------------------|
| 10 | mapleopentech Backend + Proxy | mapleopentech_backend, mapleopentech_backend-caddy |
| 11 | mapleopentech Frontend | mapleopentech-frontend_caddy |
---
## Network Topology
### maple-private-prod (Shared by ALL Apps)
**Purpose**: Private backend services - databases, cache, search
**Services**:
- Cassandra cluster (3 nodes)
- Redis
- Meilisearch
- **All backend services** (maplepress-backend, maplefile-backend, mapleopentech-backend)
**Security**: No ingress ports, no internet access, internal-only
### maple-public-prod (Per-App Reverse Proxies + Backends)
**Purpose**: Internet-facing services - reverse proxies and backends
**Services**:
- **All backend services** (join both networks)
- **All Caddy reverse proxies** (backend + frontend)
**Security**: Ports 80/443 exposed on workers running Caddy
---
## Deployment Commands
### MaplePress
```bash
# Backend + Backend Caddy (deployed together in one stack)
docker stack deploy -c maplepress-stack.yml maplepress
# Frontend (deployed separately)
docker stack deploy -c maplepress-frontend-stack.yml maplepress-frontend
```
### MapleFile (Future)
```bash
# Backend + Backend Caddy (deployed together in one stack)
docker stack deploy -c maplefile-stack.yml maplefile
# Frontend (deployed separately)
docker stack deploy -c maplefile-frontend-stack.yml maplefile-frontend
```
### mapleopentech (Future)
```bash
# Backend + Backend Caddy (deployed together in one stack)
docker stack deploy -c mapleopentech-stack.yml mapleopentech
# Frontend (deployed separately)
docker stack deploy -c mapleopentech-frontend-stack.yml mapleopentech-frontend
```
---
## Verification Commands
### List All Stacks
```bash
docker stack ls
# Expected output:
# NAME SERVICES
# cassandra 3
# maplepress 2 (backend + backend-caddy)
# maplepress-frontend 1 (frontend caddy)
# maplefile 2 (future)
# maplefile-frontend 1 (future)
# meilisearch 1
# redis 1
```
### List All Services
```bash
docker service ls | sort
# Expected output (partial):
# cassandra_cassandra-1 1/1
# cassandra_cassandra-2 1/1
# cassandra_cassandra-3 1/1
# maplepress_backend 1/1
# maplepress_backend-caddy 1/1
# maplepress-frontend_caddy 1/1
# meilisearch_meilisearch 1/1
# redis_redis 1/1
```
### List All Configs
```bash
docker config ls
# Expected output:
# maplepress_caddyfile
# maplepress-frontend_caddyfile
```
---
## Adding a New Application
To add a new application (e.g., "MaplePortal"):
### 1. Update .env.template
```bash
# Add new section
# ==============================================================================
# MAPLEPORTAL APPLICATION
# ==============================================================================
# Backend Configuration
MAPLEPORTAL_BACKEND_DOMAIN=api.mapleportal.io
MAPLEPORTAL_SPACES_BUCKET=mapleportal-prod
MAPLEPORTAL_JWT_SECRET=CHANGEME
MAPLEPORTAL_IP_ENCRYPTION_KEY=CHANGEME
# Frontend Configuration
MAPLEPORTAL_FRONTEND_DOMAIN=mapleportal.io
MAPLEPORTAL_FRONTEND_API_URL=https://api.mapleportal.io
```
### 2. Create New Workers
```bash
# Worker 12 - Backend + Backend Caddy
# Worker 13 - Frontend Caddy
```
### 3. Follow Naming Convention
- Stack names: `mapleportal` (backend + backend-caddy), `mapleportal-frontend`
- Service names: `mapleportal_backend`, `mapleportal_backend-caddy`, `mapleportal-frontend_caddy`
- Hostnames: `mapleportal-backend`, `mapleportal-backend-caddy`, `mapleportal-frontend-caddy`
- Domains: `api.mapleportal.io` (backend), `mapleportal.io` (frontend)
- Paths: `/var/www/mapleportal-frontend/`
### 4. Deploy Services
```bash
# Backend + backend-caddy in one stack
docker stack deploy -c mapleportal-stack.yml mapleportal
# Frontend in separate stack
docker stack deploy -c mapleportal-frontend-stack.yml mapleportal-frontend
```
---
## Benefits of This Architecture
### 1. Clear Separation
- Each app has dedicated workers and services
- No naming conflicts between apps
- Easy to identify which services belong to which app
### 2. Shared Infrastructure Efficiency
- Single Cassandra cluster serves all apps
- Single Redis instance (or sharded by app)
- Single Meilisearch instance with app-prefixed indexes
- Cost savings: 5 workers for infrastructure vs 15+ if each app had its own
### 3. Independent Scaling
- Scale MaplePress without affecting MapleFile
- Deploy new apps without touching existing ones
- Remove apps without impacting infrastructure
### 4. Operational Clarity
```bash
# View only MaplePress services
docker service ls | grep maplepress
# View only MapleFile services
docker service ls | grep maplefile
# Restart MaplePress backend
docker service update --force maplepress_backend
# Remove MapleFile entirely (if needed)
docker stack rm maplefile
docker stack rm maplefile-frontend
```
### 5. Developer Friendly
- Developers instantly know which app they're working with
- No ambiguous "backend" or "frontend" names
- Service discovery is intuitive: `maplepress-backend:8000`
---
## Migration Checklist (For Existing Deployments)
If you deployed with old naming (`caddy`, `maplepress`, `frontend-caddy`), migrate like this:
### Step 1: Update Configuration Files Locally
```bash
cd ~/monorepo/cloud/infrastructure/production
# Update all YAML files to use new naming
# - maplepress → maplepress-backend
# - caddy → maplepress-backend-caddy
# - frontend-caddy → maplepress-frontend-caddy
# Update Caddyfiles to use new hostnames
# - backend:8000 → maplepress-backend:8000
```
### Step 2: Remove Old Stacks
```bash
# On manager node
docker stack rm maplepress
docker stack rm caddy
docker stack rm frontend-caddy
# Wait for cleanup
sleep 10
# Remove old configs
docker config rm caddy_caddyfile
docker config rm frontend-caddy_caddyfile
```
### Step 3: Deploy New Stacks
```bash
# Deploy with new names (Option C naming)
docker stack deploy -c maplepress-stack.yml maplepress
docker stack deploy -c maplepress-frontend-stack.yml maplepress-frontend
```
### Step 4: Verify
```bash
docker service ls
# Should show:
# maplepress_backend
# maplepress_backend-caddy
# maplepress-frontend_caddy
docker config ls
# Should show:
# maplepress_caddyfile
# maplepress-frontend_caddyfile
```
---
**Last Updated**: November 2025
**Maintained By**: Infrastructure Team
**Status**: Production Standard - Follow for All New Applications

View file

@ -0,0 +1,294 @@
# Network Architecture Overview
This document explains the network strategy for Maple Open Technologies production infrastructure.
**See Also**: `00-multi-app-architecture.md` for application naming conventions and multi-app strategy.
## Network Segmentation Strategy
We use a **multi-network architecture** following industry best practices for security and isolation. This infrastructure supports **multiple independent applications** (MaplePress, MapleFile) sharing common infrastructure.
### Network Topology
```
┌─────────────────────────────────────────────────────────────────┐
│ Docker Swarm Cluster │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ maple-private-prod (Overlay Network) │ │
│ │ No Internet Access | Internal Services Only │ │
│ │ SHARED by ALL applications │ │
│ ├────────────────────────────────────────────────────────────┤ │
│ │ Infrastructure Services: │ │
│ │ ├── Cassandra (3 nodes) - Shared database cluster │ │
│ │ ├── Redis - Shared cache │ │
│ │ └── Meilisearch - Shared search │ │
│ │ │ │
│ │ Application Backends (Join BOTH Networks): │ │
│ │ ├── maplepress-backend:8000 │ │
│ │ ├── maplefile-backend:8000 (future) │ │
│ │ └── mapleopentech-backend:8000 (future) │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ maple-public-prod (Overlay Network) │ │
│ │ Internet-Facing | Public Services │ │
│ ├────────────────────────────────────────────────────────────┤ │
│ │ Reverse Proxies (Caddy - ports 80/443): │ │
│ │ ├── maplepress-backend-caddy (getmaplepress.ca) │ │
│ │ ├── maplepress-frontend-caddy (getmaplepress.com) │ │
│ │ ├── maplefile-backend-caddy (maplefile.ca) │ │
│ │ ├── maplefile-frontend-caddy (maplefile.com) │ │
│ │ └── ... (future apps) │ │
│ │ │ │
│ │ Application Backends (Join BOTH Networks): │ │
│ │ ├── maplepress-backend:8000 │ │
│ │ ├── maplefile-backend:8000 (future) │ │
│ │ └── mapleopentech-backend:8000 (future) │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ Note: Application backends join BOTH networks: │
│ - Receive requests from Caddy on maple-public-prod │
│ - Access databases/cache on maple-private-prod │
└─────────────────────────────────────────────────────────────────┘
```
## Networks Explained
### 1. `maple-private-prod` (Current)
**Purpose**: Backend services that should NEVER be exposed to the internet.
**Characteristics:**
- Overlay network (Docker Swarm managed)
- No ingress ports exposed
- No public IP access
- Service-to-service communication only
**Services:**
- Cassandra cluster (3 nodes - shared database) - databases never touch internet
- Redis (shared cache layer)
- Meilisearch (shared search engine)
- **All application backends** (maplepress-backend, maplefile-backend, mapleopentech-backend)
**Security Benefits:**
- Attack surface minimization
- No direct internet access to databases
- Compliance with data protection regulations (PCI-DSS, HIPAA, SOC2)
- Defense in depth architecture
**Service Discovery:**
```bash
# Services can reach each other by hostname
redis:6379
cassandra-1:9042
cassandra-2:9042
cassandra-3:9042
```
### 2. `maple-public-prod` (Current - In Use)
**Purpose**: Internet-facing services that handle external traffic.
**Characteristics:**
- Overlay network with ingress
- Ports 80/443 exposed to internet
- TLS/SSL termination via Caddy
- Automatic Let's Encrypt certificates
- Rate limiting and security headers
**Services:**
- **Caddy reverse proxies** (one per app component):
- `maplepress-backend-caddy` → serves `getmaplepress.ca` → proxies to `maplepress-backend:8000`
- `maplepress-frontend-caddy` → serves `getmaplepress.com` → static React files
- `maplefile-backend-caddy` (future) → serves `maplefile.ca` → proxies to `maplefile-backend:8000`
- `maplefile-frontend-caddy` (future) → serves `maplefile.com` → static React files
- **All application backends** (join both networks):
- `maplepress-backend`
- `maplefile-backend` (future)
- `mapleopentech-backend` (future)
**Routing Flow:**
```
Internet → Caddy Reverse Proxy (maple-public-prod)
→ Application Backend (maple-public-prod + maple-private-prod)
→ Databases/Cache (maple-private-prod only)
Example (MaplePress):
https://getmaplepress.ca → maplepress-backend-caddy
→ maplepress-backend:8000
→ cassandra/redis/meilisearch
```
## Why This Architecture?
### Industry Standard
This pattern is used by:
- **Netflix**: `backend-network` + `edge-network`
- **Spotify**: `data-plane` + `control-plane`
- **AWS**: VPC with `private-subnet` + `public-subnet`
- **Google Cloud**: VPC with internal + external networks
### Security Benefits
1. **Defense in Depth**: Multiple security layers
2. **Least Privilege**: Services only access what they need
3. **Attack Surface Reduction**: Databases never exposed to internet
4. **Network Segmentation**: Compliance requirement for SOC2, PCI-DSS
5. **Blast Radius Containment**: Breach of public network doesn't compromise data layer
### Operational Benefits
1. **Clear Boundaries**: Easy to understand what's exposed
2. **Independent Scaling**: Scale public/private networks separately
3. **Flexible Firewall Rules**: Different rules for different networks
4. **Service Discovery**: DNS-based discovery within each network
5. **Testing**: Can test private services without public exposure
## Network Creation
### Current Setup
Both networks are created and in use:
```bash
# Create private network (done in 02_cassandra.md - shared by ALL apps)
docker network create \
--driver overlay \
--attachable \
maple-private-prod
# Create public network (done in 06_caddy.md - used by reverse proxies)
docker network create \
--driver overlay \
--attachable \
maple-public-prod
# Verify both exist
docker network ls | grep maple
# Should show:
# maple-private-prod
# maple-public-prod
```
### Multi-App Pattern
- **All application backends** join BOTH networks
- **Each app** gets its own Caddy reverse proxy instances
- **Infrastructure services** (Cassandra, Redis, Meilisearch) only on private network
- **Shared efficiently**: 5 infrastructure workers serve unlimited apps
## Service Connection Examples
### Go Backend Connecting to Services
**On `maple-private-prod` network:**
```go
// Redis connection
redisClient := redis.NewClient(&redis.Options{
Addr: "redis:6379", // Resolves via Docker DNS
Password: os.Getenv("REDIS_PASSWORD"),
})
// Cassandra connection
cluster := gocql.NewCluster("cassandra-1", "cassandra-2", "cassandra-3")
cluster.Port = 9042
```
**Docker Stack File for Backend:**
```yaml
version: '3.8'
services:
backend:
image: your-backend:latest
networks:
- maple-private-prod # Access to databases
- maple-public-prod # Receive HTTP requests (when deployed)
environment:
- REDIS_HOST=redis
- CASSANDRA_HOSTS=cassandra-1,cassandra-2,cassandra-3
networks:
maple-private-prod:
external: true
maple-public-prod:
external: true
```
## Firewall Rules
### Private Network
```bash
# On worker nodes
# Only allow traffic from other swarm nodes (10.116.0.0/16)
sudo ufw allow from 10.116.0.0/16 to any port 2377 proto tcp # Swarm
sudo ufw allow from 10.116.0.0/16 to any port 7946 # Gossip
sudo ufw allow from 10.116.0.0/16 to any port 4789 proto udp # Overlay
sudo ufw allow from 10.116.0.0/16 to any port 6379 proto tcp # Redis
sudo ufw allow from 10.116.0.0/16 to any port 9042 proto tcp # Cassandra
```
### Public Network (Caddy Nodes)
```bash
# On workers running Caddy (worker-6, worker-7, worker-8, worker-9, etc.)
sudo ufw allow 80/tcp # HTTP (Let's Encrypt challenge + redirect to HTTPS)
sudo ufw allow 443/tcp # HTTPS (TLS/SSL traffic)
```
## Troubleshooting
### Check Which Networks a Service Uses
```bash
# Inspect service networks
docker service inspect your_service --format '{{.Spec.TaskTemplate.Networks}}'
# Should show network IDs
# Compare with: docker network ls
```
### Test Connectivity Between Networks
```bash
# From a container on maple-private-prod
docker exec -it <container> ping redis
docker exec -it <container> nc -zv cassandra-1 9042
# Should work if on same network
```
### View All Services on a Network
```bash
docker network inspect maple-private-prod --format '{{range .Containers}}{{.Name}} {{end}}'
```
## Migration Path
### Current Status
- ✅ `maple-private-prod` created
- ✅ Cassandra on `maple-private-prod`
- ✅ Redis on `maple-private-prod`
- ⏳ Backend deployment (next)
- ⏳ Public network + NGINX (future)
### When to Create `maple-public-prod`
Create the public network when you're ready to:
1. Deploy NGINX reverse proxy
2. Set up SSL/TLS certificates
3. Expose your application to the internet
Until then, all services run on the private network only.
---
**Last Updated**: November 3, 2025
**Status**: Active Architecture
**Maintained By**: Infrastructure Team

View file

@ -0,0 +1,859 @@
# Setting Up Docker Swarm Cluster
**Audience**: Junior DevOps Engineers, Infrastructure Team
**Time to Complete**: 45-60 minutes
**Prerequisites**:
- Completed [00-getting-started.md](00-getting-started.md)
- DigitalOcean account with API token configured
- SSH key added to your DigitalOcean account
---
## Overview
This guide walks you through creating a **Docker Swarm cluster** with 2 DigitalOcean droplets from scratch. You'll create two Ubuntu 24.04 servers, install Docker on both, and configure them as a swarm with private networking.
**What you'll build:**
- **1 Swarm Manager** (`mapleopentech-swarm-manager-1-prod`) - Controls the cluster
- **1 Swarm Worker** (`mapleopentech-swarm-worker-1-prod`) - Runs containers
- **Private networking** - Nodes communicate via DigitalOcean private IPs
**What is Docker Swarm?**
Docker Swarm is a container orchestration tool that lets you run containers across multiple servers as if they were one system. The manager tells workers what containers to run.
**Naming Convention:**
We use simple sequential numbering for servers. Roles (what each server does) are managed through Docker labels and tags, not hardcoded in hostnames. This makes it easy to repurpose servers as needs change.
---
## Table of Contents
1. [Create DigitalOcean Droplets](#create-digitalocean-droplets)
2. [Configure the Swarm Manager](#configure-the-swarm-manager)
3. [Configure the Swarm Worker](#configure-the-swarm-worker)
4. [Verify the Cluster](#verify-the-cluster)
5. [Update Your .env File](#update-your-env-file)
6. [Troubleshooting](#troubleshooting)
---
## Create DigitalOcean Droplets
### Step 1: Create the Swarm Manager Droplet
Log into DigitalOcean: https://cloud.digitalocean.com/
1. Click **Create** <20> **Droplets** (top right corner)
2. **Choose Region:**
- Select **Toronto 1**
- This tutorial uses Toronto - you'll use the `default-tor1` VPC
3. **Choose an Image:**
- Select **Ubuntu**
- Choose **24.04 (LTS) x64**
4. **Choose Size:**
- **Droplet Type**: Basic
- **CPU Options**: Regular
- **Size**: $12/month (2 GB RAM / 1 vCPU / 50 GB SSD)
5. **VPC Network:**
- Select **default-tor1** (auto-created by DigitalOcean)
6. **Authentication:**
- Select **SSH Key**
- Check the SSH key you added earlier
7. **Finalize:**
- **Hostname**: `mapleopentech-swarm-manager-1-prod`
- **Tags**: `production`, `swarm`, `manager`
- **Monitoring**: Enable
8. Click **Create Droplet** button (bottom right)
9. **Wait 1-2 minutes** for droplet to be created
10. **Record IPs:**
- Copy **Public IP** from droplet list
- Click droplet → copy **Private IPv4**
**✅ Checkpoint - Update your `.env` file now:**
```bash
# Open .env
nano ~/monorepo/cloud/infrastructure/production/.env
# Add these values (replace with YOUR actual IPs):
SWARM_REGION=tor1
SWARM_VPC_NAME=default-tor1
SWARM_MANAGER_1_HOSTNAME=mapleopentech-swarm-manager-1-prod
SWARM_MANAGER_1_PUBLIC_IP=159.65.123.45 # Your manager's public IP
SWARM_MANAGER_1_PRIVATE_IP=10.116.0.2 # Your manager's private IP
```
### Step 2: Create the Swarm Worker Droplet
Same settings as manager, except:
- **Hostname**: `mapleopentech-swarm-worker-1-prod`
- **Tags**: `production`, `swarm`, `worker`
Click **Create Droplet** and record both IPs.
**✅ Checkpoint - Update your `.env` file:**
```bash
# Add worker info (replace with YOUR actual IPs):
SWARM_WORKER_1_HOSTNAME=mapleopentech-swarm-worker-1-prod
SWARM_WORKER_1_PUBLIC_IP=159.65.123.46 # Your worker's public IP
SWARM_WORKER_1_PRIVATE_IP=10.116.0.3 # Your worker's private IP
```
### Step 3: Verify Private Networking
Check both droplets are in `default-tor1` VPC:
1. **Networking****VPC** → Click `default-tor1`
2. Both droplets should be listed
3. Note the subnet (e.g., `10.116.0.0/16`)
Private IPs should start with same prefix (e.g., `10.116.0.2` and `10.116.0.3`).
**✅ Checkpoint - Update your `.env` file:**
```bash
# On your local machine, add:
SWARM_REGION=tor1
SWARM_VPC_NAME=default-tor1
SWARM_VPC_SUBNET=10.116.0.0/16 # Use YOUR actual subnet from VPC dashboard
```
---
## Configure the Swarm Manager
### Step 1: Initial SSH as Root
```bash
# SSH as root (replace with YOUR manager's public IP)
ssh root@159.65.123.45
# Type 'yes' if asked about fingerprint
# You should now see: root@mapleopentech-swarm-manager-1-prod:~#
```
### Step 2: System Updates and Create Admin User
```bash
# Update and upgrade system
apt update && apt upgrade -y
# Install essential packages
apt install -y curl wget apt-transport-https ca-certificates gnupg lsb-release
# Create dedicated Docker admin user
adduser dockeradmin
# Enter a strong password when prompted
# Press Enter for other prompts (or fill them in)
# Add to sudo group
usermod -aG sudo dockeradmin
# Copy SSH keys to new user
rsync --archive --chown=dockeradmin:dockeradmin ~/.ssh /home/dockeradmin
```
**✅ Checkpoint - Update your `.env` file:**
```bash
# On your local machine, add:
DOCKERADMIN_PASSWORD=your_strong_password_here # The password you just created
```
### Step 3: Secure SSH Configuration
```bash
# Edit SSH config
vi /etc/ssh/sshd_config
```
Find and update these lines (use `Ctrl+W` to search):
```
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
MaxAuthTries 3
LoginGraceTime 60
```
Save and exit (`Ctrl+X`, `Y`, `Enter`), then restart SSH:
```bash
systemctl restart ssh
```
### Step 4: Reconnect as dockeradmin
```bash
# Exit root session
exit
# On your local machine, reconnect as dockeradmin:
ssh dockeradmin@159.65.123.45 # Your manager's public IP
# You should now see: dockeradmin@mapleopentech-swarm-manager-1-prod:~#
```
### Step 5: Install Docker
```bash
# Install Docker using official convenience script
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
# Add dockeradmin to docker group (no sudo needed for docker commands)
sudo usermod -aG docker dockeradmin
# Reload groups (or logout/login)
newgrp docker
# Verify Docker is installed
docker --version
# Should show: Docker version 24.x.x or 25.x.x
# Enable Docker to start on boot
sudo systemctl enable docker
# Check Docker is running
sudo systemctl status docker
# Should show: "active (running)" in green
# Press 'q' to exit
```
### Step 6: Configure Firewall for Docker Swarm
Docker Swarm needs specific ports open on the **PRIVATE network**:
```bash
# Install UFW (firewall) if not already installed
sudo apt install ufw -y
# Allow SSH (important - don't lock yourself out!)
sudo ufw allow 22/tcp
# Allow Docker Swarm ports on private network
# Port 2377: Cluster management (TCP)
sudo ufw allow from 10.116.0.0/16 to any port 2377 proto tcp
# Port 7946: Node communication (TCP and UDP)
sudo ufw allow from 10.116.0.0/16 to any port 7946
# Port 4789: Overlay network traffic (UDP)
sudo ufw allow from 10.116.0.0/16 to any port 4789 proto udp
# Enable firewall
sudo ufw --force enable
# Check firewall status
sudo ufw status verbose
```
**IMPORTANT**: Replace `10.116.0.0/16` with your actual private network subnet. If your private IPs are `10.116.x.x`, use `10.116.0.0/16`. If they're `10.108.x.x`, use `10.108.0.0/16`.
### Step 7: Initialize Docker Swarm
```bash
# Get the private IP of this manager droplet
ip addr show eth1 | grep "inet " | awk '{print $2}' | cut -d/ -f1
# Expected output: 10.116.0.2 (or similar)
# This is your PRIVATE IP - copy it
# Initialize swarm using PRIVATE IP
# Replace 10.116.0.2 with YOUR manager's private IP
docker swarm init --advertise-addr 10.116.0.2
# Expected output:
# Swarm initialized: current node (abc123...) is now a manager.
#
# To add a worker to this swarm, run the following command:
#
# docker swarm join --token SWMTKN-1-xxx... 10.116.0.2:2377
```
### Step 8: Save the Join Token
Copy the join command from the output above.
**✅ Checkpoint - Update your `.env` file:**
```bash
# Extract just the token part and add to .env:
SWARM_JOIN_TOKEN=SWMTKN-1-4abc123xyz789verylongtoken # Your actual token
# To get token again if needed:
# docker swarm join-token worker -q
```
### Step 9: Verify Manager Status
```bash
# Check swarm status
docker info | grep Swarm
# Should show: Swarm: active
# List nodes in the cluster
docker node ls
# Expected output:
# ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
# abc123... * swarm-manager Ready Active Leader
```
 **Success!** The manager is now running. Keep this SSH session open or note the join command.
---
## Configure the Swarm Worker
### Step 1: Initial SSH as Root
**Open a NEW terminal window** on your local machine (keep manager terminal open):
```bash
# Replace with YOUR swarm worker's PUBLIC IP
ssh root@159.65.123.46
# Type 'yes' if asked about fingerprint
# You should now see: root@mapleopentech-swarm-worker-1-prod:~#
```
### Step 2: System Updates and Create Admin User
```bash
# Update and upgrade system
apt update && apt upgrade -y
# Install essential packages
apt install -y curl wget apt-transport-https ca-certificates gnupg lsb-release
# Create dedicated Docker admin user
adduser dockeradmin
# Enter a strong password when prompted (use the SAME password as manager)
# Add to sudo group
usermod -aG sudo dockeradmin
# Copy SSH keys to new user
rsync --archive --chown=dockeradmin:dockeradmin ~/.ssh /home/dockeradmin
```
**✅ Checkpoint - Verify `.env` has:**
```bash
DOCKERADMIN_PASSWORD=YourStrongPasswordHere # Same as manager
```
### Step 3: Secure SSH Configuration
```bash
# Edit SSH configuration
nano /etc/ssh/sshd_config
# Find and update these lines:
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
MaxAuthTries 3
LoginGraceTime 60
# Save: Ctrl+X, then Y, then Enter
# Restart SSH service
systemctl restart ssh
```
### Step 4: Reconnect as dockeradmin
**Exit current session and reconnect:**
```bash
# Exit root session
exit
# SSH back in as dockeradmin
ssh dockeradmin@159.65.123.46 # Replace with YOUR worker's PUBLIC IP
# You should now see: dockeradmin@mapleopentech-swarm-worker-1-prod:~$
```
### Step 5: Install Docker
```bash
# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
# Add dockeradmin to docker group
sudo usermod -aG docker dockeradmin
# Reload groups
newgrp docker
# Verify Docker is installed
docker --version
# Should show: Docker version 24.x.x or 25.x.x
# Enable Docker to start on boot
sudo systemctl enable docker
# Check Docker is running
sudo systemctl status docker
# Should show: "active (running)" in green
# Press 'q' to exit
```
### Step 6: Configure Firewall
```bash
# Install UFW
sudo apt install ufw -y
# Allow SSH
sudo ufw allow 22/tcp
# Allow Docker Swarm ports on private network
# (Use YOUR network subnet - e.g., 10.116.0.0/16)
sudo ufw allow from 10.116.0.0/16 to any port 2377 proto tcp
sudo ufw allow from 10.116.0.0/16 to any port 7946
sudo ufw allow from 10.116.0.0/16 to any port 4789 proto udp
# Enable firewall
sudo ufw --force enable
# Check firewall status
sudo ufw status verbose
```
### Step 7: Join the Swarm
**Use the join command you saved from Step 8 of the manager setup:**
```bash
# Paste the ENTIRE command you copied earlier
# Example (use YOUR actual token and manager private IP):
docker swarm join --token SWMTKN-1-4abc123xyz789verylongtoken 10.116.0.2:2377
# Expected output:
# This node joined a swarm as a worker.
```
 **Success!** The worker has joined the swarm.
---
## Verify the Cluster
### Step 1: Check Nodes from Manager
**Go back to your manager terminal** (or SSH back in if you closed it):
```bash
ssh dockeradmin@159.65.123.45 # Your manager's PUBLIC IP
```
Run this command:
```bash
# List all nodes in the swarm
docker node ls
# Expected output (2 nodes):
# ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
# abc123... * mapleopentech-swarm-manager-1-prod Ready Active Leader
# def456... mapleopentech-swarm-worker-1-prod Ready Active
```
**What you should see:**
-  Both nodes listed
-  Both have STATUS = "Ready"
-  Manager shows "Leader"
-  Worker shows nothing in MANAGER STATUS (this is correct)
### Step 2: Test Private Network Communication
**From the manager**, ping the worker's private IP:
```bash
# Replace 10.116.0.3 with YOUR worker's private IP
ping -c 4 10.116.0.3
# Expected output:
# 64 bytes from 10.116.0.3: icmp_seq=1 ttl=64 time=0.5 ms
# 64 bytes from 10.116.0.3: icmp_seq=2 ttl=64 time=0.3 ms
# ...
# 4 packets transmitted, 4 received, 0% packet loss
```
**From the worker**, ping the manager's private IP:
```bash
# Switch to worker terminal and run:
# Replace 10.116.0.2 with YOUR manager's private IP
ping -c 4 10.116.0.2
# Expected output:
# 64 bytes from 10.116.0.2: icmp_seq=1 ttl=64 time=0.5 ms
# ...
# 4 packets transmitted, 4 received, 0% packet loss
```
 **If pings work:** Private networking is correctly configured!
L **If pings fail:** See [Troubleshooting](#troubleshooting) section below
### Step 3: Deploy a Test Service
Let's verify the swarm actually works by deploying a test container:
**From the manager:**
```bash
# Create a simple nginx web server service
docker service create \
--name test-web \
--replicas 2 \
--publish 8080:80 \
nginx:latest
# Check service status
docker service ls
# Expected output:
# ID NAME MODE REPLICAS IMAGE
# xyz123 test-web replicated 2/2 nginx:latest
# See which nodes are running the containers
docker service ps test-web
# Expected output shows containers on both manager and worker
```
**Test the service:**
```bash
# From your local machine, test the service via manager's PUBLIC IP
curl http://159.65.123.45:8080
# Should show HTML output from nginx
# Example: <!DOCTYPE html> <html> ...
# Also test via worker's PUBLIC IP
curl http://159.65.123.46:8080
# Should also work (swarm routes traffic)
```
**Clean up the test service:**
```bash
# Remove the test service (from manager)
docker service rm test-web
# Verify it's gone
docker service ls
# Should show: (empty)
```
 **If test service worked:** Your Docker Swarm cluster is fully operational!
---
## ✅ Final Checkpoint - Verify Your `.env` File
Your `.env` should now have all swarm configuration. Verify it:
```bash
# On your local machine:
cd ~/monorepo/cloud/infrastructure/production
# Check .env has all variables:
grep "SWARM" .env
# Expected output (with YOUR actual values):
# SWARM_REGION=tor1
# SWARM_VPC_NAME=default-tor1
# SWARM_VPC_SUBNET=10.116.0.0/16
# SWARM_MANAGER_1_HOSTNAME=mapleopentech-swarm-manager-1-prod
# SWARM_MANAGER_1_PUBLIC_IP=159.65.123.45
# SWARM_MANAGER_1_PRIVATE_IP=10.116.0.2
# SWARM_WORKER_1_HOSTNAME=mapleopentech-swarm-worker-1-prod
# SWARM_WORKER_1_PUBLIC_IP=159.65.123.46
# SWARM_WORKER_1_PRIVATE_IP=10.116.0.3
# SWARM_JOIN_TOKEN=SWMTKN-1-...
# Load and test:
source .env
echo "✓ Manager: ${SWARM_MANAGER_1_HOSTNAME}"
echo "✓ Worker: ${SWARM_WORKER_1_HOSTNAME}"
```
---
## Troubleshooting
### Problem: Worker Cannot Join Swarm
**Symptom**: `docker swarm join` fails with "connection refused" or timeout
**Solutions:**
1. **Check firewall on manager:**
```bash
# On manager:
sudo ufw status verbose
# Should show rules allowing port 2377 from private network
# If missing, add it:
sudo ufw allow from 10.116.0.0/16 to any port 2377 proto tcp
```
2. **Verify you're using PRIVATE IP in join command:**
```bash
# Join command should use PRIVATE IP (10.x.x.x), not PUBLIC IP
# WRONG: docker swarm join --token ... 159.65.123.45:2377
# RIGHT: docker swarm join --token ... 10.116.0.2:2377
```
3. **Check both nodes are in same VPC:**
**From DigitalOcean dashboard:**
- Go to **Networking** → **VPC**
- Click on your VPC (e.g., `default-tor1`)
- Both droplets should be listed as members
**From command line (on each node):**
```bash
# On manager:
ip addr show eth1 | grep "inet "
# Should show: 10.116.0.2/16
# On worker:
ip addr show eth1 | grep "inet "
# Should show: 10.116.0.3/16 (same 10.116 prefix)
# If prefix is different, they're in different regions/VPCs!
```
### Problem: Nodes Cannot Ping Each Other
**Symptom**: `ping` command fails between nodes
**Solutions:**
1. **Check firewall allows ICMP (ping):**
```bash
# On both nodes:
sudo ufw allow from 10.116.0.0/16
```
2. **Verify private IPs are correct:**
```bash
# On each node, check private IP:
ip addr show eth1
# Should show inet 10.x.x.x
# If you only see eth0, you don't have private networking enabled
```
3. **Check DigitalOcean VPC settings:**
- Go to DigitalOcean dashboard
- Click **Networking** → **VPC**
- Click on your VPC (e.g., `default-tor1`)
- Verify both droplets are listed as members
- If not, you created them in different regions - delete and recreate in the same region
- **Remember:** Each region has its own default VPC (Toronto = `default-tor1`, NYC = `default-nyc1`, etc.)
### Problem: Docker Not Installed
**Symptom**: `docker: command not found`
**Solution:**
```bash
# Verify Docker installation
which docker
# Should show: /usr/bin/docker
# If not found, reinstall:
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo systemctl start docker
sudo systemctl enable docker
```
### Problem: Lost Join Token
**Solution:**
```bash
# On manager, regenerate token:
docker swarm join-token worker
# Copy the full command shown in output
```
### Problem: Wrong Node Joined as Manager
**Symptom**: Both nodes show as managers
**Solution:**
```bash
# On the node that should be a worker:
docker swarm leave --force
# Then re-join with worker token (from manager):
docker swarm join-token worker
# Copy and run the command shown
```
### Problem: Firewall Locked You Out
**Symptom**: Cannot SSH after enabling UFW
**Solution:**
- Use DigitalOcean console:
1. Go to droplet in DigitalOcean dashboard
2. Click **Access** <20> **Launch Droplet Console**
3. Log in as root
4. Fix firewall:
```bash
ufw allow 22/tcp
ufw reload
```
---
## Quick Reference Commands
### Check Swarm Status (on manager)
```bash
# List all nodes
docker node ls
# Show detailed node info
docker node inspect mapleopentech-swarm-worker-1-prod
# Check swarm status
docker info | grep -A 5 Swarm
```
### Add More Workers (in future)
```bash
# On manager, get join token:
docker swarm join-token worker
# Copy output, then on new worker (e.g., worker-2):
# Create droplet with hostname: mapleopentech-swarm-worker-2-prod
# Install Docker, then paste the join command
```
### Remove a Worker
```bash
# On worker:
docker swarm leave
# On manager (replace with actual hostname):
docker node rm mapleopentech-swarm-worker-1-prod
```
### View Service Logs
```bash
# On manager:
docker service logs <service-name>
```
---
## Security Best Practices
### = Recommendations
1. **Use SSH keys only** - Disable password authentication:
```bash
# On both nodes:
nano /etc/ssh/sshd_config
# Set: PasswordAuthentication no
systemctl restart sshd
```
2. **Enable automatic security updates:**
```bash
# On both nodes:
apt install unattended-upgrades -y
dpkg-reconfigure -plow unattended-upgrades
```
3. **Limit SSH to specific IPs** (if you have static IP):
```bash
# On both nodes:
ufw delete allow 22/tcp
ufw allow from YOUR.HOME.IP.ADDRESS to any port 22 proto tcp
```
4. **Regular backups** - Enable DigitalOcean droplet backups (Settings <20> Backups)
5. **Monitor logs:**
```bash
# On both nodes:
journalctl -u docker -f
```
---
## Next Steps
 **You've completed:**
- Created 2 DigitalOcean droplets (Ubuntu 24.04)
- Installed Docker on both
- Configured Docker Swarm with private networking
- Verified cluster connectivity
- Updated `.env` file with infrastructure details
**Next:**
- **[Deploy Cassandra](02_cassandra.md)** - Set up Cassandra database cluster on the swarm
- **[Deploy Redis](03_redis.md)** - Set up Redis cache server
- **[Deploy Meilisearch](04_meilisearch.md)** - Set up Meilisearch search engine
---
## Summary of What You Built
```
┌──────────────────────────────────────────────────────────────────┐
│ DigitalOcean Cloud │
│ │
│ ┌────────────────────────────┐ ┌─────────────────────────┐ │
│ │ mapleopentech-swarm-manager-1 │ │ mapleopentech-swarm-worker-1│ │
│ │ -prod (Leader) │◄─►│ -prod (Worker) │ │
│ │ │ │ │ │
│ │ Public: 159.65.123.45 │ │ Public: 159.65.123.46 │ │
│ │ Private: 10.116.0.2 │ │ Private: 10.116.0.3 │ │
│ └────────────────────────────┘ └─────────────────────────┘ │
│ │ │ │
│ └─────────────VPC───────────────┘ │
│ (Private Network) │
│ │
│ Future: Add mapleopentech-swarm-worker-2-prod, worker-3-prod, etc. │
│ │
└──────────────────────────────────────────────────────────────────┘
```
---
**Document Version**: 1.0 (From-Scratch Edition)
**Last Updated**: November 3, 2025
**Maintained By**: Infrastructure Team

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,671 @@
# Redis Setup (Single Instance)
**Prerequisites**: Complete [01_init_docker_swarm.md](01_init_docker_swarm.md) first
**Time to Complete**: 15-20 minutes
**What You'll Build**:
- Single Redis instance on existing worker-1
- Password-protected with Docker secrets
- Private network communication only (maple-private-prod overlay)
- Persistent data with AOF + RDB
- Ready for Go application connections
---
## Table of Contents
1. [Overview](#overview)
2. [Label Worker Node](#label-worker-node)
3. [Create Redis Password Secret](#create-redis-password-secret)
4. [Deploy Redis](#deploy-redis)
5. [Verify Redis Health](#verify-redis-health)
6. [Connect from Application](#connect-from-application)
7. [Redis Management](#redis-management)
8. [Troubleshooting](#troubleshooting)
---
## Overview
### Architecture
```
Docker Swarm Cluster:
├── mapleopentech-swarm-manager-1-prod (10.116.0.2)
│ └── Orchestrates cluster
├── mapleopentech-swarm-worker-1-prod (10.116.0.3)
│ └── Redis (single instance)
│ ├── Network: maple-private-prod (overlay, shared)
│ ├── Port: 6379 (private only)
│ ├── Auth: Password (Docker secret)
│ └── Data: Persistent volume
└── mapleopentech-swarm-worker-2,3,4-prod
└── Cassandra Cluster (3 nodes)
└── Same network: maple-private-prod
Shared Network (maple-private-prod):
├── All services can communicate
├── Service discovery by name (redis, cassandra-1, etc.)
└── No public internet access
Future Application:
└── mapleopentech-swarm-worker-X-prod
└── Go Backend → Connects to redis:6379 and cassandra:9042 on maple-private-prod
```
### Redis Configuration
- **Version**: Redis 7 (Alpine)
- **Memory**: 512MB max (with LRU eviction)
- **Persistence**: AOF (every second) + RDB snapshots
- **Network**: Private overlay network only
- **Authentication**: Required via Docker secret
- **Security**: Dangerous commands disabled (FLUSHALL, CONFIG, etc.)
### Why Worker-1?
- Already exists from Docker Swarm setup
- Available capacity (2GB RAM droplet)
- Keeps costs down (no new droplet needed)
- Sufficient for caching workload
---
## Label Worker Node
We'll use Docker node labels to ensure Redis always deploys to worker-1.
**On your manager node:**
```bash
# SSH to manager
ssh dockeradmin@<manager-public-ip>
# Label worker-1 for Redis placement
docker node update --label-add redis=true mapleopentech-swarm-worker-1-prod
# Verify label
docker node inspect mapleopentech-swarm-worker-1-prod --format '{{.Spec.Labels}}'
# Should show: map[redis:true]
```
---
## Create Redis Password Secret
Redis will use Docker secrets for password authentication.
### Step 1: Generate Strong Password
**On your manager node:**
```bash
# Generate a random 32-character password
REDIS_PASSWORD=$(openssl rand -base64 32 | tr -d "=+/" | cut -c1-32)
# Display it (SAVE THIS IN YOUR PASSWORD MANAGER!)
echo $REDIS_PASSWORD
# Example output: a8K9mP2nQ7rT4vW5xY6zB3cD1eF0gH8i
```
**⚠️ IMPORTANT**: Save this password in your password manager now! You'll need it for:
- Application configuration
- Manual Redis CLI connections
- Troubleshooting
### Step 2: Create Docker Secret
```bash
# Create secret from the password
echo $REDIS_PASSWORD | docker secret create redis_password -
# Verify secret was created
docker secret ls
# Should show:
# ID NAME CREATED
# abc123... redis_password About a minute ago
```
### Step 3: Update .env File
**On your local machine**, update your `.env` file:
```bash
# Add to cloud/infrastructure/production/.env
REDIS_HOST=redis
REDIS_PORT=6379
REDIS_PASSWORD=<paste-the-password-here>
```
---
## Deploy Redis
### Step 1: Create Redis Stack File
**On your manager node:**
```bash
# Create directory for stack files (if not exists)
mkdir -p ~/stacks
cd ~/stacks
# Create Redis stack file
vi redis-stack.yml
```
Copy and paste the following:
```yaml
version: '3.8'
networks:
maple-private-prod:
external: true
volumes:
redis-data:
secrets:
redis_password:
external: true
services:
redis:
image: redis:7-alpine
hostname: redis
networks:
- maple-private-prod
volumes:
- redis-data:/data
secrets:
- redis_password
# Command with password from secret
command: >
sh -c '
redis-server
--requirepass "$$(cat /run/secrets/redis_password)"
--bind 0.0.0.0
--port 6379
--protected-mode no
--save 900 1
--save 300 10
--save 60 10000
--appendonly yes
--appendfilename "appendonly.aof"
--appendfsync everysec
--maxmemory 512mb
--maxmemory-policy allkeys-lru
--loglevel notice
--databases 16
--timeout 300
--tcp-keepalive 300
--io-threads 2
--io-threads-do-reads yes
--slowlog-log-slower-than 10000
--slowlog-max-len 128
--activerehashing yes
--maxclients 10000
--rename-command FLUSHDB ""
--rename-command FLUSHALL ""
--rename-command CONFIG ""
'
deploy:
replicas: 1
placement:
constraints:
- node.labels.redis == true
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
resources:
limits:
memory: 768M
reservations:
memory: 512M
healthcheck:
test: ["CMD", "sh", "-c", "redis-cli -a $$(cat /run/secrets/redis_password) ping | grep PONG"]
interval: 10s
timeout: 3s
retries: 3
start_period: 10s
```
Save and exit (`:wq` in vi).
### Step 2: Verify Shared Overlay Network
**Check if the maple-private-prod network exists:**
```bash
docker network ls | grep maple-private-prod
```
**You should see:**
```
abc123... maple-private-prod overlay swarm
```
**If you completed 02_cassandra.md** (Step 4), the network already exists and you're good to go!
**If the network doesn't exist**, create it now:
```bash
# Create the shared maple-private-prod network
docker network create \
--driver overlay \
--attachable \
maple-private-prod
# Verify it was created
docker network ls | grep maple-private-prod
```
**What is this network?**
- Shared by all Maple services (Cassandra, Redis, your Go backend)
- Enables private communication between services
- Service names act as hostnames (e.g., `redis`, `cassandra-1`)
- No public exposure - overlay network is internal only
### Step 3: Deploy Redis Stack
```bash
# Deploy Redis
docker stack deploy -c redis-stack.yml redis
# Expected output:
# Creating service redis_redis
```
### Step 4: Verify Deployment
```bash
# Check service status
docker service ls
# Should show:
# ID NAME REPLICAS IMAGE
# xyz... redis_redis 1/1 redis:7-alpine
# Check which node it's running on
docker service ps redis_redis
# Should show mapleopentech-swarm-worker-1-prod
# Watch logs
docker service logs -f redis_redis
# Should see: "Ready to accept connections"
# Press Ctrl+C when done
```
Redis should be up and running in ~10-15 seconds.
---
## Verify Redis Health
### Step 1: Test Redis Connection
**SSH to worker-1:**
```bash
# Get worker-1's public IP from your .env
ssh dockeradmin@<worker-1-public-ip>
# Get Redis container ID
REDIS_CONTAINER=$(docker ps -q --filter "name=redis_redis")
# Test connection (replace PASSWORD with your actual password)
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD ping
# Should return: PONG
```
### Step 2: Test Basic Operations
```bash
# Set a test key
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD SET test:key "Hello Redis"
# Returns: OK
# Get the test key
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD GET test:key
# Returns: "Hello Redis"
# Check Redis info
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD INFO server
# Shows Redis version, uptime, etc.
# Check memory usage
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD INFO memory
# Shows memory stats
```
---
## Redis Management
### Restarting Redis
```bash
# On manager node
docker service update --force redis_redis
# Wait for restart (10-15 seconds)
docker service ps redis_redis
```
### Stopping Redis
```bash
# Remove Redis stack (data persists in volume)
docker stack rm redis
# Verify it's stopped
docker service ls | grep redis
# Should show nothing
```
### Starting Redis After Stop
```bash
# Redeploy the stack
cd ~/stacks
docker stack deploy -c redis-stack.yml redis
# Data is intact from previous volume
```
### Viewing Logs
```bash
# Recent logs
docker service logs redis_redis --tail 50
# Follow logs in real-time
docker service logs -f redis_redis
```
### Backing Up Redis Data
```bash
# SSH to worker-1
ssh dockeradmin@<worker-1-public-ip>
# Get container ID
REDIS_CONTAINER=$(docker ps -q --filter "name=redis_redis")
# Trigger manual save
docker exec $REDIS_CONTAINER redis-cli -a YOUR_PASSWORD BGSAVE
# Copy RDB file to host
docker cp $REDIS_CONTAINER:/data/dump.rdb ~/redis-backup-$(date +%Y%m%d).rdb
# Download to local machine (from your local terminal)
scp dockeradmin@<worker-1-public-ip>:~/redis-backup-*.rdb ./
```
### Clearing All Data (Dangerous!)
Since FLUSHALL is disabled, you need to remove and recreate the volume:
```bash
# On manager node
docker stack rm redis
# Wait for service to stop
sleep 10
# SSH to worker-1
ssh dockeradmin@<worker-1-public-ip>
# Remove volume (THIS DELETES ALL DATA!)
docker volume rm redis_redis-data
# Exit and redeploy from manager
exit
docker stack deploy -c redis-stack.yml redis
```
---
## Troubleshooting
### Problem: Network Not Found During Deployment
**Symptom**: `network "maple-private-prod" is declared as external, but could not be found`
**Solution:**
Create the shared `maple-private-prod` network first:
```bash
# Create the network
docker network create \
--driver overlay \
--attachable \
maple-private-prod
# Verify it exists
docker network ls | grep maple-private-prod
# Should show: maple-private-prod overlay swarm
# Then deploy Redis
docker stack deploy -c redis-stack.yml redis
```
**Why this happens:**
- You haven't completed Step 2 (verify network)
- The network was deleted
- First time deploying any Maple service
**Note**: This network is shared by all services (Cassandra, Redis, backend). You only need to create it once, before deploying your first service.
### Problem: Service Won't Start
**Symptom**: `docker service ls` shows `0/1` replicas
**Solutions:**
1. **Check logs:**
```bash
docker service logs redis_redis --tail 50
```
2. **Verify secret exists:**
```bash
docker secret ls | grep redis_password
# Must show the secret
```
3. **Check node label:**
```bash
docker node inspect mapleopentech-swarm-worker-1-prod --format '{{.Spec.Labels}}'
# Must show: map[redis:true]
```
4. **Verify maple-private-prod network exists:**
```bash
docker network ls | grep maple-private-prod
# Should show: maple-private-prod overlay swarm
```
### Problem: Can't Connect (Authentication Failed)
**Symptom**: `NOAUTH Authentication required` or `ERR invalid password`
**Solutions:**
1. **Verify you're using the correct password:**
```bash
# View the secret (from manager node)
docker secret inspect redis_password
# Compare ID with what you saved
```
2. **Test with password from secret file:**
```bash
# SSH to worker-1
REDIS_CONTAINER=$(docker ps -q --filter "name=redis_redis")
docker exec $REDIS_CONTAINER sh -c 'redis-cli -a $(cat /run/secrets/redis_password) ping'
# Should return: PONG
```
### Problem: Container Keeps Restarting
**Symptom**: `docker service ps redis_redis` shows multiple restarts
**Solutions:**
1. **Check memory:**
```bash
# On worker-1
free -h
# Should have at least 1GB free
```
2. **Check logs for errors:**
```bash
docker service logs redis_redis
# Look for "Out of memory" or permission errors
```
3. **Verify volume permissions:**
```bash
# On worker-1
docker volume inspect redis_redis-data
# Check mountpoint permissions
```
### Problem: Can't Connect from Application
**Symptom**: Application can't reach Redis on port 6379
**Solutions:**
1. **Verify both services on same network:**
```bash
# Check your app is on maple-private-prod network
docker service inspect your_app --format '{{.Spec.TaskTemplate.Networks}}'
# Should show maple-private-prod
```
2. **Test DNS resolution:**
```bash
# From your app container
nslookup redis
# Should resolve to Redis container IP
```
3. **Test connectivity:**
```bash
# From your app container (install redis-cli first)
redis-cli -h redis -a YOUR_PASSWORD ping
```
### Problem: Slow Performance
**Symptom**: Redis responds slowly or times out
**Solutions:**
1. **Check slow log:**
```bash
docker exec $(docker ps -q --filter "name=redis_redis") \
redis-cli -a YOUR_PASSWORD SLOWLOG GET 10
```
2. **Check memory usage:**
```bash
docker exec $(docker ps -q --filter "name=redis_redis") \
redis-cli -a YOUR_PASSWORD INFO memory
# Look at used_memory_human and maxmemory_human
```
3. **Check for evictions:**
```bash
docker exec $(docker ps -q --filter "name=redis_redis") \
redis-cli -a YOUR_PASSWORD INFO stats | grep evicted_keys
# High number means you need more memory
```
### Problem: Data Lost After Restart
**Symptom**: Data disappears when container restarts
**Verification:**
```bash
# On worker-1, check if volume exists
docker volume ls | grep redis
# Should show: redis_redis-data
# Check volume is mounted
docker inspect $(docker ps -q --filter "name=redis_redis") --format '{{.Mounts}}'
# Should show /data mounted to volume
```
**This shouldn't happen** if volume is properly configured. If it does:
1. Check AOF/RDB files exist: `docker exec <container> ls -lh /data/`
2. Check Redis config: `docker exec <container> redis-cli -a PASSWORD CONFIG GET dir`
---
## Next Steps
✅ **You now have:**
- Redis instance running on worker-1
- Password-protected access
- Persistent data storage (AOF + RDB)
- Private network connectivity
- Ready for application integration
**Next guides:**
- **04_app_backend.md** - Deploy your Go backend application
- Connect backend to Redis and Cassandra
- Set up NGINX reverse proxy
---
## Performance Notes
### Current Setup (2GB RAM Worker)
**Capacity:**
- 512MB max Redis memory
- Suitable for: ~50k-100k small keys
- Cache hit rate: Monitor with `INFO stats`
- Throughput: ~10,000-50,000 ops/sec
**Limitations:**
- Single instance (no redundancy)
- No Redis Cluster (no automatic sharding)
- Limited to 512MB (maxmemory setting)
### Upgrade Path
**For Production with High Load:**
1. **Increase memory** (resize worker-1 to 4GB):
- Update maxmemory to 2GB
- Better for larger datasets
2. **Add Redis replica** (for redundancy):
- Deploy second Redis on another worker
- Configure replication
- High availability with Sentinel
3. **Redis Cluster** (for very high scale):
- 3+ worker nodes
- Automatic sharding
- Handles millions of keys
For most applications starting out, **single instance with 512MB is sufficient**.
---
**Last Updated**: November 3, 2025
**Maintained By**: Infrastructure Team

View file

@ -0,0 +1,511 @@
# DigitalOcean Spaces Setup (S3-Compatible Object Storage)
**Audience**: Junior DevOps Engineers, Infrastructure Team
**Time to Complete**: 15-20 minutes
**Prerequisites**: DigitalOcean account with billing enabled
---
## Overview
This guide sets up **DigitalOcean Spaces** - an S3-compatible object storage service for storing files, uploads, and media for your MaplePress backend.
**What You'll Build:**
- DigitalOcean Space (bucket) for file storage
- API keys (access key + secret key) for programmatic access
- Docker Swarm secrets for secure credential storage
- Configuration ready for backend integration
**Why DigitalOcean Spaces?**
- S3-compatible API (works with AWS SDK)
- Simple pricing: $5/mo for 250GB + 1TB transfer
- CDN included (speeds up file delivery globally)
- No egress fees within same region
- Integrated with your existing DigitalOcean infrastructure
---
## Table of Contents
1. [Create DigitalOcean Space](#step-1-create-digitalocean-space)
2. [Generate API Keys](#step-2-generate-api-keys)
3. [Create Docker Secrets](#step-3-create-docker-secrets)
4. [Verify Configuration](#step-4-verify-configuration)
5. [Test Access](#step-5-test-access)
6. [Troubleshooting](#troubleshooting)
---
## Step 1: Create DigitalOcean Space
### 1.1 Create Space via Dashboard
1. Log into DigitalOcean dashboard: https://cloud.digitalocean.com
2. Click **Manage****Spaces Object Storage** in left sidebar
3. Click **Create a Space**
4. Configure:
- **Choose a datacenter region**: Select same region as your droplets (e.g., `NYC3` or `Toronto`)
- **Enable CDN**: ✅ Yes (recommended - improves performance globally)
- **Choose a unique name**: `maplepress` (must be globally unique)
- **Select a project**: Your project (e.g., "MaplePress Production")
5. Click **Create a Space**
**Expected output:**
- Space created successfully
- You'll see the space URL: `https://maplepress.tor1.digitaloceanspaces.com`
### 1.2 Record Space Information
**Save these values** (you'll need them later):
```bash
# Space Name
SPACE_NAME=maplepress
# Endpoint (without https://)
SPACE_ENDPOINT=tor1.digitaloceanspaces.com
# Region code
SPACE_REGION=tor1
# Full URL (for reference)
SPACE_URL=https://maplepress.tor1.digitaloceanspaces.com
```
**Region codes for reference:**
- Toronto: `tor1.digitaloceanspaces.com`
- San Francisco 3: `sfo3.digitaloceanspaces.com`
- Singapore: `sgp1.digitaloceanspaces.com`
- Amsterdam: `ams3.digitaloceanspaces.com`
- Frankfurt: `fra1.digitaloceanspaces.com`
**✅ Checkpoint:** Space created and URL recorded
---
## Step 2: Generate API Keys
### 2.1 Create Spaces Access Keys
1. In DigitalOcean dashboard, go to **API** in left sidebar
2. Scroll down to **Spaces access keys** section
3. Click **Generate New Key**
4. Configure:
- **Name**: `maplepress-backend-prod`
- **Description**: "Backend service access to Spaces" (optional)
5. Click **Generate Key**
**⚠️ CRITICAL:** The secret key is **only shown once**! Copy it immediately.
### 2.2 Save Credentials Securely
You'll see:
- **Access Key**: `DO00ABC123XYZ...` (20 characters)
- **Secret Key**: `abc123def456...` (40 characters)
**SAVE BOTH IN YOUR PASSWORD MANAGER NOW!**
Example:
```
DigitalOcean Spaces - MaplePress Production
Access Key: DO00ABC123XYZ456
Secret Key: abc123def456ghi789jkl012mno345pqr678stu901
Endpoint: nyc3.digitaloceanspaces.com
Bucket: maplepress
```
### 2.3 Update Local .env File
**On your local machine:**
```bash
# Navigate to production infrastructure
cd ~/monorepo/cloud/infrastructure/production
# Edit .env file
vi .env
# Add these lines:
SPACES_ACCESS_KEY=DO00ABC123XYZ456
SPACES_SECRET_KEY=abc123def456ghi789jkl012mno345pqr678stu901
SPACES_ENDPOINT=tor1.digitaloceanspaces.com
SPACES_REGION=tor1
SPACES_BUCKET=maplepress
```
Save: `Esc`, `:wq`, `Enter`
**✅ Checkpoint:** API keys saved securely in password manager and `.env` file
---
## Step 3: Create Docker Secrets
**On manager node:**
```bash
# SSH to manager
ssh dockeradmin@<manager-public-ip>
```
### 3.1 Create Spaces Access Key Secret
```bash
# Create secret for access key
echo -n "DO00ABC123XYZ456" | docker secret create spaces_access_key -
# Verify
docker secret ls | grep spaces_access_key
# Should show: spaces_access_key About a minute ago
```
**Important:** Replace `DO00ABC123XYZ456` with your actual access key!
### 3.2 Create Spaces Secret Key Secret
```bash
# Create secret for secret key
echo -n "abc123def456ghi789jkl012mno345pqr678stu901" | docker secret create spaces_secret_key -
# Verify
docker secret ls | grep spaces_secret_key
# Should show: spaces_secret_key About a minute ago
```
**Important:** Replace with your actual secret key!
### 3.3 Verify All Secrets
```bash
# List all secrets
docker secret ls
```
**You should see:**
```
ID NAME CREATED
abc123... maplepress_jwt_secret from 05_backend.md
abc124... maplepress_ip_encryption_key from 05_backend.md
def456... redis_password from 03_redis.md
ghi789... meilisearch_master_key from 04_meilisearch.md
jkl012... spaces_access_key NEW!
mno345... spaces_secret_key NEW!
```
**✅ Checkpoint:** All secrets created successfully
---
## Step 4: Verify Configuration
### 4.1 Test Space Access from Local Machine
**Install AWS CLI (if not already installed):**
```bash
# On your local machine (Mac)
brew install awscli
# Or on Linux:
sudo apt install awscli
```
**Configure AWS CLI for DigitalOcean Spaces:**
```bash
# Create AWS credentials file
mkdir -p ~/.aws
vi ~/.aws/credentials
# Add this profile:
[digitalocean]
aws_access_key_id = DO00ABC123XYZ456
aws_secret_access_key = abc123def456ghi789jkl012mno345pqr678stu901
```
Save: `Esc`, `:wq`, `Enter`
### 4.2 Test Listing Space Contents
```bash
# List contents of your space
aws s3 ls s3://maplepress \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# Should show empty (new space) or list existing files
```
### 4.3 Test File Upload
```bash
# Create test file
echo "Hello from MaplePress!" > test-file.txt
# Upload to space
aws s3 cp test-file.txt s3://maplepress/test-file.txt \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean \
--acl public-read
# Should show: upload: ./test-file.txt to s3://maplepress/test-file.txt
```
### 4.4 Test File Download
```bash
# Download from space
aws s3 cp s3://maplepress/test-file.txt downloaded-test.txt \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# Verify content
cat downloaded-test.txt
# Should show: Hello from MaplePress!
# Clean up
rm test-file.txt downloaded-test.txt
```
### 4.5 Test Public URL Access
```bash
# Try accessing via browser or curl
curl https://maplepress.tor1.digitaloceanspaces.com/test-file.txt
# Should show: Hello from MaplePress!
```
**✅ Checkpoint:** Successfully uploaded, listed, downloaded, and accessed file
---
## Step 5: Test Access
### 5.1 Verify Endpoint Resolution
```bash
# Test DNS resolution
dig tor1.digitaloceanspaces.com +short
# Should return IP addresses (e.g., 192.81.xxx.xxx)
```
### 5.2 Test HTTPS Connection
```bash
# Test SSL/TLS connection
curl -I https://tor1.digitaloceanspaces.com
# Should return:
# HTTP/2 403 (Forbidden is OK - means endpoint is reachable)
```
### 5.3 Check Space Permissions
1. Go to DigitalOcean dashboard → Spaces
2. Click on your space (`maplepress`)
3. Click **Settings** tab
4. Check **File Listing**: Should be ❌ Restricted (recommended for security)
5. Individual files can be made public via ACL when uploading
**✅ Checkpoint:** Spaces endpoint is accessible and working
---
## Troubleshooting
### Problem: "Space name already exists"
**Symptom:** Can't create space with chosen name
**Cause:** Space names are globally unique across all DigitalOcean customers
**Solution:**
Try these naming patterns:
- `maplepress-<your-company>`
- `maplepress-<random-string>`
- `mp-prod-<date>` (e.g., `mp-prod-2025`)
Check availability by trying different names in the creation form.
### Problem: "Access Denied" When Testing
**Symptom:** AWS CLI returns `AccessDenied` error
**Causes and Solutions:**
1. **Wrong credentials:**
```bash
# Verify credentials in ~/.aws/credentials match DigitalOcean dashboard
cat ~/.aws/credentials
```
2. **Wrong endpoint:**
```bash
# Make sure endpoint matches your space region
# NYC3: nyc3.digitaloceanspaces.com
# SFO3: sfo3.digitaloceanspaces.com
```
3. **Wrong bucket name:**
```bash
# Verify bucket name matches space name exactly
aws s3 ls --endpoint-url https://tor1.digitaloceanspaces.com --profile digitalocean
# Should list your space
```
### Problem: "NoSuchBucket" Error
**Symptom:** AWS CLI says bucket doesn't exist
**Check:**
```bash
# List all spaces in your account
aws s3 ls --endpoint-url https://tor1.digitaloceanspaces.com --profile digitalocean
# Make sure your space appears in the list
```
**If space is missing:**
- Check you're in the correct DigitalOcean account
- Check space wasn't accidentally deleted
- Check endpoint URL matches space region
### Problem: Files Not Publicly Accessible
**Symptom:** Get 403 Forbidden when accessing file URL
**Cause:** File ACL is private (default)
**Solution:**
```bash
# Upload with public-read ACL
aws s3 cp file.txt s3://maplepress/file.txt \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean \
--acl public-read
# Or make existing file public
aws s3api put-object-acl \
--bucket maplepress \
--key file.txt \
--acl public-read \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
```
**Note:** Your backend will control ACLs programmatically. Public access should only be granted to files that need to be publicly accessible (e.g., user-uploaded images for display).
### Problem: CDN Not Working
**Symptom:** Files load slowly or CDN URL doesn't work
**Check:**
1. Verify CDN is enabled:
- DigitalOcean dashboard → Spaces → Your space → Settings
- **CDN** should show: ✅ Enabled
2. Use CDN URL instead of direct URL:
```bash
# Direct URL (slower):
https://maplepress.tor1.digitaloceanspaces.com/file.txt
# CDN URL (faster):
https://maplepress.tor1.cdn.digitaloceanspaces.com/file.txt
```
3. Clear CDN cache if needed:
- Spaces → Your space → Settings → CDN
- Click **Purge Cache**
### Problem: High Storage Costs
**Symptom:** Unexpected charges for Spaces
**Check:**
```bash
# Calculate total space usage
aws s3 ls s3://maplepress --recursive --human-readable --summarize \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# Shows: Total Size: X.XX GB
```
**Pricing reference:**
- $5/mo includes 250GB storage + 1TB outbound transfer
- Additional storage: $0.02/GB per month
- Additional transfer: $0.01/GB
**Optimization tips:**
- Delete old/unused files regularly
- Use CDN to reduce direct space access
- Compress images before uploading
- Set up lifecycle policies to auto-delete old files
---
## Next Steps
✅ **You now have:**
- DigitalOcean Space created and configured
- API keys generated and secured
- Docker Swarm secrets created
- Verified access from local machine
**Next guide:**
- **05_backend.md** - Deploy MaplePress backend
- Backend will use these Spaces credentials automatically
- Files uploaded via backend API will be stored in your Space
**Space Configuration for Backend:**
The backend will use these environment variables (configured in 05_backend.md):
```yaml
environment:
- AWS_ACCESS_KEY_FILE=/run/secrets/spaces_access_key
- AWS_SECRET_KEY_FILE=/run/secrets/spaces_secret_key
- AWS_ENDPOINT=https://tor1.digitaloceanspaces.com
- AWS_REGION=tor1
- AWS_BUCKET_NAME=maplepress
```
**Useful Commands:**
```bash
# List all files in space
aws s3 ls s3://maplepress --recursive \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# Get space size
aws s3 ls s3://maplepress --recursive --summarize \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# Delete test file
aws s3 rm s3://maplepress/test-file.txt \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# Sync local directory to space
aws s3 sync ./local-folder s3://maplepress/uploads/ \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
```
---
**Last Updated**: January 2025
**Maintained By**: Infrastructure Team
**Changelog:**
- January 2025: Initial DigitalOcean Spaces setup guide for MaplePress production deployment

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,453 @@
# DigitalOcean Spaces Setup for MapleFile
**Audience**: Junior DevOps Engineers, Infrastructure Team
**Time to Complete**: 20-30 minutes
**Prerequisites**:
- Completed guide 04.5_spaces.md (DigitalOcean Spaces basics)
- AWS CLI configured with DigitalOcean profile
- DigitalOcean Spaces API keys
---
## Overview
This guide configures a **separate DigitalOcean Space for MapleFile** with the required CORS settings for browser-based file uploads.
**What You'll Build:**
- New DigitalOcean Space for MapleFile file storage
- CORS configuration to allow browser uploads from frontend
- Docker Swarm secrets for MapleFile backend
- Verified upload/download functionality
**Why a Separate Space?**
- MapleFile stores encrypted user files (different from MaplePress uploads)
- Different CORS requirements (frontend uploads directly to Spaces)
- Separate billing and storage tracking
- Independent lifecycle management
---
## Table of Contents
1. [Create MapleFile Space](#step-1-create-maplefile-space)
2. [Configure CORS for Browser Uploads](#step-2-configure-cors-for-browser-uploads)
3. [Create Docker Secrets](#step-3-create-docker-secrets)
4. [Verify Configuration](#step-4-verify-configuration)
5. [Troubleshooting](#troubleshooting)
---
## Step 1: Create MapleFile Space
### 1.1 Create Space via Dashboard
1. Log into DigitalOcean dashboard: https://cloud.digitalocean.com
2. Click **Manage****Spaces Object Storage** in left sidebar
3. Click **Create a Space**
4. Configure:
- **Choose a datacenter region**: Same as your droplets (e.g., `tor1` - Toronto)
- **Enable CDN**: ✅ Yes (recommended)
- **Choose a unique name**: `maplefile` (must be globally unique)
- **Select a project**: Your project (e.g., "mapleopentech Production")
5. Click **Create a Space**
**Expected output:**
- Space created successfully
- Space URL: `https://maplefile.tor1.digitaloceanspaces.com`
### 1.2 Record Space Information
**Save these values:**
```bash
# Space Name
SPACE_NAME=maplefile
# Endpoint (without https://)
SPACE_ENDPOINT=tor1.digitaloceanspaces.com
# Region code
SPACE_REGION=tor1
# Full URL
SPACE_URL=https://maplefile.tor1.digitaloceanspaces.com
```
**✅ Checkpoint:** Space created and URL recorded
---
## Step 2: Configure CORS for Browser Uploads
**CRITICAL**: This step is required for browser-based file uploads. Without CORS configuration, users will get errors when trying to upload files from the MapleFile frontend.
### 2.1 Create CORS Configuration File
**On your local machine:**
```bash
# Create CORS configuration
cat > /tmp/maplefile-cors.json << 'EOF'
{
"CORSRules": [
{
"AllowedOrigins": [
"http://localhost:5173",
"http://localhost:3000",
"https://maplefile.ca",
"https://www.maplefile.ca"
],
"AllowedMethods": [
"GET",
"PUT",
"HEAD",
"DELETE"
],
"AllowedHeaders": [
"*"
],
"MaxAgeSeconds": 3600
}
]
}
EOF
```
**Note:** Update `AllowedOrigins` to include:
- Your development URLs (`http://localhost:5173`)
- Your production domain(s) (`https://maplefile.ca`)
### 2.2 Apply CORS Configuration
```bash
# Apply CORS to MapleFile space
aws s3api put-bucket-cors \
--bucket maplefile \
--cors-configuration file:///tmp/maplefile-cors.json \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# Should return no output (success)
```
### 2.3 Verify CORS Configuration
```bash
# Check current CORS settings
aws s3api get-bucket-cors \
--bucket maplefile \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
```
**Expected output:**
```json
{
"CORSRules": [
{
"AllowedHeaders": ["*"],
"AllowedMethods": ["GET", "PUT", "HEAD", "DELETE"],
"AllowedOrigins": [
"http://localhost:5173",
"http://localhost:3000",
"https://maplefile.ca",
"https://www.maplefile.ca"
],
"MaxAgeSeconds": 3600
}
]
}
```
### 2.4 Test CORS with Preflight Request
```bash
# Test OPTIONS preflight request
curl -I -X OPTIONS \
-H "Origin: http://localhost:5173" \
-H "Access-Control-Request-Method: PUT" \
"https://maplefile.tor1.digitaloceanspaces.com/test"
# Should return headers like:
# access-control-allow-origin: http://localhost:5173
# access-control-allow-methods: GET, PUT, HEAD, DELETE
```
**✅ Checkpoint:** CORS configuration applied and verified
---
## Step 3: Create Docker Secrets
If you're using separate API keys for MapleFile (recommended), create new secrets.
**On manager node:**
```bash
ssh dockeradmin@<manager-public-ip>
```
### 3.1 Create MapleFile Spaces Secrets
```bash
# Create secret for access key
echo -n "YOUR_ACCESS_KEY" | docker secret create maplefile_spaces_access_key -
# Create secret for secret key
echo -n "YOUR_SECRET_KEY" | docker secret create maplefile_spaces_secret_key -
# Verify
docker secret ls | grep maplefile_spaces
```
**If using same API keys as MaplePress**, you can reuse the existing secrets:
- `spaces_access_key`
- `spaces_secret_key`
### 3.2 Verify All MapleFile Secrets
```bash
docker secret ls | grep -E "maplefile|spaces"
```
**You should see:**
```
ID NAME CREATED
abc123... maplefile_jwt_secret from 09_maplefile_backend.md
def456... maplefile_ip_encryption_key from 09_maplefile_backend.md
ghi789... maplefile_spaces_access_key NEW!
jkl012... maplefile_spaces_secret_key NEW!
```
**✅ Checkpoint:** Docker secrets created
---
## Step 4: Verify Configuration
### 4.1 Test File Upload
```bash
# Create test file
echo "MapleFile test upload" > /tmp/maplefile-test.txt
# Upload to space
aws s3 cp /tmp/maplefile-test.txt s3://maplefile/test/test-file.txt \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# Should show: upload: /tmp/maplefile-test.txt to s3://maplefile/test/test-file.txt
```
### 4.2 Test File Download
```bash
# Download from space
aws s3 cp s3://maplefile/test/test-file.txt /tmp/downloaded-test.txt \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# Verify content
cat /tmp/downloaded-test.txt
# Should show: MapleFile test upload
```
### 4.3 Test Presigned URL Generation
The backend will generate presigned URLs for secure uploads. Test manually:
```bash
# Generate presigned upload URL (valid for 1 hour)
aws s3 presign s3://maplefile/test/presigned-test.txt \
--expires-in 3600 \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# Returns a URL like:
# https://maplefile.tor1.digitaloceanspaces.com/test/presigned-test.txt?X-Amz-Algorithm=...
```
### 4.4 Clean Up Test Files
```bash
# Delete test files
aws s3 rm s3://maplefile/test/ --recursive \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# Clean up local files
rm /tmp/maplefile-test.txt /tmp/downloaded-test.txt
```
**✅ Checkpoint:** Upload, download, and presigned URLs working
---
## Troubleshooting
### Problem: CORS Error on File Upload
**Symptom:** Browser console shows:
```
Access to fetch at 'https://maplefile.tor1.digitaloceanspaces.com/...' from origin 'http://localhost:5173' has been blocked by CORS policy
```
**Causes and Solutions:**
1. **CORS not configured:**
```bash
# Check CORS settings
aws s3api get-bucket-cors \
--bucket maplefile \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# If empty or error, apply CORS configuration (Step 2)
```
2. **Origin not in AllowedOrigins:**
```bash
# Update CORS to include your frontend URL
# Edit /tmp/maplefile-cors.json and re-apply
```
3. **Missing HTTP method:**
- Ensure `PUT` is in `AllowedMethods` (required for presigned URL uploads)
### Problem: "AccessDenied" on Upload
**Symptom:** Presigned URL returns 403 Forbidden
**Causes:**
1. **Presigned URL expired:**
- URLs have expiration time (default: 15 minutes)
- Generate new URL and retry
2. **Wrong bucket in URL:**
- Verify bucket name matches exactly
3. **Incorrect content type:**
- Ensure Content-Type header matches what was signed
### Problem: "SignatureDoesNotMatch" Error
**Symptom:** Upload fails with signature error
**Causes:**
1. **Modified request headers:**
- Don't add extra headers not in the signed request
2. **Wrong region in endpoint:**
- Ensure endpoint matches bucket region
3. **Clock skew:**
- Ensure system clock is synchronized
### Problem: Files Upload but Can't Download
**Symptom:** Upload succeeds but download returns 403
**Causes:**
1. **ACL not set:**
- For public files, ensure ACL is set correctly
- MapleFile uses private files with presigned download URLs
2. **Wrong presigned URL:**
- Generate download URL, not upload URL
### Problem: CORS Works in Dev but Not Production
**Symptom:** Uploads work locally but fail in production
**Solution:**
```bash
# Add production domain to CORS
# Edit /tmp/maplefile-cors.json:
"AllowedOrigins": [
"http://localhost:5173",
"https://maplefile.ca",
"https://www.maplefile.ca",
"https://app.maplefile.ca" # Add your production URLs
]
# Re-apply CORS
aws s3api put-bucket-cors \
--bucket maplefile \
--cors-configuration file:///tmp/maplefile-cors.json \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
```
---
## Backend Configuration
The MapleFile backend uses these environment variables:
```yaml
# In maplefile-stack.yml
environment:
- S3_ENDPOINT=https://tor1.digitaloceanspaces.com
- S3_PUBLIC_ENDPOINT=https://maplefile.tor1.digitaloceanspaces.com
- S3_BUCKET=maplefile
- S3_REGION=tor1
- S3_USE_SSL=true
secrets:
- maplefile_spaces_access_key
- maplefile_spaces_secret_key
```
---
## Next Steps
**You now have:**
- DigitalOcean Space for MapleFile
- CORS configured for browser uploads
- Docker secrets created
- Verified upload/download functionality
**Next guide:**
- Continue with **09_maplefile_backend.md** to deploy the backend
**Useful Commands:**
```bash
# List all files in MapleFile space
aws s3 ls s3://maplefile --recursive \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# Get total space usage
aws s3 ls s3://maplefile --recursive --summarize --human-readable \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# Update CORS (after editing JSON)
aws s3api put-bucket-cors \
--bucket maplefile \
--cors-configuration file:///tmp/maplefile-cors.json \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
# View current CORS
aws s3api get-bucket-cors \
--bucket maplefile \
--endpoint-url https://tor1.digitaloceanspaces.com \
--profile digitalocean
```
---
**Last Updated**: November 2025
**Maintained By**: Infrastructure Team
**Changelog:**
- November 2025: Initial MapleFile Spaces setup guide with CORS configuration for browser uploads

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,874 @@
# Deploy Caddy Reverse Proxy with Automatic SSL: Part 2
**Audience**: Junior DevOps Engineers, Infrastructure Team
**Time to Complete**: 20-30 minutes
**Prerequisites**:
- ✅ Completed guide **09_maplefile_backend.md** (Backend deployed and running)
- ✅ Backend service accessible on `maple-public-prod` network
- ✅ Domain name `maplefile.ca` pointing to worker-8 public IP
- ✅ Email address for Let's Encrypt SSL certificate notifications
---
## Overview
This guide configures **Caddy** as a reverse proxy with automatic SSL/TLS certificate management for your MapleFile backend.
### What is a Reverse Proxy?
Think of a reverse proxy as a "receptionist" for your backend:
1. **Internet user** → Makes request to `https://maplefile.ca`
2. **Caddy (receptionist)** → Receives the request
- Handles SSL/TLS (HTTPS encryption)
- Checks rate limits
- Adds security headers
3. **Caddy forwards** → Sends request to your backend at `http://maplefile-backend:8000`
4. **Backend** → Processes request, sends response back
5. **Caddy** → Returns response to user
**Why use a reverse proxy?**
- Your backend doesn't need to handle SSL certificates
- One place to manage security, rate limiting, and headers
- Can load balance across multiple backend instances
- Protects your backend from direct internet exposure
### Why Caddy Instead of NGINX?
**Caddy's killer feature: Automatic HTTPS**
- Caddy automatically gets SSL certificates from Let's Encrypt
- Automatically renews them before expiry (no cron jobs!)
- Zero manual certificate management
- Simpler configuration (10 lines vs 200+ for NGINX)
**What you'll build:**
- Caddy reverse proxy on worker-8
- Automatic SSL certificate from Let's Encrypt
- HTTP to HTTPS automatic redirection
- Security headers and rate limiting
- Zero-downtime certificate renewals (automatic)
**Architecture:**
```
Internet
↓ HTTPS (port 443)
Caddy (worker-8)
↓ HTTP (port 8000, internal network only)
Backend (worker-8)
↓ Private network
Databases (Cassandra, Redis on other workers)
```
**Key concept:** Caddy and Backend are both on worker-8, connected via the `maple-public-prod` Docker overlay network. Caddy can reach Backend by the hostname `maplefile-backend` - Docker's built-in DNS resolves this to the backend container's IP automatically.
---
## Step 1: Verify DNS Configuration
Before deploying Caddy, your domain must point to worker-8 (where Caddy will run).
### 1.1 Check Current DNS
**From your local machine:**
```bash
# Check where your domain currently points
dig maplefile.ca +short
# Should return worker-8's public IP (e.g., 143.110.212.253)
# If it returns nothing or wrong IP, continue to next step
```
### 1.2 Update DNS Records
**If DNS is not configured or points to wrong server:**
1. Log into your domain registrar (where you bought `maplefile.ca`)
2. Find DNS settings / DNS management / Manage DNS
3. Add or update these A records:
| Type | Name | Value | TTL |
|------|------|-------|-----|
| A | @ | `143.110.212.253` | 3600 |
| A | www | `143.110.212.253` | 3600 |
**Replace `143.110.212.253` with YOUR worker-8 public IP!**
**What this does:**
- `@` record: Makes `maplefile.ca` point to worker-8
- `www` record: Makes `www.maplefile.ca` point to worker-8
- Both domains will work with Caddy
### 1.3 Wait for DNS Propagation
DNS changes take 5-10 minutes (sometimes up to 1 hour).
**Test from your local machine:**
```bash
# Test root domain
dig maplefile.ca +short
# Should return: 143.110.212.253 (your worker-8 IP)
# Test www subdomain
dig www.maplefile.ca +short
# Should return: 143.110.212.253 (your worker-8 IP)
# Alternative test
nslookup maplefile.ca
# Should show: Address: 143.110.212.253
```
**Keep testing every minute until both commands return worker-8's public IP.**
⚠️ **CRITICAL:** Do NOT proceed until DNS resolves correctly! Caddy cannot get SSL certificates if DNS doesn't point to the right server.
### 1.4 Verify Firewall Allows HTTP/HTTPS
**On worker-8, check firewall:**
```bash
# SSH to worker-8
ssh dockeradmin@143.110.212.253
# Check firewall rules
sudo ufw status | grep -E "80|443"
# Should show:
# 80/tcp ALLOW Anywhere
# 443/tcp ALLOW Anywhere
```
**If ports are NOT open:**
```bash
# Allow HTTP (needed for Let's Encrypt)
sudo ufw allow 80/tcp
# Allow HTTPS (needed for encrypted traffic)
sudo ufw allow 443/tcp
# Verify
sudo ufw status | grep -E "80|443"
# Exit back to local machine
exit
```
**✅ Checkpoint:** DNS resolves to worker-8, ports 80 and 443 are open
---
## Step 2: Prepare Caddy Configuration
### 2.1 Create Caddy Config Directory
**On manager node:**
```bash
# SSH to manager
ssh dockeradmin@143.110.210.162
# Create directory for Caddy config
cd ~/stacks
mkdir -p maplefile-caddy-config
cd maplefile-caddy-config
```
### 2.2 Create Caddyfile
The **Caddyfile** is Caddy's configuration file. It's much simpler than NGINX config.
```bash
vi Caddyfile
```
**Paste this configuration:**
```caddy
{
# Global options
email your-email@example.com
# Use Let's Encrypt production (not staging)
# Staging is for testing - production is for real certificates
acme_ca https://acme-v02.api.letsencrypt.org/directory
}
# Your domain configuration
maplefile.ca www.maplefile.ca {
# Reverse proxy all requests to backend
reverse_proxy maplefile-backend:8000 {
# Forward real client IP to backend
header_up X-Real-IP {remote_host}
header_up X-Forwarded-For {remote_host}
header_up X-Forwarded-Proto {scheme}
header_up X-Forwarded-Host {host}
# Preserve Origin header for CORS (required for frontend)
header_up Origin {http.request.header.Origin}
}
# Logging
log {
output stdout
format json
level INFO
}
# Security headers (Caddy adds many by default)
header {
# Prevent clickjacking
X-Frame-Options "SAMEORIGIN"
# Prevent MIME type sniffing
X-Content-Type-Options "nosniff"
# Enable XSS protection
X-XSS-Protection "1; mode=block"
# HSTS - Force HTTPS for 1 year
Strict-Transport-Security "max-age=31536000; includeSubDomains"
# Control referrer information
Referrer-Policy "strict-origin-when-cross-origin"
# Remove Server header (security by obscurity)
-Server
}
# Rate limiting (requires Caddy plugin - see note below)
# For basic setups, you can skip this or add later
}
```
**Important replacements:**
1. Replace `your-email@example.com` with your real email (Let's Encrypt sends expiry warnings here)
2. Domain names are already set to `maplefile.ca` and `www.maplefile.ca`
3. Backend hostname is already set to `maplefile-backend:8000`
Save: `Esc`, then `:wq`, then `Enter`
**Understanding the config:**
- **`maplefile-backend:8000`** - This is how Caddy reaches your backend
- `maplefile-backend` = hostname of your backend service (Docker DNS resolves this)
- `8000` = port your backend listens on
- No IP address needed - Docker overlay network handles it!
**Important: Service Name vs Hostname**
When you run `docker service ls`, you see:
```
maplefile_backend 1/1 registry.digitalocean.com/ssp/maplefile-backend:prod
```
But in the Caddyfile, we use `maplefile-backend:8000`, not `maplefile_backend:8000`. Why?
- **Service name** (`maplefile_backend`): How Docker Swarm identifies the service
- Used in: `docker service ls`, `docker service logs maplefile_backend`
- Format: `{stack-name}_{service-name}`
- **Hostname** (`maplefile-backend`): How containers reach each other on the network
- Used in: Caddyfile, application configs, container-to-container communication
- Defined in the stack file: `hostname: maplefile-backend`
**Think of it like this:**
- Service name = The employee's official HR name (full legal name)
- Hostname = The nickname everyone uses in the office
Other containers don't care about the service name - they use the hostname for DNS resolution.
- **`header_up`** - Passes information to your backend about the real client
- Without this, backend would think all requests come from Caddy
- Your backend can log real client IPs for security/debugging
- **Security headers** - Tell browsers how to handle your site securely
- HSTS: Forces browsers to always use HTTPS
- X-Frame-Options: Prevents your site being embedded in iframes (clickjacking protection)
- X-Content-Type-Options: Prevents MIME confusion attacks
### 2.3 Understanding the Automatic SSL Magic
**What happens when Caddy starts:**
1. Caddy sees `maplefile.ca` in the Caddyfile
2. Caddy checks if domain points to this server (DNS check)
3. Caddy requests SSL certificate from Let's Encrypt
4. Let's Encrypt does a challenge (HTTP-01 via port 80)
5. Caddy receives certificate and stores it in `/data/caddy`
6. Caddy automatically serves HTTPS on port 443
7. Caddy automatically redirects HTTP → HTTPS
**You don't have to:**
- Manually run certbot commands
- Stop the server to renew certificates
- Set up cron jobs
- Mount certificate directories
**Caddy handles ALL of this automatically!**
---
## Step 3: Deploy Caddy Service
### 3.1 Update Stack File to Add Caddy
We need to UPDATE the existing `maplefile-stack.yml` file to add the `backend-caddy` service.
**On manager node:**
```bash
cd ~/stacks
vi maplefile-stack.yml
```
**Add the following sections to your existing stack file:**
**First, add volumes section after networks (if not already there):**
```yaml
volumes:
caddy_data:
# Caddy stores certificates here
caddy_config:
# Caddy stores config cache here
```
**Then, add configs section after volumes:**
```yaml
configs:
caddyfile:
file: ./maplefile-caddy-config/Caddyfile
```
**Finally, add the backend-caddy service after the backend service:**
```yaml
backend-caddy:
image: caddy:2.9.1-alpine
hostname: maplefile-caddy
networks:
- maple-public-prod
ports:
# Port 80 - HTTP (for Let's Encrypt challenges and HTTP→HTTPS redirect)
# Using mode: host to bind directly to worker-8's network interface
- target: 80
published: 80
protocol: tcp
mode: host
# Port 443 - HTTPS (encrypted traffic)
- target: 443
published: 443
protocol: tcp
mode: host
# Port 443 UDP - HTTP/3 support (optional, modern protocol)
- target: 443
published: 443
protocol: udp
mode: host
configs:
# Docker config - automatically distributed to worker-8
- source: caddyfile
target: /etc/caddy/Caddyfile
volumes:
# Persistent storage for certificates
- caddy_data:/data
# Persistent storage for config cache
- caddy_config:/config
deploy:
replicas: 1
placement:
constraints:
# Deploy on same node as backend (worker-8)
- node.labels.maplefile-backend == true
restart_policy:
condition: on-failure
delay: 5s
# Note: No max_attempts - Docker will keep trying indefinitely
# This prevents the service from scaling to 0 after a few failures
update_config:
# Rolling updates (zero downtime)
parallelism: 1
delay: 10s
order: start-first
resources:
limits:
# Caddy is lightweight - 256MB is plenty
memory: 256M
reservations:
memory: 128M
# Note: No healthcheck - Caddy's built-in health monitoring is sufficient
# Docker healthchecks can cause SIGTERM shutdowns during startup or cert renewal
```
Save: `Esc`, then `:wq`, then `Enter`
**Understanding the stack file:**
- **`maple-public-prod` network**: Shared network with backend
- Both Caddy and Backend are connected here
- Allows Caddy to reach Backend by hostname
- `external: true` means we created this network earlier (in 09_maplefile_backend.md)
- **Ports** (using `mode: host`):
- Port 80 (HTTP) - Needed for Let's Encrypt certificate challenges
- Port 443 (HTTPS TCP) - Encrypted traffic
- Port 443 (HTTPS UDP) - HTTP/3 support
- **Why `mode: host`?** Binds directly to worker-8's network interface
- `mode: ingress` (default) uses Docker Swarm routing mesh (any node can accept traffic)
- `mode: host` binds only on the specific node running Caddy
- Since we're pinning Caddy to worker-8 anyway, `host` mode is more reliable
- Prevents potential routing issues with Let's Encrypt challenges
- **Configs** (not volumes for Caddyfile):
- `caddyfile` - Docker config that's automatically distributed to worker-8
- Why not a volume mount? Because the file is on the manager, but Caddy runs on worker-8
- Docker configs solve this: they're stored in the swarm and sent to the right node
- Configs are immutable - to update, you must redeploy the stack
- **Volumes**:
- `caddy_data` - Stores SSL certificates (persists across restarts)
- `caddy_config` - Stores runtime config cache (persists across restarts)
- Why separate from backend data? So certificate renewals don't affect backend storage
- Volumes persist even if Caddy container is recreated
- **Placement constraint**:
- `node.labels.maplefile-backend == true` - Same as backend (worker-8)
- Caddy and Backend MUST be on the same node to share `maple-public-prod` network
- Docker overlay networks work best when services are colocated
### 3.2 Deploy Updated Stack
**On manager node:**
```bash
# Deploy the updated stack
docker stack deploy -c ~/stacks/maplefile-stack.yml maplefile
# Check both services are running
docker service ls | grep maplefile
# Should show:
# maplefile_backend 1/1 registry.digitalocean.com/ssp/maplefile-backend:prod
# maplefile_backend-caddy 1/1 caddy:2.9.1-alpine
```
**Expected output:**
```
yexoj87lb67j maplefile_backend replicated 1/1 registry.digitalocean.com/ssp/maplefile-backend:prod
abc123xyz456 maplefile_backend-caddy replicated 1/1 caddy:2.9.1-alpine
```
### 3.3 Watch Caddy Start and Get SSL Certificate
**This is the exciting part - watch Caddy automatically get your SSL certificate!**
```bash
# Watch Caddy logs (real-time)
docker service logs -f maplefile_backend-caddy
# You'll see something like this:
# {"level":"info","msg":"using provided configuration","config_file":"/etc/caddy/Caddyfile"}
# {"level":"info","msg":"obtaining certificate","domain":"maplefile.ca"}
# {"level":"info","msg":"validating authorization","domain":"maplefile.ca","challenge":"http-01"}
# {"level":"info","msg":"authorization finalized","domain":"maplefile.ca"}
# {"level":"info","msg":"certificate obtained successfully","domain":"maplefile.ca"}
# {"level":"info","msg":"serving initial configuration"}
```
**Press `Ctrl+C` to exit log streaming when you see "certificate obtained successfully"**
**What just happened?**
1. Caddy loaded the Caddyfile
2. Caddy saw `maplefile.ca` and checked DNS
3. Caddy requested a certificate from Let's Encrypt
4. Let's Encrypt sent an HTTP challenge to port 80
5. Caddy responded to the challenge
6. Let's Encrypt verified ownership and issued the certificate
7. Caddy stored the certificate in the `caddy_data` volume
8. Caddy started serving HTTPS on port 443
**All of this happened in ~10-30 seconds, completely automatically!**
---
## Step 4: Test Your HTTPS Site
### 4.1 Test HTTP to HTTPS Redirect
**From your local machine:**
```bash
# Test HTTP (port 80) - should redirect to HTTPS
curl -I http://maplefile.ca
# Expected response:
# HTTP/1.1 308 Permanent Redirect
# Location: https://maplefile.ca/
```
**What this means:**
- Caddy received HTTP request
- Caddy automatically redirected to HTTPS
- Browser will follow redirect and use HTTPS
### 4.2 Test HTTPS Connection
```bash
# Test HTTPS (port 443)
curl -I https://maplefile.ca/health
# Expected response:
# HTTP/2 200
# Content-Type: application/json
# (Your backend's response)
```
**If you see HTTP/2 200, congratulations! Your site is:**
- ✅ Serving over HTTPS
- ✅ Using HTTP/2 (faster than HTTP/1.1)
- ✅ Protected by a valid Let's Encrypt SSL certificate
- ✅ Automatically redirecting HTTP to HTTPS
### 4.3 Test in Browser
**Open your browser and visit:**
1. `http://maplefile.ca/version` - Should redirect to HTTPS
2. `https://maplefile./version` - Should show your backend's response
3. `https://www.maplefile.ca/version` - Should also work (www subdomain)
**Click the padlock icon in your browser address bar:**
- Should show "Connection is secure"
- Certificate issued by: Let's Encrypt
- Valid for: `maplefile.ca` and `www.maplefile.ca`
- Expires in: ~90 days (Caddy will auto-renew at 60 days)
### 4.4 Test SSL Certificate
**Use SSL Labs to test your certificate (optional but recommended):**
1. Visit: https://www.ssllabs.com/ssltest/
2. Enter: `maplefile.ca`
3. Click "Submit"
4. Wait 2-3 minutes for the test
**Expected grade: A or A+**
If you get less than A, check:
- Security headers in Caddyfile
- HSTS header is present
- No insecure protocols enabled
---
## Step 5: Verify Services
### 5.1 Check All Services Running
```bash
# List all maplefile services
docker service ls | grep maplefile
# Expected output:
# maplefile_backend 1/1 registry.digitalocean.com/ssp/maplefile-backend:prod
# maplefile_backend-caddy 1/1 caddy:2.9.1-alpine
```
### 5.2 Check Service Tasks
```bash
# Check backend tasks
docker service ps maplefile_backend
# Check caddy tasks
docker service ps maplefile_backend-caddy
# Both should show:
# CURRENT STATE: Running X minutes ago
# No ERROR messages
```
### 5.3 Test Backend Health
```bash
# Test backend health endpoint directly (from manager)
curl http://maplefile.ca/health
# Expected: {"status":"healthy"} or similar
# Test through Caddy (HTTPS)
curl https://maplefile.ca/health
# Should return the same response
```
---
## Troubleshooting
### Problem: Caddy Can't Get SSL Certificate
**Symptom:** Caddy logs show "failed to obtain certificate" or "challenge failed"
**Causes and fixes:**
1. **DNS not pointing to worker-8**
```bash
# Test DNS
dig maplefile.ca +short
# Should return worker-8's public IP (143.110.212.253)
# If wrong, update DNS records and wait for propagation (5-60 min)
```
2. **Port 80 not accessible**
```bash
# Test from outside
curl -I http://maplefile.ca
# If connection refused, check firewall
ssh dockeradmin@143.110.212.253
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
```
3. **Caddyfile has wrong domain**
```bash
# Check Caddyfile on manager
cat ~/stacks/maplefile-caddy-config/Caddyfile
# Should show: maplefile.ca www.maplefile.ca
# If wrong, edit and redeploy
vi ~/stacks/maplefile-caddy-config/Caddyfile
docker stack deploy -c ~/stacks/maplefile-stack.yml maplefile
```
4. **Let's Encrypt rate limit (5 certificates per week)**
```bash
# Check Caddy logs for rate limit message
docker service logs maplefile_backend-caddy | grep -i "rate limit"
# If rate limited, wait 7 days or use staging for testing
# Edit Caddyfile to use staging:
# acme_ca https://acme-staging-v02.api.letsencrypt.org/directory
```
### Problem: HTTP Not Redirecting to HTTPS
**Symptom:** `http://maplefile.ca` doesn't redirect
**Fix:**
```bash
# Check Caddy is running
docker service ps maplefile_backend-caddy
# Check Caddy logs
docker service logs maplefile_backend-caddy --tail 50
# Caddy should automatically redirect HTTP to HTTPS
# If not, check Caddyfile syntax
```
### Problem: Backend Not Reachable Through Caddy
**Symptom:** HTTPS works but returns 502 Bad Gateway
**Causes:**
1. **Backend not running**
```bash
docker service ps maplefile_backend
# Should show: Running
```
2. **Backend not on maple-public-prod network**
```bash
# Check backend networks
docker service inspect maplefile_backend --format '{{json .Spec.TaskTemplate.Networks}}'
# Should include maple-public-prod
```
3. **Wrong hostname in Caddyfile**
```bash
# Check Caddyfile
cat ~/stacks/maplefile-caddy-config/Caddyfile | grep reverse_proxy
# Should show: reverse_proxy maplefile-backend:8000
# NOT: maplefile_backend:8000 (wrong - underscore instead of hyphen)
```
### Problem: Certificate Renewal Fails
**Symptom:** Certificate expires or renewal warnings in logs
**Fix:**
```bash
# Check Caddy logs for renewal attempts
docker service logs maplefile_backend-caddy | grep -i renew
# Caddy renews at 60 days (certificate valid for 90 days)
# If renewal fails, check:
# 1. DNS still points to worker-8
# 2. Port 80 still open
# 3. Caddy service still running
# Force renewal (if needed)
# Restart Caddy service
docker service update --force maplefile_backend-caddy
```
---
## Maintenance
### Updating Caddyfile
When you need to change Caddy configuration:
```bash
# 1. Edit Caddyfile on manager
ssh dockeradmin@143.110.210.162
vi ~/stacks/maplefile-caddy-config/Caddyfile
# 2. Redeploy stack
docker stack deploy -c ~/stacks/maplefile-stack.yml maplefile
# 3. Watch Caddy reload
docker service logs -f maplefile_backend-caddy
# Caddy will gracefully reload with zero downtime
```
### Monitoring SSL Certificate Expiry
```bash
# Check certificate expiry
echo | openssl s_client -servername maplefile.ca -connect maplefile.ca:443 2>/dev/null | openssl x509 -noout -dates
# Returns:
# notBefore=Jan 15 12:00:00 2025 GMT
# notAfter=Apr 15 12:00:00 2025 GMT
# Caddy automatically renews at 60 days (30 days before expiry)
```
### Viewing Caddy Access Logs
```bash
# Real-time logs
docker service logs -f maplefile_backend-caddy
# Last 100 lines
docker service logs maplefile_backend-caddy --tail 100
# Filter for errors
docker service logs maplefile_backend-caddy | grep -i error
```
---
## Security Best Practices
### 1. Keep Caddy Updated
```bash
# Check current version
docker service inspect maplefile_backend-caddy --format '{{.Spec.TaskTemplate.ContainerSpec.Image}}'
# Update to latest (in stack file)
vi ~/stacks/maplefile-stack.yml
# Change: image: caddy:2.9.1-alpine
# To: image: caddy:2.10.0-alpine (or latest version)
# Redeploy
docker stack deploy -c ~/stacks/maplefile-stack.yml maplefile
```
### 2. Monitor Certificate Health
Set up monitoring to alert before certificate expiry:
- Let's Encrypt certificates expire in 90 days
- Caddy renews at 60 days
- Monitor renewal attempts in logs
- Set up alerts if renewal fails
### 3. Review Access Logs Regularly
```bash
# Check for suspicious access patterns
docker service logs maplefile_backend-caddy | grep -E "404|403|500"
# Look for unusual traffic spikes
docker service logs maplefile_backend-caddy | grep -i "POST\|PUT\|DELETE"
```
---
## Summary
**What you've accomplished:**
✅ Deployed Caddy reverse proxy on worker-8
✅ Obtained automatic SSL certificate from Let's Encrypt
✅ Configured HTTPS with HTTP/2 support
✅ Set up automatic HTTP → HTTPS redirects
✅ Added security headers (HSTS, X-Frame-Options, etc.)
✅ Configured Caddy to forward client IPs to backend
✅ Set up automatic certificate renewal (every 60 days)
**Your MapleFile backend is now:**
- Publicly accessible at `https://maplefile.ca`
- Protected by SSL/TLS encryption
- Behind a reverse proxy for security
- Automatically renewing certificates
- Serving HTTP/2 for better performance
**Next steps:**
- Deploy MapleFile frontend (connects to this backend)
- Set up monitoring and alerting
- Configure backups for Caddy volumes
- Review and tune security headers
- Set up rate limiting (if needed)
**Important URLs:**
- Backend API: `https://maplefile.ca`
- Health check: `https://maplefile.ca/health`
- SSL Labs test: https://www.ssllabs.com/ssltest/analyze.html?d=maplefile.ca
---
## Quick Reference
### Common Commands
```bash
# View Caddy logs
docker service logs -f maplefile_backend-caddy
# Restart Caddy (zero downtime)
docker service update --force maplefile_backend-caddy
# Update Caddyfile and reload
vi ~/stacks/maplefile-caddy-config/Caddyfile
docker stack deploy -c ~/stacks/maplefile-stack.yml maplefile
# Check SSL certificate
echo | openssl s_client -servername maplefile.ca -connect maplefile.ca:443 2>/dev/null | openssl x509 -noout -dates
# Test HTTPS
curl -I https://maplefile.ca
# Check service status
docker service ps maplefile_backend-caddy
```
### File Locations
- Caddyfile: `~/stacks/maplefile-caddy-config/Caddyfile`
- Stack file: `~/stacks/maplefile-stack.yml`
- Certificates: Stored in `caddy_data` Docker volume
- Config cache: Stored in `caddy_config` Docker volume
---
**🎉 Congratulations!** Your MapleFile backend is now securely accessible over HTTPS with automatic SSL certificate management!

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,898 @@
# Extra Operations and Domain Changes
**Audience**: DevOps Engineers, Infrastructure Team
**Time to Complete**: Varies by operation
**Prerequisites**: Completed guides 01-07 (full MaplePress deployment)
---
## Overview
This guide covers additional operations and changes that you might need to perform on your production infrastructure:
1. **Domain Changes**
- Changing backend domain (e.g., `getmaplepress.ca``getmaplepress.net`)
- Changing frontend domain (e.g., `getmaplepress.com``getmaplepress.app`)
2. **SSL Certificate Management**
3. **Scaling Operations**
4. **Backup and Recovery**
---
## Table of Contents
1. [Change Backend Domain](#operation-1-change-backend-domain)
2. [Change Frontend Domain](#operation-2-change-frontend-domain)
3. [Change Both Domains](#operation-3-change-both-domains-at-once)
4. [Force SSL Certificate Renewal](#operation-4-force-ssl-certificate-renewal)
5. [Scale Backend Horizontally](#operation-5-scale-backend-horizontally)
---
## Operation 1: Change Backend Domain
**Scenario:** Changing backend API domain from `getmaplepress.ca``getmaplepress.net`
**Impact:**
- ✅ Backend becomes available at new domain
- ❌ Old domain stops working
- ⚠️ Frontend needs CORS update to allow new backend domain
- ⚠️ SSL certificate automatically obtained for new domain
- ⚠️ Downtime: ~2-5 minutes during redeployment
### Step 1: DNS Configuration
**First, point the new domain to worker-6:**
1. Log into your DNS provider (DigitalOcean, Cloudflare, etc.)
2. Create DNS A records for new domain:
```
Type: A Record
Name: getmaplepress.net
Value: <worker-6-public-ip>
TTL: 300 (5 minutes)
Type: A Record
Name: www.getmaplepress.net
Value: <worker-6-public-ip>
TTL: 300
```
3. Wait for DNS propagation (5-60 minutes):
```bash
# Test DNS from your local machine
dig getmaplepress.net +short
# Should show: <worker-6-public-ip>
dig www.getmaplepress.net +short
# Should show: <worker-6-public-ip>
# Alternative test
nslookup getmaplepress.net
```
### Step 2: Update Backend Caddyfile
**On manager node:**
```bash
ssh dockeradmin@<manager-public-ip>
cd ~/stacks/caddy-config
# Backup old Caddyfile
cp Caddyfile Caddyfile.backup.$(date +%Y%m%d)
# Edit Caddyfile
vi Caddyfile
```
**Change this:**
```caddy
# OLD DOMAIN
getmaplepress.ca www.getmaplepress.ca {
reverse_proxy maplepress-backend:8000 {
# ... config ...
}
}
```
**To this:**
```caddy
# NEW DOMAIN
getmaplepress.net www.getmaplepress.net {
reverse_proxy maplepress-backend:8000 {
header_up X-Real-IP {remote_host}
header_up X-Forwarded-For {remote_host}
header_up X-Forwarded-Proto {scheme}
header_up X-Forwarded-Host {host}
# IMPORTANT: Preserve Origin header for CORS
header_up Origin {http.request.header.Origin}
}
log {
output stdout
format json
level INFO
}
header {
X-Frame-Options "SAMEORIGIN"
X-Content-Type-Options "nosniff"
X-XSS-Protection "1; mode=block"
Strict-Transport-Security "max-age=31536000; includeSubDomains"
Referrer-Policy "strict-origin-when-cross-origin"
-Server
}
}
```
Save: `Esc`, `:wq`, `Enter`
### Step 3: Update CORS Configuration
**Update the stack file to allow the frontend to call the new backend domain:**
```bash
# Still on manager node
cd ~/stacks
vi maplepress-stack.yml
```
**Find this line:**
```yaml
- SECURITY_CORS_ALLOWED_ORIGINS=https://getmaplepress.com,https://www.getmaplepress.com
```
**No change needed** - The CORS config is for what origins can call the backend, not the backend's domain itself. The frontend (`getmaplepress.com`) will now call `getmaplepress.net` instead of `getmaplepress.ca`.
### Step 4: Redeploy Backend Stack
```bash
# Remove old stack
docker stack rm maplepress
sleep 10
# Remove old config (contains old domain)
docker config rm maplepress_caddyfile
# Deploy with new domain
docker stack deploy -c maplepress-stack.yml maplepress
# Watch services come up
docker service ps maplepress_backend
docker service ps maplepress_backend-caddy
```
### Step 5: Verify SSL Certificate
**Caddy will automatically obtain SSL certificates for the new domain:**
```bash
# Watch Caddy logs for certificate acquisition
docker service logs -f maplepress_backend-caddy
# You should see logs like:
# "certificate obtained successfully"
# "serving https://getmaplepress.net"
```
**Test from local machine:**
```bash
# Test new domain with HTTPS
curl -I https://getmaplepress.net/health
# Should return: HTTP/2 200
# Verify SSL certificate
curl -vI https://getmaplepress.net/health 2>&1 | grep "subject:"
# Should show: subject: CN=getmaplepress.net
# Test CORS
curl -v -H "Origin: https://getmaplepress.com" https://getmaplepress.net/health 2>&1 | grep "access-control-allow-origin"
# Should show: access-control-allow-origin: https://getmaplepress.com
```
### Step 6: Update Frontend to Use New Backend Domain
**On your local machine:**
```bash
cd ~/go/src/codeberg.org/mapleopentech/monorepo/web/maplepress-frontend
# Update production environment file
vi .env.production
```
**Change:**
```bash
# OLD
VITE_API_BASE_URL=https://getmaplepress.ca
# NEW
VITE_API_BASE_URL=https://getmaplepress.net
```
**Rebuild and redeploy frontend:**
```bash
# Build with new backend URL
npm run build
# Verify the new URL is in the build
grep -r "getmaplepress.net" dist/assets/*.js | head -2
# Should show: getmaplepress.net
# SSH to worker-7 and update the frontend build
ssh dockeradmin@<worker-7-public-ip>
cd /var/www/monorepo/web/maplepress-frontend
# Pull latest code
git pull origin main
# Rebuild
npm run build
# Verify symlink
ls -la /var/www/maplepress-frontend
# Should point to: /var/www/monorepo/web/maplepress-frontend/dist
exit
```
### Step 7: Test End-to-End
```bash
# Visit frontend in browser
open https://getmaplepress.com
# Open DevTools (F12) → Network tab
# Verify API calls now go to: https://getmaplepress.net
# Verify status: 200 (not 0 or CORS errors)
```
### Step 8: (Optional) Keep Old Domain Working
If you want both domains to work temporarily:
```bash
# Edit Caddyfile to include BOTH domains
vi ~/stacks/caddy-config/Caddyfile
```
```caddy
# Support both old and new domains
getmaplepress.ca www.getmaplepress.ca, getmaplepress.net www.getmaplepress.net {
reverse_proxy maplepress-backend:8000 {
# ... same config ...
}
}
```
Then redeploy as in Step 4.
### Rollback Procedure
If something goes wrong:
```bash
# 1. Restore old Caddyfile
cd ~/stacks/caddy-config
cp Caddyfile.backup.YYYYMMDD Caddyfile
# 2. Redeploy
cd ~/stacks
docker stack rm maplepress
sleep 10
docker config rm maplepress_caddyfile
docker stack deploy -c maplepress-stack.yml maplepress
# 3. Restore frontend .env.production
cd ~/go/src/codeberg.org/mapleopentech/monorepo/web/maplepress-frontend
# Change back to: VITE_API_BASE_URL=https://getmaplepress.ca
# Rebuild and redeploy
```
**✅ Backend domain change complete!**
---
## Operation 2: Change Frontend Domain
**Scenario:** Changing frontend domain from `getmaplepress.com``getmaplepress.app`
**Impact:**
- ✅ Frontend becomes available at new domain
- ❌ Old domain stops working
- ⚠️ Backend CORS needs update to allow new frontend domain
- ⚠️ SSL certificate automatically obtained for new domain
- ⚠️ Downtime: ~2-5 minutes during redeployment
### Step 1: DNS Configuration
**Point the new domain to worker-7:**
```
Type: A Record
Name: getmaplepress.app
Value: <worker-7-public-ip>
TTL: 300
Type: A Record
Name: www.getmaplepress.app
Value: <worker-7-public-ip>
TTL: 300
```
**Test DNS propagation:**
```bash
dig getmaplepress.app +short
# Should show: <worker-7-public-ip>
nslookup getmaplepress.app
```
### Step 2: Update Frontend Caddyfile
**On manager node:**
```bash
ssh dockeradmin@<manager-public-ip>
cd ~/stacks/maplepress-frontend-caddy-config
# Backup
cp Caddyfile Caddyfile.backup.$(date +%Y%m%d)
# Edit
vi Caddyfile
```
**Change this:**
```caddy
# OLD DOMAIN
getmaplepress.com www.getmaplepress.com {
root * /var/www/maplepress-frontend
# ... config ...
}
```
**To this:**
```caddy
# NEW DOMAIN
getmaplepress.app www.getmaplepress.app {
root * /var/www/maplepress-frontend
file_server
try_files {path} /index.html
encode gzip
log {
output stdout
format json
level INFO
}
header {
X-Frame-Options "SAMEORIGIN"
X-Content-Type-Options "nosniff"
X-XSS-Protection "1; mode=block"
Strict-Transport-Security "max-age=31536000; includeSubDomains"
Referrer-Policy "strict-origin-when-cross-origin"
-Server
}
@static {
path *.js *.css *.png *.jpg *.jpeg *.gif *.svg *.woff *.woff2 *.ttf *.eot *.ico
}
header @static Cache-Control "public, max-age=31536000, immutable"
}
```
Save: `Esc`, `:wq`, `Enter`
### Step 3: Update Backend CORS Configuration
**CRITICAL:** The backend needs to allow the new frontend domain:
```bash
cd ~/stacks
vi maplepress-stack.yml
```
**Find this line:**
```yaml
- SECURITY_CORS_ALLOWED_ORIGINS=https://getmaplepress.com,https://www.getmaplepress.com
```
**Change to:**
```yaml
- SECURITY_CORS_ALLOWED_ORIGINS=https://getmaplepress.app,https://www.getmaplepress.app
```
**If you want to support BOTH old and new domains temporarily:**
```yaml
- SECURITY_CORS_ALLOWED_ORIGINS=https://getmaplepress.com,https://www.getmaplepress.com,https://getmaplepress.app,https://www.getmaplepress.app
```
### Step 4: Redeploy Backend (for CORS update)
```bash
# Backend CORS config changed, must redeploy
docker stack rm maplepress
sleep 10
docker config rm maplepress_caddyfile
docker stack deploy -c maplepress-stack.yml maplepress
# Verify backend running
docker service ps maplepress_backend
```
### Step 5: Redeploy Frontend
```bash
# Remove frontend stack
docker stack rm maplepress-frontend
sleep 10
docker config rm maplepress-frontend_caddyfile
# Deploy with new domain
docker stack deploy -c maplepress-frontend-stack.yml maplepress-frontend
# Watch it come up
docker service ps maplepress-frontend_caddy
```
### Step 6: Verify SSL Certificate
**Test from local machine:**
```bash
# Test new frontend domain
curl -I https://getmaplepress.app
# Should return: HTTP/2 200
# Verify SSL certificate
curl -vI https://getmaplepress.app 2>&1 | grep "subject:"
# Should show: subject: CN=getmaplepress.app
```
### Step 7: Test CORS from New Frontend
```bash
# Visit new frontend in browser
open https://getmaplepress.app
# Open DevTools (F12)
# Network tab: Verify API calls succeed
# Console tab: Should be NO CORS errors
```
### Step 8: Verify Backend Accepts New Origin
```bash
# Test CORS from backend perspective
curl -v -H "Origin: https://getmaplepress.app" https://getmaplepress.ca/health 2>&1 | grep "access-control-allow-origin"
# Should show: access-control-allow-origin: https://getmaplepress.app
```
### Rollback Procedure
```bash
# 1. Restore old frontend Caddyfile
cd ~/stacks/maplepress-frontend-caddy-config
cp Caddyfile.backup.YYYYMMDD Caddyfile
# 2. Restore old backend CORS config
cd ~/stacks
vi maplepress-stack.yml
# Change back to: https://getmaplepress.com,https://www.getmaplepress.com
# 3. Redeploy both
docker stack rm maplepress
docker stack rm maplepress-frontend
sleep 10
docker config rm maplepress_caddyfile
docker config rm maplepress-frontend_caddyfile
docker stack deploy -c maplepress-stack.yml maplepress
docker stack deploy -c maplepress-frontend-stack.yml maplepress-frontend
```
**✅ Frontend domain change complete!**
---
## Operation 3: Change Both Domains at Once
**Scenario:** Changing both domains simultaneously:
- Backend: `getmaplepress.ca``api.maplepress.io`
- Frontend: `getmaplepress.com``app.maplepress.io`
**Benefits:**
- Single maintenance window
- Coordinated cutover
- Clean brand migration
**Downtime:** ~5-10 minutes
### Complete Process
```bash
# ==============================================================================
# STEP 1: DNS Configuration (Do this first, wait for propagation)
# ==============================================================================
# Backend DNS:
# A Record: api.maplepress.io → <worker-6-public-ip>
# A Record: www.api.maplepress.io → <worker-6-public-ip>
# Frontend DNS:
# A Record: app.maplepress.io → <worker-7-public-ip>
# A Record: www.app.maplepress.io → <worker-7-public-ip>
# Test DNS
dig api.maplepress.io +short # Should show worker-6 IP
dig app.maplepress.io +short # Should show worker-7 IP
# ==============================================================================
# STEP 2: Update Backend Caddyfile
# ==============================================================================
ssh dockeradmin@<manager-public-ip>
cd ~/stacks/caddy-config
cp Caddyfile Caddyfile.backup.$(date +%Y%m%d)
vi Caddyfile
# Change domain from getmaplepress.ca to api.maplepress.io
# (Keep all other config the same)
# ==============================================================================
# STEP 3: Update Frontend Caddyfile
# ==============================================================================
cd ~/stacks/maplepress-frontend-caddy-config
cp Caddyfile Caddyfile.backup.$(date +%Y%m%d)
vi Caddyfile
# Change domain from getmaplepress.com to app.maplepress.io
# (Keep all other config the same)
# ==============================================================================
# STEP 4: Update Backend CORS for New Frontend Domain
# ==============================================================================
cd ~/stacks
vi maplepress-stack.yml
# Change:
# - SECURITY_CORS_ALLOWED_ORIGINS=https://app.maplepress.io,https://www.app.maplepress.io
# ==============================================================================
# STEP 5: Update Frontend .env.production for New Backend
# ==============================================================================
ssh dockeradmin@<worker-7-public-ip>
cd /var/www/monorepo/web/maplepress-frontend
vi .env.production
# Change:
# VITE_API_BASE_URL=https://api.maplepress.io
# Rebuild
npm run build
# Verify new URL in build
grep -r "api.maplepress.io" dist/assets/*.js | head -2
exit
# ==============================================================================
# STEP 6: Coordinated Deployment (Back on Manager)
# ==============================================================================
ssh dockeradmin@<manager-public-ip>
cd ~/stacks
# Remove both stacks
docker stack rm maplepress
docker stack rm maplepress-frontend
sleep 10
# Remove configs
docker config rm maplepress_caddyfile
docker config rm maplepress-frontend_caddyfile
# Deploy both stacks
docker stack deploy -c maplepress-stack.yml maplepress
docker stack deploy -c maplepress-frontend-stack.yml maplepress-frontend
# ==============================================================================
# STEP 7: Verify Both Services
# ==============================================================================
docker service ls | grep maplepress
# Should show 3 services all 1/1:
# maplepress_backend
# maplepress_backend-caddy
# maplepress-frontend_caddy
# ==============================================================================
# STEP 8: Test End-to-End (Local Machine)
# ==============================================================================
# Test backend
curl -I https://api.maplepress.io/health
# Should return: HTTP/2 200
# Test frontend
curl -I https://app.maplepress.io
# Should return: HTTP/2 200
# Test CORS
curl -v -H "Origin: https://app.maplepress.io" https://api.maplepress.io/health 2>&1 | grep "access-control"
# Should show: access-control-allow-origin: https://app.maplepress.io
# Test in browser
open https://app.maplepress.io
# DevTools → Network: Verify calls to api.maplepress.io succeed
```
**✅ Both domain changes complete!**
---
## Operation 4: Force SSL Certificate Renewal
**Scenario:** You need to manually renew SSL certificates (rarely needed - Caddy auto-renews)
### When You Might Need This
- Testing certificate renewal process
- Certificate was revoked
- Manual intervention required after failed auto-renewal
### Backend Certificate Renewal
```bash
# SSH to worker-6
ssh dockeradmin@<worker-6-public-ip>
# Get Caddy container ID
docker ps | grep maplepress_backend-caddy
# Access Caddy container
docker exec -it <container-id> sh
# Inside container - force certificate renewal
caddy reload --config /etc/caddy/Caddyfile --force
# Or restart Caddy to trigger renewal
exit
# Back on worker-6
docker service update --force maplepress_backend-caddy
# Watch logs for certificate acquisition
docker service logs -f maplepress_backend-caddy | grep -i certificate
```
### Frontend Certificate Renewal
```bash
# SSH to worker-7
ssh dockeradmin@<worker-7-public-ip>
# Get Caddy container ID
docker ps | grep maplepress-frontend
# Force reload
docker exec <container-id> caddy reload --config /etc/caddy/Caddyfile --force
# Or force restart
exit
docker service update --force maplepress-frontend_caddy
# Watch logs
docker service logs -f maplepress-frontend_caddy | grep -i certificate
```
### Verify New Certificate
```bash
# From local machine
openssl s_client -connect getmaplepress.ca:443 -servername getmaplepress.ca < /dev/null 2>/dev/null | openssl x509 -noout -dates
# Should show:
# notBefore=Nov 5 12:00:00 2025 GMT
# notAfter=Feb 3 12:00:00 2026 GMT
```
---
## Operation 5: Scale Backend Horizontally
**Scenario:** Your backend needs to handle more traffic - add more replicas
### Considerations
- Each replica needs database connections
- Cassandra can handle the load (QUORUM with 3 nodes)
- Redis connections are pooled
- Stateless design allows easy horizontal scaling
### Scale to 3 Replicas
```bash
# On manager node
cd ~/stacks
vi maplepress-stack.yml
# Find backend service, change replicas
# FROM:
# deploy:
# replicas: 1
# TO:
# deploy:
# replicas: 3
# Redeploy
docker stack deploy -c maplepress-stack.yml maplepress
# Watch replicas come up
watch docker service ps maplepress_backend
# Press Ctrl+C when all show Running
# Verify all healthy
docker service ps maplepress_backend --filter "desired-state=running"
# Should show 3 replicas
```
### Load Balancing
Caddy automatically load balances between replicas:
```bash
# Test load balancing
for i in {1..10}; do
curl -s https://getmaplepress.ca/health
sleep 1
done
# Check which replicas handled requests
docker service logs maplepress_backend | grep "GET /health" | tail -20
# You should see different container IDs handling requests
```
### Scale Back Down
```bash
# Edit stack file
vi ~/stacks/maplepress-stack.yml
# Change back to replicas: 1
# Redeploy
docker stack deploy -c maplepress-stack.yml maplepress
# Verify
docker service ps maplepress_backend
# Should show only 1 replica running, others Shutdown
```
---
## Quick Reference: Domain Change Checklist
### Backend Domain Change
- [ ] Update DNS A records (point new domain to worker-6)
- [ ] Wait for DNS propagation (5-60 minutes)
- [ ] Backup Caddyfile: `cp Caddyfile Caddyfile.backup.$(date +%Y%m%d)`
- [ ] Update backend Caddyfile with new domain
- [ ] Redeploy backend stack
- [ ] Verify SSL certificate obtained for new domain
- [ ] Update frontend `.env.production` with new backend URL
- [ ] Rebuild and redeploy frontend
- [ ] Test CORS end-to-end
### Frontend Domain Change
- [ ] Update DNS A records (point new domain to worker-7)
- [ ] Wait for DNS propagation
- [ ] Backup frontend Caddyfile
- [ ] Update frontend Caddyfile with new domain
- [ ] **Update backend CORS** in `maplepress-stack.yml`
- [ ] Redeploy backend (for CORS)
- [ ] Redeploy frontend stack
- [ ] Verify SSL certificate
- [ ] Test in browser (no CORS errors)
---
## Troubleshooting Domain Changes
### Problem: SSL Certificate Not Obtained
**Symptom:** After domain change, HTTPS doesn't work
```bash
# Check Caddy logs
docker service logs maplepress_backend-caddy --tail 100 | grep -i "acme\|certificate"
# Common issues:
# 1. DNS not propagated - wait longer
# 2. Port 80 not accessible - check firewall
# 3. Let's Encrypt rate limit - wait 1 hour
```
**Fix:**
```bash
# Verify DNS resolves
dig <new-domain> +short
# Must show correct worker IP
# Verify port 80 accessible
curl http://<new-domain>
# Should redirect to HTTPS
# If rate limited, wait and retry
# Let's Encrypt limit: 5 certificates per domain per week
```
### Problem: CORS Errors After Domain Change
**Symptom:** Frontend shows CORS errors in browser console
**Cause:** Forgot to update backend CORS configuration
**Fix:**
```bash
# Check backend CORS config
cat ~/stacks/maplepress-stack.yml | grep CORS
# Should include NEW frontend domain
# Update if needed
vi ~/stacks/maplepress-stack.yml
# Add new frontend domain to SECURITY_CORS_ALLOWED_ORIGINS
# Redeploy backend
docker stack rm maplepress
sleep 10
docker config rm maplepress_caddyfile
docker stack deploy -c maplepress-stack.yml maplepress
# Test CORS
curl -v -H "Origin: https://<new-frontend-domain>" https://<backend-domain>/health 2>&1 | grep "access-control"
```
### Problem: Old Domain Still Works
**Symptom:** Both old and new domains work
**Cause:** Caddyfile includes both domains
**Expected Behavior:** This is fine during migration - you can support both
**To Remove Old Domain:**
```bash
# Edit Caddyfile and remove old domain
vi ~/stacks/caddy-config/Caddyfile
# Remove old domain from the domain list
# Redeploy
docker stack rm maplepress
sleep 10
docker config rm maplepress_caddyfile
docker stack deploy -c maplepress-stack.yml maplepress
```
---
**Last Updated**: November 2025
**Maintained By**: Infrastructure Team

View file

@ -0,0 +1,745 @@
# Production Infrastructure Setup Guide
**Audience**: DevOps Engineers, Infrastructure Team, Junior Engineers
**Purpose**: Complete step-by-step deployment of Maple Open Technologies production infrastructure from scratch
**Time to Complete**: 6-8 hours (first-time deployment)
**Prerequisites**: DigitalOcean account, basic Linux knowledge, SSH access
---
## Overview
This directory contains comprehensive guides for deploying Maple Open Technologies production infrastructure on DigitalOcean from a **completely fresh start**. Follow these guides in sequential order to build a complete, production-ready infrastructure.
**What you'll build:**
- Docker Swarm cluster (7+ nodes)
- High-availability databases (Cassandra 3-node cluster)
- Caching layer (Redis)
- Search engine (Meilisearch)
- Backend API (Go application)
- Frontend (React SPA)
- Automatic HTTPS with SSL certificates
- Multi-application architecture (MaplePress, MapleFile)
**Infrastructure at completion:**
```
Internet (HTTPS)
├─ getmaplepress.ca → Backend API (worker-6)
└─ getmaplepress.com → Frontend (worker-7)
Backend Services (maple-public-prod + maple-private-prod)
Databases (maple-private-prod only)
├─ Cassandra: 3-node cluster (workers 2,3,4) - RF=3, QUORUM
├─ Redis: Single instance (worker-1/manager)
└─ Meilisearch: Single instance (worker-5)
Object Storage: DigitalOcean Spaces (S3-compatible)
```
---
## Setup Guides (In Order)
### Phase 0: Planning & Prerequisites (30 minutes)
**[00-getting-started.md](00-getting-started.md)** - Local workspace setup
- DigitalOcean account setup
- API token configuration
- SSH key generation
- `.env` file initialization
- Command-line tools verification
**[00-network-architecture.md](00-network-architecture.md)** - Network design
- Network segmentation strategy (`maple-private-prod` vs `maple-public-prod`)
- Security principles (defense in depth)
- Service communication patterns
- Firewall rules overview
**[00-multi-app-architecture.md](00-multi-app-architecture.md)** - Multi-app strategy
- Naming conventions for services, stacks, hostnames
- Shared infrastructure design (Cassandra/Redis/Meilisearch)
- Application isolation patterns
- Scaling to multiple apps (MaplePress, MapleFile)
**Prerequisites checklist:**
- [ ] DigitalOcean account with billing enabled
- [ ] DigitalOcean API token (read + write permissions)
- [ ] SSH key pair generated (`~/.ssh/id_rsa.pub`)
- [ ] Domain names registered (e.g., `getmaplepress.ca`, `getmaplepress.com`)
- [ ] Local machine: git, ssh, curl installed
- [ ] `.env` file created from `.env.template`
**Total time: 30 minutes**
---
### Phase 1: Infrastructure Foundation (3-4 hours)
**[01_init_docker_swarm.md](01_init_docker_swarm.md)** - Docker Swarm cluster
- Create 7+ DigitalOcean droplets (Ubuntu 24.04)
- Install Docker on all nodes
- Initialize Docker Swarm (1 manager, 6+ workers)
- Configure private networking (VPC)
- Set up firewall rules
- Verify cluster connectivity
**What you'll have:**
- Manager node (worker-1): Swarm orchestration
- Worker nodes (2-7+): Application/database hosts
- Private network: 10.116.0.0/16
- All nodes communicating securely
**Total time: 1-1.5 hours**
---
**[02_cassandra.md](02_cassandra.md)** - Cassandra database cluster
- Deploy 3-node Cassandra cluster (workers 2, 3, 4)
- Configure replication (RF=3, QUORUM consistency)
- Create keyspace and initial schema
- Verify cluster health (`nodetool status`)
- Performance tuning for production
**What you'll have:**
- Highly available database cluster
- Automatic failover (survives 1 node failure)
- QUORUM reads/writes for consistency
- Ready for application data
**Total time: 1-1.5 hours**
---
**[03_redis.md](03_redis.md)** - Redis cache server
- Deploy Redis on manager node (worker-1)
- Configure persistence (RDB + AOF)
- Set up password authentication
- Test connectivity from other services
**What you'll have:**
- High-performance caching layer
- Session storage
- Rate limiting storage
- Persistent cache (survives restarts)
**Total time: 30 minutes**
---
**[04_meilisearch.md](04_meilisearch.md)** - Search engine
- Deploy Meilisearch on worker-5
- Configure API key authentication
- Create initial indexes
- Test search functionality
**What you'll have:**
- Fast full-text search engine
- Typo-tolerant search
- Faceted filtering
- Ready for content indexing
**Total time: 30 minutes**
---
**[04.5_spaces.md](04.5_spaces.md)** - Object storage
- Create DigitalOcean Spaces bucket
- Configure access keys
- Set up CORS policies
- Create Docker secrets for Spaces credentials
- Test upload/download
**What you'll have:**
- S3-compatible object storage
- Secure credential management
- Ready for file uploads
- CDN-backed storage
**Total time: 30 minutes**
---
### Phase 2: Application Deployment (2-3 hours)
**[05_maplepress_backend.md](05_maplepress_backend.md)** - Backend API deployment (Part 1)
- Create worker-6 droplet
- Join worker-6 to Docker Swarm
- Configure DNS (point domain to worker-6)
- Authenticate with DigitalOcean Container Registry
- Create Docker secrets (JWT, encryption keys)
- Deploy backend service (Go application)
- Connect to databases (Cassandra, Redis, Meilisearch)
- Verify health checks
**What you'll have:**
- Backend API running on worker-6
- Connected to all databases
- Docker secrets configured
- Health checks passing
- Ready for reverse proxy
**Total time: 1-1.5 hours**
---
**[06_maplepress_caddy.md](06_maplepress_caddy.md)** - Backend reverse proxy (Part 2)
- Configure Caddy reverse proxy
- Set up automatic SSL/TLS (Let's Encrypt)
- Configure security headers
- Enable HTTP to HTTPS redirect
- Preserve CORS headers for frontend
- Test SSL certificate acquisition
**What you'll have:**
- Backend accessible at `https://getmaplepress.ca`
- Automatic SSL certificate management
- Zero-downtime certificate renewals
- Security headers configured
- CORS configured for frontend
**Total time: 30 minutes**
---
**[07_maplepress_frontend.md](07_maplepress_frontend.md)** - Frontend deployment
- Create worker-7 droplet
- Join worker-7 to Docker Swarm
- Install Node.js on worker-7
- Clone repository and build React app
- Configure production environment (API URL)
- Deploy Caddy for static file serving
- Configure SPA routing
- Set up automatic SSL for frontend domain
**What you'll have:**
- Frontend accessible at `https://getmaplepress.com`
- React app built with production API URL
- Automatic HTTPS
- SPA routing working
- Static asset caching
- Complete end-to-end application
**Total time: 1 hour**
---
### Phase 3: Optional Enhancements (1 hour)
**[99_extra.md](99_extra.md)** - Extra operations
- Domain changes (backend and/or frontend)
- Horizontal scaling (multiple backend replicas)
- SSL certificate management
- Load balancing verification
**Total time: As needed**
---
## Quick Start (Experienced Engineers)
**If you're familiar with Docker Swarm and don't need detailed explanations:**
```bash
# 1. Prerequisites (5 min)
cd cloud/infrastructure/production
cp .env.template .env
vi .env # Add DIGITALOCEAN_TOKEN
source .env
# 2. Infrastructure (1 hour)
# Follow 01_init_docker_swarm.md - create 7 droplets, init swarm
# SSH to manager, run quick verification
# 3. Databases (1 hour)
# Deploy Cassandra (02), Redis (03), Meilisearch (04), Spaces (04.5)
# Verify all services: docker service ls
# 4. Applications (1 hour)
# Deploy backend (05), backend-caddy (06), frontend (07)
# Test: curl https://getmaplepress.ca/health
# curl https://getmaplepress.com
# 5. Verify (15 min)
docker service ls # All services 1/1
docker node ls # All nodes Ready
# Test in browser: https://getmaplepress.com
```
**Total time for experienced: ~3 hours**
---
## Directory Structure
```
setup/
├── README.md # This file
├── 00-getting-started.md # Prerequisites & workspace setup
├── 00-network-architecture.md # Network design principles
├── 00-multi-app-architecture.md # Multi-app naming & strategy
├── 01_init_docker_swarm.md # Docker Swarm cluster
├── 02_cassandra.md # Cassandra database cluster
├── 03_redis.md # Redis cache server
├── 04_meilisearch.md # Meilisearch search engine
├── 04.5_spaces.md # DigitalOcean Spaces (object storage)
├── 05_backend.md # Backend API deployment
├── 06_caddy.md # Backend reverse proxy (Caddy + SSL)
├── 07_frontend.md # Frontend deployment (React + Caddy)
├── 08_extra.md # Domain changes, scaling, extras
└── templates/ # Configuration templates
├── cassandra-stack.yml.template
├── redis-stack.yml.template
├── backend-stack.yml.template
└── Caddyfile.template
```
---
## Infrastructure Specifications
### Hardware Requirements
| Component | Droplet Size | vCPUs | RAM | Disk | Monthly Cost |
|-----------|--------------|-------|-----|------|--------------|
| Manager (worker-1) + Redis | Basic | 2 | 2 GB | 50 GB | $18 |
| Cassandra Node 1 (worker-2) | General Purpose | 2 | 4 GB | 80 GB | $48 |
| Cassandra Node 2 (worker-3) | General Purpose | 2 | 4 GB | 80 GB | $48 |
| Cassandra Node 3 (worker-4) | General Purpose | 2 | 4 GB | 80 GB | $48 |
| Meilisearch (worker-5) | Basic | 2 | 2 GB | 50 GB | $18 |
| Backend (worker-6) | Basic | 2 | 2 GB | 50 GB | $18 |
| Frontend (worker-7) | Basic | 1 | 1 GB | 25 GB | $6 |
| **Total** | - | **13** | **19 GB** | **415 GB** | **~$204/mo** |
**Additional costs:**
- DigitalOcean Spaces: $5/mo (250 GB storage + 1 TB transfer)
- Bandwidth: Included (1 TB per droplet)
- Backups (optional): +20% of droplet cost
**Total estimated: ~$210-250/month**
### Software Versions
| Software | Version | Notes |
|----------|---------|-------|
| Ubuntu | 24.04 LTS | Base OS |
| Docker | 27.x+ | Container runtime |
| Docker Swarm | Built-in | Orchestration |
| Cassandra | 4.1.x | Database |
| Redis | 7.x-alpine | Cache |
| Meilisearch | v1.5+ | Search |
| Caddy | 2-alpine | Reverse proxy |
| Go | 1.21+ | Backend runtime |
| Node.js | 20 LTS | Frontend build |
---
## Key Concepts
### Docker Swarm Architecture
**Manager node (worker-1):**
- Orchestrates all services
- Schedules tasks to workers
- Maintains cluster state
- Runs Redis (collocated)
**Worker nodes (2-7+):**
- Execute service tasks (containers)
- Report health to manager
- Isolated workloads via labels
**Node labels:**
- `backend=true`: Backend deployment target (worker-6)
- `maplepress-frontend=true`: Frontend target (worker-7)
### Network Architecture
**`maple-private-prod` (overlay network):**
- All databases (Cassandra, Redis, Meilisearch)
- Backend services (access to databases)
- **No internet access** (security)
- Internal-only communication
**`maple-public-prod` (overlay network):**
- Caddy reverse proxies
- Backend services (receive HTTP requests)
- Ports 80/443 exposed to internet
**Backends join BOTH networks:**
- Receive requests from Caddy (public network)
- Access databases (private network)
### Multi-Application Pattern
**Shared infrastructure (workers 1-5):**
- Cassandra, Redis, Meilisearch serve ALL apps
- Cost-efficient (1 infrastructure for unlimited apps)
**Per-application deployment (workers 6+):**
- Each app gets dedicated workers
- Independent scaling and deployment
- Clear isolation
**Example: Adding MapleFile**
- Worker-8: `maplefile_backend` + `maplefile_backend-caddy`
- Worker-9: `maplefile-frontend_caddy`
- Uses same Cassandra/Redis/Meilisearch
- No changes to infrastructure
---
## Common Commands Reference
### Swarm Management
```bash
# List all nodes
docker node ls
# List all services
docker service ls
# View service logs
docker service logs -f maplepress_backend
# Scale service
docker service scale maplepress_backend=3
# Update service (rolling restart)
docker service update --force maplepress_backend
# Remove service
docker service rm maplepress_backend
```
### Stack Management
```bash
# Deploy stack
docker stack deploy -c stack.yml stack-name
# List stacks
docker stack ls
# View stack services
docker stack services maplepress
# Remove stack
docker stack rm maplepress
```
### Troubleshooting
```bash
# Check service status
docker service ps maplepress_backend
# View container logs
docker logs <container-id>
# Inspect service
docker service inspect maplepress_backend
# Check network
docker network inspect maple-private-prod
# List configs
docker config ls
# List secrets
docker secret ls
```
---
## Deployment Checklist
**Use this checklist to track your progress:**
### Phase 0: Prerequisites
- [ ] DigitalOcean account created
- [ ] API token generated and saved
- [ ] SSH keys generated (`ssh-keygen`)
- [ ] SSH key added to DigitalOcean
- [ ] Domain names registered
- [ ] `.env` file created from template
- [ ] `.env` file has correct permissions (600)
- [ ] Git repository cloned locally
### Phase 1: Infrastructure
- [ ] 7 droplets created (workers 1-7)
- [ ] Docker Swarm initialized
- [ ] All workers joined swarm
- [ ] Private networking configured (VPC)
- [ ] Firewall rules configured on all nodes
- [ ] Cassandra 3-node cluster deployed
- [ ] Cassandra cluster healthy (`nodetool status`)
- [ ] Redis deployed on manager
- [ ] Redis authentication configured
- [ ] Meilisearch deployed on worker-5
- [ ] Meilisearch API key configured
- [ ] DigitalOcean Spaces bucket created
- [ ] Spaces access keys stored as Docker secrets
### Phase 2: Applications
- [ ] Worker-6 created and joined swarm
- [ ] Worker-6 labeled for backend
- [ ] DNS pointing backend domain to worker-6
- [ ] Backend Docker secrets created (JWT, IP encryption)
- [ ] Backend service deployed
- [ ] Backend health check passing
- [ ] Backend Caddy deployed
- [ ] Backend SSL certificate obtained
- [ ] Backend accessible at `https://domain.ca`
- [ ] Worker-7 created and joined swarm
- [ ] Worker-7 labeled for frontend
- [ ] DNS pointing frontend domain to worker-7
- [ ] Node.js installed on worker-7
- [ ] Repository cloned on worker-7
- [ ] Frontend built with production API URL
- [ ] Frontend Caddy deployed
- [ ] Frontend SSL certificate obtained
- [ ] Frontend accessible at `https://domain.com`
- [ ] CORS working (frontend can call backend)
### Phase 3: Verification
- [ ] All services show 1/1 replicas (`docker service ls`)
- [ ] All nodes show Ready (`docker node ls`)
- [ ] Backend health endpoint returns 200
- [ ] Frontend loads in browser
- [ ] Frontend can call backend API (no CORS errors)
- [ ] SSL certificates valid (green padlock)
- [ ] HTTP redirects to HTTPS
### Next Steps
- [ ] Set up monitoring (see `../operations/02_monitoring_alerting.md`)
- [ ] Configure backups (see `../operations/01_backup_recovery.md`)
- [ ] Review incident runbooks (see `../operations/03_incident_response.md`)
---
## Troubleshooting Guide
### Problem: Docker Swarm Join Fails
**Symptoms:** Worker can't join swarm, connection refused
**Check:**
```bash
# On manager, verify swarm is initialized
docker info | grep "Swarm: active"
# Verify firewall allows swarm ports
sudo ufw status | grep -E "2377|7946|4789"
# Get new join token
docker swarm join-token worker
```
### Problem: Service Won't Start
**Symptoms:** Service stuck at 0/1 replicas
**Check:**
```bash
# View service events
docker service ps service-name --no-trunc
# Common issues:
# - Image not found: Authenticate with registry
# - Network not found: Create network first
# - Secret not found: Create secrets
# - No suitable node: Check node labels
```
### Problem: DNS Not Resolving
**Symptoms:** Domain doesn't resolve to correct IP
**Check:**
```bash
# Test DNS resolution
dig yourdomain.com +short
# Should return worker IP
# If not, wait 5-60 minutes for propagation
# Or check DNS provider settings
```
### Problem: SSL Certificate Not Obtained
**Symptoms:** HTTPS not working, certificate errors
**Check:**
```bash
# Verify DNS points to correct server
dig yourdomain.com +short
# Verify port 80 accessible (Let's Encrypt challenge)
curl http://yourdomain.com
# Check Caddy logs
docker service logs service-name --tail 100 | grep -i certificate
# Common issues:
# - DNS not pointing to server
# - Port 80 blocked by firewall
# - Rate limited (5 certs/domain/week)
```
### Problem: Services Can't Communicate
**Symptoms:** Backend can't reach database
**Check:**
```bash
# Verify both services on same network
docker service inspect backend --format '{{.Spec.TaskTemplate.Networks}}'
docker service inspect database --format '{{.Spec.TaskTemplate.Networks}}'
# Test DNS resolution from container
docker exec <container> nslookup database-hostname
# Verify firewall allows internal traffic
sudo ufw status | grep 10.116.0.0/16
```
---
## Getting Help
### Documentation Resources
**Within this repository:**
- This directory (`setup/`): Initial deployment guides
- `../operations/`: Day-to-day operational procedures
- `../reference/`: Architecture diagrams, capacity planning
- `../automation/`: Scripts for common tasks
**External resources:**
- Docker Swarm: https://docs.docker.com/engine/swarm/
- Cassandra: https://cassandra.apache.org/doc/latest/
- DigitalOcean: https://docs.digitalocean.com/
- Caddy: https://caddyserver.com/docs/
### Common Questions
**Q: Can I use a different cloud provider (AWS, GCP, Azure)?**
A: Yes, but you'll need to adapt networking and object storage sections. The Docker Swarm and application deployment sections remain the same.
**Q: Can I deploy with fewer nodes?**
A: Minimum viable: 3 nodes (1 manager + 2 workers). Run Cassandra in single-node mode (not recommended for production). Colocate services on same workers.
**Q: How do I add a new application (e.g., MapleFile)?**
A: Follow `00-multi-app-architecture.md`. Add 2 workers (backend + frontend), deploy new stacks. Reuse existing databases.
**Q: What if I only have one domain?**
A: Use subdomains: `api.yourdomain.com` (backend), `app.yourdomain.com` (frontend). Update DNS and Caddyfiles accordingly.
---
## Security Best Practices
**Implemented by these guides:**
- ✅ Firewall configured (UFW) on all nodes
- ✅ SSH key-based authentication (no passwords)
- ✅ Docker secrets for sensitive values
- ✅ Network segmentation (private vs public)
- ✅ Automatic HTTPS with Let's Encrypt
- ✅ Security headers configured in Caddy
- ✅ Database authentication (Redis password, Meilisearch API key)
- ✅ Private Docker registry authentication
**Additional recommendations:**
- Rotate secrets quarterly (see `../operations/07_security_operations.md`)
- Enable 2FA on DigitalOcean account
- Regular security updates (Ubuntu unattended-upgrades)
- Monitor for unauthorized access attempts
- Backup encryption (GPG for backup files)
---
## Maintenance Schedule
**After deployment, establish these routines:**
**Daily:**
- Check service health (`docker service ls`)
- Review monitoring dashboards
- Check backup completion logs
**Weekly:**
- Review security logs
- Check disk space across all nodes
- Verify SSL certificate expiry dates
**Monthly:**
- Apply security updates (`apt update && apt upgrade`)
- Review capacity and performance metrics
- Test backup restore procedures
- Rotate non-critical secrets
**Quarterly:**
- Full disaster recovery drill
- Review and update documentation
- Capacity planning review
- Security audit
---
## What's Next?
**After completing setup:**
1. **Configure Operations** (`../operations/`)
- Set up monitoring and alerting
- Configure automated backups
- Review incident response runbooks
2. **Optimize Performance**
- Tune database settings
- Configure caching strategies
- Load test your infrastructure
3. **Add Redundancy**
- Scale critical services
- Set up failover procedures
- Implement health checks
4. **Automate**
- CI/CD pipeline for deployments
- Automated testing
- Infrastructure as Code (Terraform)
---
**Last Updated**: January 2025
**Maintained By**: Infrastructure Team
**Review Frequency**: Quarterly
**Feedback**: Found an issue or have a suggestion? Open an issue on Codeberg or contact the infrastructure team.
---
## Success! 🎉
If you've completed all guides in this directory, you now have:
✅ Production-ready infrastructure on DigitalOcean
✅ High-availability database cluster (Cassandra RF=3)
✅ Caching and search infrastructure (Redis, Meilisearch)
✅ Secure backend API with automatic HTTPS
✅ React frontend with automatic SSL
✅ Multi-application architecture ready to scale
✅ Network segmentation for security
✅ Docker Swarm orchestration
**Welcome to production operations!** 🚀
Now head to `../operations/` to learn how to run and maintain your infrastructure.

View file

@ -0,0 +1,111 @@
version: '3.8'
networks:
maple-private-prod:
external: true
maple-public-prod:
external: true
secrets:
maplepress_jwt_secret:
external: true
redis_password:
external: true
meilisearch_master_key:
external: true
# Uncomment if using S3/SeaweedFS:
# s3_access_key:
# external: true
# s3_secret_key:
# external: true
services:
backend:
image: registry.digitalocean.com/ssp/maplepress_backend:latest
hostname: backend
networks:
- maple-public-prod # Receive requests from NGINX
- maple-private-prod # Access databases
secrets:
- maplepress_jwt_secret
- redis_password
- meilisearch_master_key
# Uncomment if using S3:
# - s3_access_key
# - s3_secret_key
environment:
# Application Configuration
- APP_ENVIRONMENT=production
- APP_VERSION=${APP_VERSION:-1.0.0}
# HTTP Server Configuration
- SERVER_HOST=0.0.0.0
- SERVER_PORT=8000
# Cassandra Database Configuration
# Use all 3 Cassandra nodes for high availability
- DATABASE_HOSTS=cassandra-1:9042,cassandra-2:9042,cassandra-3:9042
- DATABASE_KEYSPACE=maplepress
- DATABASE_CONSISTENCY=QUORUM
- DATABASE_REPLICATION=3
- DATABASE_MIGRATIONS_PATH=file://migrations
# Meilisearch Configuration
- MEILISEARCH_HOST=http://meilisearch:7700
# Logger Configuration
- LOGGER_LEVEL=info
- LOGGER_FORMAT=json
# S3/Object Storage Configuration (if using)
# - AWS_ENDPOINT=https://your-region.digitaloceanspaces.com
# - AWS_REGION=us-east-1
# - AWS_BUCKET_NAME=maplepress-prod
# Read secrets and set as environment variables using entrypoint
entrypoint: ["/bin/sh", "-c"]
command:
- |
export APP_JWT_SECRET=$$(cat /run/secrets/maplepress_jwt_secret)
export CACHE_PASSWORD=$$(cat /run/secrets/redis_password)
export MEILISEARCH_API_KEY=$$(cat /run/secrets/meilisearch_master_key)
# Uncomment if using S3:
# export AWS_ACCESS_KEY=$$(cat /run/secrets/s3_access_key)
# export AWS_SECRET_KEY=$$(cat /run/secrets/s3_secret_key)
# Set Redis configuration
export CACHE_HOST=redis
export CACHE_PORT=6379
export CACHE_DB=0
# Start the backend
exec /app/maplepress-backend
deploy:
replicas: 1
placement:
constraints:
- node.labels.backend == true
restart_policy:
condition: on-failure
delay: 10s
max_attempts: 3
resources:
limits:
memory: 1G
cpus: '1.0'
reservations:
memory: 512M
cpus: '0.5'
update_config:
parallelism: 1
delay: 10s
failure_action: rollback
order: start-first # Zero-downtime: start new before stopping old
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "--header=X-Tenant-ID: healthcheck", "http://localhost:8000/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 60s

View file

@ -0,0 +1,101 @@
version: '3.8'
networks:
maple-private-prod:
external: true
volumes:
cassandra-1-data:
cassandra-2-data:
cassandra-3-data:
services:
cassandra-1:
image: cassandra:5.0.4
hostname: cassandra-1
networks:
- maple-private-prod
environment:
- CASSANDRA_CLUSTER_NAME=maple-prod-cluster
- CASSANDRA_DC=datacenter1
- CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch
- CASSANDRA_SEEDS=cassandra-1,cassandra-2,cassandra-3
- MAX_HEAP_SIZE=512M
- HEAP_NEWSIZE=128M
volumes:
- cassandra-1-data:/var/lib/cassandra
deploy:
replicas: 1
placement:
constraints:
- node.labels.cassandra == node1
restart_policy:
condition: on-failure
delay: 10s
max_attempts: 3
healthcheck:
test: ["CMD-SHELL", "cqlsh -e 'describe cluster' || exit 1"]
interval: 30s
timeout: 10s
retries: 5
start_period: 120s
cassandra-2:
image: cassandra:5.0.4
hostname: cassandra-2
networks:
- maple-private-prod
environment:
- CASSANDRA_CLUSTER_NAME=maple-prod-cluster
- CASSANDRA_DC=datacenter1
- CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch
- CASSANDRA_SEEDS=cassandra-1,cassandra-2,cassandra-3
- MAX_HEAP_SIZE=512M
- HEAP_NEWSIZE=128M
volumes:
- cassandra-2-data:/var/lib/cassandra
deploy:
replicas: 1
placement:
constraints:
- node.labels.cassandra == node2
restart_policy:
condition: on-failure
delay: 10s
max_attempts: 3
healthcheck:
test: ["CMD-SHELL", "cqlsh -e 'describe cluster' || exit 1"]
interval: 30s
timeout: 10s
retries: 5
start_period: 120s
cassandra-3:
image: cassandra:5.0.4
hostname: cassandra-3
networks:
- maple-private-prod
environment:
- CASSANDRA_CLUSTER_NAME=maple-prod-cluster
- CASSANDRA_DC=datacenter1
- CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch
- CASSANDRA_SEEDS=cassandra-1,cassandra-2,cassandra-3
- MAX_HEAP_SIZE=512M
- HEAP_NEWSIZE=128M
volumes:
- cassandra-3-data:/var/lib/cassandra
deploy:
replicas: 1
placement:
constraints:
- node.labels.cassandra == node3
restart_policy:
condition: on-failure
delay: 10s
max_attempts: 3
healthcheck:
test: ["CMD-SHELL", "cqlsh -e 'describe cluster' || exit 1"]
interval: 30s
timeout: 10s
retries: 5
start_period: 120s

View file

@ -0,0 +1,114 @@
#!/bin/bash
#
# Cassandra Cluster Sequential Deployment Script
# This script deploys Cassandra nodes sequentially to avoid race conditions
# during cluster formation.
#
set -e
STACK_NAME="cassandra"
STACK_FILE="cassandra-stack.yml"
echo "=== Cassandra Cluster Sequential Deployment ==="
echo ""
# Check if stack file exists
if [ ! -f "$STACK_FILE" ]; then
echo "ERROR: $STACK_FILE not found in current directory"
exit 1
fi
echo "Step 1: Deploying cassandra-1 (seed node)..."
docker stack deploy -c "$STACK_FILE" "$STACK_NAME"
# Scale down cassandra-2 and cassandra-3 temporarily
docker service scale "${STACK_NAME}_cassandra-2=0" > /dev/null 2>&1
docker service scale "${STACK_NAME}_cassandra-3=0" > /dev/null 2>&1
echo "Waiting for cassandra-1 to become healthy (this takes ~5-8 minutes)..."
echo "Checking every 30 seconds..."
# Wait for cassandra-1 to be running
COUNTER=0
MAX_WAIT=20 # 20 * 30 seconds = 10 minutes max
while [ $COUNTER -lt $MAX_WAIT ]; do
REPLICAS=$(docker service ls --filter "name=${STACK_NAME}_cassandra-1" --format "{{.Replicas}}")
if [ "$REPLICAS" = "1/1" ]; then
echo "✓ cassandra-1 is running"
# Give it extra time to fully initialize
echo "Waiting additional 2 minutes for cassandra-1 to fully initialize..."
sleep 120
break
fi
echo " cassandra-1 status: $REPLICAS (waiting...)"
sleep 30
COUNTER=$((COUNTER + 1))
done
if [ $COUNTER -eq $MAX_WAIT ]; then
echo "ERROR: cassandra-1 failed to start within 10 minutes"
echo "Check logs with: docker service logs ${STACK_NAME}_cassandra-1"
exit 1
fi
echo ""
echo "Step 2: Starting cassandra-2..."
docker service scale "${STACK_NAME}_cassandra-2=1"
echo "Waiting for cassandra-2 to become healthy (this takes ~5-8 minutes)..."
COUNTER=0
while [ $COUNTER -lt $MAX_WAIT ]; do
REPLICAS=$(docker service ls --filter "name=${STACK_NAME}_cassandra-2" --format "{{.Replicas}}")
if [ "$REPLICAS" = "1/1" ]; then
echo "✓ cassandra-2 is running"
echo "Waiting additional 2 minutes for cassandra-2 to join cluster..."
sleep 120
break
fi
echo " cassandra-2 status: $REPLICAS (waiting...)"
sleep 30
COUNTER=$((COUNTER + 1))
done
if [ $COUNTER -eq $MAX_WAIT ]; then
echo "ERROR: cassandra-2 failed to start within 10 minutes"
echo "Check logs with: docker service logs ${STACK_NAME}_cassandra-2"
exit 1
fi
echo ""
echo "Step 3: Starting cassandra-3..."
docker service scale "${STACK_NAME}_cassandra-3=1"
echo "Waiting for cassandra-3 to become healthy (this takes ~5-8 minutes)..."
COUNTER=0
while [ $COUNTER -lt $MAX_WAIT ]; do
REPLICAS=$(docker service ls --filter "name=${STACK_NAME}_cassandra-3" --format "{{.Replicas}}")
if [ "$REPLICAS" = "1/1" ]; then
echo "✓ cassandra-3 is running"
echo "Waiting additional 2 minutes for cassandra-3 to join cluster..."
sleep 120
break
fi
echo " cassandra-3 status: $REPLICAS (waiting...)"
sleep 30
COUNTER=$((COUNTER + 1))
done
if [ $COUNTER -eq $MAX_WAIT ]; then
echo "ERROR: cassandra-3 failed to start within 10 minutes"
echo "Check logs with: docker service logs ${STACK_NAME}_cassandra-3"
exit 1
fi
echo ""
echo "=== Deployment Complete ==="
echo ""
echo "All 3 Cassandra nodes should now be running and forming a cluster."
echo ""
echo "Verify cluster status by SSH'ing to any worker node and running:"
echo " docker exec -it \$(docker ps -q --filter \"name=cassandra\") nodetool status"
echo ""
echo "You should see 3 nodes with status 'UN' (Up Normal)."
echo ""

View file

@ -0,0 +1,56 @@
version: '3.8'
networks:
maple-private-prod:
external: true
volumes:
meilisearch-data:
secrets:
meilisearch_master_key:
external: true
services:
meilisearch:
image: getmeili/meilisearch:v1.5
hostname: meilisearch
networks:
- maple-private-prod
volumes:
- meilisearch-data:/meili_data
secrets:
- meilisearch_master_key
entrypoint: ["/bin/sh", "-c"]
command:
- |
export MEILI_MASTER_KEY=$$(cat /run/secrets/meilisearch_master_key)
exec meilisearch
environment:
- MEILI_ENV=production
- MEILI_NO_ANALYTICS=true
- MEILI_DB_PATH=/meili_data
- MEILI_HTTP_ADDR=0.0.0.0:7700
- MEILI_LOG_LEVEL=INFO
- MEILI_MAX_INDEXING_MEMORY=512mb
- MEILI_MAX_INDEXING_THREADS=2
deploy:
replicas: 1
placement:
constraints:
- node.labels.meilisearch == true
restart_policy:
condition: on-failure
delay: 10s
max_attempts: 3
resources:
limits:
memory: 1G
reservations:
memory: 768M
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:7700/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s

View file

@ -0,0 +1,71 @@
version: '3.8'
networks:
maple-public-prod:
external: true
volumes:
nginx-ssl-certs:
nginx-ssl-www:
services:
nginx:
image: nginx:alpine
hostname: nginx
networks:
- maple-public-prod
ports:
- "80:80"
- "443:443"
volumes:
- nginx-ssl-certs:/etc/letsencrypt
- nginx-ssl-www:/var/www/certbot
- /var/run/docker.sock:/tmp/docker.sock:ro # For nginx-proxy
configs:
- source: nginx_config
target: /etc/nginx/nginx.conf
- source: nginx_site_config
target: /etc/nginx/conf.d/default.conf
deploy:
replicas: 1
placement:
constraints:
- node.labels.backend == true # Same node as backend
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
resources:
limits:
memory: 256M
cpus: '0.5'
reservations:
memory: 128M
cpus: '0.25'
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:80/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 10s
certbot:
image: certbot/certbot:latest
hostname: certbot
volumes:
- nginx-ssl-certs:/etc/letsencrypt
- nginx-ssl-www:/var/www/certbot
entrypoint: "/bin/sh -c 'trap exit TERM; while :; do certbot renew; sleep 12h & wait $${!}; done;'"
deploy:
replicas: 1
placement:
constraints:
- node.labels.backend == true
restart_policy:
condition: on-failure
configs:
nginx_config:
file: ./nginx.conf
nginx_site_config:
file: ./site.conf

View file

@ -0,0 +1,55 @@
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 2048;
use epoll;
multi_accept on;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
# Logging
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'rt=$request_time uct="$upstream_connect_time" '
'uht="$upstream_header_time" urt="$upstream_response_time"';
access_log /var/log/nginx/access.log main;
# Performance
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
client_max_body_size 100M;
# Gzip
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_comp_level 6;
gzip_types text/plain text/css text/xml text/javascript
application/json application/javascript application/xml+rss
application/rss+xml application/atom+xml image/svg+xml
text/x-component application/x-font-ttf font/opentype;
# Security headers (default)
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
# Rate limiting zones
limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=api:10m rate=100r/s;
limit_req_status 429;
# Include site configurations
include /etc/nginx/conf.d/*.conf;
}

View file

@ -0,0 +1,73 @@
version: '3.8'
networks:
maple-private-prod:
external: true
volumes:
redis-data:
secrets:
redis_password:
external: true
services:
redis:
image: redis:7-alpine
hostname: redis
networks:
- maple-private-prod
volumes:
- redis-data:/data
secrets:
- redis_password
# Command with password from secret
command: >
sh -c '
redis-server
--requirepass "$$(cat /run/secrets/redis_password)"
--bind 0.0.0.0
--port 6379
--protected-mode no
--save 900 1
--save 300 10
--save 60 10000
--appendonly yes
--appendfilename "appendonly.aof"
--appendfsync everysec
--maxmemory 512mb
--maxmemory-policy allkeys-lru
--loglevel notice
--databases 16
--timeout 300
--tcp-keepalive 300
--io-threads 2
--io-threads-do-reads yes
--slowlog-log-slower-than 10000
--slowlog-max-len 128
--activerehashing yes
--maxclients 10000
--rename-command FLUSHDB ""
--rename-command FLUSHALL ""
--rename-command CONFIG ""
'
deploy:
replicas: 1
placement:
constraints:
- node.labels.redis == true
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
resources:
limits:
memory: 768M
reservations:
memory: 512M
healthcheck:
test: ["CMD", "sh", "-c", "redis-cli -a $$(cat /run/secrets/redis_password) ping | grep PONG"]
interval: 10s
timeout: 3s
retries: 3
start_period: 10s

View file

@ -0,0 +1,161 @@
# Maple Infrastructure - Redis Production Configuration
# This file is used by the Redis Docker container
# ==============================================================================
# NETWORK
# ==============================================================================
# Bind to all interfaces (Docker networking handles access control)
bind 0.0.0.0
# Default Redis port
port 6379
# Protected mode disabled (we rely on Docker network isolation)
# Only containers on maple-prod overlay network can access
protected-mode no
# ==============================================================================
# PERSISTENCE
# ==============================================================================
# RDB Snapshots (background saves)
# Save if at least 1 key changed in 900 seconds (15 min)
save 900 1
# Save if at least 10 keys changed in 300 seconds (5 min)
save 300 10
# Save if at least 10000 keys changed in 60 seconds (1 min)
save 60 10000
# Stop writes if RDB snapshot fails (data safety)
stop-writes-on-bgsave-error yes
# Compress RDB files
rdbcompression yes
# Checksum RDB files
rdbchecksum yes
# RDB filename
dbfilename dump.rdb
# Working directory for RDB and AOF files
dir /data
# ==============================================================================
# APPEND-ONLY FILE (AOF) - Additional Durability
# ==============================================================================
# Enable AOF for better durability
appendonly yes
# AOF filename
appendfilename "appendonly.aof"
# Sync strategy: fsync every second (good balance)
# Options: always, everysec, no
appendfsync everysec
# Don't fsync during rewrite (prevents blocking)
no-appendfsync-on-rewrite no
# Auto-rewrite AOF when it grows 100% larger
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
# ==============================================================================
# MEMORY MANAGEMENT
# ==============================================================================
# Maximum memory (adjust based on your droplet RAM)
# For 2GB droplet with Redis only: 1.5GB safe limit
# For 2GB droplet with other services: 512MB-1GB
maxmemory 512mb
# Eviction policy when maxmemory reached
# allkeys-lru: Evict least recently used keys (good for cache)
# volatile-lru: Only evict keys with TTL set
# noeviction: Return errors when memory limit reached
maxmemory-policy allkeys-lru
# LRU/LFU algorithm precision (higher = more accurate, more CPU)
maxmemory-samples 5
# ==============================================================================
# SECURITY
# ==============================================================================
# Require password for all operations
# IMPORTANT: This is loaded from Docker secret in production
# requirepass will be set via command line argument
# Disable dangerous commands in production
rename-command FLUSHDB ""
rename-command FLUSHALL ""
rename-command CONFIG ""
# ==============================================================================
# LOGGING
# ==============================================================================
# Log level: debug, verbose, notice, warning
loglevel notice
# Log to stdout (Docker captures logs)
logfile ""
# ==============================================================================
# DATABASES
# ==============================================================================
# Number of databases (default 16)
databases 16
# ==============================================================================
# PERFORMANCE TUNING
# ==============================================================================
# Timeout for idle client connections (0 = disabled)
timeout 300
# TCP keepalive
tcp-keepalive 300
# Number of I/O threads (use for high load)
# 0 = auto-detect, 1 = single-threaded
io-threads 2
io-threads-do-reads yes
# ==============================================================================
# SLOW LOG
# ==============================================================================
# Log queries slower than 10ms
slowlog-log-slower-than 10000
# Keep last 128 slow queries
slowlog-max-len 128
# ==============================================================================
# ADVANCED
# ==============================================================================
# Enable active rehashing
activerehashing yes
# Client output buffer limits
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit replica 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
# Max number of clients
maxclients 10000
# ==============================================================================
# NOTES
# ==============================================================================
# This configuration is optimized for:
# - Production caching workload
# - 2GB RAM droplet
# - Single Redis instance (not clustered)
# - AOF + RDB persistence
# - Docker Swarm networking
#
# Monitoring commands:
# - INFO: Get server stats
# - SLOWLOG GET: View slow queries
# - MEMORY STATS: Memory usage breakdown
# - CLIENT LIST: Connected clients
# ==============================================================================

View file

@ -0,0 +1,108 @@
# Upstream backend service
upstream backend {
server backend:8000;
keepalive 32;
}
# HTTP server - redirect to HTTPS
server {
listen 80;
listen [::]:80;
server_name getmaplepress.ca www.getmaplepress.ca;
# Let's Encrypt challenge location
location /.well-known/acme-challenge/ {
root /var/www/certbot;
}
# Health check endpoint (for load balancer)
location /health {
access_log off;
return 200 "healthy\n";
add_header Content-Type text/plain;
}
# Redirect all other HTTP traffic to HTTPS
location / {
return 301 https://$host$request_uri;
}
}
# HTTPS server
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name getmaplepress.ca www.getmaplepress.ca;
# SSL certificates (Let's Encrypt)
ssl_certificate /etc/letsencrypt/live/getmaplepress.ca/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/getmaplepress.ca/privkey.pem;
# SSL configuration (Mozilla Intermediate)
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384;
ssl_prefer_server_ciphers off;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
ssl_stapling on;
ssl_stapling_verify on;
# Security headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
add_header Referrer-Policy "strict-origin-when-cross-origin" always;
# Logging
access_log /var/log/nginx/access.log main;
error_log /var/log/nginx/error.log warn;
# Proxy settings
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Port $server_port;
# Timeouts
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
# Buffer settings
proxy_buffering on;
proxy_buffer_size 4k;
proxy_buffers 8 4k;
proxy_busy_buffers_size 8k;
# API endpoints (rate limited)
location /api/ {
limit_req zone=api burst=20 nodelay;
proxy_pass http://backend;
}
# All other requests
location / {
limit_req zone=general burst=5 nodelay;
proxy_pass http://backend;
}
# Health check (internal)
location /health {
access_log off;
proxy_pass http://backend/health;
}
# Metrics endpoint (if exposed)
location /metrics {
access_log off;
deny all; # Only allow from monitoring systems
# allow 10.116.0.0/16; # Uncomment to allow from VPC
proxy_pass http://backend/metrics;
}
}