Initial commit: Open sourcing all of the Maple Open Technologies code.
This commit is contained in:
commit
755d54a99d
2010 changed files with 448675 additions and 0 deletions
671
cloud/infrastructure/production/setup/03_redis.md
Normal file
671
cloud/infrastructure/production/setup/03_redis.md
Normal file
|
|
@ -0,0 +1,671 @@
|
|||
# Redis Setup (Single Instance)
|
||||
|
||||
**Prerequisites**: Complete [01_init_docker_swarm.md](01_init_docker_swarm.md) first
|
||||
|
||||
**Time to Complete**: 15-20 minutes
|
||||
|
||||
**What You'll Build**:
|
||||
- Single Redis instance on existing worker-1
|
||||
- Password-protected with Docker secrets
|
||||
- Private network communication only (maple-private-prod overlay)
|
||||
- Persistent data with AOF + RDB
|
||||
- Ready for Go application connections
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview](#overview)
|
||||
2. [Label Worker Node](#label-worker-node)
|
||||
3. [Create Redis Password Secret](#create-redis-password-secret)
|
||||
4. [Deploy Redis](#deploy-redis)
|
||||
5. [Verify Redis Health](#verify-redis-health)
|
||||
6. [Connect from Application](#connect-from-application)
|
||||
7. [Redis Management](#redis-management)
|
||||
8. [Troubleshooting](#troubleshooting)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
Docker Swarm Cluster:
|
||||
├── mapleopentech-swarm-manager-1-prod (10.116.0.2)
|
||||
│ └── Orchestrates cluster
|
||||
│
|
||||
├── mapleopentech-swarm-worker-1-prod (10.116.0.3)
|
||||
│ └── Redis (single instance)
|
||||
│ ├── Network: maple-private-prod (overlay, shared)
|
||||
│ ├── Port: 6379 (private only)
|
||||
│ ├── Auth: Password (Docker secret)
|
||||
│ └── Data: Persistent volume
|
||||
│
|
||||
└── mapleopentech-swarm-worker-2,3,4-prod
|
||||
└── Cassandra Cluster (3 nodes)
|
||||
└── Same network: maple-private-prod
|
||||
|
||||
Shared Network (maple-private-prod):
|
||||
├── All services can communicate
|
||||
├── Service discovery by name (redis, cassandra-1, etc.)
|
||||
└── No public internet access
|
||||
|
||||
Future Application:
|
||||
└── mapleopentech-swarm-worker-X-prod
|
||||
└── Go Backend → Connects to redis:6379 and cassandra:9042 on maple-private-prod
|
||||
```
|
||||
|
||||
### Redis Configuration
|
||||
|
||||
- **Version**: Redis 7 (Alpine)
|
||||
- **Memory**: 512MB max (with LRU eviction)
|
||||
- **Persistence**: AOF (every second) + RDB snapshots
|
||||
- **Network**: Private overlay network only
|
||||
- **Authentication**: Required via Docker secret
|
||||
- **Security**: Dangerous commands disabled (FLUSHALL, CONFIG, etc.)
|
||||
|
||||
### Why Worker-1?
|
||||
|
||||
- Already exists from Docker Swarm setup
|
||||
- Available capacity (2GB RAM droplet)
|
||||
- Keeps costs down (no new droplet needed)
|
||||
- Sufficient for caching workload
|
||||
|
||||
---
|
||||
|
||||
## Label Worker Node
|
||||
|
||||
We'll use Docker node labels to ensure Redis always deploys to worker-1.
|
||||
|
||||
**On your manager node:**
|
||||
|
||||
```bash
|
||||
# SSH to manager
|
||||
ssh dockeradmin@<manager-public-ip>
|
||||
|
||||
# Label worker-1 for Redis placement
|
||||
docker node update --label-add redis=true mapleopentech-swarm-worker-1-prod
|
||||
|
||||
# Verify label
|
||||
docker node inspect mapleopentech-swarm-worker-1-prod --format '{{.Spec.Labels}}'
|
||||
# Should show: map[redis:true]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Create Redis Password Secret
|
||||
|
||||
Redis will use Docker secrets for password authentication.
|
||||
|
||||
### Step 1: Generate Strong Password
|
||||
|
||||
**On your manager node:**
|
||||
|
||||
```bash
|
||||
# Generate a random 32-character password
|
||||
REDIS_PASSWORD=$(openssl rand -base64 32 | tr -d "=+/" | cut -c1-32)
|
||||
|
||||
# Display it (SAVE THIS IN YOUR PASSWORD MANAGER!)
|
||||
echo $REDIS_PASSWORD
|
||||
|
||||
# Example output: a8K9mP2nQ7rT4vW5xY6zB3cD1eF0gH8i
|
||||
```
|
||||
|
||||
**⚠️ IMPORTANT**: Save this password in your password manager now! You'll need it for:
|
||||
- Application configuration
|
||||
- Manual Redis CLI connections
|
||||
- Troubleshooting
|
||||
|
||||
### Step 2: Create Docker Secret
|
||||
|
||||
```bash
|
||||
# Create secret from the password
|
||||
echo $REDIS_PASSWORD | docker secret create redis_password -
|
||||
|
||||
# Verify secret was created
|
||||
docker secret ls
|
||||
# Should show:
|
||||
# ID NAME CREATED
|
||||
# abc123... redis_password About a minute ago
|
||||
```
|
||||
|
||||
### Step 3: Update .env File
|
||||
|
||||
**On your local machine**, update your `.env` file:
|
||||
|
||||
```bash
|
||||
# Add to cloud/infrastructure/production/.env
|
||||
REDIS_HOST=redis
|
||||
REDIS_PORT=6379
|
||||
REDIS_PASSWORD=<paste-the-password-here>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deploy Redis
|
||||
|
||||
### Step 1: Create Redis Stack File
|
||||
|
||||
**On your manager node:**
|
||||
|
||||
```bash
|
||||
# Create directory for stack files (if not exists)
|
||||
mkdir -p ~/stacks
|
||||
cd ~/stacks
|
||||
|
||||
# Create Redis stack file
|
||||
vi redis-stack.yml
|
||||
```
|
||||
|
||||
Copy and paste the following:
|
||||
|
||||
```yaml
|
||||
version: '3.8'
|
||||
|
||||
networks:
|
||||
maple-private-prod:
|
||||
external: true
|
||||
|
||||
volumes:
|
||||
redis-data:
|
||||
|
||||
secrets:
|
||||
redis_password:
|
||||
external: true
|
||||
|
||||
services:
|
||||
redis:
|
||||
image: redis:7-alpine
|
||||
hostname: redis
|
||||
networks:
|
||||
- maple-private-prod
|
||||
volumes:
|
||||
- redis-data:/data
|
||||
secrets:
|
||||
- redis_password
|
||||
# Command with password from secret
|
||||
command: >
|
||||
sh -c '
|
||||
redis-server
|
||||
--requirepass "$$(cat /run/secrets/redis_password)"
|
||||
--bind 0.0.0.0
|
||||
--port 6379
|
||||
--protected-mode no
|
||||
--save 900 1
|
||||
--save 300 10
|
||||
--save 60 10000
|
||||
--appendonly yes
|
||||
--appendfilename "appendonly.aof"
|
||||
--appendfsync everysec
|
||||
--maxmemory 512mb
|
||||
--maxmemory-policy allkeys-lru
|
||||
--loglevel notice
|
||||
--databases 16
|
||||
--timeout 300
|
||||
--tcp-keepalive 300
|
||||
--io-threads 2
|
||||
--io-threads-do-reads yes
|
||||
--slowlog-log-slower-than 10000
|
||||
--slowlog-max-len 128
|
||||
--activerehashing yes
|
||||
--maxclients 10000
|
||||
--rename-command FLUSHDB ""
|
||||
--rename-command FLUSHALL ""
|
||||
--rename-command CONFIG ""
|
||||
'
|
||||
deploy:
|
||||
replicas: 1
|
||||
placement:
|
||||
constraints:
|
||||
- node.labels.redis == true
|
||||
restart_policy:
|
||||
condition: on-failure
|
||||
delay: 5s
|
||||
max_attempts: 3
|
||||
resources:
|
||||
limits:
|
||||
memory: 768M
|
||||
reservations:
|
||||
memory: 512M
|
||||
healthcheck:
|
||||
test: ["CMD", "sh", "-c", "redis-cli -a $$(cat /run/secrets/redis_password) ping | grep PONG"]
|
||||
interval: 10s
|
||||
timeout: 3s
|
||||
retries: 3
|
||||
start_period: 10s
|
||||
```
|
||||
|
||||
Save and exit (`:wq` in vi).
|
||||
|
||||
### Step 2: Verify Shared Overlay Network
|
||||
|
||||
**Check if the maple-private-prod network exists:**
|
||||
|
||||
```bash
|
||||
docker network ls | grep maple-private-prod
|
||||
```
|
||||
|
||||
**You should see:**
|
||||
|
||||
```
|
||||
abc123... maple-private-prod overlay swarm
|
||||
```
|
||||
|
||||
**If you completed 02_cassandra.md** (Step 4), the network already exists and you're good to go!
|
||||
|
||||
**If the network doesn't exist**, create it now:
|
||||
|
||||
```bash
|
||||
# Create the shared maple-private-prod network
|
||||
docker network create \
|
||||
--driver overlay \
|
||||
--attachable \
|
||||
maple-private-prod
|
||||
|
||||
# Verify it was created
|
||||
docker network ls | grep maple-private-prod
|
||||
```
|
||||
|
||||
**What is this network?**
|
||||
- Shared by all Maple services (Cassandra, Redis, your Go backend)
|
||||
- Enables private communication between services
|
||||
- Service names act as hostnames (e.g., `redis`, `cassandra-1`)
|
||||
- No public exposure - overlay network is internal only
|
||||
|
||||
### Step 3: Deploy Redis Stack
|
||||
|
||||
```bash
|
||||
# Deploy Redis
|
||||
docker stack deploy -c redis-stack.yml redis
|
||||
|
||||
# Expected output:
|
||||
# Creating service redis_redis
|
||||
```
|
||||
|
||||
### Step 4: Verify Deployment
|
||||
|
||||
```bash
|
||||
# Check service status
|
||||
docker service ls
|
||||
# Should show:
|
||||
# ID NAME REPLICAS IMAGE
|
||||
# xyz... redis_redis 1/1 redis:7-alpine
|
||||
|
||||
# Check which node it's running on
|
||||
docker service ps redis_redis
|
||||
# Should show mapleopentech-swarm-worker-1-prod
|
||||
|
||||
# Watch logs
|
||||
docker service logs -f redis_redis
|
||||
# Should see: "Ready to accept connections"
|
||||
# Press Ctrl+C when done
|
||||
```
|
||||
|
||||
Redis should be up and running in ~10-15 seconds.
|
||||
|
||||
---
|
||||
|
||||
## Verify Redis Health
|
||||
|
||||
### Step 1: Test Redis Connection
|
||||
|
||||
**SSH to worker-1:**
|
||||
|
||||
```bash
|
||||
# Get worker-1's public IP from your .env
|
||||
ssh dockeradmin@<worker-1-public-ip>
|
||||
|
||||
# Get Redis container ID
|
||||
REDIS_CONTAINER=$(docker ps -q --filter "name=redis_redis")
|
||||
|
||||
# Test connection (replace PASSWORD with your actual password)
|
||||
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD ping
|
||||
# Should return: PONG
|
||||
```
|
||||
|
||||
### Step 2: Test Basic Operations
|
||||
|
||||
```bash
|
||||
# Set a test key
|
||||
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD SET test:key "Hello Redis"
|
||||
# Returns: OK
|
||||
|
||||
# Get the test key
|
||||
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD GET test:key
|
||||
# Returns: "Hello Redis"
|
||||
|
||||
# Check Redis info
|
||||
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD INFO server
|
||||
# Shows Redis version, uptime, etc.
|
||||
|
||||
# Check memory usage
|
||||
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD INFO memory
|
||||
# Shows memory stats
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Redis Management
|
||||
|
||||
### Restarting Redis
|
||||
|
||||
```bash
|
||||
# On manager node
|
||||
docker service update --force redis_redis
|
||||
|
||||
# Wait for restart (10-15 seconds)
|
||||
docker service ps redis_redis
|
||||
```
|
||||
|
||||
### Stopping Redis
|
||||
|
||||
```bash
|
||||
# Remove Redis stack (data persists in volume)
|
||||
docker stack rm redis
|
||||
|
||||
# Verify it's stopped
|
||||
docker service ls | grep redis
|
||||
# Should show nothing
|
||||
```
|
||||
|
||||
### Starting Redis After Stop
|
||||
|
||||
```bash
|
||||
# Redeploy the stack
|
||||
cd ~/stacks
|
||||
docker stack deploy -c redis-stack.yml redis
|
||||
|
||||
# Data is intact from previous volume
|
||||
```
|
||||
|
||||
### Viewing Logs
|
||||
|
||||
```bash
|
||||
# Recent logs
|
||||
docker service logs redis_redis --tail 50
|
||||
|
||||
# Follow logs in real-time
|
||||
docker service logs -f redis_redis
|
||||
```
|
||||
|
||||
### Backing Up Redis Data
|
||||
|
||||
```bash
|
||||
# SSH to worker-1
|
||||
ssh dockeradmin@<worker-1-public-ip>
|
||||
|
||||
# Get container ID
|
||||
REDIS_CONTAINER=$(docker ps -q --filter "name=redis_redis")
|
||||
|
||||
# Trigger manual save
|
||||
docker exec $REDIS_CONTAINER redis-cli -a YOUR_PASSWORD BGSAVE
|
||||
|
||||
# Copy RDB file to host
|
||||
docker cp $REDIS_CONTAINER:/data/dump.rdb ~/redis-backup-$(date +%Y%m%d).rdb
|
||||
|
||||
# Download to local machine (from your local terminal)
|
||||
scp dockeradmin@<worker-1-public-ip>:~/redis-backup-*.rdb ./
|
||||
```
|
||||
|
||||
### Clearing All Data (Dangerous!)
|
||||
|
||||
Since FLUSHALL is disabled, you need to remove and recreate the volume:
|
||||
|
||||
```bash
|
||||
# On manager node
|
||||
docker stack rm redis
|
||||
|
||||
# Wait for service to stop
|
||||
sleep 10
|
||||
|
||||
# SSH to worker-1
|
||||
ssh dockeradmin@<worker-1-public-ip>
|
||||
|
||||
# Remove volume (THIS DELETES ALL DATA!)
|
||||
docker volume rm redis_redis-data
|
||||
|
||||
# Exit and redeploy from manager
|
||||
exit
|
||||
docker stack deploy -c redis-stack.yml redis
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Problem: Network Not Found During Deployment
|
||||
|
||||
**Symptom**: `network "maple-private-prod" is declared as external, but could not be found`
|
||||
|
||||
**Solution:**
|
||||
|
||||
Create the shared `maple-private-prod` network first:
|
||||
|
||||
```bash
|
||||
# Create the network
|
||||
docker network create \
|
||||
--driver overlay \
|
||||
--attachable \
|
||||
maple-private-prod
|
||||
|
||||
# Verify it exists
|
||||
docker network ls | grep maple-private-prod
|
||||
# Should show: maple-private-prod overlay swarm
|
||||
|
||||
# Then deploy Redis
|
||||
docker stack deploy -c redis-stack.yml redis
|
||||
```
|
||||
|
||||
**Why this happens:**
|
||||
- You haven't completed Step 2 (verify network)
|
||||
- The network was deleted
|
||||
- First time deploying any Maple service
|
||||
|
||||
**Note**: This network is shared by all services (Cassandra, Redis, backend). You only need to create it once, before deploying your first service.
|
||||
|
||||
### Problem: Service Won't Start
|
||||
|
||||
**Symptom**: `docker service ls` shows `0/1` replicas
|
||||
|
||||
**Solutions:**
|
||||
|
||||
1. **Check logs:**
|
||||
```bash
|
||||
docker service logs redis_redis --tail 50
|
||||
```
|
||||
|
||||
2. **Verify secret exists:**
|
||||
```bash
|
||||
docker secret ls | grep redis_password
|
||||
# Must show the secret
|
||||
```
|
||||
|
||||
3. **Check node label:**
|
||||
```bash
|
||||
docker node inspect mapleopentech-swarm-worker-1-prod --format '{{.Spec.Labels}}'
|
||||
# Must show: map[redis:true]
|
||||
```
|
||||
|
||||
4. **Verify maple-private-prod network exists:**
|
||||
```bash
|
||||
docker network ls | grep maple-private-prod
|
||||
# Should show: maple-private-prod overlay swarm
|
||||
```
|
||||
|
||||
### Problem: Can't Connect (Authentication Failed)
|
||||
|
||||
**Symptom**: `NOAUTH Authentication required` or `ERR invalid password`
|
||||
|
||||
**Solutions:**
|
||||
|
||||
1. **Verify you're using the correct password:**
|
||||
```bash
|
||||
# View the secret (from manager node)
|
||||
docker secret inspect redis_password
|
||||
# Compare ID with what you saved
|
||||
```
|
||||
|
||||
2. **Test with password from secret file:**
|
||||
```bash
|
||||
# SSH to worker-1
|
||||
REDIS_CONTAINER=$(docker ps -q --filter "name=redis_redis")
|
||||
docker exec $REDIS_CONTAINER sh -c 'redis-cli -a $(cat /run/secrets/redis_password) ping'
|
||||
# Should return: PONG
|
||||
```
|
||||
|
||||
### Problem: Container Keeps Restarting
|
||||
|
||||
**Symptom**: `docker service ps redis_redis` shows multiple restarts
|
||||
|
||||
**Solutions:**
|
||||
|
||||
1. **Check memory:**
|
||||
```bash
|
||||
# On worker-1
|
||||
free -h
|
||||
# Should have at least 1GB free
|
||||
```
|
||||
|
||||
2. **Check logs for errors:**
|
||||
```bash
|
||||
docker service logs redis_redis
|
||||
# Look for "Out of memory" or permission errors
|
||||
```
|
||||
|
||||
3. **Verify volume permissions:**
|
||||
```bash
|
||||
# On worker-1
|
||||
docker volume inspect redis_redis-data
|
||||
# Check mountpoint permissions
|
||||
```
|
||||
|
||||
### Problem: Can't Connect from Application
|
||||
|
||||
**Symptom**: Application can't reach Redis on port 6379
|
||||
|
||||
**Solutions:**
|
||||
|
||||
1. **Verify both services on same network:**
|
||||
```bash
|
||||
# Check your app is on maple-private-prod network
|
||||
docker service inspect your_app --format '{{.Spec.TaskTemplate.Networks}}'
|
||||
# Should show maple-private-prod
|
||||
```
|
||||
|
||||
2. **Test DNS resolution:**
|
||||
```bash
|
||||
# From your app container
|
||||
nslookup redis
|
||||
# Should resolve to Redis container IP
|
||||
```
|
||||
|
||||
3. **Test connectivity:**
|
||||
```bash
|
||||
# From your app container (install redis-cli first)
|
||||
redis-cli -h redis -a YOUR_PASSWORD ping
|
||||
```
|
||||
|
||||
### Problem: Slow Performance
|
||||
|
||||
**Symptom**: Redis responds slowly or times out
|
||||
|
||||
**Solutions:**
|
||||
|
||||
1. **Check slow log:**
|
||||
```bash
|
||||
docker exec $(docker ps -q --filter "name=redis_redis") \
|
||||
redis-cli -a YOUR_PASSWORD SLOWLOG GET 10
|
||||
```
|
||||
|
||||
2. **Check memory usage:**
|
||||
```bash
|
||||
docker exec $(docker ps -q --filter "name=redis_redis") \
|
||||
redis-cli -a YOUR_PASSWORD INFO memory
|
||||
# Look at used_memory_human and maxmemory_human
|
||||
```
|
||||
|
||||
3. **Check for evictions:**
|
||||
```bash
|
||||
docker exec $(docker ps -q --filter "name=redis_redis") \
|
||||
redis-cli -a YOUR_PASSWORD INFO stats | grep evicted_keys
|
||||
# High number means you need more memory
|
||||
```
|
||||
|
||||
### Problem: Data Lost After Restart
|
||||
|
||||
**Symptom**: Data disappears when container restarts
|
||||
|
||||
**Verification:**
|
||||
|
||||
```bash
|
||||
# On worker-1, check if volume exists
|
||||
docker volume ls | grep redis
|
||||
# Should show: redis_redis-data
|
||||
|
||||
# Check volume is mounted
|
||||
docker inspect $(docker ps -q --filter "name=redis_redis") --format '{{.Mounts}}'
|
||||
# Should show /data mounted to volume
|
||||
```
|
||||
|
||||
**This shouldn't happen** if volume is properly configured. If it does:
|
||||
1. Check AOF/RDB files exist: `docker exec <container> ls -lh /data/`
|
||||
2. Check Redis config: `docker exec <container> redis-cli -a PASSWORD CONFIG GET dir`
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
✅ **You now have:**
|
||||
- Redis instance running on worker-1
|
||||
- Password-protected access
|
||||
- Persistent data storage (AOF + RDB)
|
||||
- Private network connectivity
|
||||
- Ready for application integration
|
||||
|
||||
**Next guides:**
|
||||
- **04_app_backend.md** - Deploy your Go backend application
|
||||
- Connect backend to Redis and Cassandra
|
||||
- Set up NGINX reverse proxy
|
||||
|
||||
---
|
||||
|
||||
## Performance Notes
|
||||
|
||||
### Current Setup (2GB RAM Worker)
|
||||
|
||||
**Capacity:**
|
||||
- 512MB max Redis memory
|
||||
- Suitable for: ~50k-100k small keys
|
||||
- Cache hit rate: Monitor with `INFO stats`
|
||||
- Throughput: ~10,000-50,000 ops/sec
|
||||
|
||||
**Limitations:**
|
||||
- Single instance (no redundancy)
|
||||
- No Redis Cluster (no automatic sharding)
|
||||
- Limited to 512MB (maxmemory setting)
|
||||
|
||||
### Upgrade Path
|
||||
|
||||
**For Production with High Load:**
|
||||
|
||||
1. **Increase memory** (resize worker-1 to 4GB):
|
||||
- Update maxmemory to 2GB
|
||||
- Better for larger datasets
|
||||
|
||||
2. **Add Redis replica** (for redundancy):
|
||||
- Deploy second Redis on another worker
|
||||
- Configure replication
|
||||
- High availability with Sentinel
|
||||
|
||||
3. **Redis Cluster** (for very high scale):
|
||||
- 3+ worker nodes
|
||||
- Automatic sharding
|
||||
- Handles millions of keys
|
||||
|
||||
For most applications starting out, **single instance with 512MB is sufficient**.
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: November 3, 2025
|
||||
**Maintained By**: Infrastructure Team
|
||||
Loading…
Add table
Add a link
Reference in a new issue