monorepo/cloud/infrastructure/production/setup/03_redis.md
2025-12-02 22:48:40 -05:00

15 KiB

Redis Setup (Single Instance)

Prerequisites: Complete 01_init_docker_swarm.md first

Time to Complete: 15-20 minutes

What You'll Build:

  • Single Redis instance on existing worker-1
  • Password-protected with Docker secrets
  • Private network communication only (mapleopentech-private-prod overlay)
  • Persistent data with AOF + RDB
  • Ready for Go application connections

Table of Contents

  1. Overview
  2. Label Worker Node
  3. Create Redis Password Secret
  4. Deploy Redis
  5. Verify Redis Health
  6. Connect from Application
  7. Redis Management
  8. Troubleshooting

Overview

Architecture

Docker Swarm Cluster:
├── mapleopentech-swarm-manager-1-prod (10.116.0.2)
│   └── Orchestrates cluster
│
├── mapleopentech-swarm-worker-1-prod (10.116.0.3)
│   └── Redis (single instance)
│       ├── Network: mapleopentech-private-prod (overlay, shared)
│       ├── Port: 6379 (private only)
│       ├── Auth: Password (Docker secret)
│       └── Data: Persistent volume
│
└── mapleopentech-swarm-worker-2,3,4-prod
    └── Cassandra Cluster (3 nodes)
        └── Same network: mapleopentech-private-prod

Shared Network (mapleopentech-private-prod):
├── All services can communicate
├── Service discovery by name (redis, cassandra-1, etc.)
└── No public internet access

Future Application:
└── mapleopentech-swarm-worker-X-prod
    └── Go Backend → Connects to redis:6379 and cassandra:9042 on mapleopentech-private-prod

Redis Configuration

  • Version: Redis 7 (Alpine)
  • Memory: 512MB max (with LRU eviction)
  • Persistence: AOF (every second) + RDB snapshots
  • Network: Private overlay network only
  • Authentication: Required via Docker secret
  • Security: Dangerous commands disabled (FLUSHALL, CONFIG, etc.)

Why Worker-1?

  • Already exists from Docker Swarm setup
  • Available capacity (2GB RAM droplet)
  • Keeps costs down (no new droplet needed)
  • Sufficient for caching workload

Label Worker Node

We'll use Docker node labels to ensure Redis always deploys to worker-1.

On your manager node:

# SSH to manager
ssh dockeradmin@<manager-public-ip>

# Label worker-1 for Redis placement
docker node update --label-add redis=true mapleopentech-swarm-worker-1-prod

# Verify label
docker node inspect mapleopentech-swarm-worker-1-prod --format '{{.Spec.Labels}}'
# Should show: map[redis:true]

Create Redis Password Secret

Redis will use Docker secrets for password authentication.

Step 1: Generate Strong Password

On your manager node:

# Generate a random 32-character password
REDIS_PASSWORD=$(openssl rand -base64 32 | tr -d "=+/" | cut -c1-32)

# Display it (SAVE THIS IN YOUR PASSWORD MANAGER!)
echo $REDIS_PASSWORD

# Example output: a8K9mP2nQ7rT4vW5xY6zB3cD1eF0gH8i

⚠️ IMPORTANT: Save this password in your password manager now! You'll need it for:

  • Application configuration
  • Manual Redis CLI connections
  • Troubleshooting

Step 2: Create Docker Secret

# Create secret from the password
echo $REDIS_PASSWORD | docker secret create redis_password -

# Verify secret was created
docker secret ls
# Should show:
# ID            NAME             CREATED
# abc123...     redis_password   About a minute ago

Step 3: Update .env File

On your local machine, update your .env file:

# Add to cloud/infrastructure/production/.env
REDIS_HOST=redis
REDIS_PORT=6379
REDIS_PASSWORD=<paste-the-password-here>

Deploy Redis

Step 1: Create Redis Stack File

On your manager node:

# Create directory for stack files (if not exists)
mkdir -p ~/stacks
cd ~/stacks

# Create Redis stack file
vi redis-stack.yml

Copy and paste the following:

version: '3.8'

networks:
  mapleopentech-private-prod:
    external: true

volumes:
  redis-data:

secrets:
  redis_password:
    external: true

services:
  redis:
    image: redis:7-alpine
    hostname: redis
    networks:
      - mapleopentech-private-prod
    volumes:
      - redis-data:/data
    secrets:
      - redis_password
    # Command with password from secret
    command: >
      sh -c '
      redis-server
      --requirepass "$$(cat /run/secrets/redis_password)"
      --bind 0.0.0.0
      --port 6379
      --protected-mode no
      --save 900 1
      --save 300 10
      --save 60 10000
      --appendonly yes
      --appendfilename "appendonly.aof"
      --appendfsync everysec
      --maxmemory 512mb
      --maxmemory-policy allkeys-lru
      --loglevel notice
      --databases 16
      --timeout 300
      --tcp-keepalive 300
      --io-threads 2
      --io-threads-do-reads yes
      --slowlog-log-slower-than 10000
      --slowlog-max-len 128
      --activerehashing yes
      --maxclients 10000
      --rename-command FLUSHDB ""
      --rename-command FLUSHALL ""
      --rename-command CONFIG ""
      '
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.labels.redis == true
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
      resources:
        limits:
          memory: 768M
        reservations:
          memory: 512M
    healthcheck:
      test: ["CMD", "sh", "-c", "redis-cli -a $$(cat /run/secrets/redis_password) ping | grep PONG"]
      interval: 10s
      timeout: 3s
      retries: 3
      start_period: 10s

Save and exit (:wq in vi).

Step 2: Verify Shared Overlay Network

Check if the mapleopentech-private-prod network exists:

docker network ls | grep mapleopentech-private-prod

You should see:

abc123...      mapleopentech-private-prod   overlay   swarm

If you completed 02_cassandra.md (Step 4), the network already exists and you're good to go!

If the network doesn't exist, create it now:

# Create the shared mapleopentech-private-prod network
docker network create \
  --driver overlay \
  --attachable \
  mapleopentech-private-prod

# Verify it was created
docker network ls | grep mapleopentech-private-prod

What is this network?

  • Shared by all Maple services (Cassandra, Redis, your Go backend)
  • Enables private communication between services
  • Service names act as hostnames (e.g., redis, cassandra-1)
  • No public exposure - overlay network is internal only

Step 3: Deploy Redis Stack

# Deploy Redis
docker stack deploy -c redis-stack.yml redis

# Expected output:
# Creating service redis_redis

Step 4: Verify Deployment

# Check service status
docker service ls
# Should show:
# ID        NAME          REPLICAS   IMAGE
# xyz...    redis_redis   1/1        redis:7-alpine

# Check which node it's running on
docker service ps redis_redis
# Should show mapleopentech-swarm-worker-1-prod

# Watch logs
docker service logs -f redis_redis
# Should see: "Ready to accept connections"
# Press Ctrl+C when done

Redis should be up and running in ~10-15 seconds.


Verify Redis Health

Step 1: Test Redis Connection

SSH to worker-1:

# Get worker-1's public IP from your .env
ssh dockeradmin@<worker-1-public-ip>

# Get Redis container ID
REDIS_CONTAINER=$(docker ps -q --filter "name=redis_redis")

# Test connection (replace PASSWORD with your actual password)
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD ping
# Should return: PONG

Step 2: Test Basic Operations

# Set a test key
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD SET test:key "Hello Redis"
# Returns: OK

# Get the test key
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD GET test:key
# Returns: "Hello Redis"

# Check Redis info
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD INFO server
# Shows Redis version, uptime, etc.

# Check memory usage
docker exec -it $REDIS_CONTAINER redis-cli -a YOUR_REDIS_PASSWORD INFO memory
# Shows memory stats

Redis Management

Restarting Redis

# On manager node
docker service update --force redis_redis

# Wait for restart (10-15 seconds)
docker service ps redis_redis

Stopping Redis

# Remove Redis stack (data persists in volume)
docker stack rm redis

# Verify it's stopped
docker service ls | grep redis
# Should show nothing

Starting Redis After Stop

# Redeploy the stack
cd ~/stacks
docker stack deploy -c redis-stack.yml redis

# Data is intact from previous volume

Viewing Logs

# Recent logs
docker service logs redis_redis --tail 50

# Follow logs in real-time
docker service logs -f redis_redis

Backing Up Redis Data

# SSH to worker-1
ssh dockeradmin@<worker-1-public-ip>

# Get container ID
REDIS_CONTAINER=$(docker ps -q --filter "name=redis_redis")

# Trigger manual save
docker exec $REDIS_CONTAINER redis-cli -a YOUR_PASSWORD BGSAVE

# Copy RDB file to host
docker cp $REDIS_CONTAINER:/data/dump.rdb ~/redis-backup-$(date +%Y%m%d).rdb

# Download to local machine (from your local terminal)
scp dockeradmin@<worker-1-public-ip>:~/redis-backup-*.rdb ./

Clearing All Data (Dangerous!)

Since FLUSHALL is disabled, you need to remove and recreate the volume:

# On manager node
docker stack rm redis

# Wait for service to stop
sleep 10

# SSH to worker-1
ssh dockeradmin@<worker-1-public-ip>

# Remove volume (THIS DELETES ALL DATA!)
docker volume rm redis_redis-data

# Exit and redeploy from manager
exit
docker stack deploy -c redis-stack.yml redis

Troubleshooting

Problem: Network Not Found During Deployment

Symptom: network "mapleopentech-private-prod" is declared as external, but could not be found

Solution:

Create the shared mapleopentech-private-prod network first:

# Create the network
docker network create \
  --driver overlay \
  --attachable \
  mapleopentech-private-prod

# Verify it exists
docker network ls | grep mapleopentech-private-prod
# Should show: mapleopentech-private-prod   overlay   swarm

# Then deploy Redis
docker stack deploy -c redis-stack.yml redis

Why this happens:

  • You haven't completed Step 2 (verify network)
  • The network was deleted
  • First time deploying any Maple service

Note: This network is shared by all services (Cassandra, Redis, backend). You only need to create it once, before deploying your first service.

Problem: Service Won't Start

Symptom: docker service ls shows 0/1 replicas

Solutions:

  1. Check logs:

    docker service logs redis_redis --tail 50
    
  2. Verify secret exists:

    docker secret ls | grep redis_password
    # Must show the secret
    
  3. Check node label:

    docker node inspect mapleopentech-swarm-worker-1-prod --format '{{.Spec.Labels}}'
    # Must show: map[redis:true]
    
  4. Verify mapleopentech-private-prod network exists:

    docker network ls | grep mapleopentech-private-prod
    # Should show: mapleopentech-private-prod   overlay   swarm
    

Problem: Can't Connect (Authentication Failed)

Symptom: NOAUTH Authentication required or ERR invalid password

Solutions:

  1. Verify you're using the correct password:

    # View the secret (from manager node)
    docker secret inspect redis_password
    # Compare ID with what you saved
    
  2. Test with password from secret file:

    # SSH to worker-1
    REDIS_CONTAINER=$(docker ps -q --filter "name=redis_redis")
    docker exec $REDIS_CONTAINER sh -c 'redis-cli -a $(cat /run/secrets/redis_password) ping'
    # Should return: PONG
    

Problem: Container Keeps Restarting

Symptom: docker service ps redis_redis shows multiple restarts

Solutions:

  1. Check memory:

    # On worker-1
    free -h
    # Should have at least 1GB free
    
  2. Check logs for errors:

    docker service logs redis_redis
    # Look for "Out of memory" or permission errors
    
  3. Verify volume permissions:

    # On worker-1
    docker volume inspect redis_redis-data
    # Check mountpoint permissions
    

Problem: Can't Connect from Application

Symptom: Application can't reach Redis on port 6379

Solutions:

  1. Verify both services on same network:

    # Check your app is on mapleopentech-private-prod network
    docker service inspect your_app --format '{{.Spec.TaskTemplate.Networks}}'
    # Should show mapleopentech-private-prod
    
  2. Test DNS resolution:

    # From your app container
    nslookup redis
    # Should resolve to Redis container IP
    
  3. Test connectivity:

    # From your app container (install redis-cli first)
    redis-cli -h redis -a YOUR_PASSWORD ping
    

Problem: Slow Performance

Symptom: Redis responds slowly or times out

Solutions:

  1. Check slow log:

    docker exec $(docker ps -q --filter "name=redis_redis") \
      redis-cli -a YOUR_PASSWORD SLOWLOG GET 10
    
  2. Check memory usage:

    docker exec $(docker ps -q --filter "name=redis_redis") \
      redis-cli -a YOUR_PASSWORD INFO memory
    # Look at used_memory_human and maxmemory_human
    
  3. Check for evictions:

    docker exec $(docker ps -q --filter "name=redis_redis") \
      redis-cli -a YOUR_PASSWORD INFO stats | grep evicted_keys
    # High number means you need more memory
    

Problem: Data Lost After Restart

Symptom: Data disappears when container restarts

Verification:

# On worker-1, check if volume exists
docker volume ls | grep redis
# Should show: redis_redis-data

# Check volume is mounted
docker inspect $(docker ps -q --filter "name=redis_redis") --format '{{.Mounts}}'
# Should show /data mounted to volume

This shouldn't happen if volume is properly configured. If it does:

  1. Check AOF/RDB files exist: docker exec <container> ls -lh /data/
  2. Check Redis config: docker exec <container> redis-cli -a PASSWORD CONFIG GET dir

Next Steps

You now have:

  • Redis instance running on worker-1
  • Password-protected access
  • Persistent data storage (AOF + RDB)
  • Private network connectivity
  • Ready for application integration

Next guides:

  • 04_app_backend.md - Deploy your Go backend application
  • Connect backend to Redis and Cassandra
  • Set up NGINX reverse proxy

Performance Notes

Current Setup (2GB RAM Worker)

Capacity:

  • 512MB max Redis memory
  • Suitable for: ~50k-100k small keys
  • Cache hit rate: Monitor with INFO stats
  • Throughput: ~10,000-50,000 ops/sec

Limitations:

  • Single instance (no redundancy)
  • No Redis Cluster (no automatic sharding)
  • Limited to 512MB (maxmemory setting)

Upgrade Path

For Production with High Load:

  1. Increase memory (resize worker-1 to 4GB):

    • Update maxmemory to 2GB
    • Better for larger datasets
  2. Add Redis replica (for redundancy):

    • Deploy second Redis on another worker
    • Configure replication
    • High availability with Sentinel
  3. Redis Cluster (for very high scale):

    • 3+ worker nodes
    • Automatic sharding
    • Handles millions of keys

For most applications starting out, single instance with 512MB is sufficient.


Last Updated: November 3, 2025 Maintained By: Infrastructure Team