Initial commit: Open sourcing all of the Maple Open Technologies code.
This commit is contained in:
commit
755d54a99d
2010 changed files with 448675 additions and 0 deletions
693
cloud/infrastructure/production/automation/README.md
Normal file
693
cloud/infrastructure/production/automation/README.md
Normal file
|
|
@ -0,0 +1,693 @@
|
|||
# Automation Scripts and Tools
|
||||
|
||||
**Audience**: DevOps Engineers, Automation Teams
|
||||
**Purpose**: Automated scripts, monitoring configs, and CI/CD pipelines for production infrastructure
|
||||
**Prerequisites**: Infrastructure deployed, basic scripting knowledge
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This directory contains automation tools, scripts, and configurations to reduce manual operational overhead and ensure consistency across deployments.
|
||||
|
||||
**What's automated:**
|
||||
- Backup procedures (scheduled)
|
||||
- Deployment workflows (CI/CD)
|
||||
- Monitoring and alerting (Prometheus/Grafana configs)
|
||||
- Common maintenance tasks (scripts)
|
||||
- Infrastructure health checks
|
||||
|
||||
---
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
automation/
|
||||
├── README.md # This file
|
||||
│
|
||||
├── scripts/ # Operational scripts
|
||||
│ ├── backup-all.sh # Master backup orchestrator
|
||||
│ ├── backup-cassandra.sh # Cassandra snapshot + upload
|
||||
│ ├── backup-redis.sh # Redis RDB/AOF backup
|
||||
│ ├── backup-meilisearch.sh # Meilisearch dump export
|
||||
│ ├── deploy-backend.sh # Backend deployment automation
|
||||
│ ├── deploy-frontend.sh # Frontend deployment automation
|
||||
│ ├── health-check.sh # Infrastructure health verification
|
||||
│ ├── rotate-secrets.sh # Secret rotation automation
|
||||
│ └── cleanup-docker.sh # Docker cleanup (images, containers)
|
||||
│
|
||||
├── monitoring/ # Monitoring configurations
|
||||
│ ├── prometheus.yml # Prometheus scrape configs
|
||||
│ ├── alertmanager.yml # Alert routing and receivers
|
||||
│ ├── alert-rules.yml # Prometheus alert definitions
|
||||
│ └── grafana-dashboards/ # JSON dashboard exports
|
||||
│ ├── infrastructure.json
|
||||
│ ├── maplepress.json
|
||||
│ └── databases.json
|
||||
│
|
||||
└── ci-cd/ # CI/CD pipeline examples
|
||||
├── github-actions.yml # GitHub Actions workflow
|
||||
├── gitlab-ci.yml # GitLab CI pipeline
|
||||
└── deployment-pipeline.md # CI/CD setup guide
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Scripts
|
||||
|
||||
### Backup Scripts
|
||||
|
||||
All backup scripts are designed to be run via cron. They:
|
||||
- Create local snapshots/dumps
|
||||
- Compress and upload to DigitalOcean Spaces
|
||||
- Clean up old backups (retention policy)
|
||||
- Log to `/var/log/`
|
||||
- Exit with appropriate codes for monitoring
|
||||
|
||||
**See `../operations/01_backup_recovery.md` for complete script contents and setup instructions.**
|
||||
|
||||
**Installation:**
|
||||
|
||||
```bash
|
||||
# On manager node
|
||||
ssh dockeradmin@<manager-ip>
|
||||
|
||||
# Copy scripts (once scripts are created in this directory)
|
||||
sudo cp automation/scripts/backup-*.sh /usr/local/bin/
|
||||
sudo chmod +x /usr/local/bin/backup-*.sh
|
||||
|
||||
# Schedule via cron
|
||||
sudo crontab -e
|
||||
# 0 2 * * * /usr/local/bin/backup-all.sh >> /var/log/backup-all.log 2>&1
|
||||
```
|
||||
|
||||
### Deployment Scripts
|
||||
|
||||
**`deploy-backend.sh`** - Automated backend deployment
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Purpose: Deploy new backend version with zero downtime
|
||||
# Usage: ./deploy-backend.sh [tag]
|
||||
# Example: ./deploy-backend.sh prod
|
||||
|
||||
set -e
|
||||
|
||||
TAG=${1:-prod}
|
||||
echo "=== Deploying Backend: Tag $TAG ==="
|
||||
|
||||
# Step 1: Build and push (from local dev machine)
|
||||
echo "Building and pushing image..."
|
||||
cd ~/go/src/codeberg.org/mapleopentech/monorepo/cloud/mapleopentech-backend
|
||||
task deploy
|
||||
|
||||
# Step 2: Force pull on worker-6
|
||||
echo "Forcing fresh pull on worker-6..."
|
||||
ssh dockeradmin@<worker-6-ip> \
|
||||
"docker pull registry.digitalocean.com/ssp/maplepress_backend:$TAG"
|
||||
|
||||
# Step 3: Redeploy stack
|
||||
echo "Redeploying stack..."
|
||||
ssh dockeradmin@<manager-ip> << 'ENDSSH'
|
||||
cd ~/stacks
|
||||
docker stack rm maplepress
|
||||
sleep 10
|
||||
docker config rm maplepress_caddyfile 2>/dev/null || true
|
||||
docker stack deploy -c maplepress-stack.yml maplepress
|
||||
ENDSSH
|
||||
|
||||
# Step 4: Verify deployment
|
||||
echo "Verifying deployment..."
|
||||
sleep 30
|
||||
ssh dockeradmin@<manager-ip> << 'ENDSSH'
|
||||
docker service ps maplepress_backend | head -5
|
||||
docker service logs maplepress_backend --tail 20
|
||||
ENDSSH
|
||||
|
||||
# Step 5: Health check
|
||||
echo "Testing health endpoint..."
|
||||
curl -f https://getmaplepress.ca/health || { echo "Health check failed!"; exit 1; }
|
||||
|
||||
echo "✅ Backend deployment complete!"
|
||||
```
|
||||
|
||||
**`deploy-frontend.sh`** - Automated frontend deployment
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Purpose: Deploy new frontend build
|
||||
# Usage: ./deploy-frontend.sh
|
||||
|
||||
set -e
|
||||
|
||||
echo "=== Deploying Frontend ==="
|
||||
|
||||
# SSH to worker-7 and run deployment
|
||||
ssh dockeradmin@<worker-7-ip> << 'ENDSSH'
|
||||
cd /var/www/monorepo
|
||||
|
||||
echo "Pulling latest code..."
|
||||
git pull origin main
|
||||
|
||||
cd web/maplepress-frontend
|
||||
|
||||
echo "Configuring production environment..."
|
||||
cat > .env.production << 'EOF'
|
||||
VITE_API_BASE_URL=https://getmaplepress.ca
|
||||
NODE_ENV=production
|
||||
EOF
|
||||
|
||||
echo "Installing dependencies..."
|
||||
npm install
|
||||
|
||||
echo "Building frontend..."
|
||||
npm run build
|
||||
|
||||
echo "Verifying build..."
|
||||
if grep -q "getmaplepress.ca" dist/assets/*.js 2>/dev/null; then
|
||||
echo "✅ Production API URL confirmed"
|
||||
else
|
||||
echo "⚠️ Warning: Production URL not found in build"
|
||||
fi
|
||||
ENDSSH
|
||||
|
||||
# Test frontend
|
||||
echo "Testing frontend..."
|
||||
curl -f https://getmaplepress.com || { echo "Frontend test failed!"; exit 1; }
|
||||
|
||||
echo "✅ Frontend deployment complete!"
|
||||
```
|
||||
|
||||
### Health Check Script
|
||||
|
||||
**`health-check.sh`** - Comprehensive infrastructure health verification
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Purpose: Check health of all infrastructure components
|
||||
# Usage: ./health-check.sh
|
||||
# Exit codes: 0=healthy, 1=warnings, 2=critical
|
||||
|
||||
WARNINGS=0
|
||||
CRITICAL=0
|
||||
|
||||
echo "=== Infrastructure Health Check ==="
|
||||
echo "Started: $(date)"
|
||||
echo ""
|
||||
|
||||
# Check all services
|
||||
echo "--- Docker Services ---"
|
||||
SERVICES_DOWN=$(docker service ls | grep -v "1/1" | grep -v "REPLICAS" | wc -l)
|
||||
if [ $SERVICES_DOWN -gt 0 ]; then
|
||||
echo "⚠️ WARNING: $SERVICES_DOWN services not at full capacity"
|
||||
docker service ls | grep -v "1/1" | grep -v "REPLICAS"
|
||||
WARNINGS=$((WARNINGS + 1))
|
||||
else
|
||||
echo "✅ All services running (1/1)"
|
||||
fi
|
||||
|
||||
# Check all nodes
|
||||
echo ""
|
||||
echo "--- Docker Nodes ---"
|
||||
NODES_DOWN=$(docker node ls | grep -v "Ready" | grep -v "STATUS" | wc -l)
|
||||
if [ $NODES_DOWN -gt 0 ]; then
|
||||
echo "🔴 CRITICAL: $NODES_DOWN nodes not ready!"
|
||||
docker node ls | grep -v "Ready" | grep -v "STATUS"
|
||||
CRITICAL=$((CRITICAL + 1))
|
||||
else
|
||||
echo "✅ All nodes ready"
|
||||
fi
|
||||
|
||||
# Check disk space
|
||||
echo ""
|
||||
echo "--- Disk Space ---"
|
||||
for NODE in worker-1 worker-2 worker-3 worker-4 worker-5 worker-6 worker-7; do
|
||||
DISK_USAGE=$(ssh -o StrictHostKeyChecking=no dockeradmin@$NODE "df -h / | tail -1 | awk '{print \$5}' | tr -d '%'")
|
||||
if [ $DISK_USAGE -gt 85 ]; then
|
||||
echo "🔴 CRITICAL: $NODE disk usage: ${DISK_USAGE}%"
|
||||
CRITICAL=$((CRITICAL + 1))
|
||||
elif [ $DISK_USAGE -gt 75 ]; then
|
||||
echo "⚠️ WARNING: $NODE disk usage: ${DISK_USAGE}%"
|
||||
WARNINGS=$((WARNINGS + 1))
|
||||
else
|
||||
echo "✅ $NODE disk usage: ${DISK_USAGE}%"
|
||||
fi
|
||||
done
|
||||
|
||||
# Check endpoints
|
||||
echo ""
|
||||
echo "--- HTTP Endpoints ---"
|
||||
if curl -sf https://getmaplepress.ca/health > /dev/null; then
|
||||
echo "✅ Backend health check passed"
|
||||
else
|
||||
echo "🔴 CRITICAL: Backend health check failed!"
|
||||
CRITICAL=$((CRITICAL + 1))
|
||||
fi
|
||||
|
||||
if curl -sf https://getmaplepress.com > /dev/null; then
|
||||
echo "✅ Frontend accessible"
|
||||
else
|
||||
echo "🔴 CRITICAL: Frontend not accessible!"
|
||||
CRITICAL=$((CRITICAL + 1))
|
||||
fi
|
||||
|
||||
# Summary
|
||||
echo ""
|
||||
echo "=== Summary ==="
|
||||
echo "Warnings: $WARNINGS"
|
||||
echo "Critical: $CRITICAL"
|
||||
|
||||
if [ $CRITICAL -gt 0 ]; then
|
||||
echo "🔴 Status: CRITICAL"
|
||||
exit 2
|
||||
elif [ $WARNINGS -gt 0 ]; then
|
||||
echo "⚠️ Status: WARNING"
|
||||
exit 1
|
||||
else
|
||||
echo "✅ Status: HEALTHY"
|
||||
exit 0
|
||||
fi
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Monitoring Configuration Files
|
||||
|
||||
### Prometheus Configuration
|
||||
|
||||
**Located at**: `monitoring/prometheus.yml`
|
||||
|
||||
```yaml
|
||||
# See ../operations/02_monitoring_alerting.md for complete configuration
|
||||
# This file should be copied to ~/stacks/monitoring-config/ on manager node
|
||||
|
||||
global:
|
||||
scrape_interval: 15s
|
||||
evaluation_interval: 15s
|
||||
|
||||
alerting:
|
||||
alertmanagers:
|
||||
- static_configs:
|
||||
- targets: ['alertmanager:9093']
|
||||
|
||||
rule_files:
|
||||
- /etc/prometheus/alert-rules.yml
|
||||
|
||||
scrape_configs:
|
||||
- job_name: 'prometheus'
|
||||
static_configs:
|
||||
- targets: ['localhost:9090']
|
||||
|
||||
- job_name: 'node-exporter'
|
||||
dns_sd_configs:
|
||||
- names: ['tasks.node-exporter']
|
||||
type: 'A'
|
||||
port: 9100
|
||||
|
||||
- job_name: 'cadvisor'
|
||||
dns_sd_configs:
|
||||
- names: ['tasks.cadvisor']
|
||||
type: 'A'
|
||||
port: 8080
|
||||
|
||||
- job_name: 'maplepress-backend'
|
||||
static_configs:
|
||||
- targets: ['maplepress-backend:8000']
|
||||
metrics_path: '/metrics'
|
||||
```
|
||||
|
||||
### Alert Rules
|
||||
|
||||
**Located at**: `monitoring/alert-rules.yml`
|
||||
|
||||
See `../operations/02_monitoring_alerting.md` for complete alert rule configurations.
|
||||
|
||||
### Grafana Dashboards
|
||||
|
||||
**Dashboard exports** (JSON format) should be stored in `monitoring/grafana-dashboards/`.
|
||||
|
||||
**To import:**
|
||||
1. Access Grafana via SSH tunnel: `ssh -L 3000:localhost:3000 dockeradmin@<manager-ip>`
|
||||
2. Open http://localhost:3000
|
||||
3. Dashboards → Import → Upload JSON file
|
||||
|
||||
**Recommended dashboards:**
|
||||
- Infrastructure Overview (node metrics, disk, CPU, memory)
|
||||
- MaplePress Application (HTTP metrics, errors, latency)
|
||||
- Database Metrics (Cassandra, Redis, Meilisearch)
|
||||
|
||||
---
|
||||
|
||||
## CI/CD Pipelines
|
||||
|
||||
### GitHub Actions Example
|
||||
|
||||
**File:** `ci-cd/github-actions.yml`
|
||||
|
||||
```yaml
|
||||
name: Deploy to Production
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
paths:
|
||||
- 'cloud/mapleopentech-backend/**'
|
||||
|
||||
jobs:
|
||||
build-and-deploy:
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v3
|
||||
|
||||
- name: Set up Go
|
||||
uses: actions/setup-go@v4
|
||||
with:
|
||||
go-version: '1.21'
|
||||
|
||||
- name: Run tests
|
||||
run: |
|
||||
cd cloud/mapleopentech-backend
|
||||
go test ./...
|
||||
|
||||
- name: Install doctl
|
||||
uses: digitalocean/action-doctl@v2
|
||||
with:
|
||||
token: ${{ secrets.DIGITALOCEAN_TOKEN }}
|
||||
|
||||
- name: Build and push Docker image
|
||||
run: |
|
||||
cd cloud/mapleopentech-backend
|
||||
doctl registry login
|
||||
docker build -t registry.digitalocean.com/ssp/maplepress_backend:prod .
|
||||
docker push registry.digitalocean.com/ssp/maplepress_backend:prod
|
||||
|
||||
- name: Deploy to production
|
||||
uses: appleboy/ssh-action@master
|
||||
with:
|
||||
host: ${{ secrets.MANAGER_IP }}
|
||||
username: dockeradmin
|
||||
key: ${{ secrets.SSH_PRIVATE_KEY }}
|
||||
script: |
|
||||
# Force pull on worker-6
|
||||
ssh dockeradmin@${{ secrets.WORKER_6_IP }} \
|
||||
"docker pull registry.digitalocean.com/ssp/maplepress_backend:prod"
|
||||
|
||||
# Redeploy stack
|
||||
cd ~/stacks
|
||||
docker stack rm maplepress
|
||||
sleep 10
|
||||
docker config rm maplepress_caddyfile || true
|
||||
docker stack deploy -c maplepress-stack.yml maplepress
|
||||
|
||||
# Wait and verify
|
||||
sleep 30
|
||||
docker service ps maplepress_backend | head -5
|
||||
|
||||
- name: Health check
|
||||
run: |
|
||||
curl -f https://getmaplepress.ca/health || exit 1
|
||||
|
||||
- name: Notify deployment
|
||||
if: always()
|
||||
uses: 8398a7/action-slack@v3
|
||||
with:
|
||||
status: ${{ job.status }}
|
||||
text: 'Backend deployment ${{ job.status }}'
|
||||
webhook_url: ${{ secrets.SLACK_WEBHOOK }}
|
||||
```
|
||||
|
||||
### GitLab CI Example
|
||||
|
||||
**File:** `ci-cd/gitlab-ci.yml`
|
||||
|
||||
```yaml
|
||||
stages:
|
||||
- test
|
||||
- build
|
||||
- deploy
|
||||
|
||||
variables:
|
||||
DOCKER_IMAGE: registry.digitalocean.com/ssp/maplepress_backend
|
||||
DOCKER_TAG: prod
|
||||
|
||||
test:
|
||||
stage: test
|
||||
image: golang:1.21
|
||||
script:
|
||||
- cd cloud/mapleopentech-backend
|
||||
- go test ./...
|
||||
|
||||
build:
|
||||
stage: build
|
||||
image: docker:latest
|
||||
services:
|
||||
- docker:dind
|
||||
before_script:
|
||||
- docker login registry.digitalocean.com -u $DIGITALOCEAN_TOKEN -p $DIGITALOCEAN_TOKEN
|
||||
script:
|
||||
- cd cloud/mapleopentech-backend
|
||||
- docker build -t $DOCKER_IMAGE:$DOCKER_TAG .
|
||||
- docker push $DOCKER_IMAGE:$DOCKER_TAG
|
||||
only:
|
||||
- main
|
||||
|
||||
deploy:
|
||||
stage: deploy
|
||||
image: alpine:latest
|
||||
before_script:
|
||||
- apk add --no-cache openssh-client
|
||||
- eval $(ssh-agent -s)
|
||||
- echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add -
|
||||
- mkdir -p ~/.ssh
|
||||
- chmod 700 ~/.ssh
|
||||
- ssh-keyscan -H $MANAGER_IP >> ~/.ssh/known_hosts
|
||||
script:
|
||||
# Force pull on worker-6
|
||||
- ssh dockeradmin@$WORKER_6_IP "docker pull $DOCKER_IMAGE:$DOCKER_TAG"
|
||||
|
||||
# Redeploy stack
|
||||
- |
|
||||
ssh dockeradmin@$MANAGER_IP << 'EOF'
|
||||
cd ~/stacks
|
||||
docker stack rm maplepress
|
||||
sleep 10
|
||||
docker config rm maplepress_caddyfile || true
|
||||
docker stack deploy -c maplepress-stack.yml maplepress
|
||||
EOF
|
||||
|
||||
# Verify deployment
|
||||
- sleep 30
|
||||
- ssh dockeradmin@$MANAGER_IP "docker service ps maplepress_backend | head -5"
|
||||
|
||||
# Health check
|
||||
- apk add --no-cache curl
|
||||
- curl -f https://getmaplepress.ca/health
|
||||
only:
|
||||
- main
|
||||
environment:
|
||||
name: production
|
||||
url: https://getmaplepress.ca
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Running Scripts Manually
|
||||
|
||||
```bash
|
||||
# Backup all services
|
||||
ssh dockeradmin@<manager-ip>
|
||||
sudo /usr/local/bin/backup-all.sh
|
||||
|
||||
# Health check
|
||||
ssh dockeradmin@<manager-ip>
|
||||
sudo /usr/local/bin/health-check.sh
|
||||
echo "Exit code: $?"
|
||||
# 0 = healthy, 1 = warnings, 2 = critical
|
||||
|
||||
# Deploy backend
|
||||
cd ~/monorepo/cloud/infrastructure/production
|
||||
./automation/scripts/deploy-backend.sh prod
|
||||
|
||||
# Deploy frontend
|
||||
./automation/scripts/deploy-frontend.sh
|
||||
```
|
||||
|
||||
### Scheduling Scripts with Cron
|
||||
|
||||
```bash
|
||||
# Edit crontab on manager
|
||||
ssh dockeradmin@<manager-ip>
|
||||
sudo crontab -e
|
||||
|
||||
# Add these lines:
|
||||
|
||||
# Backup all services daily at 2 AM
|
||||
0 2 * * * /usr/local/bin/backup-all.sh >> /var/log/backup-all.log 2>&1
|
||||
|
||||
# Health check every hour
|
||||
0 * * * * /usr/local/bin/health-check.sh >> /var/log/health-check.log 2>&1
|
||||
|
||||
# Docker cleanup weekly (Sunday 3 AM)
|
||||
0 3 * * 0 /usr/local/bin/cleanup-docker.sh >> /var/log/docker-cleanup.log 2>&1
|
||||
|
||||
# Secret rotation monthly (1st of month, 4 AM)
|
||||
0 4 1 * * /usr/local/bin/rotate-secrets.sh >> /var/log/secret-rotation.log 2>&1
|
||||
```
|
||||
|
||||
### Monitoring Script Execution
|
||||
|
||||
```bash
|
||||
# View cron logs
|
||||
sudo grep CRON /var/log/syslog | tail -20
|
||||
|
||||
# View specific script logs
|
||||
tail -f /var/log/backup-all.log
|
||||
tail -f /var/log/health-check.log
|
||||
|
||||
# Check script exit codes
|
||||
echo "Last backup exit code: $?"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Script Development
|
||||
|
||||
1. **Always use `set -e`**: Exit on first error
|
||||
2. **Log everything**: Redirect to `/var/log/`
|
||||
3. **Use exit codes**: 0=success, 1=warning, 2=critical
|
||||
4. **Idempotent**: Safe to run multiple times
|
||||
5. **Document**: Comments and usage instructions
|
||||
6. **Test**: Verify on staging before production
|
||||
|
||||
### Secret Management
|
||||
|
||||
**Never hardcode secrets in scripts!**
|
||||
|
||||
```bash
|
||||
# ❌ Bad
|
||||
REDIS_PASSWORD="mysecret123"
|
||||
|
||||
# ✅ Good
|
||||
REDIS_PASSWORD=$(docker exec redis cat /run/secrets/redis_password)
|
||||
|
||||
# ✅ Even better
|
||||
REDIS_PASSWORD=$(cat /run/secrets/redis_password 2>/dev/null || echo "")
|
||||
if [ -z "$REDIS_PASSWORD" ]; then
|
||||
echo "Error: Redis password not found"
|
||||
exit 1
|
||||
fi
|
||||
```
|
||||
|
||||
### Error Handling
|
||||
|
||||
```bash
|
||||
# Check command success
|
||||
if ! docker service ls > /dev/null 2>&1; then
|
||||
echo "Error: Cannot connect to Docker"
|
||||
exit 2
|
||||
fi
|
||||
|
||||
# Trap errors
|
||||
trap 'echo "Script failed on line $LINENO"' ERR
|
||||
|
||||
# Verify prerequisites
|
||||
for COMMAND in docker ssh s3cmd; do
|
||||
if ! command -v $COMMAND &> /dev/null; then
|
||||
echo "Error: $COMMAND not found"
|
||||
exit 1
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Script Won't Execute
|
||||
|
||||
```bash
|
||||
# Check permissions
|
||||
ls -la /usr/local/bin/script.sh
|
||||
# Should be: -rwxr-xr-x (executable)
|
||||
|
||||
# Fix permissions
|
||||
sudo chmod +x /usr/local/bin/script.sh
|
||||
|
||||
# Check shebang
|
||||
head -1 /usr/local/bin/script.sh
|
||||
# Should be: #!/bin/bash
|
||||
```
|
||||
|
||||
### Cron Job Not Running
|
||||
|
||||
```bash
|
||||
# Check cron service
|
||||
sudo systemctl status cron
|
||||
|
||||
# Check cron logs
|
||||
sudo grep CRON /var/log/syslog | tail -20
|
||||
|
||||
# Test cron environment
|
||||
* * * * * /usr/bin/env > /tmp/cron-env.txt
|
||||
# Wait 1 minute, then check /tmp/cron-env.txt
|
||||
```
|
||||
|
||||
### SSH Issues in Scripts
|
||||
|
||||
```bash
|
||||
# Add SSH keys to ssh-agent
|
||||
eval $(ssh-agent)
|
||||
ssh-add ~/.ssh/id_rsa
|
||||
|
||||
# Disable strict host checking (only for internal network)
|
||||
ssh -o StrictHostKeyChecking=no user@host "command"
|
||||
|
||||
# Use SSH config
|
||||
cat >> ~/.ssh/config << EOF
|
||||
Host worker-*
|
||||
StrictHostKeyChecking no
|
||||
UserKnownHostsFile=/dev/null
|
||||
EOF
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Contributing
|
||||
|
||||
**When adding new automation:**
|
||||
|
||||
1. Place scripts in `automation/scripts/`
|
||||
2. Document usage in header comments
|
||||
3. Follow naming convention: `verb-noun.sh`
|
||||
4. Test thoroughly on staging
|
||||
5. Update this README with script description
|
||||
6. Add to appropriate cron schedule if applicable
|
||||
|
||||
---
|
||||
|
||||
## Future Automation Ideas
|
||||
|
||||
**Not yet implemented, but good candidates:**
|
||||
|
||||
- [ ] Automatic SSL certificate monitoring (separate from Caddy)
|
||||
- [ ] Database performance metrics collection
|
||||
- [ ] Automated capacity planning reports
|
||||
- [ ] Self-healing scripts (restart failed services)
|
||||
- [ ] Traffic spike detection and auto-scaling
|
||||
- [ ] Automated security vulnerability scanning
|
||||
- [ ] Log aggregation and analysis
|
||||
- [ ] Cost optimization recommendations
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: January 2025
|
||||
**Maintained By**: Infrastructure Team
|
||||
|
||||
**Note**: Scripts in this directory are templates. Customize IP addresses, domains, and credentials for your specific environment before use.
|
||||
Loading…
Add table
Add a link
Reference in a new issue