echo6-docs/projects/cc-deploy-watchtower-v2.md
Matt Johnson e9231ac24a Migration: consolidate Echo6 docs to cortex with full infrastructure cleanup sync
- Documents recent infrastructure cleanup (8 CTs destroyed, 35 DNS records removed, Headscale cleanup)
- Adds 24 new runbooks covering Authentik, PeerTube, Meshtastic, RECON, Proxmox, Mailcow, Internet Archive, GPU routing
- Adds project documentation for headscale, vaultwarden, peertube, matrix, mmud, advbbs, arr stack
- Updates services.md, environment.md, caddy.md, authentik.md to match live infrastructure
- Removes 4 deprecated runbook duplicates (canonical versions live in projects/)
- Adds .gitignore for binary archives and editor temp files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 06:02:16 +00:00

6.2 KiB

Deploy WATCHTOWER v2 — Modular Ops Dashboard

Context: CC runs on cortex. WATCHTOWER deploys to Contabo (100.64.0.1). The tarball is at /home/zvx/projects/contabo/watchtower/watchtower-v2.tar.gz on cortex. This runbook is at /home/zvx/.ref/projects/ on cortex.

WATCHTOWER v2 is a modular FastAPI monitoring dashboard. Collectors are auto-discovered from app/collectors/ and enabled via {NAME}_ENABLED=true in .env. Adding new monitoring targets requires zero edits to existing files.

Pre-flight: Transfer tarball and SSH check

# SCP tarball from cortex (this machine) to Contabo
scp /home/zvx/projects/contabo/watchtower/watchtower-v2.tar.gz zvx@100.64.0.1:/tmp/

# Verify sshpass is installed on Contabo
ssh zvx@100.64.0.1 "which sshpass || sudo apt-get install -y sshpass"

# Test SSH from Contabo to each monitored node
ssh zvx@100.64.0.1 << 'SSHEOF'
echo "=== PeerTube (100.64.0.23) ==="
sshpass -p '7redditGold' ssh -o StrictHostKeyChecking=no zvx@100.64.0.23 "hostname && echo OK" 2>&1

echo "=== cortex/GPU (100.64.0.14) ==="
sshpass -p '7redditGold' ssh -o StrictHostKeyChecking=no zvx@100.64.0.14 "hostname && echo OK" 2>&1
SSHEOF

If either SSH fails, stop and report the error. Do not proceed without working SSH to at least one target.


Phase 1: Deploy codebase

All remaining commands run on Contabo. SSH in:

ssh zvx@100.64.0.1

Then:

# Clean any old install
sudo rm -rf /opt/watchtower

# Extract v2 tarball
sudo tar xzf /tmp/watchtower-v2.tar.gz -C /opt/
sudo mv /opt/watchtower-v2 /opt/watchtower
sudo chown -R $USER:$USER /opt/watchtower

cd /opt/watchtower

Create .env from example

cp .env.example .env

The defaults in .env.example are already set to the correct current values:

Target IP User Notes
GPU (cortex) 100.64.0.14 zvx nvidia-smi
PeerTube 100.64.0.23 zvx Native PostgreSQL (peertube_prod), pipeline at /opt/bulk-import/
RECON disabled Flip RECON_ENABLED=true when rebuilt

Verify PeerTube PostgreSQL access

PostgreSQL runs natively on the PeerTube CT (not in Docker). Verify:

sshpass -p '7redditGold' ssh zvx@100.64.0.23 "sudo -u postgres psql -d peertube_prod -t -A -c 'SELECT COUNT(*) FROM video;'"

Should return the video count (e.g., 207). If it errors, the DB name may be different — check with:

sshpass -p '7redditGold' ssh zvx@100.64.0.23 "sudo -u postgres psql -l" 

Update PT_DB_NAME in .env if needed.

Verify bulk-import pipeline paths

sshpass -p '7redditGold' ssh zvx@100.64.0.23 "ls -la /opt/bulk-import/ 2>/dev/null && wc -l /opt/bulk-import/downloaded.txt 2>/dev/null || echo 'PATH NOT FOUND'"

Phase 2: Build and start

cd /opt/watchtower

docker compose up -d --build

# Wait for startup then check logs
sleep 5
docker logs watchtower 2>&1 | tail -30

Expected log output

WATCHTOWER starting up...
Database connected: /data/watchtower.db
[registry] Loaded collector: gpu (GPU (cortex))
[registry] Loaded collector: peertube (PeerTube Ingest)
[registry] Skipped collector: recon (RECON_ENABLED=false)
[registry] 2 collector(s) active: ['gpu', 'peertube']
[gpu] collector starting (interval: 60s)
[peertube] collector starting (interval: 60s)

Verify collectors

# Wait for first poll cycle
sleep 65

echo "=== Health ==="
curl -s http://localhost:8084/api/health | python3 -m json.tool

echo "=== Collector Manifest ==="
curl -s http://localhost:8084/api/collectors | python3 -m json.tool

echo "=== GPU Data ==="
curl -s http://localhost:8084/api/c/gpu | python3 -m json.tool

echo "=== PeerTube Data ==="
curl -s http://localhost:8084/api/c/peertube | python3 -m json.tool

STOP — Report collector status

Tell me:

  1. Which collectors show "online": true
  2. Any errors from the logs or API responses
  3. The PeerTube DB name if it wasn't peertube_prod

Do not proceed to Phase 3 until collectors are confirmed.


Phase 3: Public access (Caddy + Authentik)

Check DNS

dig +short wt.echo6.co

If it doesn't resolve, report that — DNS record needs to be added manually.

Check/deploy Caddy config

Caddy is at 100.64.0.8 on the mesh.

echo "=== Check existing config ==="
sshpass -p '7redditGold' ssh zvx@100.64.0.8 "cat ~/docker/caddy/sites/wt.echo6.co* 2>/dev/null || echo 'NO CONFIG FOUND'"

echo "=== Check Caddy is running ==="
sshpass -p '7redditGold' ssh zvx@100.64.0.8 "docker ps --format '{{.Names}}' | grep -i caddy"

If no config exists, create it:

sshpass -p '7redditGold' ssh zvx@100.64.0.8 "cat > ~/docker/caddy/sites/wt.echo6.co.caddy << 'CADDYEOF'
wt.echo6.co {
    forward_auth localhost:9000 {
        uri /outpost.goauthentik.io/auth/caddy
        copy_headers X-Authentik-Username X-Authentik-Groups X-Authentik-Email X-Authentik-Name X-Authentik-Uid
        trusted_proxies private_ranges
    }
    reverse_proxy 100.64.0.1:8084
}
CADDYEOF"

If config already exists, verify the reverse_proxy line points to 100.64.0.1:8084 (Contabo's current Tailscale IP). If it still says 100.64.0.6, fix it:

sshpass -p '7redditGold' ssh zvx@100.64.0.8 "sed -i 's/100.64.0.6:8084/100.64.0.1:8084/' ~/docker/caddy/sites/wt.echo6.co.caddy"

Reload Caddy

sshpass -p '7redditGold' ssh zvx@100.64.0.8 "docker exec caddy caddy reload --config /etc/caddy/Caddyfile"

Test

curl -sI https://wt.echo6.co 2>&1 | head -10

Should get 302 redirect to Authentik or 200 if authenticated.


Post-deploy: How updates work

Code is volume-mounted from /opt/watchtower/app/ into the container on Contabo. To update:

ssh zvx@100.64.0.1
cd /opt/watchtower
# Edit files or git pull
docker restart watchtower

No rebuild needed for code changes. Only rebuild (docker compose up -d --build) if requirements.txt or Dockerfile changes.

Post-deploy: Adding a new collector

  1. Copy app/collectors/_example.py to app/collectors/myservice.py
  2. Edit the class: set name, display_name, implement fetch()
  3. Add to .env: MYSERVICE_ENABLED=true plus any config vars
  4. docker restart watchtower

The frontend auto-discovers the new panel. No HTML/JS/route edits needed.