- Documents recent infrastructure cleanup (8 CTs destroyed, 35 DNS records removed, Headscale cleanup) - Adds 24 new runbooks covering Authentik, PeerTube, Meshtastic, RECON, Proxmox, Mailcow, Internet Archive, GPU routing - Adds project documentation for headscale, vaultwarden, peertube, matrix, mmud, advbbs, arr stack - Updates services.md, environment.md, caddy.md, authentik.md to match live infrastructure - Removes 4 deprecated runbook duplicates (canonical versions live in projects/) - Adds .gitignore for binary archives and editor temp files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
6.2 KiB
Deploy WATCHTOWER v2 — Modular Ops Dashboard
Context: CC runs on cortex. WATCHTOWER deploys to Contabo (100.64.0.1). The tarball is at /home/zvx/projects/contabo/watchtower/watchtower-v2.tar.gz on cortex. This runbook is at /home/zvx/.ref/projects/ on cortex.
WATCHTOWER v2 is a modular FastAPI monitoring dashboard. Collectors are auto-discovered from app/collectors/ and enabled via {NAME}_ENABLED=true in .env. Adding new monitoring targets requires zero edits to existing files.
Pre-flight: Transfer tarball and SSH check
# SCP tarball from cortex (this machine) to Contabo
scp /home/zvx/projects/contabo/watchtower/watchtower-v2.tar.gz zvx@100.64.0.1:/tmp/
# Verify sshpass is installed on Contabo
ssh zvx@100.64.0.1 "which sshpass || sudo apt-get install -y sshpass"
# Test SSH from Contabo to each monitored node
ssh zvx@100.64.0.1 << 'SSHEOF'
echo "=== PeerTube (100.64.0.23) ==="
sshpass -p '7redditGold' ssh -o StrictHostKeyChecking=no zvx@100.64.0.23 "hostname && echo OK" 2>&1
echo "=== cortex/GPU (100.64.0.14) ==="
sshpass -p '7redditGold' ssh -o StrictHostKeyChecking=no zvx@100.64.0.14 "hostname && echo OK" 2>&1
SSHEOF
If either SSH fails, stop and report the error. Do not proceed without working SSH to at least one target.
Phase 1: Deploy codebase
All remaining commands run on Contabo. SSH in:
ssh zvx@100.64.0.1
Then:
# Clean any old install
sudo rm -rf /opt/watchtower
# Extract v2 tarball
sudo tar xzf /tmp/watchtower-v2.tar.gz -C /opt/
sudo mv /opt/watchtower-v2 /opt/watchtower
sudo chown -R $USER:$USER /opt/watchtower
cd /opt/watchtower
Create .env from example
cp .env.example .env
The defaults in .env.example are already set to the correct current values:
| Target | IP | User | Notes |
|---|---|---|---|
| GPU (cortex) | 100.64.0.14 | zvx | nvidia-smi |
| PeerTube | 100.64.0.23 | zvx | Native PostgreSQL (peertube_prod), pipeline at /opt/bulk-import/ |
| RECON | disabled | — | Flip RECON_ENABLED=true when rebuilt |
Verify PeerTube PostgreSQL access
PostgreSQL runs natively on the PeerTube CT (not in Docker). Verify:
sshpass -p '7redditGold' ssh zvx@100.64.0.23 "sudo -u postgres psql -d peertube_prod -t -A -c 'SELECT COUNT(*) FROM video;'"
Should return the video count (e.g., 207). If it errors, the DB name may be different — check with:
sshpass -p '7redditGold' ssh zvx@100.64.0.23 "sudo -u postgres psql -l"
Update PT_DB_NAME in .env if needed.
Verify bulk-import pipeline paths
sshpass -p '7redditGold' ssh zvx@100.64.0.23 "ls -la /opt/bulk-import/ 2>/dev/null && wc -l /opt/bulk-import/downloaded.txt 2>/dev/null || echo 'PATH NOT FOUND'"
Phase 2: Build and start
cd /opt/watchtower
docker compose up -d --build
# Wait for startup then check logs
sleep 5
docker logs watchtower 2>&1 | tail -30
Expected log output
WATCHTOWER starting up...
Database connected: /data/watchtower.db
[registry] Loaded collector: gpu (GPU (cortex))
[registry] Loaded collector: peertube (PeerTube Ingest)
[registry] Skipped collector: recon (RECON_ENABLED=false)
[registry] 2 collector(s) active: ['gpu', 'peertube']
[gpu] collector starting (interval: 60s)
[peertube] collector starting (interval: 60s)
Verify collectors
# Wait for first poll cycle
sleep 65
echo "=== Health ==="
curl -s http://localhost:8084/api/health | python3 -m json.tool
echo "=== Collector Manifest ==="
curl -s http://localhost:8084/api/collectors | python3 -m json.tool
echo "=== GPU Data ==="
curl -s http://localhost:8084/api/c/gpu | python3 -m json.tool
echo "=== PeerTube Data ==="
curl -s http://localhost:8084/api/c/peertube | python3 -m json.tool
⛔ STOP — Report collector status
Tell me:
- Which collectors show
"online": true - Any errors from the logs or API responses
- The PeerTube DB name if it wasn't
peertube_prod
Do not proceed to Phase 3 until collectors are confirmed.
Phase 3: Public access (Caddy + Authentik)
Check DNS
dig +short wt.echo6.co
If it doesn't resolve, report that — DNS record needs to be added manually.
Check/deploy Caddy config
Caddy is at 100.64.0.8 on the mesh.
echo "=== Check existing config ==="
sshpass -p '7redditGold' ssh zvx@100.64.0.8 "cat ~/docker/caddy/sites/wt.echo6.co* 2>/dev/null || echo 'NO CONFIG FOUND'"
echo "=== Check Caddy is running ==="
sshpass -p '7redditGold' ssh zvx@100.64.0.8 "docker ps --format '{{.Names}}' | grep -i caddy"
If no config exists, create it:
sshpass -p '7redditGold' ssh zvx@100.64.0.8 "cat > ~/docker/caddy/sites/wt.echo6.co.caddy << 'CADDYEOF'
wt.echo6.co {
forward_auth localhost:9000 {
uri /outpost.goauthentik.io/auth/caddy
copy_headers X-Authentik-Username X-Authentik-Groups X-Authentik-Email X-Authentik-Name X-Authentik-Uid
trusted_proxies private_ranges
}
reverse_proxy 100.64.0.1:8084
}
CADDYEOF"
If config already exists, verify the reverse_proxy line points to 100.64.0.1:8084 (Contabo's current Tailscale IP). If it still says 100.64.0.6, fix it:
sshpass -p '7redditGold' ssh zvx@100.64.0.8 "sed -i 's/100.64.0.6:8084/100.64.0.1:8084/' ~/docker/caddy/sites/wt.echo6.co.caddy"
Reload Caddy
sshpass -p '7redditGold' ssh zvx@100.64.0.8 "docker exec caddy caddy reload --config /etc/caddy/Caddyfile"
Test
curl -sI https://wt.echo6.co 2>&1 | head -10
Should get 302 redirect to Authentik or 200 if authenticated.
Post-deploy: How updates work
Code is volume-mounted from /opt/watchtower/app/ into the container on Contabo. To update:
ssh zvx@100.64.0.1
cd /opt/watchtower
# Edit files or git pull
docker restart watchtower
No rebuild needed for code changes. Only rebuild (docker compose up -d --build) if requirements.txt or Dockerfile changes.
Post-deploy: Adding a new collector
- Copy
app/collectors/_example.pytoapp/collectors/myservice.py - Edit the class: set
name,display_name, implementfetch() - Add to
.env:MYSERVICE_ENABLED=trueplus any config vars docker restart watchtower
The frontend auto-discovers the new panel. No HTML/JS/route edits needed.