Migration: consolidate Echo6 docs to cortex with full infrastructure cleanup sync
- Documents recent infrastructure cleanup (8 CTs destroyed, 35 DNS records removed, Headscale cleanup) - Adds 24 new runbooks covering Authentik, PeerTube, Meshtastic, RECON, Proxmox, Mailcow, Internet Archive, GPU routing - Adds project documentation for headscale, vaultwarden, peertube, matrix, mmud, advbbs, arr stack - Updates services.md, environment.md, caddy.md, authentik.md to match live infrastructure - Removes 4 deprecated runbook duplicates (canonical versions live in projects/) - Adds .gitignore for binary archives and editor temp files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
89834796ff
commit
e9231ac24a
93 changed files with 51223 additions and 254 deletions
223
projects/cc-deploy-watchtower-v2.md
Normal file
223
projects/cc-deploy-watchtower-v2.md
Normal file
|
|
@ -0,0 +1,223 @@
|
|||
# Deploy WATCHTOWER v2 — Modular Ops Dashboard
|
||||
|
||||
**Context:** CC runs on cortex. WATCHTOWER deploys to Contabo (100.64.0.1). The tarball is at `/home/zvx/projects/contabo/watchtower/watchtower-v2.tar.gz` on cortex. This runbook is at `/home/zvx/.ref/projects/` on cortex.
|
||||
|
||||
WATCHTOWER v2 is a modular FastAPI monitoring dashboard. Collectors are auto-discovered from `app/collectors/` and enabled via `{NAME}_ENABLED=true` in `.env`. Adding new monitoring targets requires zero edits to existing files.
|
||||
|
||||
## Pre-flight: Transfer tarball and SSH check
|
||||
|
||||
```bash
|
||||
# SCP tarball from cortex (this machine) to Contabo
|
||||
scp /home/zvx/projects/contabo/watchtower/watchtower-v2.tar.gz zvx@100.64.0.1:/tmp/
|
||||
|
||||
# Verify sshpass is installed on Contabo
|
||||
ssh zvx@100.64.0.1 "which sshpass || sudo apt-get install -y sshpass"
|
||||
|
||||
# Test SSH from Contabo to each monitored node
|
||||
ssh zvx@100.64.0.1 << 'SSHEOF'
|
||||
echo "=== PeerTube (100.64.0.23) ==="
|
||||
sshpass -p '7redditGold' ssh -o StrictHostKeyChecking=no zvx@100.64.0.23 "hostname && echo OK" 2>&1
|
||||
|
||||
echo "=== cortex/GPU (100.64.0.14) ==="
|
||||
sshpass -p '7redditGold' ssh -o StrictHostKeyChecking=no zvx@100.64.0.14 "hostname && echo OK" 2>&1
|
||||
SSHEOF
|
||||
```
|
||||
|
||||
If either SSH fails, stop and report the error. Do not proceed without working SSH to at least one target.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Deploy codebase
|
||||
|
||||
All remaining commands run on Contabo. SSH in:
|
||||
|
||||
```bash
|
||||
ssh zvx@100.64.0.1
|
||||
```
|
||||
|
||||
Then:
|
||||
|
||||
```bash
|
||||
# Clean any old install
|
||||
sudo rm -rf /opt/watchtower
|
||||
|
||||
# Extract v2 tarball
|
||||
sudo tar xzf /tmp/watchtower-v2.tar.gz -C /opt/
|
||||
sudo mv /opt/watchtower-v2 /opt/watchtower
|
||||
sudo chown -R $USER:$USER /opt/watchtower
|
||||
|
||||
cd /opt/watchtower
|
||||
```
|
||||
|
||||
### Create .env from example
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
The defaults in `.env.example` are already set to the correct current values:
|
||||
|
||||
| Target | IP | User | Notes |
|
||||
|--------|-----|------|-------|
|
||||
| GPU (cortex) | 100.64.0.14 | zvx | nvidia-smi |
|
||||
| PeerTube | 100.64.0.23 | zvx | Native PostgreSQL (`peertube_prod`), pipeline at `/opt/bulk-import/` |
|
||||
| RECON | disabled | — | Flip `RECON_ENABLED=true` when rebuilt |
|
||||
|
||||
### Verify PeerTube PostgreSQL access
|
||||
|
||||
PostgreSQL runs natively on the PeerTube CT (not in Docker). Verify:
|
||||
|
||||
```bash
|
||||
sshpass -p '7redditGold' ssh zvx@100.64.0.23 "sudo -u postgres psql -d peertube_prod -t -A -c 'SELECT COUNT(*) FROM video;'"
|
||||
```
|
||||
|
||||
Should return the video count (e.g., 207). If it errors, the DB name may be different — check with:
|
||||
```bash
|
||||
sshpass -p '7redditGold' ssh zvx@100.64.0.23 "sudo -u postgres psql -l"
|
||||
```
|
||||
|
||||
Update `PT_DB_NAME` in `.env` if needed.
|
||||
|
||||
### Verify bulk-import pipeline paths
|
||||
|
||||
```bash
|
||||
sshpass -p '7redditGold' ssh zvx@100.64.0.23 "ls -la /opt/bulk-import/ 2>/dev/null && wc -l /opt/bulk-import/downloaded.txt 2>/dev/null || echo 'PATH NOT FOUND'"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Build and start
|
||||
|
||||
```bash
|
||||
cd /opt/watchtower
|
||||
|
||||
docker compose up -d --build
|
||||
|
||||
# Wait for startup then check logs
|
||||
sleep 5
|
||||
docker logs watchtower 2>&1 | tail -30
|
||||
```
|
||||
|
||||
### Expected log output
|
||||
|
||||
```
|
||||
WATCHTOWER starting up...
|
||||
Database connected: /data/watchtower.db
|
||||
[registry] Loaded collector: gpu (GPU (cortex))
|
||||
[registry] Loaded collector: peertube (PeerTube Ingest)
|
||||
[registry] Skipped collector: recon (RECON_ENABLED=false)
|
||||
[registry] 2 collector(s) active: ['gpu', 'peertube']
|
||||
[gpu] collector starting (interval: 60s)
|
||||
[peertube] collector starting (interval: 60s)
|
||||
```
|
||||
|
||||
### Verify collectors
|
||||
|
||||
```bash
|
||||
# Wait for first poll cycle
|
||||
sleep 65
|
||||
|
||||
echo "=== Health ==="
|
||||
curl -s http://localhost:8084/api/health | python3 -m json.tool
|
||||
|
||||
echo "=== Collector Manifest ==="
|
||||
curl -s http://localhost:8084/api/collectors | python3 -m json.tool
|
||||
|
||||
echo "=== GPU Data ==="
|
||||
curl -s http://localhost:8084/api/c/gpu | python3 -m json.tool
|
||||
|
||||
echo "=== PeerTube Data ==="
|
||||
curl -s http://localhost:8084/api/c/peertube | python3 -m json.tool
|
||||
```
|
||||
|
||||
### ⛔ STOP — Report collector status
|
||||
|
||||
Tell me:
|
||||
1. Which collectors show `"online": true`
|
||||
2. Any errors from the logs or API responses
|
||||
3. The PeerTube DB name if it wasn't `peertube_prod`
|
||||
|
||||
Do not proceed to Phase 3 until collectors are confirmed.
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Public access (Caddy + Authentik)
|
||||
|
||||
### Check DNS
|
||||
|
||||
```bash
|
||||
dig +short wt.echo6.co
|
||||
```
|
||||
|
||||
If it doesn't resolve, report that — DNS record needs to be added manually.
|
||||
|
||||
### Check/deploy Caddy config
|
||||
|
||||
Caddy is at 100.64.0.8 on the mesh.
|
||||
|
||||
```bash
|
||||
echo "=== Check existing config ==="
|
||||
sshpass -p '7redditGold' ssh zvx@100.64.0.8 "cat ~/docker/caddy/sites/wt.echo6.co* 2>/dev/null || echo 'NO CONFIG FOUND'"
|
||||
|
||||
echo "=== Check Caddy is running ==="
|
||||
sshpass -p '7redditGold' ssh zvx@100.64.0.8 "docker ps --format '{{.Names}}' | grep -i caddy"
|
||||
```
|
||||
|
||||
If no config exists, create it:
|
||||
|
||||
```bash
|
||||
sshpass -p '7redditGold' ssh zvx@100.64.0.8 "cat > ~/docker/caddy/sites/wt.echo6.co.caddy << 'CADDYEOF'
|
||||
wt.echo6.co {
|
||||
forward_auth localhost:9000 {
|
||||
uri /outpost.goauthentik.io/auth/caddy
|
||||
copy_headers X-Authentik-Username X-Authentik-Groups X-Authentik-Email X-Authentik-Name X-Authentik-Uid
|
||||
trusted_proxies private_ranges
|
||||
}
|
||||
reverse_proxy 100.64.0.1:8084
|
||||
}
|
||||
CADDYEOF"
|
||||
```
|
||||
|
||||
If config already exists, verify the `reverse_proxy` line points to `100.64.0.1:8084` (Contabo's current Tailscale IP). If it still says `100.64.0.6`, fix it:
|
||||
|
||||
```bash
|
||||
sshpass -p '7redditGold' ssh zvx@100.64.0.8 "sed -i 's/100.64.0.6:8084/100.64.0.1:8084/' ~/docker/caddy/sites/wt.echo6.co.caddy"
|
||||
```
|
||||
|
||||
### Reload Caddy
|
||||
|
||||
```bash
|
||||
sshpass -p '7redditGold' ssh zvx@100.64.0.8 "docker exec caddy caddy reload --config /etc/caddy/Caddyfile"
|
||||
```
|
||||
|
||||
### Test
|
||||
|
||||
```bash
|
||||
curl -sI https://wt.echo6.co 2>&1 | head -10
|
||||
```
|
||||
|
||||
Should get 302 redirect to Authentik or 200 if authenticated.
|
||||
|
||||
---
|
||||
|
||||
## Post-deploy: How updates work
|
||||
|
||||
Code is volume-mounted from `/opt/watchtower/app/` into the container on Contabo. To update:
|
||||
|
||||
```bash
|
||||
ssh zvx@100.64.0.1
|
||||
cd /opt/watchtower
|
||||
# Edit files or git pull
|
||||
docker restart watchtower
|
||||
```
|
||||
|
||||
No rebuild needed for code changes. Only rebuild (`docker compose up -d --build`) if `requirements.txt` or `Dockerfile` changes.
|
||||
|
||||
## Post-deploy: Adding a new collector
|
||||
|
||||
1. Copy `app/collectors/_example.py` to `app/collectors/myservice.py`
|
||||
2. Edit the class: set `name`, `display_name`, implement `fetch()`
|
||||
3. Add to `.env`: `MYSERVICE_ENABLED=true` plus any config vars
|
||||
4. `docker restart watchtower`
|
||||
|
||||
The frontend auto-discovers the new panel. No HTML/JS/route edits needed.
|
||||
Loading…
Add table
Add a link
Reference in a new issue