Migration: consolidate Echo6 docs to cortex with full infrastructure cleanup sync

- Documents recent infrastructure cleanup (8 CTs destroyed, 35 DNS records removed, Headscale cleanup)
- Adds 24 new runbooks covering Authentik, PeerTube, Meshtastic, RECON, Proxmox, Mailcow, Internet Archive, GPU routing
- Adds project documentation for headscale, vaultwarden, peertube, matrix, mmud, advbbs, arr stack
- Updates services.md, environment.md, caddy.md, authentik.md to match live infrastructure
- Removes 4 deprecated runbook duplicates (canonical versions live in projects/)
- Adds .gitignore for binary archives and editor temp files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Matt Johnson 2026-04-13 06:02:16 +00:00
commit e9231ac24a
93 changed files with 51223 additions and 254 deletions

View file

@ -0,0 +1,215 @@
# Runbook: Onboard a Proxmox Node
You install Proxmox. You give CC an IP and a root password. CC does the rest.
---
## Current Cluster
| Alias | Local IP | Tailscale IP |
|----------|-----------------|-----------------|
| data | 192.168.1.240 | 100.64.0.20 |
| utility | 192.168.1.241 | 100.64.0.19 |
| cloud | 192.168.1.242 | 100.64.0.22 |
| media | 192.168.1.243 | 100.64.0.21 |
Management host: **cortex**
---
## Inputs
```
NODE_IP= # e.g. 192.168.1.244
NODE_ALIAS= # e.g. storage (lowercase, no dots)
ROOT_PASS= # root password for initial key copy
```
---
## Phase 1: SSH Access
Nothing works without this.
```bash
# Ensure sshpass is installed
which sshpass || sudo apt install -y sshpass
# Test access immediately
sshpass -p "$ROOT_PASS" ssh \
-o StrictHostKeyChecking=accept-new \
-o IdentitiesOnly=yes \
-o PreferredAuthentications=password \
root@$NODE_IP 'hostname'
```
### Gate
Must return the hostname. **Stop if this fails.**
### Add host alias
```bash
# Ensure ~/.ssh/config has global defaults (idempotent)
grep -q "IdentitiesOnly yes" ~/.ssh/config 2>/dev/null || cat >> ~/.ssh/config << 'EOF'
Host *
IdentitiesOnly yes
StrictHostKeyChecking accept-new
ConnectTimeout 10
ServerAliveInterval 30
ServerAliveCountMax 3
EOF
# Add alias (idempotent)
grep -q "Host $NODE_ALIAS$" ~/.ssh/config 2>/dev/null || cat >> ~/.ssh/config << EOF
Host $NODE_ALIAS
HostName $NODE_IP
User root
EOF
```
### Optional: Set up key auth
Eliminates the need for sshpass on every command to this node.
```bash
ls ~/.ssh/id_ed25519 || ssh-keygen -t ed25519 -C "cortex" -N "" -f ~/.ssh/id_ed25519
sshpass -p "$ROOT_PASS" ssh-copy-id \
-o StrictHostKeyChecking=accept-new \
-o IdentitiesOnly=yes \
-o PreferredAuthentications=password \
root@$NODE_IP
# Verify key auth works (no password)
ssh $NODE_ALIAS 'hostname'
```
### How CC connects for the rest of this runbook
If key auth is set up:
```bash
ssh $NODE_ALIAS '<command>'
```
If not:
```bash
sshpass -p "$ROOT_PASS" ssh $NODE_ALIAS '<command>'
```
---
## Phase 2: Base Configuration
```bash
ssh $NODE_ALIAS 'apt update && apt dist-upgrade -y'
ssh $NODE_ALIAS 'timedatectl set-timezone America/Boise'
ssh $NODE_ALIAS 'timedatectl status | grep -i sync'
# Disable enterprise repo
ssh $NODE_ALIAS 'sed -i "s/^deb/# deb/" /etc/apt/sources.list.d/pve-enterprise.list 2>/dev/null; true'
# Add no-subscription repo
ssh $NODE_ALIAS 'grep -q "pve-no-subscription" /etc/apt/sources.list.d/pve-no-subscription.list 2>/dev/null || \
echo "deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription" > /etc/apt/sources.list.d/pve-no-subscription.list'
```
---
## Phase 3: Tailscale
```bash
ssh $NODE_ALIAS 'curl -fsSL https://tailscale.com/install.sh | sh'
ssh $NODE_ALIAS 'tailscale up --login-server=https://<HEADSCALE_URL> --auth-key=<PREAUTH_KEY>'
# Get Tailscale IP and add alias
TSIP=$(ssh $NODE_ALIAS 'tailscale ip -4')
echo "Tailscale IP: $TSIP"
grep -q "Host ts-$NODE_ALIAS$" ~/.ssh/config 2>/dev/null || cat >> ~/.ssh/config << EOF
Host ts-$NODE_ALIAS
HostName $TSIP
User root
EOF
ssh ts-$NODE_ALIAS 'hostname'
```
---
## Phase 4: Verify Cluster Membership
You join the node to the cluster. CC verifies it's there.
```bash
ssh $NODE_ALIAS 'pvecm status 2>/dev/null | grep "Cluster Member"'
ssh data 'pvecm nodes'
```
If not in the cluster yet, **stop and tell the user**. Do not run `pvecm add`.
---
## Phase 5: Verify
```bash
# Authentik SSO (syncs via cluster)
ssh $NODE_ALIAS 'pveum realm list | grep authentik'
# Storage
ssh $NODE_ALIAS 'pvesm status'
ssh $NODE_ALIAS 'lsblk && echo "---" && vgs && lvs'
```
---
## Phase 6: Update Inventory
Add to CLAUDE.md cluster table:
```
| <NODE_ALIAS> | <NODE_IP> | <TSIP> |
```
Update any hardcoded node lists:
- proxmox-audit.sh (NODES array)
- Monitoring/backup targets
---
## Final Verification
Every line must say OK.
```bash
echo "=== $NODE_ALIAS ==="
echo -n "SSH (local): "; ssh $NODE_ALIAS 'echo OK' 2>&1
echo -n "SSH (tailscale): "; ssh ts-$NODE_ALIAS 'echo OK' 2>&1
echo -n "Cluster: "; ssh $NODE_ALIAS 'pvecm status 2>/dev/null | grep -q "Cluster Member: Yes" && echo OK || echo FAIL'
echo -n "Tailscale: "; ssh $NODE_ALIAS 'tailscale status --self >/dev/null 2>&1 && echo OK || echo FAIL'
echo -n "OIDC realm: "; ssh $NODE_ALIAS 'pveum realm list 2>/dev/null | grep -q authentik && echo OK || echo FAIL'
echo -n "Storage: "; ssh $NODE_ALIAS 'pvesm status >/dev/null 2>&1 && echo OK || echo FAIL'
echo -n "PVE version: "; ssh $NODE_ALIAS 'pveversion'
echo -n "Time sync: "; ssh $NODE_ALIAS 'timedatectl show -p NTPSynchronized --value'
```
---
## Troubleshooting
**"Too many authentication failures"**
`IdentitiesOnly yes` missing from `Host *` in `~/.ssh/config`.
**sshpass "Permission denied"**
Add `-o PreferredAuthentications=password -o IdentitiesOnly=yes`.
**Cluster join corosync errors**
Check `/etc/hosts` on all nodes includes the new hostname and IP.
**Authentik realm missing**
Check `systemctl status pve-cluster`. Realm syncs via pmxcfs in `/etc/pve/domains.cfg`.
**Can't migrate VMs to node**
Storage mismatch. Compare `pvesm status` on both nodes.