- Documents recent infrastructure cleanup (8 CTs destroyed, 35 DNS records removed, Headscale cleanup) - Adds 24 new runbooks covering Authentik, PeerTube, Meshtastic, RECON, Proxmox, Mailcow, Internet Archive, GPU routing - Adds project documentation for headscale, vaultwarden, peertube, matrix, mmud, advbbs, arr stack - Updates services.md, environment.md, caddy.md, authentik.md to match live infrastructure - Removes 4 deprecated runbook duplicates (canonical versions live in projects/) - Adds .gitignore for binary archives and editor temp files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5.2 KiB
Runbook: Onboard a Proxmox Node
You install Proxmox. You give CC an IP and a root password. CC does the rest.
Current Cluster
| Alias | Local IP | Tailscale IP |
|---|---|---|
| data | 192.168.1.240 | 100.64.0.20 |
| utility | 192.168.1.241 | 100.64.0.19 |
| cloud | 192.168.1.242 | 100.64.0.22 |
| media | 192.168.1.243 | 100.64.0.21 |
Management host: cortex
Inputs
NODE_IP= # e.g. 192.168.1.244
NODE_ALIAS= # e.g. storage (lowercase, no dots)
ROOT_PASS= # root password for initial key copy
Phase 1: SSH Access
Nothing works without this.
# Ensure sshpass is installed
which sshpass || sudo apt install -y sshpass
# Test access immediately
sshpass -p "$ROOT_PASS" ssh \
-o StrictHostKeyChecking=accept-new \
-o IdentitiesOnly=yes \
-o PreferredAuthentications=password \
root@$NODE_IP 'hostname'
Gate
Must return the hostname. Stop if this fails.
Add host alias
# Ensure ~/.ssh/config has global defaults (idempotent)
grep -q "IdentitiesOnly yes" ~/.ssh/config 2>/dev/null || cat >> ~/.ssh/config << 'EOF'
Host *
IdentitiesOnly yes
StrictHostKeyChecking accept-new
ConnectTimeout 10
ServerAliveInterval 30
ServerAliveCountMax 3
EOF
# Add alias (idempotent)
grep -q "Host $NODE_ALIAS$" ~/.ssh/config 2>/dev/null || cat >> ~/.ssh/config << EOF
Host $NODE_ALIAS
HostName $NODE_IP
User root
EOF
Optional: Set up key auth
Eliminates the need for sshpass on every command to this node.
ls ~/.ssh/id_ed25519 || ssh-keygen -t ed25519 -C "cortex" -N "" -f ~/.ssh/id_ed25519
sshpass -p "$ROOT_PASS" ssh-copy-id \
-o StrictHostKeyChecking=accept-new \
-o IdentitiesOnly=yes \
-o PreferredAuthentications=password \
root@$NODE_IP
# Verify key auth works (no password)
ssh $NODE_ALIAS 'hostname'
How CC connects for the rest of this runbook
If key auth is set up:
ssh $NODE_ALIAS '<command>'
If not:
sshpass -p "$ROOT_PASS" ssh $NODE_ALIAS '<command>'
Phase 2: Base Configuration
ssh $NODE_ALIAS 'apt update && apt dist-upgrade -y'
ssh $NODE_ALIAS 'timedatectl set-timezone America/Boise'
ssh $NODE_ALIAS 'timedatectl status | grep -i sync'
# Disable enterprise repo
ssh $NODE_ALIAS 'sed -i "s/^deb/# deb/" /etc/apt/sources.list.d/pve-enterprise.list 2>/dev/null; true'
# Add no-subscription repo
ssh $NODE_ALIAS 'grep -q "pve-no-subscription" /etc/apt/sources.list.d/pve-no-subscription.list 2>/dev/null || \
echo "deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription" > /etc/apt/sources.list.d/pve-no-subscription.list'
Phase 3: Tailscale
ssh $NODE_ALIAS 'curl -fsSL https://tailscale.com/install.sh | sh'
ssh $NODE_ALIAS 'tailscale up --login-server=https://<HEADSCALE_URL> --auth-key=<PREAUTH_KEY>'
# Get Tailscale IP and add alias
TSIP=$(ssh $NODE_ALIAS 'tailscale ip -4')
echo "Tailscale IP: $TSIP"
grep -q "Host ts-$NODE_ALIAS$" ~/.ssh/config 2>/dev/null || cat >> ~/.ssh/config << EOF
Host ts-$NODE_ALIAS
HostName $TSIP
User root
EOF
ssh ts-$NODE_ALIAS 'hostname'
Phase 4: Verify Cluster Membership
You join the node to the cluster. CC verifies it's there.
ssh $NODE_ALIAS 'pvecm status 2>/dev/null | grep "Cluster Member"'
ssh data 'pvecm nodes'
If not in the cluster yet, stop and tell the user. Do not run pvecm add.
Phase 5: Verify
# Authentik SSO (syncs via cluster)
ssh $NODE_ALIAS 'pveum realm list | grep authentik'
# Storage
ssh $NODE_ALIAS 'pvesm status'
ssh $NODE_ALIAS 'lsblk && echo "---" && vgs && lvs'
Phase 6: Update Inventory
Add to CLAUDE.md cluster table:
| <NODE_ALIAS> | <NODE_IP> | <TSIP> |
Update any hardcoded node lists:
- proxmox-audit.sh (NODES array)
- Monitoring/backup targets
Final Verification
Every line must say OK.
echo "=== $NODE_ALIAS ==="
echo -n "SSH (local): "; ssh $NODE_ALIAS 'echo OK' 2>&1
echo -n "SSH (tailscale): "; ssh ts-$NODE_ALIAS 'echo OK' 2>&1
echo -n "Cluster: "; ssh $NODE_ALIAS 'pvecm status 2>/dev/null | grep -q "Cluster Member: Yes" && echo OK || echo FAIL'
echo -n "Tailscale: "; ssh $NODE_ALIAS 'tailscale status --self >/dev/null 2>&1 && echo OK || echo FAIL'
echo -n "OIDC realm: "; ssh $NODE_ALIAS 'pveum realm list 2>/dev/null | grep -q authentik && echo OK || echo FAIL'
echo -n "Storage: "; ssh $NODE_ALIAS 'pvesm status >/dev/null 2>&1 && echo OK || echo FAIL'
echo -n "PVE version: "; ssh $NODE_ALIAS 'pveversion'
echo -n "Time sync: "; ssh $NODE_ALIAS 'timedatectl show -p NTPSynchronized --value'
Troubleshooting
"Too many authentication failures"
IdentitiesOnly yes missing from Host * in ~/.ssh/config.
sshpass "Permission denied"
Add -o PreferredAuthentications=password -o IdentitiesOnly=yes.
Cluster join corosync errors
Check /etc/hosts on all nodes includes the new hostname and IP.
Authentik realm missing
Check systemctl status pve-cluster. Realm syncs via pmxcfs in /etc/pve/domains.cfg.
Can't migrate VMs to node
Storage mismatch. Compare pvesm status on both nodes.