- Documents recent infrastructure cleanup (8 CTs destroyed, 35 DNS records removed, Headscale cleanup) - Adds 24 new runbooks covering Authentik, PeerTube, Meshtastic, RECON, Proxmox, Mailcow, Internet Archive, GPU routing - Adds project documentation for headscale, vaultwarden, peertube, matrix, mmud, advbbs, arr stack - Updates services.md, environment.md, caddy.md, authentik.md to match live infrastructure - Removes 4 deprecated runbook duplicates (canonical versions live in projects/) - Adds .gitignore for binary archives and editor temp files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
283 lines
9.6 KiB
Markdown
283 lines
9.6 KiB
Markdown
# Proxmox — Create Ubuntu VM (Cloud-Init)
|
|
|
|
Automated VM creation using Ubuntu cloud images. No interactive installer needed.
|
|
|
|
## Prerequisites
|
|
|
|
- SSH access to the target Proxmox host (directly or via jump box)
|
|
- Headscale running on Contabo with a valid preauth key
|
|
- Target Proxmox host has sufficient resources (check with `pvesm status`, `free -h`, `nproc`)
|
|
|
|
## Variables — Prompt the User
|
|
|
|
**Before executing any steps, prompt the user for ALL of the following values.** Present them one group at a time. Suggest defaults in parentheses where noted. Do not proceed until all values are confirmed.
|
|
|
|
### Group 1 — Identity
|
|
Prompt for these first:
|
|
- **VM name** — hostname for the VM (e.g., `cortex`)
|
|
- **Proxmox host** — which node to create it on? (e.g., `toc`, `data`, `utility`)
|
|
|
|
### Group 2 — Networking
|
|
Once identity is set, prompt:
|
|
- **VMID** — 150-199 range per convention. Suggest next available by checking `qm list` on the host.
|
|
- **Static IP** — suggest matching VMID (e.g., VMID 150 → 192.168.1.150). Verify it's not already in use.
|
|
- **Gateway** — (default: `192.168.1.1`)
|
|
|
|
### Group 3 — Resources
|
|
Prompt for hardware allocation. Check host resources first (`nproc`, `free -h`, `pvesm status`) and suggest reasonable values:
|
|
- **CPU cores/threads** — how many to allocate?
|
|
- **RAM (MB)** — how much?
|
|
- **Disk (GB)** — how large?
|
|
|
|
### Group 4 — Features
|
|
Prompt yes/no for each:
|
|
- **GPU passthrough?** — if yes, detect PCI address automatically via `lspci -nn | grep -i nvidia` on the host. Requires IOMMU+VFIO already configured.
|
|
- **Install Docker?**
|
|
- **Install NVIDIA drivers?** — only relevant if GPU passthrough is yes.
|
|
- **Install Node.js?** — for Claude Code.
|
|
- **Register with Tailscale?** — if yes, Tailscale hostname defaults to VM name.
|
|
|
|
### Summary
|
|
After collecting all values, present a summary table and ask for confirmation before executing.
|
|
|
|
## Step 1 — Download Ubuntu Cloud Image
|
|
|
|
Check if image already exists on the host. Only download if missing.
|
|
|
|
```bash
|
|
ssh root@$PVE_HOST 'ls /var/lib/vz/template/iso/noble-server-cloudimg-amd64.img 2>/dev/null \
|
|
|| wget -P /var/lib/vz/template/iso/ \
|
|
https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img'
|
|
```
|
|
|
|
## Step 2 — Create VM
|
|
|
|
```bash
|
|
ssh root@$PVE_HOST "qm create $VMID \
|
|
--name $VM_NAME \
|
|
--memory $RAM_MB \
|
|
--cores $CORES \
|
|
--cpu cputype=host \
|
|
--scsihw virtio-scsi-single \
|
|
--net0 virtio,bridge=vmbr0 \
|
|
--ostype l26 \
|
|
--bios ovmf \
|
|
--efidisk0 local-lvm:1 \
|
|
--machine q35 \
|
|
--agent enabled=1 \
|
|
--onboot 1"
|
|
```
|
|
|
|
## Step 3 — Import Cloud Image as Disk
|
|
|
|
```bash
|
|
ssh root@$PVE_HOST "qm importdisk $VMID /var/lib/vz/template/iso/noble-server-cloudimg-amd64.img local-lvm"
|
|
|
|
# Attach the imported disk (disk name is vm-$VMID-disk-1)
|
|
ssh root@$PVE_HOST "qm set $VMID --scsi0 local-lvm:vm-${VMID}-disk-1,iothread=1,discard=on"
|
|
|
|
# Resize
|
|
ssh root@$PVE_HOST "qm resize $VMID scsi0 ${DISK_GB}G"
|
|
```
|
|
|
|
## Step 4 — Configure Cloud-Init
|
|
|
|
```bash
|
|
# Add cloud-init drive
|
|
ssh root@$PVE_HOST "qm set $VMID --ide2 local-lvm:cloudinit"
|
|
|
|
# Ensure SSH keys exist on the Proxmox host
|
|
# If not, pull from another node:
|
|
ssh root@$PVE_HOST 'test -f /root/.ssh/authorized_keys || echo "ERROR: No SSH keys on host"'
|
|
|
|
# Configure cloud-init
|
|
ssh root@$PVE_HOST "qm set $VMID \
|
|
--ciuser zvx \
|
|
--cipassword temp-change-me \
|
|
--ipconfig0 ip=${VM_IP}/24,gw=${GATEWAY} \
|
|
--nameserver 1.1.1.1 \
|
|
--searchdomain echo6.co \
|
|
--sshkeys /root/.ssh/authorized_keys \
|
|
--boot order=scsi0"
|
|
```
|
|
|
|
### Alternative: Use standard Echo6 cloud-init snippet
|
|
|
|
If the snippet is already on the node (deployed by `echo6-onboard-node.sh`), you can use it instead of the manual `--ciuser`/`--cipassword` config above. This pre-installs sshpass, curl, git, htop, and other standard packages via cloud-init:
|
|
|
|
```bash
|
|
ssh root@$PVE_HOST "qm set $VMID --cicustom \"user=local:snippets/echo6-base-userdata.yml\""
|
|
```
|
|
|
|
**Note:** You still need `--ipconfig0`, `--nameserver`, `--searchdomain`, `--sshkeys`, and `--boot` from the block above. The snippet only covers packages and `manage_etc_hosts`.
|
|
|
|
## Step 5 — GPU Passthrough (if enabled)
|
|
|
|
Skip if `GPU_PASSTHROUGH=no`.
|
|
|
|
Requires IOMMU and VFIO already configured on the Proxmox host. Verify first:
|
|
|
|
```bash
|
|
ssh root@$PVE_HOST 'dmesg | grep -i "IOMMU enabled"'
|
|
ssh root@$PVE_HOST "lspci -nnk -s $GPU_PCI_ADDR | grep 'Kernel driver in use: vfio-pci'"
|
|
```
|
|
|
|
If both check out:
|
|
|
|
```bash
|
|
ssh root@$PVE_HOST "qm set $VMID --hostpci0 ${GPU_PCI_ADDR},pcie=1,x-vga=0"
|
|
```
|
|
|
|
If VFIO is NOT configured, stop and follow the IOMMU/VFIO setup procedure before continuing.
|
|
|
|
## Step 6 — Start VM and Wait for Boot
|
|
|
|
```bash
|
|
ssh root@$PVE_HOST "qm config $VMID"
|
|
ssh root@$PVE_HOST "qm start $VMID"
|
|
|
|
echo "Waiting for VM to boot and run cloud-init..."
|
|
sleep 60
|
|
until ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no zvx@$VM_IP 'hostname' 2>/dev/null; do
|
|
echo "Still waiting..."
|
|
sleep 15
|
|
done
|
|
```
|
|
|
|
If the VM doesn't come up after 3 minutes, check the console via Proxmox web UI or:
|
|
|
|
```bash
|
|
ssh root@$PVE_HOST "qm terminal $VMID"
|
|
```
|
|
|
|
## Step 7 — Base System Setup
|
|
|
|
```bash
|
|
ssh zvx@$VM_IP 'sudo apt-get update && sudo apt-get install -y \
|
|
sshpass curl wget git htop iotop tmux vim \
|
|
rsync tree jq unzip \
|
|
net-tools dnsutils \
|
|
python3 python3-pip python3-venv \
|
|
sudo'
|
|
```
|
|
|
|
## Step 8 — Enable Password Authentication
|
|
|
|
Ubuntu cloud images default to key-only SSH via a drop-in config. Enable password auth since all machines are behind VPN/local network.
|
|
|
|
```bash
|
|
# Fix the cloud-init drop-in that disables password auth
|
|
ssh zvx@$VM_IP 'echo "PasswordAuthentication yes" | sudo tee /etc/ssh/sshd_config.d/60-cloudimg-settings.conf'
|
|
|
|
# Also set in main config for completeness
|
|
ssh zvx@$VM_IP 'sudo sed -i "s/^#*PasswordAuthentication.*/PasswordAuthentication yes/" /etc/ssh/sshd_config'
|
|
ssh zvx@$VM_IP 'sudo sed -i "s/^#*KbdInteractiveAuthentication.*/KbdInteractiveAuthentication yes/" /etc/ssh/sshd_config'
|
|
ssh zvx@$VM_IP 'sudo systemctl restart ssh'
|
|
|
|
# Change the default password immediately
|
|
ssh zvx@$VM_IP 'passwd'
|
|
```
|
|
|
|
**Important:** Password authentication is the default for Echo6 infrastructure. All machines are protected by VPN (Headscale/Tailscale) and local network — key-only auth creates unnecessary friction for multi-machine access.
|
|
|
|
## Step 9 — NVIDIA Drivers (if GPU passthrough)
|
|
|
|
Skip if `INSTALL_NVIDIA=no`.
|
|
|
|
```bash
|
|
ssh zvx@$VM_IP 'lspci | grep -i nvidia'
|
|
|
|
# Install driver
|
|
ssh zvx@$VM_IP 'sudo apt-get update && sudo apt-get install -y nvidia-driver-550'
|
|
ssh zvx@$VM_IP 'sudo reboot'
|
|
|
|
sleep 60
|
|
until ssh -o ConnectTimeout=5 zvx@$VM_IP 'hostname' 2>/dev/null; do
|
|
echo "Waiting for reboot..."
|
|
sleep 15
|
|
done
|
|
|
|
ssh zvx@$VM_IP 'nvidia-smi'
|
|
```
|
|
|
|
Verify: `nvidia-smi` should show the GPU name, driver version, and VRAM.
|
|
|
|
## Step 10 — Docker (if enabled)
|
|
|
|
Skip if `INSTALL_DOCKER=no`.
|
|
|
|
```bash
|
|
ssh zvx@$VM_IP 'curl -fsSL https://get.docker.com | sh'
|
|
ssh zvx@$VM_IP 'sudo usermod -aG docker zvx'
|
|
```
|
|
|
|
### NVIDIA Container Toolkit (only if GPU + Docker)
|
|
|
|
```bash
|
|
ssh zvx@$VM_IP 'curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
|
|
sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && \
|
|
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
|
|
sed "s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g" | \
|
|
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list'
|
|
|
|
ssh zvx@$VM_IP 'sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit'
|
|
ssh zvx@$VM_IP 'sudo nvidia-ctk runtime configure --runtime=docker'
|
|
ssh zvx@$VM_IP 'sudo systemctl restart docker'
|
|
|
|
# Test
|
|
ssh zvx@$VM_IP 'docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu24.04 nvidia-smi'
|
|
```
|
|
|
|
## Step 11 — Node.js (if enabled)
|
|
|
|
Skip if `INSTALL_NODEJS=no`.
|
|
|
|
```bash
|
|
ssh zvx@$VM_IP 'curl -fsSL https://deb.nodesource.com/setup_22.x | sudo bash - && \
|
|
sudo apt-get install -y nodejs'
|
|
```
|
|
|
|
## Step 12 — Tailscale Registration
|
|
|
|
Generate a preauth key on Contabo first:
|
|
|
|
```bash
|
|
docker exec headscale-standby headscale preauthkeys create --user echo6 --reusable --expiration 72h
|
|
```
|
|
|
|
Then register the VM:
|
|
|
|
```bash
|
|
ssh zvx@$VM_IP 'curl -fsSL https://tailscale.com/install.sh | sh'
|
|
ssh zvx@$VM_IP 'sudo systemctl enable tailscaled && sudo systemctl start tailscaled'
|
|
ssh zvx@$VM_IP "sudo tailscale up --login-server https://vpn.echo6.co --auth-key <KEY> --hostname $TAILSCALE_HOSTNAME"
|
|
|
|
# Verify
|
|
ssh zvx@$VM_IP 'tailscale status'
|
|
docker exec headscale-standby headscale nodes list
|
|
```
|
|
|
|
## Step 13 — Final Verification
|
|
|
|
```bash
|
|
ssh zvx@$VM_IP "
|
|
echo '=== Hostname ===' && hostname
|
|
echo '=== IP ===' && ip -4 addr show | grep 'inet 192'
|
|
echo '=== Kernel ===' && uname -r
|
|
echo '=== GPU ===' && (nvidia-smi --query-gpu=name,driver_version,memory.total --format=csv,noheader 2>/dev/null || echo 'No GPU')
|
|
echo '=== Docker ===' && (docker --version 2>/dev/null || echo 'Not installed')
|
|
echo '=== Docker GPU ===' && (docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu24.04 nvidia-smi --query-gpu=name --format=csv,noheader 2>/dev/null || echo 'N/A')
|
|
echo '=== Tailscale ===' && tailscale status
|
|
echo '=== Node.js ===' && (node --version 2>/dev/null || echo 'Not installed')
|
|
echo '=== Python ===' && python3 --version
|
|
echo '=== Disk ===' && df -h /
|
|
"
|
|
|
|
docker exec headscale-standby headscale nodes list
|
|
```
|
|
|
|
## Post-Creation
|
|
|
|
1. Update `/home/zvx/projects/.ref/docs/hardware/environment.md` with the new VM's IP and Tailscale IP
|
|
2. Update `/home/zvx/projects/.ref/docs/services/services.md` once services are deployed
|
|
3. Remove the cloud image ISO if disk space is tight: `ssh root@$PVE_HOST 'rm /var/lib/vz/template/iso/noble-server-cloudimg-amd64.img'`
|