echo6-docs/runbooks/proxmox-create-ubuntu-vm.md
Matt Johnson e9231ac24a Migration: consolidate Echo6 docs to cortex with full infrastructure cleanup sync
- Documents recent infrastructure cleanup (8 CTs destroyed, 35 DNS records removed, Headscale cleanup)
- Adds 24 new runbooks covering Authentik, PeerTube, Meshtastic, RECON, Proxmox, Mailcow, Internet Archive, GPU routing
- Adds project documentation for headscale, vaultwarden, peertube, matrix, mmud, advbbs, arr stack
- Updates services.md, environment.md, caddy.md, authentik.md to match live infrastructure
- Removes 4 deprecated runbook duplicates (canonical versions live in projects/)
- Adds .gitignore for binary archives and editor temp files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 06:02:16 +00:00

9.6 KiB

Proxmox — Create Ubuntu VM (Cloud-Init)

Automated VM creation using Ubuntu cloud images. No interactive installer needed.

Prerequisites

  • SSH access to the target Proxmox host (directly or via jump box)
  • Headscale running on Contabo with a valid preauth key
  • Target Proxmox host has sufficient resources (check with pvesm status, free -h, nproc)

Variables — Prompt the User

Before executing any steps, prompt the user for ALL of the following values. Present them one group at a time. Suggest defaults in parentheses where noted. Do not proceed until all values are confirmed.

Group 1 — Identity

Prompt for these first:

  • VM name — hostname for the VM (e.g., cortex)
  • Proxmox host — which node to create it on? (e.g., toc, data, utility)

Group 2 — Networking

Once identity is set, prompt:

  • VMID — 150-199 range per convention. Suggest next available by checking qm list on the host.
  • Static IP — suggest matching VMID (e.g., VMID 150 → 192.168.1.150). Verify it's not already in use.
  • Gateway — (default: 192.168.1.1)

Group 3 — Resources

Prompt for hardware allocation. Check host resources first (nproc, free -h, pvesm status) and suggest reasonable values:

  • CPU cores/threads — how many to allocate?
  • RAM (MB) — how much?
  • Disk (GB) — how large?

Group 4 — Features

Prompt yes/no for each:

  • GPU passthrough? — if yes, detect PCI address automatically via lspci -nn | grep -i nvidia on the host. Requires IOMMU+VFIO already configured.
  • Install Docker?
  • Install NVIDIA drivers? — only relevant if GPU passthrough is yes.
  • Install Node.js? — for Claude Code.
  • Register with Tailscale? — if yes, Tailscale hostname defaults to VM name.

Summary

After collecting all values, present a summary table and ask for confirmation before executing.

Step 1 — Download Ubuntu Cloud Image

Check if image already exists on the host. Only download if missing.

ssh root@$PVE_HOST 'ls /var/lib/vz/template/iso/noble-server-cloudimg-amd64.img 2>/dev/null \
  || wget -P /var/lib/vz/template/iso/ \
  https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img'

Step 2 — Create VM

ssh root@$PVE_HOST "qm create $VMID \
  --name $VM_NAME \
  --memory $RAM_MB \
  --cores $CORES \
  --cpu cputype=host \
  --scsihw virtio-scsi-single \
  --net0 virtio,bridge=vmbr0 \
  --ostype l26 \
  --bios ovmf \
  --efidisk0 local-lvm:1 \
  --machine q35 \
  --agent enabled=1 \
  --onboot 1"

Step 3 — Import Cloud Image as Disk

ssh root@$PVE_HOST "qm importdisk $VMID /var/lib/vz/template/iso/noble-server-cloudimg-amd64.img local-lvm"

# Attach the imported disk (disk name is vm-$VMID-disk-1)
ssh root@$PVE_HOST "qm set $VMID --scsi0 local-lvm:vm-${VMID}-disk-1,iothread=1,discard=on"

# Resize
ssh root@$PVE_HOST "qm resize $VMID scsi0 ${DISK_GB}G"

Step 4 — Configure Cloud-Init

# Add cloud-init drive
ssh root@$PVE_HOST "qm set $VMID --ide2 local-lvm:cloudinit"

# Ensure SSH keys exist on the Proxmox host
# If not, pull from another node:
ssh root@$PVE_HOST 'test -f /root/.ssh/authorized_keys || echo "ERROR: No SSH keys on host"'

# Configure cloud-init
ssh root@$PVE_HOST "qm set $VMID \
  --ciuser zvx \
  --cipassword temp-change-me \
  --ipconfig0 ip=${VM_IP}/24,gw=${GATEWAY} \
  --nameserver 1.1.1.1 \
  --searchdomain echo6.co \
  --sshkeys /root/.ssh/authorized_keys \
  --boot order=scsi0"

Alternative: Use standard Echo6 cloud-init snippet

If the snippet is already on the node (deployed by echo6-onboard-node.sh), you can use it instead of the manual --ciuser/--cipassword config above. This pre-installs sshpass, curl, git, htop, and other standard packages via cloud-init:

ssh root@$PVE_HOST "qm set $VMID --cicustom \"user=local:snippets/echo6-base-userdata.yml\""

Note: You still need --ipconfig0, --nameserver, --searchdomain, --sshkeys, and --boot from the block above. The snippet only covers packages and manage_etc_hosts.

Step 5 — GPU Passthrough (if enabled)

Skip if GPU_PASSTHROUGH=no.

Requires IOMMU and VFIO already configured on the Proxmox host. Verify first:

ssh root@$PVE_HOST 'dmesg | grep -i "IOMMU enabled"'
ssh root@$PVE_HOST "lspci -nnk -s $GPU_PCI_ADDR | grep 'Kernel driver in use: vfio-pci'"

If both check out:

ssh root@$PVE_HOST "qm set $VMID --hostpci0 ${GPU_PCI_ADDR},pcie=1,x-vga=0"

If VFIO is NOT configured, stop and follow the IOMMU/VFIO setup procedure before continuing.

Step 6 — Start VM and Wait for Boot

ssh root@$PVE_HOST "qm config $VMID"
ssh root@$PVE_HOST "qm start $VMID"

echo "Waiting for VM to boot and run cloud-init..."
sleep 60
until ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no zvx@$VM_IP 'hostname' 2>/dev/null; do
  echo "Still waiting..."
  sleep 15
done

If the VM doesn't come up after 3 minutes, check the console via Proxmox web UI or:

ssh root@$PVE_HOST "qm terminal $VMID"

Step 7 — Base System Setup

ssh zvx@$VM_IP 'sudo apt-get update && sudo apt-get install -y \
  sshpass curl wget git htop iotop tmux vim \
  rsync tree jq unzip \
  net-tools dnsutils \
  python3 python3-pip python3-venv \
  sudo'

Step 8 — Enable Password Authentication

Ubuntu cloud images default to key-only SSH via a drop-in config. Enable password auth since all machines are behind VPN/local network.

# Fix the cloud-init drop-in that disables password auth
ssh zvx@$VM_IP 'echo "PasswordAuthentication yes" | sudo tee /etc/ssh/sshd_config.d/60-cloudimg-settings.conf'

# Also set in main config for completeness
ssh zvx@$VM_IP 'sudo sed -i "s/^#*PasswordAuthentication.*/PasswordAuthentication yes/" /etc/ssh/sshd_config'
ssh zvx@$VM_IP 'sudo sed -i "s/^#*KbdInteractiveAuthentication.*/KbdInteractiveAuthentication yes/" /etc/ssh/sshd_config'
ssh zvx@$VM_IP 'sudo systemctl restart ssh'

# Change the default password immediately
ssh zvx@$VM_IP 'passwd'

Important: Password authentication is the default for Echo6 infrastructure. All machines are protected by VPN (Headscale/Tailscale) and local network — key-only auth creates unnecessary friction for multi-machine access.

Step 9 — NVIDIA Drivers (if GPU passthrough)

Skip if INSTALL_NVIDIA=no.

ssh zvx@$VM_IP 'lspci | grep -i nvidia'

# Install driver
ssh zvx@$VM_IP 'sudo apt-get update && sudo apt-get install -y nvidia-driver-550'
ssh zvx@$VM_IP 'sudo reboot'

sleep 60
until ssh -o ConnectTimeout=5 zvx@$VM_IP 'hostname' 2>/dev/null; do
  echo "Waiting for reboot..."
  sleep 15
done

ssh zvx@$VM_IP 'nvidia-smi'

Verify: nvidia-smi should show the GPU name, driver version, and VRAM.

Step 10 — Docker (if enabled)

Skip if INSTALL_DOCKER=no.

ssh zvx@$VM_IP 'curl -fsSL https://get.docker.com | sh'
ssh zvx@$VM_IP 'sudo usermod -aG docker zvx'

NVIDIA Container Toolkit (only if GPU + Docker)

ssh zvx@$VM_IP 'curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
  sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && \
  curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
  sed "s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g" | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list'

ssh zvx@$VM_IP 'sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit'
ssh zvx@$VM_IP 'sudo nvidia-ctk runtime configure --runtime=docker'
ssh zvx@$VM_IP 'sudo systemctl restart docker'

# Test
ssh zvx@$VM_IP 'docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu24.04 nvidia-smi'

Step 11 — Node.js (if enabled)

Skip if INSTALL_NODEJS=no.

ssh zvx@$VM_IP 'curl -fsSL https://deb.nodesource.com/setup_22.x | sudo bash - && \
  sudo apt-get install -y nodejs'

Step 12 — Tailscale Registration

Generate a preauth key on Contabo first:

docker exec headscale-standby headscale preauthkeys create --user echo6 --reusable --expiration 72h

Then register the VM:

ssh zvx@$VM_IP 'curl -fsSL https://tailscale.com/install.sh | sh'
ssh zvx@$VM_IP 'sudo systemctl enable tailscaled && sudo systemctl start tailscaled'
ssh zvx@$VM_IP "sudo tailscale up --login-server https://vpn.echo6.co --auth-key <KEY> --hostname $TAILSCALE_HOSTNAME"

# Verify
ssh zvx@$VM_IP 'tailscale status'
docker exec headscale-standby headscale nodes list

Step 13 — Final Verification

ssh zvx@$VM_IP "
  echo '=== Hostname ===' && hostname
  echo '=== IP ===' && ip -4 addr show | grep 'inet 192'
  echo '=== Kernel ===' && uname -r
  echo '=== GPU ===' && (nvidia-smi --query-gpu=name,driver_version,memory.total --format=csv,noheader 2>/dev/null || echo 'No GPU')
  echo '=== Docker ===' && (docker --version 2>/dev/null || echo 'Not installed')
  echo '=== Docker GPU ===' && (docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu24.04 nvidia-smi --query-gpu=name --format=csv,noheader 2>/dev/null || echo 'N/A')
  echo '=== Tailscale ===' && tailscale status
  echo '=== Node.js ===' && (node --version 2>/dev/null || echo 'Not installed')
  echo '=== Python ===' && python3 --version
  echo '=== Disk ===' && df -h /
"

docker exec headscale-standby headscale nodes list

Post-Creation

  1. Update /home/zvx/projects/.ref/docs/hardware/environment.md with the new VM's IP and Tailscale IP
  2. Update /home/zvx/projects/.ref/docs/services/services.md once services are deployed
  3. Remove the cloud image ISO if disk space is tight: ssh root@$PVE_HOST 'rm /var/lib/vz/template/iso/noble-server-cloudimg-amd64.img'