# Project: PeerTube Phase 2 — Import Pipeline Build **Goal:** Build a complete YouTube download → local import → GPU transcode pipeline for 99 channels (~70K+ videos, ~15.3TB) on a fresh PeerTube v8 instance. Clean slate — no legacy code, no old pipeline files. Build it right from scratch. **CC Host:** cortex (SSH to all nodes via aliases in ~/.ssh/config; Proxmox nodes use sshpass auth) --- ## SSH Prerequisites — RUN FIRST **Every CC session must verify SSH connectivity before executing any remote commands. Never assume SSH works.** ### Verify cortex → CT 110 (PeerTube) ```bash # CT 110 uses sshpass auth (same as all LXCs). Check ~/.ssh/config for alias. # Try alias first, fall back to IP: ssh -o ConnectTimeout=5 peertube 'hostname' 2>/dev/null \ || sshpass -p '7redditGold' ssh -o StrictHostKeyChecking=accept-new -o ConnectTimeout=5 zvx@192.168.1.170 'hostname' ``` ### Verify cortex → media node (Proxmox host, for pct commands if needed) ```bash sshpass -p '7redditGold' ssh -o StrictHostKeyChecking=accept-new -o ConnectTimeout=5 root@192.168.1.243 'hostname' ``` ### Gate Both must return hostnames. **Stop and fix SSH before proceeding with ANY step.** If aliases don't exist in `~/.ssh/config`, add them: ```bash grep -q "Host peertube$" ~/.ssh/config 2>/dev/null || cat >> ~/.ssh/config << 'EOF' Host peertube HostName 192.168.1.170 User zvx EOF ``` Note: Most pipeline work runs as the `peertube` user inside CT 110. SSH in as zvx, then `sudo -u peertube` or `sudo su - peertube` as needed. --- ## Runbook References These runbooks live in `~/runbooks/` on cortex. Call them by name when their scope applies: | Runbook | When to Use in Phase 2 | |---------|----------------------| | **`nordvpn-lxc.md`** | **Step 3 — RUN THIS RUNBOOK.** VPN setup on CT 110 with TUN device, NordVPN/WireGuard, split tunneling, rotation script | | **`peertube-remote-runner.md`** | **ACTIVE — used for video-transcription (Whisper captioning).** Runner on cortex handles auto-captioning with smart GPU/CPU routing. Not used for H.265 video transcoding (pipeline handles that). See runbook for Whisper setup details. | | `ct-runbook.md` | If CT 110 needs additional packages or baseline changes (provisioned in Phase 1 — reference only) | | `expose-service-home.md` | stream.echo6.co is already exposed (Phase 1). Reference only if Caddy/DNS/cert issues arise | | `authentik-oidc-application.md` | PeerTube OIDC already configured (Phase 1). Reference only if SSO breaks | | `pi-nas-omv-runbook.md` | If NFS storage issues arise (mount problems, permissions, OMV config) | | `proxmox-onboard-node.md` | SSH access patterns — the Phase 1 prereq pattern above follows this runbook's conventions | | `proxmox-create-ubuntu-vm.md` | If cortex needs modifications (GPU passthrough, NVIDIA drivers, Docker). Reference only | **Not applicable to Phase 2:** idahomesh-*, meshmonitor-*, meshtasticd-* runbooks. --- ## Infrastructure (Read-Only Context — Do Not Modify) ### PeerTube Instance - **CT 110** on **media** node (Proxmox) - Local IP: 192.168.1.170 - Tailscale IP: 100.64.0.23 - OS: Debian 12, privileged LXC - PeerTube v8 — **native install** (NOT Docker). No `docker exec` for anything. - Runs as user: `peertube` - PostgreSQL: local, accessible via `sudo -u postgres psql peertube_prod` or `sudo -u peertube psql peertube_prod` - Redis: local - Nginx: local (port 80), proxied through Caddy on utility node - Domain: stream.echo6.co - NFS storage: 18TB from pi-nas (192.168.1.245) mounted at `/var/www/peertube/storage/` - NFS export path: `/srv/dev-disk-by-uuid-822575b9-1549-4aab-823e-8160d2aa7c68/peertube/` - PeerTube config: `/var/www/peertube/config/local-production.json` (v8 uses JSON, not YAML) - PeerTube base dir: `/var/www/peertube/` - Built-in channel sync: DISABLED (bulk pipeline handles imports) - Signup: disabled (Authentik SSO only) ### GPU Pre-Transcoding (H.265 via NVENC) - **cortex** — VM on TOC node, RTX A4000 GPU passthrough - cortex is also the CC host and runs Ollama/Aurora - NVENC is separate silicon from CUDA — transcoding won't conflict with LLM inference - **PeerTube's built-in transcoding is DISABLED** — remote runners ignore transcoding plugins, so there's no way to get H.265 through the runner pipeline - Instead: a `transcoder.py` service on cortex pulls downloaded videos from CT 110, re-encodes to H.265 with `hevc_nvenc`, pushes back. The importer then uploads already-transcoded files to PeerTube with `waitTranscoding=false` - Target: H.265, 1080p only, single file per video (no HLS adaptive — LAN/Tailscale viewers don't need it) - ffmpeg command: `ffmpeg -i input.mp4 -c:v hevc_nvenc -preset medium -cq 28 -c:a aac -b:a 128k output.mp4` - File transfer: cortex pulls from CT 110 via rsync/SSH, transcodes locally to avoid NFS latency on GPU work, pushes result back ### Runner Service (ACTIVE — video-transcription/captioning) Runner on cortex handles Whisper auto-captioning. Also registered for VOD transcoding jobs but H.265 video transcoding goes through the pipeline transcoder instead. ```ini [Unit] Description=PeerTube Remote Runner (NVENC) After=network-online.target nvidia-persistenced.service Wants=network-online.target Requires=nvidia-persistenced.service [Service] Type=simple User=zvx Group=zvx Environment=NODE_ENV=production Environment=PATH=/opt/peertube-runner/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ExecStart=/usr/bin/peertube-runner server --enable-job vod-hls-transcoding --enable-job vod-audio-merge-transcoding --enable-job live-rtmp-hls-transcoding --enable-job video-studio-transcoding --enable-job video-transcription WorkingDirectory=/home/zvx Restart=always RestartSec=30 StandardOutput=journal StandardError=journal SyslogIdentifier=peertube-runner MemoryMax=20G [Install] WantedBy=multi-user.target ``` **Whisper config:** Smart wrapper at `/usr/local/bin/whisper-smart` routes <1hr to GPU (CUDA float16), >=1hr to CPU (int8). CPU jobs serialized via flock. Runner concurrency=2 (1 GPU + 1 CPU in parallel). Model: medium. See `peertube-remote-runner.md` for full details. ### Recovered Runner Health Script ```bash #!/bin/bash LOG_TAG="peertube-runner-health" if ! systemctl is-active --quiet peertube-runner; then logger -t $LOG_TAG "Runner not active, restarting..." systemctl restart peertube-runner sleep 10 fi if ! pgrep -f "peertube-runner server" > /dev/null; then logger -t $LOG_TAG "Runner process not found, restarting service..." systemctl restart peertube-runner fi if ! nvidia-smi > /dev/null 2>&1; then logger -t $LOG_TAG "GPU not accessible, restarting nvidia-persistenced and runner..." systemctl restart nvidia-persistenced sleep 5 systemctl restart peertube-runner fi ``` ### SSH / Access - cortex → CT 110: `ssh peertube` or `ssh root@192.168.1.170` (check ~/.ssh/config) - cortex → Proxmox nodes: uses sshpass (aliases in ~/.ssh/config) - CT 110 user for pipeline: `peertube` (same user that runs the PeerTube process) ### VPN - NordVPN account exists, needs fresh setup on CT 110 - LXC may not support NordVPN CLI (systemd issues) — WireGuard configs as fallback - Rotation countries: US, CA, UK, DE, NL, SE - Split tunnel / killswitch off so PeerTube stays accessible locally --- ## Channel Map — The 99 Channels ### Recovered Schema (from old WATCHTOWER add_channel.py) ```json { "category": "Tactical/SUT", "channel_name": "(YT)Garand Thumb", "actor_name": "garand-thumb", "youtube_url": "https://www.youtube.com/@GarandThumb", "youtube_channel_id": null, "peertube_channel_id": null, "video_count": 0, "priority": "H", "est_videos": 500, "est_gb": 98 } ``` ### Recovered Slug Function ```python import re def slugify_channel(name): """Convert channel name to PeerTube-safe actor_name.""" name = re.sub(r'^\(YT\)\s*', '', name) slug = re.sub(r'[^a-z0-9]+', '-', name.lower()).strip('-') return slug[:50] or 'channel' ``` ### Known YouTube URLs (from old PeerTube sync records — 24 channels) These 24 channels had active sync records with confirmed YouTube URLs: ``` Essential Craftsman → @essentialcraftsman CommsPrepper → @CommsPrepper Steven Lavimoniere → @StevenLavimoniere Andreas Spiess → @AndreasSpiess Mustie1 → @mustie1 Donyboy73 → @Donyboy73 Turn a Wood Bowl → @TurnaWoodBowl RoseRed Homestead → @RoseRedHomestead Homesteading Family → @HomesteadingFamily My Self Reliance → @MySelfReliance RegisteredNurseRN → @RegisteredNurseRN Skinny Medic → @SkinnyMedic Marine X → @MarineX Plumberparts → @plumberparts MedCram → @Medcram City Prepping → @CityPrepping Paul Kirtley → @PaulKirtley Armando Hasudungan → playlist?list=UUesNt4_Z-Pm41RzpAClfVcg Self Sufficient Me → @Selfsufficientme Taryl Fixes All → @TarylFixesAll Engineer775 → @engineer775 WeberAuto → @WeberAuto Sun Knudsen → @sunknudsen Master Your Medics → @MasterYourMedics MCQBushcraft → @MCQBushcraft ChrisFix → @ChrisFix ``` ### The 99 Channels (Finalized Feb 2026) #### OPSEC / Privacy (6) | Channel | Priority | Notes | |---------|----------|-------| | Michael Bazzell / IntelTechniques | H | OSINT + digital privacy, ex-FBI | | The Hated One | H | Privacy advocacy, surveillance deep-dives | | Mental Outlaw | H | Linux + privacy + infosec news | | Naomi Brockwell TV | M | Privacy-focused tech | | Techlore | M | Privacy tools and comparisons | | Sun Knudsen | M | Step-by-step privacy hardening | #### Physical Security (2) | Channel | Priority | Notes | |---------|----------|-------| | Deviant Ollam | H | Physical penetration testing, lock bypass | | BosnianBill | M | Lock picking, physical security analysis | #### Intelligence / OSINT (4) | Channel | Priority | Notes | |---------|----------|-------| | OSINT Dojo | H | OSINT methodology training | | Benjamin Strick | H | Professional OSINT investigations | | OSINT Curious | M | OSINT tools and techniques | | S2 Underground | H | Threat intel, analysis tradecraft | #### Cybersecurity (7) | Channel | Priority | Notes | |---------|----------|-------| | John Hammond | H | CTF walkthroughs, malware analysis | | IppSec | H | HackTheBox walkthroughs | | LiveOverflow | H | Binary exploitation, web security | | Professor Messer | M | CompTIA certification training | | The Cyber Mentor | M | Ethical hacking courses | | Hak5 | M | Hacking tools and techniques | | David Bombal | M | Networking + cybersecurity | #### Tactical / SUT (6) | Channel | Priority | Notes | |---------|----------|-------| | Garand Thumb | H | Tactics, gear testing, NV | | Dirty Civilian | H | SUT for civilians | | One Shepherd | H | Former SOF, tactical training | | Brent0331 | H | USMC veteran, tactical analysis | | Brass Facts | M | Firearms philosophy, gear testing | | Sage Dynamics | H | Research-based torture tests | #### Firearms (8) | Channel | Priority | Notes | |---------|----------|-------| | Forgotten Weapons | H | Historical + technical firearms (largest channel, ~3K videos) | | Paul Harrell | H | Terminal ballistics, practical shooting | | 9-Hole Reviews | M | Precision rifle, historical accuracy | | Lucky Gunner | M | Ammo testing, concealed carry | | C&Rsenal | M | WWI/WWII firearms deep-dives | | Jerry Miculek | M | Speed shooting, competition | | InRangeTV | M | Firearms + mud tests | | Hickok45 | M | Reviews + shooting demonstrations | #### Comms / Signals (7) | Channel | Priority | Notes | |---------|----------|-------| | OH8STN | H | Off-grid digital comms, Winlink | | Andreas Spiess | H | Electronics + LoRa + radio | | Ham Radio Crash Course | H | Amateur radio training | | Tech Minds | M | SDR, radio tech | | The Comms Channel | M | Comms gear and planning | | KM4ACK | H | Build-a-Pi, ham radio software | | Signals Everywhere | M | SDR + spectrum analysis | #### Medical (5) | Channel | Priority | Notes | |---------|----------|-------| | PrepMedic | H | Flight paramedic, trauma care | | Skinny Medic | H | IFAK, trauma kits | | MedWild | H | Wilderness medicine | | Crisis Medicine | H | Former 18D SF Medic, TCCC | | Ninja Nerd | H | Comprehensive physiology/pathology | #### Linux / Infrastructure (6) | Channel | Priority | Notes | |---------|----------|-------| | Lawrence Systems | H | Enterprise networking + Linux | | Learn Linux TV | H | Linux tutorials and homelab | | Jeff Geerling | H | Raspberry Pi, Ansible, self-hosting | | Techno Tim | M | Homelab, Docker, Kubernetes | | Level1Techs | M | Hardware + Linux deep-dives | | Wolfgang's Channel | M | Self-hosting, privacy infra | #### Hardware / Electronics (4) | Channel | Priority | Notes | |---------|----------|-------| | Ben Eater | H | Computer architecture from scratch | | EEVblog | H | Electronics engineering | | GreatScott! | M | Electronics projects | | Big Clive | M | Electronics teardowns | #### Auto / Mechanical (7) | Channel | Priority | Notes | |---------|----------|-------| | ChrisFix | H | DIY auto repair fundamentals | | Mustie1 | H | Dead machinery resurrection | | South Main Auto | H | Diagnostic logic | | 1A Auto | H | Make/model/year repair encyclopedia (~4,500 videos) | | Pine Hollow Auto Diagnostics | M | Advanced diagnostics | | ScannerDanner | M | Master electrical diagnostics | | Diesel Creek | M | Heavy equipment repair | #### Construction / Trades (7) | Channel | Priority | Notes | |---------|----------|-------| | Essential Craftsman | H | Construction + life skills | | Matt Risinger | H | Building science | | Mike Haduck Masonry | M | Foundations, concrete, stone | | Awesome Framers | M | Structural framing | | This Old House | M | Home renovation | | Electrician U | M | Electrical trade training | | Got2Learn | M | Plumbing/electrical tutorials | #### Welding / Fabrication (3) | Channel | Priority | Notes | |---------|----------|-------| | Welding Tips and Tricks | H | Welding instruction | | ChuckE2009 | M | Welding + fabrication | | Paul Sellers | H | Hand tool woodworking master | #### Sustainment / Fieldcraft (2) | Channel | Priority | Notes | |---------|----------|-------| | Corporals Corner | H | Field skills, shelter, fire | | Gray Bearded Green Beret | H | SF wilderness medicine + fieldcraft | #### Homesteading / Production (8) | Channel | Priority | Notes | |---------|----------|-------| | City Prepping | H | Urban/suburban preparedness | | My Self Reliance | H | Off-grid building | | Engineer775 | H | Off-grid power systems | | Project Farm | H | Tool and product testing | | Will Prowse / DIY Solar Power | H | Solar power systems | | Townsends | M | 18th century skills + cooking | | RoseRed Homestead | M | Homesteading skills | | The Urban Prepper | M | Urban preparedness, modular bags | #### Preparedness (1) | Channel | Priority | Notes | |---------|----------|-------| | The Provident Prepper | M | Preparedness planning methodology | #### Energy / Alt-Fuel (1) | Channel | Priority | Notes | |---------|----------|-------| | Adeptus Beta | M | Wood gasification (~7GB, tiny) | #### Education / STEM (6) | Channel | Priority | Notes | |---------|----------|-------| | Practical Engineering | H | Civil engineering with demos | | Real Engineering | M | Aerospace, energy, transport | | The Efficient Engineer | M | Core engineering fundamentals | | NurdRage | M | Chemistry experiments | | NileRed | M | Chemistry deep-dives | | Veritasium | M | Science + engineering | #### Education / Math (2) | Channel | Priority | Notes | |---------|----------|-------| | Professor Leonard | H | Full calculus + stats lectures | | Organic Chemistry Tutor | M | Math + science tutorials | #### Education / CS (2) | Channel | Priority | Notes | |---------|----------|-------| | Computerphile | H | Crypto, networking theory, security concepts | | MIT Missing Semester | M | Shell, git, dev tools (tiny, ~50 videos) | #### Small Engine (1) | Channel | Priority | Notes | |---------|----------|-------| | Donyboy73 | M | Small engine repair | #### Woodworking (1) | Channel | Priority | Notes | |---------|----------|-------| | Steve Ramsey | M | Beginner woodworking | #### Home Repair (2) | Channel | Priority | Notes | |---------|----------|-------| | Home RenoVision DIY | M | Home repair tutorials | | Roger Wakefield | M | Plumbing | #### Bushcraft (1) | Channel | Priority | Notes | |---------|----------|-------| | Joe Robinet | M | Bushcraft and camping | **Total: 99 channels across 20 categories** --- ## Execution Steps ### Step 1: Channel Map Generation **Where:** CT 110 **What:** Build `/opt/bulk-import/config/channel-map.json` **SSH Gate:** `ssh peertube 'hostname'` must succeed before proceeding. 1. Create directory structure: ```bash # Scripts and config on local disk mkdir -p /opt/bulk-import/{config,logs} chown -R peertube:peertube /opt/bulk-import # Video data on NFS (18TB pi-nas mount) mkdir -p /var/www/peertube/storage/pipeline/{staging,completed,transcoded,failed} chown -R peertube:peertube /var/www/peertube/storage/pipeline # Symlink data dirs so scripts use /opt/bulk-import/ paths ln -sfn /var/www/peertube/storage/pipeline/staging /opt/bulk-import/staging ln -sfn /var/www/peertube/storage/pipeline/completed /opt/bulk-import/completed ln -sfn /var/www/peertube/storage/pipeline/transcoded /opt/bulk-import/transcoded ln -sfn /var/www/peertube/storage/pipeline/failed /opt/bulk-import/failed ``` 2. For each of the 99 channels: - Look up the actual YouTube channel URL (use `yt-dlp --print channel_url --playlist-items 1 --skip-download "https://www.youtube.com/@ChannelHandle"` for any that need verification) - Generate `actor_name` via slugify - Write to channel-map.json 3. Use the 24 known URLs from old sync records as a head start. The remaining 75 need URL resolution. **⚠️ This step requires yt-dlp installed and working on CT 110. If yt-dlp isn't installed yet, install it first:** ```bash curl -L https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -o /usr/local/bin/yt-dlp chmod +x /usr/local/bin/yt-dlp ``` **⚠️ YouTube may rate-limit channel lookups. Space requests 2-3 seconds apart. If rate-limited, use cookies or VPN.** ### Step 2: PeerTube Channel Creation **Where:** CT 110 **What:** Batch-create all 99 channels via PeerTube API **SSH Gate:** `ssh peertube 'curl -s http://localhost:9000/api/v1/config | head -c 50'` — must return JSON. Confirms both SSH and PeerTube are up. 1. Get OAuth token from PeerTube API (local, port 9000): ```bash # Get client credentials curl -s http://localhost:9000/api/v1/oauth-clients/local -H "Host: stream.echo6.co" # Get user token curl -s http://localhost:9000/api/v1/users/token \ -H "Host: stream.echo6.co" \ --data "client_id=&client_secret=&grant_type=password&username=root&password=" ``` 2. For each channel in channel-map.json: ```bash curl -s -X POST http://localhost:9000/api/v1/video-channels \ -H "Host: stream.echo6.co" \ -H "Authorization: Bearer " \ -H "Content-Type: application/json" \ -d '{"name": "", "displayName": "(YT)", "description": "Imported from YouTube: "}' ``` 3. Capture the returned channel ID and update `peertube_channel_id` in channel-map.json 4. Verify: `curl -s http://localhost:9000/api/v1/video-channels -H "Host: stream.echo6.co" | python3 -m json.tool | grep -c '"name"'` should return 99 (plus the default channel) ### Step 3: NordVPN Setup **Where:** CT 110 **What:** Install VPN for IP rotation during YouTube downloads **SSH Gate:** `ssh peertube 'hostname'` must succeed. **➡️ RUN RUNBOOK: `~/runbooks/nordvpn-lxc.md`** Use these inputs: ``` CTID=110 CT_HOST=peertube PVE_HOST=media # or root@192.168.1.243 NORDVPN_TOKEN= # ⚠️ Get from Matt VPN_COUNTRIES="United_States,Canada,United_Kingdom,Germany,Netherlands,Sweden" VPN_CONFIG_DIR=/opt/bulk-import/config/vpn ``` **Additional context for this deployment:** - CT 110 runs PeerTube on port 9000 — split tunneling is MANDATORY so PeerTube stays reachable on 192.168.1.170 and 100.64.0.23 while VPN is active - The rotation script at `/opt/bulk-import/config/vpn/vpn-rotate.sh` will be called by `downloader.py` (Step 5) on rate-limit detection - After runbook completes, verify PeerTube still accessible: `curl -s http://192.168.1.170:9000/api/v1/config | head -c 50` (from another machine, while VPN is up on CT 110) **⚠️ NordVPN token required from Matt. Cannot proceed without it.** ### Step 4: YouTube Cookies **Where:** CT 110 **What:** Export browser cookies for yt-dlp bot detection bypass 1. Matt exports cookies from browser (Netscape format) using "Get cookies.txt LOCALLY" extension 2. SCP to CT 110: `scp cookies.txt root@192.168.1.170:/opt/bulk-import/config/cookies.txt` 3. Fix perms: `chown peertube:peertube /opt/bulk-import/config/cookies.txt && chmod 600 /opt/bulk-import/config/cookies.txt` 4. Test: `sudo -u peertube yt-dlp --cookies /opt/bulk-import/config/cookies.txt --simulate "https://www.youtube.com/watch?v=dQw4w9WgXcQ"` **⚠️ Cookies expire every 2-4 weeks. Needs manual refresh.** ### Step 5: Build downloader.py **Where:** CT 110 at `/opt/bulk-import/downloader.py` **What:** Round-robin YouTube channel downloader with VPN rotation **Deploy:** Write file locally on cortex, then `scp` to CT 110. Or write directly via `ssh peertube 'cat > /opt/bulk-import/downloader.py << "PYEOF" ... PYEOF'` **SSH Gate:** `ssh peertube 'ls /opt/bulk-import/config/channel-map.json'` — channel map must exist (Step 1 complete). Requirements: - Round-robin across all 99 channels (don't hammer one channel) - yt-dlp with: `--cookies`, `--download-archive downloaded.txt` (dedup), `--write-info-json`, `--write-thumbnail`, `--format "bestvideo[height<=1080]+bestaudio/best[height<=1080]"`, `--merge-output-format mp4` - Downloads land in `/opt/bulk-import/staging///` with .mp4 + .info.json + .jpg - On successful download, move to `/opt/bulk-import/completed///` - **Note:** transcoder.py (Step 6) picks up from completed/ — downloader does NOT feed importer directly - VPN rotation: detect rate-limit (HTTP 429, sign-in required, bot detection), disconnect current VPN, connect to next country in rotation list, retry - State file: `/opt/bulk-import/config/downloader-state.json` — tracks current channel index, current VPN country, last activity timestamp - Logging to `/opt/bulk-import/logs/downloader.log` — include `=== Channel: ===` markers (WATCHTOWER parses these) - Target throughput: ~30 videos/hr - Graceful shutdown on SIGTERM/SIGINT ### Step 6: Build transcoder.py **Where:** cortex (local — this IS the CC host) at `/opt/bulk-import/transcoder.py` **What:** Pulls H.264 videos from CT 110, re-encodes to H.265 via NVENC, pushes back **Connectivity Gate:** ```bash nvidia-smi > /dev/null 2>&1 && echo "GPU OK" || echo "GPU MISSING" ffmpeg -encoders 2>/dev/null | grep -q hevc_nvenc && echo "HEVC NVENC OK" || echo "HEVC NVENC MISSING" ssh peertube 'ls /opt/bulk-import/completed/' > /dev/null 2>&1 && echo "SSH OK" || echo "SSH FAIL" ``` Requirements: - Watch CT 110's `/opt/bulk-import/completed/` for new video directories (via SSH/rsync polling, not inotify — it's remote) - For each video dir found: 1. `rsync` the dir from CT 110 to cortex local temp: `/opt/bulk-import/transcode-work///` 2. Run ffmpeg: `ffmpeg -hwaccel cuda -i input.mp4 -c:v hevc_nvenc -preset medium -cq 28 -tag:v hvc1 -c:a aac -b:a 128k output.mp4` - `-cq 28` = constant quality mode (NVENC equivalent of CRF) - `-tag:v hvc1` = Apple/browser compatible HEVC tag - `-preset medium` = balance speed/quality (can tune later) - Preserve .info.json and .jpg (just copy, don't re-encode) 3. `rsync` the transcoded dir back to CT 110: `/opt/bulk-import/transcoded///` 4. Remove the source from CT 110's `completed/` dir (it's been transcoded) 5. Clean up local temp - Skip videos that already exist in `transcoded/` - Logging to `/opt/bulk-import/logs/transcoder.log` on cortex (and/or stream to CT 110) - State file: `/opt/bulk-import/config/transcoder-state.json` on cortex - Graceful shutdown on SIGTERM/SIGINT — finish current transcode, don't start new ones - Target throughput: depends on video length, but NVENC should handle ~2-5 videos/hr for typical 10-20min content at 1080p - One video at a time (NVENC session limit on A4000) **Directory structure on cortex:** ``` /opt/bulk-import/ ← transcoder home on cortex ├── transcoder.py ├── config/ │ └── transcoder-state.json ├── logs/ │ └── transcoder.log └── transcode-work/ ← temp working dir, cleaned after each video ``` **ffmpeg must be installed on cortex with NVENC support:** ```bash sudo apt install -y ffmpeg ffmpeg -encoders 2>/dev/null | grep hevc_nvenc # must show hevc_nvenc # If missing: sudo apt install -y libnvidia-encode-550 (match driver version) ``` ### Step 7: Build importer.py **Where:** CT 110 at `/opt/bulk-import/importer.py` **What:** Watches transcoded/ dir, uploads to PeerTube via API **Deploy:** Same as Step 5 — write locally, scp to CT 110. **SSH Gate:** `ssh peertube 'ls /opt/bulk-import/config/channel-map.json && curl -s http://localhost:9000/api/v1/config | head -c 50'` — channel map AND PeerTube API must be reachable. Requirements: - Watch `/opt/bulk-import/transcoded/` for new video directories (NOT completed/ — transcoder feeds this) - For each video dir: read .info.json, extract title, description, upload_date (→ originallyPublishedAt), tags, thumbnail - Map `` from dir path → `peertube_channel_id` from channel-map.json - Upload via PeerTube API: `POST /api/v1/videos/upload` with multipart form data - Set: name, description, channelId, originallyPublishedAt, tags (first 5), thumbnailfile, privacy (1=public), **waitTranscoding=false** (video is already H.265, no PeerTube transcoding needed) - On success: **DELETE the video dir from `transcoded/`** — PeerTube's storage is the authoritative copy. No `imported/` directory. - On failure: move to `/opt/bulk-import/failed/` with error log - Rate: process one video at a time, ~50/hr max (don't overwhelm PeerTube) - Dedup: check if video title + channel already exists before uploading - Logging to `/opt/bulk-import/logs/importer.log` - OAuth token management: cache token, refresh on 401 ### Step 8: Systemd Services **Where:** CT 110 (downloader + importer) AND cortex (transcoder) **What:** Service files for all three pipeline components **SSH Gate:** `ssh peertube 'ls /opt/bulk-import/downloader.py /opt/bulk-import/importer.py'` — both CT 110 scripts must exist (Steps 5 and 7 complete). `/opt/bulk-import/transcoder.py` must exist on cortex (Step 6 complete). **On CT 110:** ```bash # /etc/systemd/system/pt-downloader.service [Unit] Description=PeerTube Bulk Downloader After=network-online.target Wants=network-online.target [Service] Type=simple User=peertube Group=peertube ExecStart=/usr/bin/python3 /opt/bulk-import/downloader.py WorkingDirectory=/opt/bulk-import Restart=always RestartSec=60 StandardOutput=journal StandardError=journal SyslogIdentifier=pt-downloader [Install] WantedBy=multi-user.target # /etc/systemd/system/pt-importer.service — same pattern, ExecStart points to importer.py ``` **On cortex:** ```bash # /etc/systemd/system/pt-transcoder.service [Unit] Description=PeerTube H.265 NVENC Transcoder After=network-online.target nvidia-persistenced.service Wants=network-online.target [Service] Type=simple User=zvx Group=zvx ExecStart=/usr/bin/python3 /opt/bulk-import/transcoder.py WorkingDirectory=/opt/bulk-import Restart=always RestartSec=60 StandardOutput=journal StandardError=journal SyslogIdentifier=pt-transcoder MemoryMax=12G [Install] WantedBy=multi-user.target ``` Enable but **do not start** until testing is complete. ### Step 9: PeerTube Transcoding Config — DISABLED **Where:** CT 110 **What:** Disable PeerTube's built-in transcoding — videos arrive pre-transcoded as H.265 **SSH Gate:** `ssh peertube 'hostname'` must succeed. Edit `/var/www/peertube/config/local-production.json`: ```json { "transcoding": { "enabled": false }, "import": { "videos": { "concurrency": 4, "http": { "enabled": true }, "torrent": { "enabled": false } } }, "video_channel_synchronization": { "enabled": false } } ``` Restart PeerTube after config changes: `sudo systemctl restart peertube` **Why disabled:** Videos are pre-transcoded to H.265 by cortex (Step 6) before import. The importer uploads with `waitTranscoding=false`. PeerTube serves the file as-is. No runner needed, no re-encode, no wasted cycles. ### Step 10: Integration Test **Full connectivity gate — ALL must pass:** ```bash ssh peertube 'hostname' # SSH to CT 110 ssh peertube 'curl -s http://localhost:9000/api/v1/config | head -c 50' # PeerTube API ssh peertube 'systemctl is-active peertube' # PeerTube service nvidia-smi > /dev/null 2>&1 && echo "GPU OK" # cortex GPU ffmpeg -encoders 2>/dev/null | grep -q hevc_nvenc && echo "NVENC OK" # HEVC encoder ssh peertube 'ls /opt/bulk-import/completed/' > /dev/null && echo "Dirs OK" # Pipeline dirs ``` 1. Start downloader — let it grab 5-10 videos from 2-3 different channels 2. Verify videos land in `/opt/bulk-import/completed/` with .mp4 + .info.json + .jpg 3. Start transcoder on cortex — verify it pulls videos, encodes H.265 via NVENC (`nvidia-smi` shows encoder utilization) 4. Verify transcoded files land in `/opt/bulk-import/transcoded/` on CT 110, and originals cleared from `completed/` 5. Verify transcoded file is H.265: `ffprobe -v error -select_streams v:0 -show_entries stream=codec_name -of csv=p=0 ` should return `hevc` 6. Start importer — verify videos appear in PeerTube UI with correct metadata, channel assignment, thumbnails 7. Verify playback works at stream.echo6.co (H.265 plays natively in modern browsers via HLS/web-video) 8. Check dedup — restart downloader, verify it skips already-downloaded videos 9. Check VPN rotation — trigger a rate limit (or simulate), verify country switches ### Step 11: Go-Live **On CT 110:** ```bash systemctl start pt-downloader && systemctl start pt-importer systemctl enable pt-downloader && systemctl enable pt-importer ``` **On cortex:** ```bash systemctl start pt-transcoder systemctl enable pt-transcoder ``` Monitor for 24 hours. Expected steady-state: - Downloader: ~30 videos/hr - Transcoder: ~2-5 videos/hr (bottleneck — NVENC is fast but 1080p H.265 takes time per video) - Importer: keeps up with transcoder output, ~50/hr capacity but paced by transcoder - GPU utilization: 80-100% encoder, minimal CUDA (no conflict with Ollama) **⚠️ The transcoder is the bottleneck.** At ~3 videos/hr average, 70K videos = ~970 days. Strategies to accelerate: - Lower quality preset: `-preset fast` or `-preset hp` (speed over quality) - Accept lower CQ: `-cq 32` instead of 28 (smaller files, slightly lower quality) - Run 2 NVENC sessions in parallel (A4000 supports ~3 concurrent) - Add a second GPU node - Accept H.264 for bulk and only H.265 for new imports going forward --- ## Manual Inputs Required (Before CC Can Execute) | Item | Who | When Needed | |------|-----|-------------| | NordVPN token | Matt | Step 3 | | YouTube cookies.txt | Matt | Step 4 | | PeerTube admin password | Matt | Step 2 (OAuth) | --- ## Dependencies Between Steps ``` Step 1 (channel map) ──→ Step 2 (create channels) ──→ Step 7 (importer needs channel IDs) ↗ Step 3 (VPN) + Step 4 (cookies) ──→ Step 5 (downloader) ──→ Step 6 (transcoder reads completed/) ↓ Step 7 (importer reads transcoded/) Step 9 (disable PT transcoding) ←── independent, do anytime before Step 10 Step 10 (integration test) ←── requires ALL of 1-9 Step 11 (go-live) ←── requires Step 10 pass ``` Steps 1-2 and Step 9 are independent workstreams. Steps 3-4 require Matt's manual input. Steps 5, 6, 7 are the three core scripts. Step 6 runs on cortex; everything else runs on CT 110. --- ## What NOT to Build (Phase 3 — WATCHTOWER) WATCHTOWER (the monitoring dashboard) is Phase 3. Don't build it now. The pipeline scripts should have enough logging that we can monitor via `journalctl` and log files during Phase 2. WATCHTOWER will eventually: - SSH into CT 110 to read pipeline metrics (but CT 110 is native now, not Docker — queries change) - Point to cortex instead of old TOC for GPU stats - Read channel-map.json from `/opt/bulk-import/config/` instead of old `/mnt/data/bulk-import/` - Need new .env config for all changed IPs But that's later. Pipeline first. --- ## Channel Management (via RECON Dashboard) **Added 2026-02-18.** Channel management UI is now in the RECON dashboard Upload tab at `http://192.168.1.130:8420/upload`. No more SSH + manual JSON editing to add channels. - **Sudoers:** `/etc/sudoers.d/recon-mgmt` on CT 110 — allows zvx to run yt-dlp, psql, and tee as peertube - **API endpoints** in `/opt/recon/lib/api.py`: - `GET /api/peertube/channels` — list all channels with video counts from PeerTube DB - `GET /api/peertube/channels/stats` — total channels, total videos, downloader status - `POST /api/peertube/channels/add` — resolve YT URL via yt-dlp, create PeerTube channel, update channel-map.json - `DELETE /api/peertube/channels/` — remove from JSON and PeerTube - **UI features:** stats bar, add form (URL + category + priority), sortable channel table, remove button - **All operations go through SSH from CT 130 → CT 110** using the existing `_ssh_peertube()` helper