From b1c05c4d02951ea53ba0a70abfee3ba76123b8f2 Mon Sep 17 00:00:00 2001 From: Matt Date: Thu, 16 Apr 2026 06:50:36 +0000 Subject: [PATCH] =?UTF-8?q?PROJECT-BIBLE:=20fix=20storage=20topology=20?= =?UTF-8?q?=E2=80=94=20library=20is=20LXC=20bind-mount,=20not=20NFS?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Section 2 topology diagram: 'Library (LXC bind) / data /mnt/data/library → /mnt/library/ (read/write, local SSD)' - Section 10 Config table: library_root described as bind-mount root - Section 13 Filesystem layout: /mnt/library annotated as LXC bind-mount - Section 14 Refactor history: storage migration note added (NFS history preserved as historical context) - Section 15 Operational runbook: replaced recon-backup.timer reference with planned/TBD note - Section 16 Known Gotchas: new bullet on bind-mount file ownership and the absence of NFS / root_squash in the path - Section 17 Credentials & Hosts: added data host row; rewrote pi-nas role to backup target (planned, not yet configured) reflecting the 2026-04-15 wipe of /export/library - Section 18 Open Follow-ups: added backup architecture entry capturing the missing rsync job and the now-available ~300G pi-nas headroom --- PROJECT-BIBLE.md | 35 ++++++++++++++++++++++++++++------- 1 file changed, 28 insertions(+), 7 deletions(-) diff --git a/PROJECT-BIBLE.md b/PROJECT-BIBLE.md index 666dc12..7e8402e 100644 --- a/PROJECT-BIBLE.md +++ b/PROJECT-BIBLE.md @@ -37,10 +37,10 @@ Search returns page-grounded citations back to the file or stream URL. ``` ┌─────────────────────────┐ │ CT 130 (recon) │ - Library (NFS) │ /opt/recon/ │ ┌──────────────┐ - pi-nas:/export/library│ ├─ data/ │ │ Qdrant │ + Library (LXC bind) │ /opt/recon/ │ ┌──────────────┐ + data /mnt/data/library│ ├─ data/ │ │ Qdrant │ → /mnt/library/ │ │ ├─ acquired/ │ │ cortex:6333 │ - (read/write) │ │ ├─ processing/ │ ←→ │ recon_knowledge_hybrid + (read/write, local SSD)│ │ ├─ processing/ │ ←→ │ recon_knowledge_hybrid │ │ ├─ concepts/ │ │ (1024-d dense + sparse) │ │ └─ recon.db │ └──────────────┘ │ ├─ lib/ │ @@ -358,7 +358,7 @@ Lives at `/opt/recon/config.yaml`. Secrets (`GEMINI_KEYS`, ### Top-level keys | Key | Meaning | |---|---| -| `library_root` | `/mnt/library` — NFS mount root | +| `library_root` | `/mnt/library` — LXC bind-mount root (data host `/mnt/data/library`, local SSD) | | `processing` | Worker counts, window sizes, timeouts, retry policy | | `embedding` | TEI host/port, model (`bge-m3`), 1024-d dense | | `sparse_embedding` | Separate service on cortex:8091 | @@ -481,7 +481,7 @@ and knowledge-stats panels so the dashboard loads instantly. │ └── organizer.py # determine_dominant_domain, level 1-4 naming └── logs/ -/mnt/library/ # NFS from pi-nas, read-write +/mnt/library/ # LXC bind-mount from data host /mnt/data/library (local SSD), read-write ├── //. └── _acquired/ _review/ _staging/ signal-archive/ # not touched by pipeline ``` @@ -521,6 +521,13 @@ implementations are in the RECON repo; design lives here. - 18,855 transcripts in `/mnt/library/_sources/streamecho6/`. - Old stream-B `new_pipeline` ran off `/mnt/library/_acquired/`. - `scan_library()` polled the NFS mount for new PDFs — now deprecated. +- *Storage migration note:* `/mnt/library` was historically an NFS + mount from `pi-nas:/export/library`, which is what `current-state.md` + and `scan_library()` were written against. The library has since + been migrated to local SSD on the data Proxmox host + (`/mnt/data/library`) and surfaced into CT 130 via an LXC + bind-mount. The pi-nas copy was wiped on 2026-04-15. Path strings + inside the codebase didn't change; only the underlying storage did. --- @@ -537,7 +544,8 @@ tail -f /opt/recon/logs/recon.log ```bash # Local DB backup before risky operations cp /opt/recon/data/recon.db /tmp/recon.db.bak.$(date +%s) -# Contabo offsite (automatic): rsync every 6 hours, see recon-backup.timer +# Offsite backup: planned, not yet configured (TBD — likely rsync to +# pi-nas:/export/recon-backup once a backup target is provisioned). ``` ### Inspect pipeline state at a glance @@ -590,6 +598,11 @@ curl -s http://100.64.0.14:6333/collections/recon_knowledge_hybrid \ filter excludes them. - **PeerTube 429.** Respect `peertube.rate_limit_delay` between caption API calls or you'll get throttled. +- **Library is an LXC bind-mount, not NFS.** `/mnt/library` on CT 130 is + bound from the data Proxmox host's `/mnt/data/library` (local ext4 on + /dev/sda1). File ownership/UID-GID is shared with the host — writes + from inside the container appear with the container UID on the host. + No NFS, no `root_squash`, no network in the path. - **SSH heredocs with Python code break.** When editing remote files, write to a temp file via `scp` or `cat > file` rather than bash heredocs with parens/quotes. @@ -603,9 +616,10 @@ curl -s http://100.64.0.14:6333/collections/recon_knowledge_hybrid \ | Host | Role | Access | |---|---|---| | CT 130 (192.168.1.130 / 100.64.0.24) | RECON service | `ssh zvx@192.168.1.130` (key auth) | +| data host (192.168.1.240) | Proxmox node hosting CT 130; `/mnt/data/library` source for the CT 130 bind-mount | `ssh root@192.168.1.240` | | cortex VM (192.168.1.150) | Qdrant, TEI, sparse svc, Ollama | `ssh zvx@cortex` | | CT 110 (192.168.1.170) | PeerTube `stream.echo6.co` | `ssh zvx@192.168.1.170` | -| pi-nas (192.168.1.245) | NFS server for `/mnt/library` | `ssh zvx@pi-nas` | +| pi-nas (192.168.1.245) | Backup target (planned; not yet configured). ~22T pool with ~300G free after library wipe. | `ssh zvx@pi-nas` | | CT 101 (192.168.1.101) | Caddy reverse proxy (home) | `ssh root@192.168.1.241 'pct exec 101'` | Secrets: `/home/zvx/projects/.ref/credentials` on TOC (this machine). @@ -644,6 +658,13 @@ RECON Gemini/PeerTube keys: `/opt/recon/.env` on CT 130. Qdrant snapshots are not in any backup rotation. If CT 130 or cortex lose their disks, these are the hardest to regenerate (Gemini calls + embedding compute). +- **Backup architecture** — no offsite backup is currently configured. + Section 15 references a planned rsync-to-pi-nas job, but neither the + script nor the systemd timer (`recon-backup.timer`) exist. Decide + what gets backed up (`recon.db`, `concepts/`, `text/`, Qdrant + snapshots, `/mnt/library`?), where, and on what cadence; pi-nas has + ~300G free in `/export/` after the 2026-04-15 library wipe and could + be the target for a first pass. - **`signal-archive/` in `/mnt/library/`** — 44 Signal/Matrix chat log files, not library content. Matt intends these to "eventually contribute" to the knowledge base but no ingestion path exists yet.