mirror of
https://github.com/zvx-echo6/refactored-recon.git
synced 2026-05-20 06:34:34 +02:00
PROJECT-BIBLE: fix storage topology — library is LXC bind-mount, not NFS
- Section 2 topology diagram: 'Library (LXC bind) / data /mnt/data/library → /mnt/library/ (read/write, local SSD)' - Section 10 Config table: library_root described as bind-mount root - Section 13 Filesystem layout: /mnt/library annotated as LXC bind-mount - Section 14 Refactor history: storage migration note added (NFS history preserved as historical context) - Section 15 Operational runbook: replaced recon-backup.timer reference with planned/TBD note - Section 16 Known Gotchas: new bullet on bind-mount file ownership and the absence of NFS / root_squash in the path - Section 17 Credentials & Hosts: added data host row; rewrote pi-nas role to backup target (planned, not yet configured) reflecting the 2026-04-15 wipe of /export/library - Section 18 Open Follow-ups: added backup architecture entry capturing the missing rsync job and the now-available ~300G pi-nas headroom
This commit is contained in:
parent
d1cde5a56d
commit
b1c05c4d02
1 changed files with 28 additions and 7 deletions
|
|
@ -37,10 +37,10 @@ Search returns page-grounded citations back to the file or stream URL.
|
|||
```
|
||||
┌─────────────────────────┐
|
||||
│ CT 130 (recon) │
|
||||
Library (NFS) │ /opt/recon/ │ ┌──────────────┐
|
||||
pi-nas:/export/library│ ├─ data/ │ │ Qdrant │
|
||||
Library (LXC bind) │ /opt/recon/ │ ┌──────────────┐
|
||||
data /mnt/data/library│ ├─ data/ │ │ Qdrant │
|
||||
→ /mnt/library/ │ │ ├─ acquired/ │ │ cortex:6333 │
|
||||
(read/write) │ │ ├─ processing/ │ ←→ │ recon_knowledge_hybrid
|
||||
(read/write, local SSD)│ │ ├─ processing/ │ ←→ │ recon_knowledge_hybrid
|
||||
│ │ ├─ concepts/ │ │ (1024-d dense + sparse)
|
||||
│ │ └─ recon.db │ └──────────────┘
|
||||
│ ├─ lib/ │
|
||||
|
|
@ -358,7 +358,7 @@ Lives at `/opt/recon/config.yaml`. Secrets (`GEMINI_KEYS`,
|
|||
### Top-level keys
|
||||
| Key | Meaning |
|
||||
|---|---|
|
||||
| `library_root` | `/mnt/library` — NFS mount root |
|
||||
| `library_root` | `/mnt/library` — LXC bind-mount root (data host `/mnt/data/library`, local SSD) |
|
||||
| `processing` | Worker counts, window sizes, timeouts, retry policy |
|
||||
| `embedding` | TEI host/port, model (`bge-m3`), 1024-d dense |
|
||||
| `sparse_embedding` | Separate service on cortex:8091 |
|
||||
|
|
@ -481,7 +481,7 @@ and knowledge-stats panels so the dashboard loads instantly.
|
|||
│ └── organizer.py # determine_dominant_domain, level 1-4 naming
|
||||
└── logs/
|
||||
|
||||
/mnt/library/ # NFS from pi-nas, read-write
|
||||
/mnt/library/ # LXC bind-mount from data host /mnt/data/library (local SSD), read-write
|
||||
├── <Domain>/<Subdomain>/<canonical_name>.<ext>
|
||||
└── _acquired/ _review/ _staging/ signal-archive/ # not touched by pipeline
|
||||
```
|
||||
|
|
@ -521,6 +521,13 @@ implementations are in the RECON repo; design lives here.
|
|||
- 18,855 transcripts in `/mnt/library/_sources/streamecho6/`.
|
||||
- Old stream-B `new_pipeline` ran off `/mnt/library/_acquired/`.
|
||||
- `scan_library()` polled the NFS mount for new PDFs — now deprecated.
|
||||
- *Storage migration note:* `/mnt/library` was historically an NFS
|
||||
mount from `pi-nas:/export/library`, which is what `current-state.md`
|
||||
and `scan_library()` were written against. The library has since
|
||||
been migrated to local SSD on the data Proxmox host
|
||||
(`/mnt/data/library`) and surfaced into CT 130 via an LXC
|
||||
bind-mount. The pi-nas copy was wiped on 2026-04-15. Path strings
|
||||
inside the codebase didn't change; only the underlying storage did.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -537,7 +544,8 @@ tail -f /opt/recon/logs/recon.log
|
|||
```bash
|
||||
# Local DB backup before risky operations
|
||||
cp /opt/recon/data/recon.db /tmp/recon.db.bak.$(date +%s)
|
||||
# Contabo offsite (automatic): rsync every 6 hours, see recon-backup.timer
|
||||
# Offsite backup: planned, not yet configured (TBD — likely rsync to
|
||||
# pi-nas:/export/recon-backup once a backup target is provisioned).
|
||||
```
|
||||
|
||||
### Inspect pipeline state at a glance
|
||||
|
|
@ -590,6 +598,11 @@ curl -s http://100.64.0.14:6333/collections/recon_knowledge_hybrid \
|
|||
filter excludes them.
|
||||
- **PeerTube 429.** Respect `peertube.rate_limit_delay` between caption
|
||||
API calls or you'll get throttled.
|
||||
- **Library is an LXC bind-mount, not NFS.** `/mnt/library` on CT 130 is
|
||||
bound from the data Proxmox host's `/mnt/data/library` (local ext4 on
|
||||
/dev/sda1). File ownership/UID-GID is shared with the host — writes
|
||||
from inside the container appear with the container UID on the host.
|
||||
No NFS, no `root_squash`, no network in the path.
|
||||
- **SSH heredocs with Python code break.** When editing remote files,
|
||||
write to a temp file via `scp` or `cat > file` rather than bash
|
||||
heredocs with parens/quotes.
|
||||
|
|
@ -603,9 +616,10 @@ curl -s http://100.64.0.14:6333/collections/recon_knowledge_hybrid \
|
|||
| Host | Role | Access |
|
||||
|---|---|---|
|
||||
| CT 130 (192.168.1.130 / 100.64.0.24) | RECON service | `ssh zvx@192.168.1.130` (key auth) |
|
||||
| data host (192.168.1.240) | Proxmox node hosting CT 130; `/mnt/data/library` source for the CT 130 bind-mount | `ssh root@192.168.1.240` |
|
||||
| cortex VM (192.168.1.150) | Qdrant, TEI, sparse svc, Ollama | `ssh zvx@cortex` |
|
||||
| CT 110 (192.168.1.170) | PeerTube `stream.echo6.co` | `ssh zvx@192.168.1.170` |
|
||||
| pi-nas (192.168.1.245) | NFS server for `/mnt/library` | `ssh zvx@pi-nas` |
|
||||
| pi-nas (192.168.1.245) | Backup target (planned; not yet configured). ~22T pool with ~300G free after library wipe. | `ssh zvx@pi-nas` |
|
||||
| CT 101 (192.168.1.101) | Caddy reverse proxy (home) | `ssh root@192.168.1.241 'pct exec 101'` |
|
||||
|
||||
Secrets: `/home/zvx/projects/.ref/credentials` on TOC (this machine).
|
||||
|
|
@ -644,6 +658,13 @@ RECON Gemini/PeerTube keys: `/opt/recon/.env` on CT 130.
|
|||
Qdrant snapshots are not in any backup rotation. If CT 130 or cortex
|
||||
lose their disks, these are the hardest to regenerate (Gemini calls
|
||||
+ embedding compute).
|
||||
- **Backup architecture** — no offsite backup is currently configured.
|
||||
Section 15 references a planned rsync-to-pi-nas job, but neither the
|
||||
script nor the systemd timer (`recon-backup.timer`) exist. Decide
|
||||
what gets backed up (`recon.db`, `concepts/`, `text/`, Qdrant
|
||||
snapshots, `/mnt/library`?), where, and on what cadence; pi-nas has
|
||||
~300G free in `/export/` after the 2026-04-15 library wipe and could
|
||||
be the target for a first pass.
|
||||
- **`signal-archive/` in `/mnt/library/`** — 44 Signal/Matrix chat log
|
||||
files, not library content. Matt intends these to "eventually
|
||||
contribute" to the knowledge base but no ingestion path exists yet.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue