mirror of
https://github.com/zvx-echo6/refactored-recon.git
synced 2026-05-20 14:44:39 +02:00
PROJECT-BIBLE: fix storage topology — library is LXC bind-mount, not NFS
- Section 2 topology diagram: 'Library (LXC bind) / data /mnt/data/library → /mnt/library/ (read/write, local SSD)' - Section 10 Config table: library_root described as bind-mount root - Section 13 Filesystem layout: /mnt/library annotated as LXC bind-mount - Section 14 Refactor history: storage migration note added (NFS history preserved as historical context) - Section 15 Operational runbook: replaced recon-backup.timer reference with planned/TBD note - Section 16 Known Gotchas: new bullet on bind-mount file ownership and the absence of NFS / root_squash in the path - Section 17 Credentials & Hosts: added data host row; rewrote pi-nas role to backup target (planned, not yet configured) reflecting the 2026-04-15 wipe of /export/library - Section 18 Open Follow-ups: added backup architecture entry capturing the missing rsync job and the now-available ~300G pi-nas headroom
This commit is contained in:
parent
d1cde5a56d
commit
b1c05c4d02
1 changed files with 28 additions and 7 deletions
|
|
@ -37,10 +37,10 @@ Search returns page-grounded citations back to the file or stream URL.
|
||||||
```
|
```
|
||||||
┌─────────────────────────┐
|
┌─────────────────────────┐
|
||||||
│ CT 130 (recon) │
|
│ CT 130 (recon) │
|
||||||
Library (NFS) │ /opt/recon/ │ ┌──────────────┐
|
Library (LXC bind) │ /opt/recon/ │ ┌──────────────┐
|
||||||
pi-nas:/export/library│ ├─ data/ │ │ Qdrant │
|
data /mnt/data/library│ ├─ data/ │ │ Qdrant │
|
||||||
→ /mnt/library/ │ │ ├─ acquired/ │ │ cortex:6333 │
|
→ /mnt/library/ │ │ ├─ acquired/ │ │ cortex:6333 │
|
||||||
(read/write) │ │ ├─ processing/ │ ←→ │ recon_knowledge_hybrid
|
(read/write, local SSD)│ │ ├─ processing/ │ ←→ │ recon_knowledge_hybrid
|
||||||
│ │ ├─ concepts/ │ │ (1024-d dense + sparse)
|
│ │ ├─ concepts/ │ │ (1024-d dense + sparse)
|
||||||
│ │ └─ recon.db │ └──────────────┘
|
│ │ └─ recon.db │ └──────────────┘
|
||||||
│ ├─ lib/ │
|
│ ├─ lib/ │
|
||||||
|
|
@ -358,7 +358,7 @@ Lives at `/opt/recon/config.yaml`. Secrets (`GEMINI_KEYS`,
|
||||||
### Top-level keys
|
### Top-level keys
|
||||||
| Key | Meaning |
|
| Key | Meaning |
|
||||||
|---|---|
|
|---|---|
|
||||||
| `library_root` | `/mnt/library` — NFS mount root |
|
| `library_root` | `/mnt/library` — LXC bind-mount root (data host `/mnt/data/library`, local SSD) |
|
||||||
| `processing` | Worker counts, window sizes, timeouts, retry policy |
|
| `processing` | Worker counts, window sizes, timeouts, retry policy |
|
||||||
| `embedding` | TEI host/port, model (`bge-m3`), 1024-d dense |
|
| `embedding` | TEI host/port, model (`bge-m3`), 1024-d dense |
|
||||||
| `sparse_embedding` | Separate service on cortex:8091 |
|
| `sparse_embedding` | Separate service on cortex:8091 |
|
||||||
|
|
@ -481,7 +481,7 @@ and knowledge-stats panels so the dashboard loads instantly.
|
||||||
│ └── organizer.py # determine_dominant_domain, level 1-4 naming
|
│ └── organizer.py # determine_dominant_domain, level 1-4 naming
|
||||||
└── logs/
|
└── logs/
|
||||||
|
|
||||||
/mnt/library/ # NFS from pi-nas, read-write
|
/mnt/library/ # LXC bind-mount from data host /mnt/data/library (local SSD), read-write
|
||||||
├── <Domain>/<Subdomain>/<canonical_name>.<ext>
|
├── <Domain>/<Subdomain>/<canonical_name>.<ext>
|
||||||
└── _acquired/ _review/ _staging/ signal-archive/ # not touched by pipeline
|
└── _acquired/ _review/ _staging/ signal-archive/ # not touched by pipeline
|
||||||
```
|
```
|
||||||
|
|
@ -521,6 +521,13 @@ implementations are in the RECON repo; design lives here.
|
||||||
- 18,855 transcripts in `/mnt/library/_sources/streamecho6/`.
|
- 18,855 transcripts in `/mnt/library/_sources/streamecho6/`.
|
||||||
- Old stream-B `new_pipeline` ran off `/mnt/library/_acquired/`.
|
- Old stream-B `new_pipeline` ran off `/mnt/library/_acquired/`.
|
||||||
- `scan_library()` polled the NFS mount for new PDFs — now deprecated.
|
- `scan_library()` polled the NFS mount for new PDFs — now deprecated.
|
||||||
|
- *Storage migration note:* `/mnt/library` was historically an NFS
|
||||||
|
mount from `pi-nas:/export/library`, which is what `current-state.md`
|
||||||
|
and `scan_library()` were written against. The library has since
|
||||||
|
been migrated to local SSD on the data Proxmox host
|
||||||
|
(`/mnt/data/library`) and surfaced into CT 130 via an LXC
|
||||||
|
bind-mount. The pi-nas copy was wiped on 2026-04-15. Path strings
|
||||||
|
inside the codebase didn't change; only the underlying storage did.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -537,7 +544,8 @@ tail -f /opt/recon/logs/recon.log
|
||||||
```bash
|
```bash
|
||||||
# Local DB backup before risky operations
|
# Local DB backup before risky operations
|
||||||
cp /opt/recon/data/recon.db /tmp/recon.db.bak.$(date +%s)
|
cp /opt/recon/data/recon.db /tmp/recon.db.bak.$(date +%s)
|
||||||
# Contabo offsite (automatic): rsync every 6 hours, see recon-backup.timer
|
# Offsite backup: planned, not yet configured (TBD — likely rsync to
|
||||||
|
# pi-nas:/export/recon-backup once a backup target is provisioned).
|
||||||
```
|
```
|
||||||
|
|
||||||
### Inspect pipeline state at a glance
|
### Inspect pipeline state at a glance
|
||||||
|
|
@ -590,6 +598,11 @@ curl -s http://100.64.0.14:6333/collections/recon_knowledge_hybrid \
|
||||||
filter excludes them.
|
filter excludes them.
|
||||||
- **PeerTube 429.** Respect `peertube.rate_limit_delay` between caption
|
- **PeerTube 429.** Respect `peertube.rate_limit_delay` between caption
|
||||||
API calls or you'll get throttled.
|
API calls or you'll get throttled.
|
||||||
|
- **Library is an LXC bind-mount, not NFS.** `/mnt/library` on CT 130 is
|
||||||
|
bound from the data Proxmox host's `/mnt/data/library` (local ext4 on
|
||||||
|
/dev/sda1). File ownership/UID-GID is shared with the host — writes
|
||||||
|
from inside the container appear with the container UID on the host.
|
||||||
|
No NFS, no `root_squash`, no network in the path.
|
||||||
- **SSH heredocs with Python code break.** When editing remote files,
|
- **SSH heredocs with Python code break.** When editing remote files,
|
||||||
write to a temp file via `scp` or `cat > file` rather than bash
|
write to a temp file via `scp` or `cat > file` rather than bash
|
||||||
heredocs with parens/quotes.
|
heredocs with parens/quotes.
|
||||||
|
|
@ -603,9 +616,10 @@ curl -s http://100.64.0.14:6333/collections/recon_knowledge_hybrid \
|
||||||
| Host | Role | Access |
|
| Host | Role | Access |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| CT 130 (192.168.1.130 / 100.64.0.24) | RECON service | `ssh zvx@192.168.1.130` (key auth) |
|
| CT 130 (192.168.1.130 / 100.64.0.24) | RECON service | `ssh zvx@192.168.1.130` (key auth) |
|
||||||
|
| data host (192.168.1.240) | Proxmox node hosting CT 130; `/mnt/data/library` source for the CT 130 bind-mount | `ssh root@192.168.1.240` |
|
||||||
| cortex VM (192.168.1.150) | Qdrant, TEI, sparse svc, Ollama | `ssh zvx@cortex` |
|
| cortex VM (192.168.1.150) | Qdrant, TEI, sparse svc, Ollama | `ssh zvx@cortex` |
|
||||||
| CT 110 (192.168.1.170) | PeerTube `stream.echo6.co` | `ssh zvx@192.168.1.170` |
|
| CT 110 (192.168.1.170) | PeerTube `stream.echo6.co` | `ssh zvx@192.168.1.170` |
|
||||||
| pi-nas (192.168.1.245) | NFS server for `/mnt/library` | `ssh zvx@pi-nas` |
|
| pi-nas (192.168.1.245) | Backup target (planned; not yet configured). ~22T pool with ~300G free after library wipe. | `ssh zvx@pi-nas` |
|
||||||
| CT 101 (192.168.1.101) | Caddy reverse proxy (home) | `ssh root@192.168.1.241 'pct exec 101'` |
|
| CT 101 (192.168.1.101) | Caddy reverse proxy (home) | `ssh root@192.168.1.241 'pct exec 101'` |
|
||||||
|
|
||||||
Secrets: `/home/zvx/projects/.ref/credentials` on TOC (this machine).
|
Secrets: `/home/zvx/projects/.ref/credentials` on TOC (this machine).
|
||||||
|
|
@ -644,6 +658,13 @@ RECON Gemini/PeerTube keys: `/opt/recon/.env` on CT 130.
|
||||||
Qdrant snapshots are not in any backup rotation. If CT 130 or cortex
|
Qdrant snapshots are not in any backup rotation. If CT 130 or cortex
|
||||||
lose their disks, these are the hardest to regenerate (Gemini calls
|
lose their disks, these are the hardest to regenerate (Gemini calls
|
||||||
+ embedding compute).
|
+ embedding compute).
|
||||||
|
- **Backup architecture** — no offsite backup is currently configured.
|
||||||
|
Section 15 references a planned rsync-to-pi-nas job, but neither the
|
||||||
|
script nor the systemd timer (`recon-backup.timer`) exist. Decide
|
||||||
|
what gets backed up (`recon.db`, `concepts/`, `text/`, Qdrant
|
||||||
|
snapshots, `/mnt/library`?), where, and on what cadence; pi-nas has
|
||||||
|
~300G free in `/export/` after the 2026-04-15 library wipe and could
|
||||||
|
be the target for a first pass.
|
||||||
- **`signal-archive/` in `/mnt/library/`** — 44 Signal/Matrix chat log
|
- **`signal-archive/` in `/mnt/library/`** — 44 Signal/Matrix chat log
|
||||||
files, not library content. Matt intends these to "eventually
|
files, not library content. Matt intends these to "eventually
|
||||||
contribute" to the knowledge base but no ingestion path exists yet.
|
contribute" to the knowledge base but no ingestion path exists yet.
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue