mirror of
https://github.com/zvx-echo6/recon.git
synced 2026-06-10 00:44:37 +02:00
No description
- Python 83.3%
- JavaScript 7.8%
- HTML 6.4%
- CSS 1.4%
- Shell 1.1%
* cleanup: remove /api/address_book handlers (extraction #3 shadow) Removes address_book_bp (lib/address_book_api.py: /api/address_book/lookup + /api/address_book/list) + its registration in lib/api.py. Edge-shadowed since extraction #3 — navi-contacts (:8423) serves /api/address_book/* on navi.echo6.co; no recon-side consumer (no template/JS reference). lib/address_book.py is KEPT — geocode.py (nickname short-circuit + annotation) and netsyms_api.py import it. NOT removed this PR: contacts_bp. The recon dashboard at /deleted-contacts (recon-product, stays) calls /api/contacts/<id>/{restore,restore-as,purge} via XHR, and recon.echo6.co proxies straight to recon:8420 (verified the Caddy block — no navi-contacts routing there). Removing contacts_bp would break those dashboard actions. Flagged for a decision; lib/contacts.py also stays (dashboard ContactsDB reads). See PR body. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * cleanup: deprecate /nav-i + /deleted-contacts; remove contacts_bp + lib/contacts.py Probe found recon's /deleted-contacts dashboard reads /opt/recon/data/contacts.db — frozen since extraction #3 moved write ownership to navi-contacts (/var/lib/navi-backend/contacts.db). The page has been silently rendering ~25-day stale data, and its restore/restore-as/purge XHRs hit recon's contacts_bp (the recon.echo6.co Caddy block proxies straight to recon:8420 — no navi-contacts routing there). Per Matt's decision, deprecate the pages entirely; they'll be re-surfaced later as a proper admin page consuming navi-contacts via API. Removed: - contacts_bp (lib/contacts_api.py, all 10 /api/contacts* routes) + its registration in lib/api.py — edge-shadowed by navi-contacts :8423 since #3, and now free of recon-product consumers once the dashboard goes. - /nav-i (navi_landing_page) + /deleted-contacts (deleted_contacts_page) route handlers; templates/navi/landing.html + templates/navi/deleted_contacts.html. - lib/contacts.py (ContactsDB) — the dashboard was its only non-contacts_bp consumer; both gone. - The two dead NAVI_SUBNAV entries (Overview→/nav-i, Deleted Contacts→ /deleted-contacts). Kept / adapted: - /nav-i/api-keys page (recon-product key management) stays. NAVI_SUBNAV reduced to just its API Keys entry; the base.html top-nav "Nav-I" link repointed /nav-i -> /nav-i/api-keys so the surviving section page stays reachable (minimal href change, not a nav restructure — flagged in PR). - lib/address_book.py — geocode.py + netsyms_api.py still consume it (untouched). Out-of-band follow-up after merge: delete the stale /opt/recon/data/contacts.db (frozen 2026-04-28; data, not code). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * cleanup: pull the entire /nav-i/* subtree (api-keys page is a weaker dup of /settings/keys) Completes the contacts cleanup by removing the rest of /nav-i/. The /nav-i/api-keys page was (a) a weaker duplicate of /settings/keys for Gemini (it lacked remove + reload-from-.env), and (b) a write-only-to-dead-files surface for TomTom + Google Places: it wrote /opt/recon/.env, but the live navi-traffic (:8421) and navi-places (:8425) services read their own /etc/navi-backend/<svc>.env and have ignored recon's copy since extractions #1 + #5. End state: no /nav-i/* URLs in recon. Removed: - /nav-i/api-keys route + template (templates/navi/api_keys.html) - all /api/nav-i/api-keys/* endpoints (list/update/test/restart-recon) - lib/api_keys_admin.py (its only importers were those 4 endpoints; _KEY_DEFS/ _read_env/_write_env were private to it) - the now-orphaned NAVI_SUBNAV - the "Nav-I" top-nav entry in base.html (reverses the /nav-i->/nav-i/api-keys repoint from the previous commit, now that the page itself is gone) Kept (Gemini's real home, recon-product): - /settings/keys + /api/keys/* + lib/key_manager.py (KeyManager) — they import key_manager directly, never api_keys_admin, so untouched. Note: TOMTOM_API_KEY now has zero recon .py references. GOOGLE_PLACES_API_KEY still has one (lib/google_places.py), kept in the prior /api/place cleanup as place_detail's dep; its only caller (_enrich_with_google) is unreachable since the /api/place handlers were removed — left in place pending /api/wiki-enrich retirement (out of scope here). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: zvx-echo6 <mj@k7zvx.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| config | ||
| lib | ||
| scripts | ||
| static | ||
| templates | ||
| .gitignore | ||
| api.py | ||
| config.yaml | ||
| enricher.py | ||
| migrate_paths.py | ||
| PROJECT-BIBLE.md | ||
| README.md | ||
| recon.py | ||
| requirements.txt | ||
| run-pipeline-now.sh | ||
| sweep_gated.sh | ||
RECON -- Knowledge Extraction Pipeline
Extracts structured knowledge from PDFs and web content into a Qdrant vector database for RAG retrieval by Aurora.
Quick Start
# Activate
cd /opt/recon && source venv/bin/activate
# Scan library for new PDFs
recon scan
# Queue and process
recon queue
recon extract
recon enrich
recon embed
# Or run full pipeline
recon run
# Ingest a web page
recon ingest-url "https://example.com/article" --category "Category" --process
# Crawl an entire docs site
recon crawl "https://docs.example.com" --include /docs/ --category "Category" --process
# Upload a PDF
recon upload --file /path/to/document.pdf --category "Category"
# Search
recon search "water purification methods"
# Check status
recon status
recon failures
Dashboard
Services
| Service | Location | Purpose |
|---|---|---|
| RECON Dashboard | recon:8420 | Pipeline management + API |
| Qdrant | cortex:6333 | Vector database |
| TEI | cortex:8090 | Embeddings (1,711/sec) |
| Ollama | cortex:11434 | Chat + fallback embeddings |
| OpenWebUI | cortex:8080 (ai.echo6.co) | Aurora chat with RAG |
| File Server | recon:8888 (files.echo6.co) | PDF downloads |
Key Paths
| Path | Contents |
|---|---|
| /opt/recon/ | Application code |
| /opt/recon/data/concepts/ | Gemini extractions (CRITICAL -- back these up) |
| /opt/recon/data/text/ | Extracted text |
| /opt/recon/data/recon.db | SQLite status DB |
| /mnt/library/ | PDF library (NFS from pi-nas) |
Backups
Automated every 6 hours to Contabo VPS via /opt/recon/scripts/backup.sh.
Concept JSONs are the most valuable data ($130+ of Gemini API work).
Qdrant is NOT backed up -- rebuilt from JSONs in ~10 minutes via recon rebuild.
Monitoring
# Pipeline status
recon status
# Tail logs
tail -f /opt/recon/logs/recon.log
# Pipeline run log
tail -f /opt/recon/pipeline.log
# Validate consistency
recon validate --deep
Full Documentation
See PROJECT-BIBLE.md for complete system documentation.