matt/recon

mirror of https://github.com/zvx-echo6/recon.git synced 2026-05-20 14:44:54 +02:00

No description

Python 87%
HTML 6%
JavaScript 5.3%
CSS 1%
Shell 0.7%

Find a file

Matt f276b95753 Add /api/reverse/<lat>/<lon> localhost-sourced enrichment bundle New geocode_bp sibling to the existing /api/reverse?lat=&lon= route (which is unchanged). Returns a flat 9-field bundle for the Central enrichment framework: name, city, county, state, country, postal_code (Photon), timezone (timezones.sqlite via R-tree + shapely), landclass (in-process lookup_landclass), elevation_m (Valhalla /height). - Each component lookup is independent and wrapped in try/except: a failure logs a warning and yields null, never a 5xx. 400 only on unparseable / out-of-range coordinates. - lat/lon parsed manually rather than via Flask <float:>, which rejects negative and integer coordinates and would 404 instead of 400. - 10k-entry / 24h TTLCache keyed on coords rounded to 4 decimals. - Tests mock Photon/Valhalla/landclass; one test exercises the real timezones.sqlite. cachetools pinned in requirements.txt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-20 05:33:45 +00:00
config	offroute: PostGIS entry points with 100m densification and land_status tagging	2026-05-09 03:28:58 +00:00
lib	Add /api/reverse/<lat>/<lon> localhost-sourced enrichment bundle	2026-05-20 05:33:45 +00:00
scripts	Add Overture Maps POI enrichment layer for place details	2026-04-21 16:51:25 +00:00
static
templates	Add Nav-I API key management UI	2026-04-23 06:50:44 +00:00
.gitignore
api.py
config.yaml
enricher.py
migrate_paths.py
PROJECT-BIBLE.md
README.md
recon.py
requirements.txt	Add /api/reverse/<lat>/<lon> localhost-sourced enrichment bundle	2026-05-20 05:33:45 +00:00
run-pipeline-now.sh
sweep_gated.sh

README.md

RECON -- Knowledge Extraction Pipeline

Extracts structured knowledge from PDFs and web content into a Qdrant vector database for RAG retrieval by Aurora.

Quick Start

# Activate
cd /opt/recon && source venv/bin/activate

# Scan library for new PDFs
recon scan

# Queue and process
recon queue
recon extract
recon enrich
recon embed

# Or run full pipeline
recon run

# Ingest a web page
recon ingest-url "https://example.com/article" --category "Category" --process

# Crawl an entire docs site
recon crawl "https://docs.example.com" --include /docs/ --category "Category" --process

# Upload a PDF
recon upload --file /path/to/document.pdf --category "Category"

# Search
recon search "water purification methods"

# Check status
recon status
recon failures

Dashboard

http://100.64.0.24:8420

Services

Service	Location	Purpose
RECON Dashboard	recon:8420	Pipeline management + API
Qdrant	cortex:6333	Vector database
TEI	cortex:8090	Embeddings (1,711/sec)
Ollama	cortex:11434	Chat + fallback embeddings
OpenWebUI	cortex:8080 (ai.echo6.co)	Aurora chat with RAG
File Server	recon:8888 (files.echo6.co)	PDF downloads

Key Paths

Path	Contents
/opt/recon/	Application code
/opt/recon/data/concepts/	Gemini extractions (CRITICAL -- back these up)
/opt/recon/data/text/	Extracted text
/opt/recon/data/recon.db	SQLite status DB
/mnt/library/	PDF library (NFS from pi-nas)

Backups

Automated every 6 hours to Contabo VPS via /opt/recon/scripts/backup.sh. Concept JSONs are the most valuable data ($130+ of Gemini API work). Qdrant is NOT backed up -- rebuilt from JSONs in ~10 minutes via recon rebuild.

Monitoring

# Pipeline status
recon status

# Tail logs
tail -f /opt/recon/logs/recon.log

# Pipeline run log
tail -f /opt/recon/pipeline.log

# Validate consistency
recon validate --deep

Full Documentation

See PROJECT-BIBLE.md for complete system documentation.