PR-B of decouple #4-REWRITE — the LAST recon→navi decoupling step. navi-places
now owns the Kiwix link-rewrite logic in-process (navi-backend PR-A 7103c27,
deployed + verified: Twin Falls live route returns wiki_rewrites local/public
from navi's own wiki_cache.db; zero outbound calls to recon /api/wiki-rewrite).
- DELETE lib/wiki_rewrite.py (the Kiwix rewrite logic — ported to navi-places).
- DELETE lib/wiki_rewrite_api.py (the /api/wiki-rewrite blueprint).
- DELETE lib/wiki_rewrite_api_test.py (tests the deleted endpoint).
- api.py: drop the wiki_rewrite_bp import + register_blueprint + section comment.
Verified zero recon consumers: nothing in recon imports wiki_rewrite — it was
purely an HTTP endpoint for navi-places. After this, recon services make and
receive zero navi-ecosystem runtime calls; recon is a fully separate product.
Out-of-band (post-deploy): DROP TABLE wiki_cache from /opt/recon/data/place_cache.db
(table only — place_cache + google_api_calls stay).
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR-B of decouple #4-READ. navi-places now reads its own wiki_index.db directly
(navi-backend a8f9520, deployed + verified: Horseshoe Falls enrichment served
from /var/lib/navi-backend/wiki_index.db; admin-info dropped the recon-wiki-enrich
dependency). recon's endpoint is edge-unreachable-unused, safe to remove.
- DELETE lib/wiki_enrich_api.py (the /api/wiki-enrich blueprint).
- DELETE lib/place_detail.py (97-line survivor: lookup_wiki_index +
_get_wiki_index_db) — its only consumer was wiki_enrich_api.py (verified zero
non-test code consumers). Fully orphaned.
- DELETE lib/wiki_enrich_api_test.py (tests the deleted endpoint).
- api.py: drop the wiki_enrich_bp import + register_blueprint.
Untouched (separate decouple): /api/wiki-rewrite (wiki_rewrite_api.py +
wiki_rewrite.py), still navi-consumed. /opt/recon/data/wiki_index.db left in
place (data; now a harmless dead file). Internal localhost migration — no nginx.
Flag (doc follow-up, not fixed): deployment_config.py:10 + wiki_rewrite_api.py:6
both have stale in-prose references to the deleted place_detail.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR-B of the 2-PR whoami migration. The route is now served by navi-admin
:8427 via nginx (`^~ /api/auth/whoami` cutover verified live — edge responses
carry navi-admin's X-Cache-Status: BYPASS), so recon's handler is
edge-unreachable and safe to remove.
- lib/api.py: delete the @app.route('/api/auth/whoami') api_auth_whoami handler
+ its dedicated section comment. It was the file tail (post-cleanup-#6), so
api.py now ends on the metrics-history handler.
Sequenced after PR-A (navi-backend, merged + deployed) and the nginx edge
cutover, so the route never 404s. recon serves zero navi-facing auth-state
endpoints now.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
/api/offroute (POST) and /api/mvum (GET) are edge-shadowed since extraction #8
— navi-offroute :8428 serves both via nginx. Cleanup #4 removed the last
in-process consumer of lib/offroute/dem.py (netsyms_api._reverse_elevation +
the module-level _DEM = DEMReader()), so the entire 9-file lib/offroute/
package is now orphaned and goes with this PR.
- api.py: drop both handlers (api_offroute, api_mvum) + their section comments.
Both used in-function lazy imports of offroute, so no top-of-file import
survives.
- DELETE lib/offroute/ wholesale (__init__, router, mvum, cost, barriers, dem,
friction, trails, prototype). prototype.py was already dead at runtime.
Closes the recon->navi navi-shadow cleanup loop: recon now serves zero navi-*
shadow routes.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
/api/landclass is edge-shadowed since extraction #4 — navi-landclass :8424
serves the route via nginx. Cleanup #4 removed the last in-process consumer
(netsyms_api._reverse_landclass), so lib/landclass.py is now fully orphaned.
- api.py: drop the @app.route('/api/landclass') handler + the
`from .landclass import lookup_landclass, format_summary` import.
- DELETE lib/landclass.py (only consumer was the deleted handler).
- DELETE lib/landclass_test.py (SUT gone).
PADUS_DB_* vars in /opt/recon/.env are now dead in recon — flagged for an
out-of-band post-merge cleanup, not touched here (data, not code).
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All three routes (/api/geocode, /api/reverse, /api/reverse/<lat>/<lon>) are
edge-shadowed since extraction #6 — navi-geo :8426 serves them via nginx.
- netsyms_api.py: drop geocode_bp + its three handlers, the bundle-private
helpers, and module state (TTLCache/lock/_TZ_DB_PATH/_DEM). netsyms_bp
(/api/netsyms/lookup + /health) survives.
- api.py: drop the geocode_bp import + register_blueprint line.
- DELETE lib/geocode.py, lib/nav_tools.py (both orphaned once the handlers go).
- DELETE reverse_bundle_test.py, geocode_test.py, nav_tools_test.py.
Decouples netsyms_api.py from landclass.py and offroute/dem.py — prerequisite
for cleanups #5 and #6.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* cleanup: remove /api/address_book handlers (extraction #3 shadow)
Removes address_book_bp (lib/address_book_api.py: /api/address_book/lookup +
/api/address_book/list) + its registration in lib/api.py. Edge-shadowed since
extraction #3 — navi-contacts (:8423) serves /api/address_book/* on
navi.echo6.co; no recon-side consumer (no template/JS reference).
lib/address_book.py is KEPT — geocode.py (nickname short-circuit + annotation)
and netsyms_api.py import it.
NOT removed this PR: contacts_bp. The recon dashboard at /deleted-contacts
(recon-product, stays) calls /api/contacts/<id>/{restore,restore-as,purge} via
XHR, and recon.echo6.co proxies straight to recon:8420 (verified the Caddy
block — no navi-contacts routing there). Removing contacts_bp would break those
dashboard actions. Flagged for a decision; lib/contacts.py also stays (dashboard
ContactsDB reads). See PR body.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* cleanup: deprecate /nav-i + /deleted-contacts; remove contacts_bp + lib/contacts.py
Probe found recon's /deleted-contacts dashboard reads /opt/recon/data/contacts.db
— frozen since extraction #3 moved write ownership to navi-contacts
(/var/lib/navi-backend/contacts.db). The page has been silently rendering ~25-day
stale data, and its restore/restore-as/purge XHRs hit recon's contacts_bp (the
recon.echo6.co Caddy block proxies straight to recon:8420 — no navi-contacts
routing there). Per Matt's decision, deprecate the pages entirely; they'll be
re-surfaced later as a proper admin page consuming navi-contacts via API.
Removed:
- contacts_bp (lib/contacts_api.py, all 10 /api/contacts* routes) + its
registration in lib/api.py — edge-shadowed by navi-contacts :8423 since #3,
and now free of recon-product consumers once the dashboard goes.
- /nav-i (navi_landing_page) + /deleted-contacts (deleted_contacts_page) route
handlers; templates/navi/landing.html + templates/navi/deleted_contacts.html.
- lib/contacts.py (ContactsDB) — the dashboard was its only non-contacts_bp
consumer; both gone.
- The two dead NAVI_SUBNAV entries (Overview→/nav-i, Deleted Contacts→
/deleted-contacts).
Kept / adapted:
- /nav-i/api-keys page (recon-product key management) stays. NAVI_SUBNAV reduced
to just its API Keys entry; the base.html top-nav "Nav-I" link repointed
/nav-i -> /nav-i/api-keys so the surviving section page stays reachable
(minimal href change, not a nav restructure — flagged in PR).
- lib/address_book.py — geocode.py + netsyms_api.py still consume it (untouched).
Out-of-band follow-up after merge: delete the stale /opt/recon/data/contacts.db
(frozen 2026-04-28; data, not code).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* cleanup: pull the entire /nav-i/* subtree (api-keys page is a weaker dup of /settings/keys)
Completes the contacts cleanup by removing the rest of /nav-i/. The
/nav-i/api-keys page was (a) a weaker duplicate of /settings/keys for Gemini
(it lacked remove + reload-from-.env), and (b) a write-only-to-dead-files
surface for TomTom + Google Places: it wrote /opt/recon/.env, but the live
navi-traffic (:8421) and navi-places (:8425) services read their own
/etc/navi-backend/<svc>.env and have ignored recon's copy since extractions
#1 + #5. End state: no /nav-i/* URLs in recon.
Removed:
- /nav-i/api-keys route + template (templates/navi/api_keys.html)
- all /api/nav-i/api-keys/* endpoints (list/update/test/restart-recon)
- lib/api_keys_admin.py (its only importers were those 4 endpoints; _KEY_DEFS/
_read_env/_write_env were private to it)
- the now-orphaned NAVI_SUBNAV
- the "Nav-I" top-nav entry in base.html (reverses the /nav-i->/nav-i/api-keys
repoint from the previous commit, now that the page itself is gone)
Kept (Gemini's real home, recon-product):
- /settings/keys + /api/keys/* + lib/key_manager.py (KeyManager) — they import
key_manager directly, never api_keys_admin, so untouched.
Note: TOMTOM_API_KEY now has zero recon .py references. GOOGLE_PLACES_API_KEY
still has one (lib/google_places.py), kept in the prior /api/place cleanup as
place_detail's dep; its only caller (_enrich_with_google) is unreachable since
the /api/place handlers were removed — left in place pending /api/wiki-enrich
retirement (out of scope here).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: zvx-echo6 <mj@k7zvx.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
/api/place/<osm_type>/<int:osm_id> and /api/place/wikidata/<id> are
edge-shadowed since extraction #5 — navi-places (:8425) serves both via
nginx. Removes the two recon-side handlers + the now-unused
`from .place_detail import get_place_detail, get_place_by_wikidata` import.
NO modules deleted. place_detail.py is KEPT — wiki_enrich_api.py (the
/api/wiki-enrich endpoint, which stays; navi-places HTTP-consumes it) imports
`lookup_wiki_index` from it. That transitively keeps its deps google_places.py,
overture.py, osm_categories.py (all imported only by place_detail). This
corrects Phase A #5 §3's "only lib/api.py imports place_detail" — the
wiki-enrich endpoint (added post-#5) is a second consumer.
Co-authored-by: zvx-echo6 <mj@k7zvx.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* cleanup: remove /api/config handler (extraction #2 shadow)
recon's /api/config Flask handler (lib/api.py) is edge-shadowed since
extraction #2 — navi-config (:8422) serves the route via nginx on
navi.echo6.co. The recon-side handler is dead at the edge; remove it.
lib/deployment_config.py is KEPT: get_deployment_config() still has many
in-process consumers (lib/api.py:1237 /api/landclass has_landclass gate,
google_places.py, place_detail.py x4, offroute/router.py). Only the
/api/config HTTP handler is removed; the import at api.py:27 stays.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* cleanup: refresh deployment_config docstring (drop /api/config reference)
The module docstring still said get_deployment_config() was "for use by the
/api/config endpoint" — that handler was removed in the parent commit. Rewrite
to reflect the actual 5 in-process consumers (landclass gate, google_places,
place_detail ×4, offroute/router.py profile.offroute.*).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: zvx-echo6 <mj@k7zvx.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-tag HTTP wrapper over wiki_rewrite.rewrite_wiki_link so the (future)
navi-places service can rewrite OSM wiki tags to local Kiwix URLs over HTTP
instead of importing recon's wiki_rewrite module (which talks to Kiwix on
localhost:8430 and the wiki_cache table in /opt/recon/data/place_cache.db).
Companion to PR #8 (/api/wiki-enrich) — Matt picked option B (HTTP-couple the
Kiwix offline-wiki rewriting too, since it matters in prod).
GET /api/wiki-rewrite?tag=<wikipedia|wikidata|wikivoyage|appropedia>&value=<raw>
-> 200 {url, status} where status is "local" | "public" | "original"
-> 400 on missing value or unknown tag
-> no 404 (unclassifiable value echoes back with status "original",
mirroring rewrite_wiki_link)
Public (no auth), like /api/place/* and /api/wiki-enrich.
Changes (additive only):
- lib/wiki_rewrite_api.py: new wiki_rewrite_bp blueprint. Thin route directly
over the existing rewrite_wiki_link(tag, value) — no extraction needed
(it's already a clean standalone function, unlike wiki-enrich's lookup).
- lib/api.py: register the blueprint (one block).
- lib/wiki_rewrite_api_test.py: 5 tests (local Kiwix hit, public fallback,
unclassifiable -> original, missing value -> 400, unknown tag -> 400),
stubbing check_kiwix_has_article (no Kiwix/DB), plain-assert + __main__
runner. Verified green against recon's venv (flask 3.1.2).
Does NOT touch place_detail's in-process _enrich_wiki_links — that gets removed
in a later PR once navi-places is live (same as PR #8). wiki_cache stays in
recon's own place_cache.db post-cutover (harmless positive-cache duplication).
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
HTTP wrapper over the wiki_index lookup so the (future) navi-places service can
fetch wiki enrichment over HTTP instead of reading recon's 2.1 GB
data/wiki_index.db directly (Phase A option B — HTTP coupling).
GET /api/wiki-enrich?wikidata=<Qid> (primary key)
GET /api/wiki-enrich?name=<name>&country=<cc> (fallback key)
-> 200 {wiki_summary?, wiki_population?, wiki_url?, wikivoyage_url?}
-> 400 if no usable key; 404 on no match. Public (no auth, like /api/place/*).
Route keys are wikidata_id / name+country — NOT osm_type/osm_id — because that
is how wiki_index is actually queried (the in-process _enrich_with_wiki_index
looks up by result['wikidata_id'] then name+country_code, never by OSM id; see
extraction-5-wiki-enrich-investigation.md). An osm-keyed route would have forced
a redundant in-recon place lookup.
Changes (additive only):
- lib/place_detail.py: new standalone lookup_wiki_index(wikidata_id, name,
country_code) doing the same two SELECTs + field/URL mapping as the
in-process path, returning a dict or None. Pure DB read, never raises.
`_enrich_with_wiki_index` is LEFT UNTOUCHED — it can be DRY-refactored to
delegate to this in a later PR; the in-process enrichment path is unchanged.
- lib/wiki_enrich_api.py: new wiki_enrich_bp blueprint with the route.
- lib/api.py: register the blueprint (one block).
- lib/wiki_enrich_api_test.py: 4 tests (hit-by-wikidata + decoded fields,
no-match -> 404, name+country fallback, no-key -> 400) over an in-memory
fixture DB; plain-assert style + __main__ runner (recon venv has no pytest).
Verified green against recon's venv (flask 3.1.2).
Does NOT remove the in-process _enrich_with_wiki_index call from place_detail —
that happens in a later PR once navi-places is live and serving.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /api/traffic/flow/<z>/<x>/<y>.png handler is dead code in recon. As of
extraction #1 of the recon<->Navi decoupling, this path is served by the
standalone navi-traffic service. Live request flow is now:
Caddy (CT 101, navi.echo6.co @authed_api, forward_auth)
-> nginx :8440 (location ^~ /api/traffic/ -> proxy_cache traffic_cache)
-> navi-traffic gunicorn :8421 (services/navi_traffic)
Cutover verified live: authenticated browser fetch to
https://navi.echo6.co/api/traffic/flow/... returns 200 image/png with
X-Cache-Status MISS then HIT (120s cache), Server: gunicorn.
navi-backend (github.com/zvx-echo6/navi-backend):
- dae54f3 Initial scaffold: navi-backend + navi-traffic
- 311cb8f nginx: use ^~ prefix on /api/traffic/ to beat .png regex catch-all
Caddy cutover (@authed_api upstream 8420 -> nginx 8440) applied on Utility
CT 101. Also drops the now-unused make_response flask import (no other uses
in lib/api.py). os and http_requests remain (used elsewhere).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MVUM Data Import:
- Downloaded USFS MVUM Roads (150,636 features) and Trails (28,741 features)
- Imported to navi.db as mvum_roads and mvum_trails tables
- Idaho coverage: ~8,994 roads and ~4,504 trails across 7 national forests
- Preserved all vehicle-class fields (ATV, MOTORCYCLE, HIGHCLEARANCEVEHICLE, etc.)
- Preserved seasonal date ranges (*_DATESOPEN fields)
New mvum.py module:
- MVUMReader class for querying MVUM data by bbox and nearest point
- parse_date_range() for seasonal date string parsing (MM/DD-MM/DD format)
- check_access() for determining open/closed status with date checking
- symbol_to_access() fallback when per-vehicle fields are null
- get_mvum_access_grid() for rasterizing MVUM to pathfinder grid
Cost function integration:
- Added mvum parameter to compute_cost_grid()
- MVUM closures respond to boundary_mode:
* strict = impassable (np.inf)
* pragmatic = 5x friction penalty
* emergency = ignored entirely
- Foot mode skips MVUM (motor-vehicle specific)
Router integration:
- Loads MVUM access grid for motorized modes (mtb, atv, vehicle)
- Tracks mvum_closed_crossings in path summary
Places Panel API:
- GET /api/mvum?lat=XX&lon=XX&radius=50
- Returns MVUM feature with access status for all vehicle classes
- Includes seasonal date ranges, maintenance level, forest/district info
- GeoJSON geometry for map display
Validation:
- MVUM places endpoint tested with Sawtooth NF road
- All four modes validated with strict/pragmatic/emergency boundary modes
- Foot mode correctly ignores MVUM restrictions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Returns {authenticated: bool, username: string|null} based on
X-Authentik-Username header presence. Used by Navi frontend to
detect auth state without triggering SSO redirect.
- Add get_place_by_wikidata() to place_detail.py
- Queries Wikidata API for entity details (name, description, coords)
- Extracts population, instance_of, OSM relation ID, Wikipedia link
- Add /api/place/wikidata/<id> route to api.py
Supports Navi basemap label enrichment when OSM details unavailable.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace /nav-i/api-keys stub with functional admin page for managing
third-party API keys (Gemini, TomTom, Google Places).
- New lib/api_keys_admin.py: list/update/test operations with masked
display, atomic .env writes (.env.bak backup), provider-specific
test calls (Gemini models.list, TomTom geocode, Google Places
searchText)
- 4 new endpoints: GET /api/nav-i/api-keys/list, POST .../update,
POST .../test, POST .../restart-recon
- Full UI: key table with masked values, per-key update modal with
show/hide toggle, inline test results with latency, Gemini detail
sub-table with per-key stats, RECON restart with confirmation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Integrates USGS PAD-US 4.0 (651k features) into a local PostGIS database
for point-in-polygon land ownership queries. Adds /api/landclass endpoint
returning classifications, public/private status, and management hierarchy.
- lib/landclass.py: connection pool, lookup_landclass(), domain label maps
- lib/api.py: GET /api/landclass?lat=&lon= (feature-flag gated)
- home.yaml: enable has_landclass flag
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New /api/place/<osm_type>/<osm_id> endpoint returns cleaned OSM tag data
for PlaceDetail panel enrichment. Routes to local Nominatim (Idaho coverage)
first, falls back to Overpass public API for out-of-region queries. Responses
cached in SQLite (data/place_cache.db) with no expiry.
New modules: lib/place_detail.py (proxy + cache), lib/osm_categories.py
(~50 category humanization mappings). Profile YAMLs updated with
place_details config block and has_nominatim_details flag.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add /api/traffic/flow proxy route to hide TomTom API key from frontend
- Add tileset_hillshade and traffic config blocks to all three profiles
- Flip has_hillshade and has_traffic_overlay flags in home and regional profiles
- Minimal profile has config blocks but flags remain false (dormant)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add profile-driven config infrastructure:
- config/profiles/{home,regional_pi,minimal_pi}.yaml templates
- lib/deployment_config.py loader (reads RECON_PROFILE env var)
- GET /api/config returns active profile as JSON (5min cache)
Frontend reads this on startup to determine tile source, defaults,
and feature flags. No existing behavior changed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- YAML-backed saved locations (config/address_book.yaml)
- Exact/partial alias matching with case-insensitive lookup
- Flask blueprint: /api/address_book/lookup, /api/address_book/list
- Geocoder short-circuits Photon when address book has exact match
- Test suite for lookup behavior
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract shared _full_zim_cleanup(source_id) from api_kiwix_remove
- Add SIGHUP to kiwix-serve after kiwix-manage remove
- Delete linked scrape_jobs rows during ZIM removal
- Update api_scraper_delete to do full ZIM cleanup when applicable
- Set chromium_path for single-file browser crawl support
- Add status.db to .gitignore
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New API endpoints: DELETE single job, clear all failed/cancelled.
Dashboard now shows Delete buttons on completed/failed jobs,
Retry+Delete on failed jobs, and a Clear Failed bulk action.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New /kiwix/scraper page with submit form (URL, title, language,
crawl mode), stats cards, and auto-refreshing jobs table with
cancel/retry actions. Kiwix section now has Library/Scraper subnav.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Status was showing COMPLETE after ZIM extraction finished, even when
documents were still queued for enrichment/embedding. Now computes
effective_status by checking actual pipeline state per-source:
- DETECTED: ingest not enabled (gray)
- EXTRACTING: ZIM processor running (blue)
- PROCESSING: extracted but docs still in enricher/embedder queue (amber)
- COMPLETE: all docs fully enriched and embedded in Qdrant (green)
Also fixed _build_kiwix_sources pipeline query to filter by category
per-source instead of returning global kiwix stats for every source.
Progress column now shows "X / Y in Qdrant" when processing, or
"X / Y extracted" otherwise.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Install langdetect package for content-level language detection
- Add _check_language() to enricher.py: reads first 1500 chars of first
page, detects language via langdetect, skips if not in allowed list
- Configurable via config.yaml pipeline.language_filter and
pipeline.allowed_languages (default: en only)
- Catches non-English content from ANY source (PDF, web, ZIM, PeerTube)
before burning Gemini API quota on enrichment
- Add scan_zims retry logic (3 attempts, 2s delay) for upload handler
- Purged 6,483 stale non-English zim_articles rows from DB
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ZIM processor: extract articles from ZIM files, feed into existing enrichment pipeline
- Dashboard: Kiwix tab with library table, ingest toggle, upload, remove
- kiwix-serve on port 8430, wiki.echo6.co behind Authentik
- Citation URLs point to wiki.echo6.co/{zimname}/{article_path}
- Dashboard shows WIKI type badge for ZIM-sourced content
- Appropedia EN (19,445 articles) fully ingested as proof of concept
Upload handler now writes files to the appropriate hopper subfolder
instead of copying directly to /mnt/library/:
- .pdf -> acquired/pdf/
- .txt -> acquired/text/
- .epub, .doc, .docx, .mobi -> acquired/pdf/ (dispatcher format
normalizer converts to PDF before processing)
The dispatcher picks up files and routes through the appropriate
processor (pdf_processor or text_processor) for full metadata
voting, domain classification, and canonical filing.
Changes to api_upload() / _process_upload():
- Relaxed extension check: PDF, TXT, EPUB, DOC, DOCX, MOBI
- Routes to correct hopper subfolder by extension
- Writes meta.json sidecar with original filename and category hint
- Removed: direct library copy, add_to_catalogue, queue_document
- Added: hopper-level dedup check (catches rapid re-uploads)
- Kept: catalogue dedup check for immediate user feedback
Changes to api_upload_status():
- Added fallback: checks acquired/ and processing/ dirs if hash
not yet in documents table (covers gap between upload and
dispatcher pickup)
Template updated: accept attribute and help text now reflect
multi-format support.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace legacy ingest_channel/ingest_all imports with acquire_batch
from lib.acquisition.peertube. The endpoint now writes flat file pairs
to the hopper and lets the dispatcher handle processing, matching the
Phase 6d architecture. Removes channel/since/process parameters that
were tied to the old direct-ingest path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs in the Recently Completed table:
1. Title showed "Untitled" for all transcripts because the dashboard
read documents.book_title (populated by PDF metadata voting) which
is NULL for transcripts. Fixed by COALESCE(book_title, filename)
in the SQL query -- falls back to catalogue.filename which holds
the real video title.
2. Type showed "WEB" for all transcripts because the type CASE
expression only had web and pdf branches, with web matching any
http% path -- and transcript paths are PeerTube watch URLs.
Fixed by adding a transcript branch keyed on catalogue.source =
stream.echo6.co, evaluated before the web branch.
Also adds badge-transcript CSS (purple) and JS rendering case.
Applied consistently to both the Recently Completed and Sources
table queries.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Current state of the pipeline code as of 2026-04-14 (Phase 1 scaffolding complete).
Config has new_pipeline.enabled=false and crawler.sites=[] per refactor plan.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>