PR-B of the 2-PR whoami migration. The route is now served by navi-admin
:8427 via nginx (`^~ /api/auth/whoami` cutover verified live — edge responses
carry navi-admin's X-Cache-Status: BYPASS), so recon's handler is
edge-unreachable and safe to remove.
- lib/api.py: delete the @app.route('/api/auth/whoami') api_auth_whoami handler
+ its dedicated section comment. It was the file tail (post-cleanup-#6), so
api.py now ends on the metrics-history handler.
Sequenced after PR-A (navi-backend, merged + deployed) and the nginx edge
cutover, so the route never 404s. recon serves zero navi-facing auth-state
endpoints now.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both modules were flagged in cleanup #27 (PR #16) as fully orphaned once the
place_detail orchestrator cluster was deleted; Matt confirmed scope in chat.
- lib/overture.py (170L): only consumer was place_detail._enrich_with_overture
(deleted in #27).
- lib/osm_categories.py (143L): humanize_category's only callers were
place_detail._parse_nominatim / _parse_overpass (both deleted in #27).
Re-probed against master 79d7b2b: zero import/usage references anywhere outside
the modules themselves, zero template/JS refs, no test files. compileall lib/
passes.
Note: scripts/overture_import.py (the Overture-Maps→PostGIS ETL script) is
independent — imports nothing from lib/ — and is left untouched. After this PR
the `overture` PostGIS DB it populates has no remaining recon reader; that's a
data-ops follow-up, not code touched here.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After cleanup #4 deleted lib/geocode.py, the only remaining address_book
references in recon were lib/address_book_test.py (test of the dying SUT) and
a dead `from . import address_book` import at the top of lib/netsyms_api.py
(never referenced in the body). This PR removes all three.
- DELETE lib/address_book.py + lib/address_book_test.py
- netsyms_api.py: drop the dead `from . import address_book` import
config/address_book.yaml stays — vendored data, navi-contacts (:8423) consumes
its own copy via NAVI_ADDRESS_BOOK_YAML.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #11 (cleanup #2) deleted the /api/place* HTTP handlers but left their
orchestrator functions in lib/place_detail.py as dead code. Pre-flight for the
original Task #27 (delete google_places.py) surfaced that _enrich_with_google
is NOT a no-caller leaf — it's called by the unreachable get_place_detail. A
full caller-graph trace showed ~90% of place_detail.py is dead orchestration.
Scope expanded (Matt confirmed in chat) to remove the whole dead cluster:
- lib/google_places.py (entire file)
- place_detail.py: get_place_detail, get_place_by_wikidata, _enrich_with_google,
_apply_google_data, _enrich_with_overture, _enrich_with_wiki_index,
_enrich_wiki_links, _parse_nominatim, _parse_nominatim_address, _parse_overpass,
_build_overpass_query, cache_get, cache_put, _get_db + their now-unused
imports/constants (json, time, requests, osm_categories, NOMINATIM_URL, etc.)
KEEP only lookup_wiki_index + _get_wiki_index_db (the wiki_enrich_api survivor
path) — preserved byte-exact. Module docstring refreshed.
Flagged separately (not touched): overture.py + osm_categories.py are now
orphaned (only consumers were the deleted cluster); stale docstrings; the
deployment_config.py:9 catalog comment.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
/api/offroute (POST) and /api/mvum (GET) are edge-shadowed since extraction #8
— navi-offroute :8428 serves both via nginx. Cleanup #4 removed the last
in-process consumer of lib/offroute/dem.py (netsyms_api._reverse_elevation +
the module-level _DEM = DEMReader()), so the entire 9-file lib/offroute/
package is now orphaned and goes with this PR.
- api.py: drop both handlers (api_offroute, api_mvum) + their section comments.
Both used in-function lazy imports of offroute, so no top-of-file import
survives.
- DELETE lib/offroute/ wholesale (__init__, router, mvum, cost, barriers, dem,
friction, trails, prototype). prototype.py was already dead at runtime.
Closes the recon->navi navi-shadow cleanup loop: recon now serves zero navi-*
shadow routes.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
/api/landclass is edge-shadowed since extraction #4 — navi-landclass :8424
serves the route via nginx. Cleanup #4 removed the last in-process consumer
(netsyms_api._reverse_landclass), so lib/landclass.py is now fully orphaned.
- api.py: drop the @app.route('/api/landclass') handler + the
`from .landclass import lookup_landclass, format_summary` import.
- DELETE lib/landclass.py (only consumer was the deleted handler).
- DELETE lib/landclass_test.py (SUT gone).
PADUS_DB_* vars in /opt/recon/.env are now dead in recon — flagged for an
out-of-band post-merge cleanup, not touched here (data, not code).
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All three routes (/api/geocode, /api/reverse, /api/reverse/<lat>/<lon>) are
edge-shadowed since extraction #6 — navi-geo :8426 serves them via nginx.
- netsyms_api.py: drop geocode_bp + its three handlers, the bundle-private
helpers, and module state (TTLCache/lock/_TZ_DB_PATH/_DEM). netsyms_bp
(/api/netsyms/lookup + /health) survives.
- api.py: drop the geocode_bp import + register_blueprint line.
- DELETE lib/geocode.py, lib/nav_tools.py (both orphaned once the handlers go).
- DELETE reverse_bundle_test.py, geocode_test.py, nav_tools_test.py.
Decouples netsyms_api.py from landclass.py and offroute/dem.py — prerequisite
for cleanups #5 and #6.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* cleanup: remove /api/address_book handlers (extraction #3 shadow)
Removes address_book_bp (lib/address_book_api.py: /api/address_book/lookup +
/api/address_book/list) + its registration in lib/api.py. Edge-shadowed since
extraction #3 — navi-contacts (:8423) serves /api/address_book/* on
navi.echo6.co; no recon-side consumer (no template/JS reference).
lib/address_book.py is KEPT — geocode.py (nickname short-circuit + annotation)
and netsyms_api.py import it.
NOT removed this PR: contacts_bp. The recon dashboard at /deleted-contacts
(recon-product, stays) calls /api/contacts/<id>/{restore,restore-as,purge} via
XHR, and recon.echo6.co proxies straight to recon:8420 (verified the Caddy
block — no navi-contacts routing there). Removing contacts_bp would break those
dashboard actions. Flagged for a decision; lib/contacts.py also stays (dashboard
ContactsDB reads). See PR body.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* cleanup: deprecate /nav-i + /deleted-contacts; remove contacts_bp + lib/contacts.py
Probe found recon's /deleted-contacts dashboard reads /opt/recon/data/contacts.db
— frozen since extraction #3 moved write ownership to navi-contacts
(/var/lib/navi-backend/contacts.db). The page has been silently rendering ~25-day
stale data, and its restore/restore-as/purge XHRs hit recon's contacts_bp (the
recon.echo6.co Caddy block proxies straight to recon:8420 — no navi-contacts
routing there). Per Matt's decision, deprecate the pages entirely; they'll be
re-surfaced later as a proper admin page consuming navi-contacts via API.
Removed:
- contacts_bp (lib/contacts_api.py, all 10 /api/contacts* routes) + its
registration in lib/api.py — edge-shadowed by navi-contacts :8423 since #3,
and now free of recon-product consumers once the dashboard goes.
- /nav-i (navi_landing_page) + /deleted-contacts (deleted_contacts_page) route
handlers; templates/navi/landing.html + templates/navi/deleted_contacts.html.
- lib/contacts.py (ContactsDB) — the dashboard was its only non-contacts_bp
consumer; both gone.
- The two dead NAVI_SUBNAV entries (Overview→/nav-i, Deleted Contacts→
/deleted-contacts).
Kept / adapted:
- /nav-i/api-keys page (recon-product key management) stays. NAVI_SUBNAV reduced
to just its API Keys entry; the base.html top-nav "Nav-I" link repointed
/nav-i -> /nav-i/api-keys so the surviving section page stays reachable
(minimal href change, not a nav restructure — flagged in PR).
- lib/address_book.py — geocode.py + netsyms_api.py still consume it (untouched).
Out-of-band follow-up after merge: delete the stale /opt/recon/data/contacts.db
(frozen 2026-04-28; data, not code).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* cleanup: pull the entire /nav-i/* subtree (api-keys page is a weaker dup of /settings/keys)
Completes the contacts cleanup by removing the rest of /nav-i/. The
/nav-i/api-keys page was (a) a weaker duplicate of /settings/keys for Gemini
(it lacked remove + reload-from-.env), and (b) a write-only-to-dead-files
surface for TomTom + Google Places: it wrote /opt/recon/.env, but the live
navi-traffic (:8421) and navi-places (:8425) services read their own
/etc/navi-backend/<svc>.env and have ignored recon's copy since extractions
#1 + #5. End state: no /nav-i/* URLs in recon.
Removed:
- /nav-i/api-keys route + template (templates/navi/api_keys.html)
- all /api/nav-i/api-keys/* endpoints (list/update/test/restart-recon)
- lib/api_keys_admin.py (its only importers were those 4 endpoints; _KEY_DEFS/
_read_env/_write_env were private to it)
- the now-orphaned NAVI_SUBNAV
- the "Nav-I" top-nav entry in base.html (reverses the /nav-i->/nav-i/api-keys
repoint from the previous commit, now that the page itself is gone)
Kept (Gemini's real home, recon-product):
- /settings/keys + /api/keys/* + lib/key_manager.py (KeyManager) — they import
key_manager directly, never api_keys_admin, so untouched.
Note: TOMTOM_API_KEY now has zero recon .py references. GOOGLE_PLACES_API_KEY
still has one (lib/google_places.py), kept in the prior /api/place cleanup as
place_detail's dep; its only caller (_enrich_with_google) is unreachable since
the /api/place handlers were removed — left in place pending /api/wiki-enrich
retirement (out of scope here).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: zvx-echo6 <mj@k7zvx.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
/api/place/<osm_type>/<int:osm_id> and /api/place/wikidata/<id> are
edge-shadowed since extraction #5 — navi-places (:8425) serves both via
nginx. Removes the two recon-side handlers + the now-unused
`from .place_detail import get_place_detail, get_place_by_wikidata` import.
NO modules deleted. place_detail.py is KEPT — wiki_enrich_api.py (the
/api/wiki-enrich endpoint, which stays; navi-places HTTP-consumes it) imports
`lookup_wiki_index` from it. That transitively keeps its deps google_places.py,
overture.py, osm_categories.py (all imported only by place_detail). This
corrects Phase A #5 §3's "only lib/api.py imports place_detail" — the
wiki-enrich endpoint (added post-#5) is a second consumer.
Co-authored-by: zvx-echo6 <mj@k7zvx.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* cleanup: remove /api/config handler (extraction #2 shadow)
recon's /api/config Flask handler (lib/api.py) is edge-shadowed since
extraction #2 — navi-config (:8422) serves the route via nginx on
navi.echo6.co. The recon-side handler is dead at the edge; remove it.
lib/deployment_config.py is KEPT: get_deployment_config() still has many
in-process consumers (lib/api.py:1237 /api/landclass has_landclass gate,
google_places.py, place_detail.py x4, offroute/router.py). Only the
/api/config HTTP handler is removed; the import at api.py:27 stays.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* cleanup: refresh deployment_config docstring (drop /api/config reference)
The module docstring still said get_deployment_config() was "for use by the
/api/config endpoint" — that handler was removed in the parent commit. Rewrite
to reflect the actual 5 in-process consumers (landclass gate, google_places,
place_detail ×4, offroute/router.py profile.offroute.*).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: zvx-echo6 <mj@k7zvx.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-tag HTTP wrapper over wiki_rewrite.rewrite_wiki_link so the (future)
navi-places service can rewrite OSM wiki tags to local Kiwix URLs over HTTP
instead of importing recon's wiki_rewrite module (which talks to Kiwix on
localhost:8430 and the wiki_cache table in /opt/recon/data/place_cache.db).
Companion to PR #8 (/api/wiki-enrich) — Matt picked option B (HTTP-couple the
Kiwix offline-wiki rewriting too, since it matters in prod).
GET /api/wiki-rewrite?tag=<wikipedia|wikidata|wikivoyage|appropedia>&value=<raw>
-> 200 {url, status} where status is "local" | "public" | "original"
-> 400 on missing value or unknown tag
-> no 404 (unclassifiable value echoes back with status "original",
mirroring rewrite_wiki_link)
Public (no auth), like /api/place/* and /api/wiki-enrich.
Changes (additive only):
- lib/wiki_rewrite_api.py: new wiki_rewrite_bp blueprint. Thin route directly
over the existing rewrite_wiki_link(tag, value) — no extraction needed
(it's already a clean standalone function, unlike wiki-enrich's lookup).
- lib/api.py: register the blueprint (one block).
- lib/wiki_rewrite_api_test.py: 5 tests (local Kiwix hit, public fallback,
unclassifiable -> original, missing value -> 400, unknown tag -> 400),
stubbing check_kiwix_has_article (no Kiwix/DB), plain-assert + __main__
runner. Verified green against recon's venv (flask 3.1.2).
Does NOT touch place_detail's in-process _enrich_wiki_links — that gets removed
in a later PR once navi-places is live (same as PR #8). wiki_cache stays in
recon's own place_cache.db post-cutover (harmless positive-cache duplication).
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
HTTP wrapper over the wiki_index lookup so the (future) navi-places service can
fetch wiki enrichment over HTTP instead of reading recon's 2.1 GB
data/wiki_index.db directly (Phase A option B — HTTP coupling).
GET /api/wiki-enrich?wikidata=<Qid> (primary key)
GET /api/wiki-enrich?name=<name>&country=<cc> (fallback key)
-> 200 {wiki_summary?, wiki_population?, wiki_url?, wikivoyage_url?}
-> 400 if no usable key; 404 on no match. Public (no auth, like /api/place/*).
Route keys are wikidata_id / name+country — NOT osm_type/osm_id — because that
is how wiki_index is actually queried (the in-process _enrich_with_wiki_index
looks up by result['wikidata_id'] then name+country_code, never by OSM id; see
extraction-5-wiki-enrich-investigation.md). An osm-keyed route would have forced
a redundant in-recon place lookup.
Changes (additive only):
- lib/place_detail.py: new standalone lookup_wiki_index(wikidata_id, name,
country_code) doing the same two SELECTs + field/URL mapping as the
in-process path, returning a dict or None. Pure DB read, never raises.
`_enrich_with_wiki_index` is LEFT UNTOUCHED — it can be DRY-refactored to
delegate to this in a later PR; the in-process enrichment path is unchanged.
- lib/wiki_enrich_api.py: new wiki_enrich_bp blueprint with the route.
- lib/api.py: register the blueprint (one block).
- lib/wiki_enrich_api_test.py: 4 tests (hit-by-wikidata + decoded fields,
no-match -> 404, name+country fallback, no-key -> 400) over an in-memory
fixture DB; plain-assert style + __main__ runner (recon venv has no pytest).
Verified green against recon's venv (flask 3.1.2).
Does NOT remove the in-process _enrich_with_wiki_index call from place_detail —
that happens in a later PR once navi-places is live and serving.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Additive prep for the Navi Panel.jsx login/logout cutover. Adds an `auth`
block (login_url, logout_url) to each deployment profile, placed after the
existing `services` block:
- home.yaml login=/outpost.goauthentik.io/start?rd=%2F
logout=auth.echo6.co invalidation flow, next=navi.echo6.co
- minimal_pi.yaml same, with TODO(matt) to confirm logout next= host
- regional_pi.yaml same, with TODO(matt) to confirm logout next= host
No Python change. /api/config returns the whole profile dict, so these keys
flow through automatically; existing consumers ignore unknown keys, making
this backward-safe (the frontend fallback path is simply never needed once
this is live).
Next steps (separate PRs): the navi-config service (:8422) mirroring this
handler, and the Panel.jsx fix to read cfg.auth.login_url/logout_url with the
current literals as fallback.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /api/traffic/flow/<z>/<x>/<y>.png handler is dead code in recon. As of
extraction #1 of the recon<->Navi decoupling, this path is served by the
standalone navi-traffic service. Live request flow is now:
Caddy (CT 101, navi.echo6.co @authed_api, forward_auth)
-> nginx :8440 (location ^~ /api/traffic/ -> proxy_cache traffic_cache)
-> navi-traffic gunicorn :8421 (services/navi_traffic)
Cutover verified live: authenticated browser fetch to
https://navi.echo6.co/api/traffic/flow/... returns 200 image/png with
X-Cache-Status MISS then HIT (120s cache), Server: gunicorn.
navi-backend (github.com/zvx-echo6/navi-backend):
- dae54f3 Initial scaffold: navi-backend + navi-traffic
- 311cb8f nginx: use ^~ prefix on /api/traffic/ to beat .png regex catch-all
Caddy cutover (@authed_api upstream 8420 -> nginx 8440) applied on Utility
CT 101. Also drops the now-unused make_response flask import (no other uses
in lib/api.py). os and http_requests remain (used elsewhere).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Enriches place API responses with wiki_summary, wiki_url, wiki_population,
and wikivoyage_url from wiki_index.db. Lookups by wikidata_id first,
then falls back to name + country_code.
Called from Nominatim, Overpass, and Wikidata endpoints.
Gated by has_kiwix_wiki feature flag.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
47 PAD-US units (Aleutian/Bering-Sea BOEM marine features, all is_valid=False)
are stored as antimeridian-wrapping polygons whose bbox spans ~360 deg of
longitude. Their invalid planar geometry forms latitude bands that ST_Intersects
false-matches for non-US points (e.g. London/Germany at ~51N matched
"Rat Islands" ogc_fid 3974).
Fix: add `AND (ST_XMax(geom) - ST_XMin(geom)) < 60` to the lookup_landclass
SELECT. No DB writes; two cheap ST_XMax/XMin evals on the already
spatial-index-filtered result set. Verified live: total 651088 rows,
filtered 651041 (exactly 47 excluded); Yosemite/Grand Canyon retained,
London/Germany now empty.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per OFFROUTE-ARCHITECTURE.md §9 ("planet-dem.pmtiles as single elevation
source"). The bundle endpoint previously called Valhalla /height, which only
has 48 Idaho HGT tiles; it now reads the planet-scale Terrarium PMTiles that
already back the frontend hillshade and contours.
- dem.py: add DEMReader.sample_point(lat, lon) — one z12 tile (LRU-cached),
Web-Mercator pixel index, None outside the +/-85.05 pole cap or when untiled.
- netsyms_api.py: module-level DEMReader singleton (lazy mmap, None if init
fails); _reverse_elevation now calls _DEM.sample_point; drop the Valhalla
HTTP call and _VALHALLA_HEIGHT_URL.
- tests: DEM-mock and DEM-unavailable cases; EXPECTED_KEYS derives from
_BUNDLE_KEYS. All 9 tests pass.
Verified live: Boise 824m, London 8m, Tokyo 35m, Yosemite 2804m, pole -> None.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New geocode_bp sibling to the existing /api/reverse?lat=&lon= route (which
is unchanged). Returns a flat 9-field bundle for the Central enrichment
framework: name, city, county, state, country, postal_code (Photon),
timezone (timezones.sqlite via R-tree + shapely), landclass (in-process
lookup_landclass), elevation_m (Valhalla /height).
- Each component lookup is independent and wrapped in try/except: a failure
logs a warning and yields null, never a 5xx. 400 only on unparseable /
out-of-range coordinates.
- lat/lon parsed manually rather than via Flask <float:>, which rejects
negative and integer coordinates and would 404 instead of 400.
- 10k-entry / 24h TTLCache keyed on coords rounded to 4 decimals.
- Tests mock Photon/Valhalla/landclass; one test exercises the real
timezones.sqlite. cachetools pinned in requirements.txt.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Migrate EntryPointIndex from SQLite to PostGIS (padus database)
- Densify highway LineStrings at 100m intervals via Shapely interpolate
- 2.94M entry points from 476k lines (4x more coverage)
- Tag each entry point with land_status via ST_Intersects against padus_sub
- 1.64M public (56%), 1.30M unknown (44%)
- Add geography GIST index for fast radius queries (~25ms)
- Increase OFF_NETWORK_THRESHOLD_M from 10m to 50m for GPS accuracy
- PBF path and PostGIS DSN configurable via home.yaml
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The wilderness segment now ALWAYS uses foot mode for MCP pathfinding.
The user's selected mode only affects:
1. Entry point selection (MODE_TO_VALID_HIGHWAYS filtering)
2. Valhalla costing for the network segment
This ensures vehicles can navigate through wilderness (on foot) to
reach roads, rather than failing when no vehicle-accessible path exists.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
MVUM Data Import:
- Downloaded USFS MVUM Roads (150,636 features) and Trails (28,741 features)
- Imported to navi.db as mvum_roads and mvum_trails tables
- Idaho coverage: ~8,994 roads and ~4,504 trails across 7 national forests
- Preserved all vehicle-class fields (ATV, MOTORCYCLE, HIGHCLEARANCEVEHICLE, etc.)
- Preserved seasonal date ranges (*_DATESOPEN fields)
New mvum.py module:
- MVUMReader class for querying MVUM data by bbox and nearest point
- parse_date_range() for seasonal date string parsing (MM/DD-MM/DD format)
- check_access() for determining open/closed status with date checking
- symbol_to_access() fallback when per-vehicle fields are null
- get_mvum_access_grid() for rasterizing MVUM to pathfinder grid
Cost function integration:
- Added mvum parameter to compute_cost_grid()
- MVUM closures respond to boundary_mode:
* strict = impassable (np.inf)
* pragmatic = 5x friction penalty
* emergency = ignored entirely
- Foot mode skips MVUM (motor-vehicle specific)
Router integration:
- Loads MVUM access grid for motorized modes (mtb, atv, vehicle)
- Tracks mvum_closed_crossings in path summary
Places Panel API:
- GET /api/mvum?lat=XX&lon=XX&radius=50
- Returns MVUM feature with access status for all vehicle classes
- Includes seasonal date ranges, maintenance level, forest/district info
- GeoJSON geometry for map display
Validation:
- MVUM places endpoint tested with Sawtooth NF road
- All four modes validated with strict/pragmatic/emergency boundary modes
- Foot mode correctly ignores MVUM restrictions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- New friction.py: reads WorldCover friction VRT, resamples to match
elevation grid, provides point sampling for validation
- Modified cost.py: accepts optional friction array, multiplies Tobler
time cost by friction multiplier, inf for water/nodata (255/0)
- Modified prototype.py: loads friction layer, passes to cost function,
validates path avoids water cells (friction=255)
Validated on Idaho test bbox:
- Path avoids Murtaugh Lake (no water cells on path)
- Friction along path: min=10, max=20, mean=10.2
- Effort increased 3.4% vs Phase O1 due to friction multipliers
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds _text_quality_ok() gate that replaces the bare 50-char length
check at each stage of the extraction fallback chain. Checks:
- Word-boundary ratio (≥60% of tokens must be real words)
- Concatenation ratio (lc→UC transitions must be <10% of word count)
When PyPDF2 default extraction fails quality check, retries with
space_width=100 for tighter word-boundary detection. This fixes
Haynes/workshop manuals where tight kerning produces concatenated
words like 'byMike' and 'oftheGuild'.
Also adds -layout flag to pdftotext subprocess calls for better
spatial awareness in the poppler fallback stage.
Note: PyPDF2 3.0.1 does not support layout=True parameter.
The space_width parameter serves the same purpose.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pre-processes HTML tree before lxml .text_content() to prevent
element concatenation:
- <table> cells joined with ' | ' delimiter, rows with newlines
- <br> tags produce newlines
- <li> items get '- ' prefix and newline separation
- <dt>/<dd> definition list items get newline separation
Fixes ~868 mangled Qdrant points where table content was jammed
together (e.g. 'Freq51Primary1A==' instead of 'Freq51 | Primary | 1A==').
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Returns {authenticated: bool, username: string|null} based on
X-Authentik-Username header presence. Used by Navi frontend to
detect auth state without triggering SSO redirect.
The /api/geocode endpoint blended Photon and Netsyms results, but only
Photon respected viewport bias from prior work. Address queries to
Netsyms/AddressDB returned globally-sorted matches regardless of where
the user was looking — searching '214 North St' from Idaho returned
Illinois results.
Now fetches up to 200 Netsyms results when viewport lat/lon provided,
sorts by squared distance from viewport center, then returns top N.
Falls back to default ordering when viewport absent. Photon path
unchanged.
Request polygon_geojson=1 from Nominatim to include admin boundary
polygons in place detail responses. Also fetch boundary via OSM
relation ID for wikidata lookups.