Commit graph

86 commits

Author SHA1 Message Date
299be21f42 Replace mega-channel size rule with explicit skip list
The >500-video threshold was too aggressive — it skipped tiebreaking
for legitimate large channels (1a-auto, forgotten-weapons, etc.) where
channel context correctly resolves ties. Replace with an explicit
MEGA_CHANNEL_SKIP_LIST in recon_domains.py. Only known non-topical
catch-alls (currently just "Transcript") skip the tiebreaker.

Removed _channel_video_count() helper and MEGA_CHANNEL_THRESHOLD
constant (no longer used).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 04:24:39 +00:00
d8196e60c7 Track domain-assignment follow-up items
Three nice-to-haves from the Qdrant migration pre-commit review:
list-type domain handling, tiebreaker client threading, debug log
caller identification. All zero production impact.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 03:59:19 +00:00
043119fbaa Fix: review UI domain breakdown from Qdrant
The review items API endpoint still read concept domain counts from
on-disk JSON files, which would show empty breakdowns for items with
missing concept directories. Replace disk reads with the same
_count_domains_from_qdrant function used by domain assignment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 03:59:13 +00:00
3b37d96c4d Switch domain assignment to Qdrant as source of truth
Replace on-disk concept file reads with Qdrant payload queries for
domain assignment. This unlocks assignment for ~10,120 items that had
missing or legacy-only concept files on disk while Qdrant held the
correct 18-domain taxonomy data.

Changes:
- domain_assigner.py: Replace _count_concept_domains (disk) with
  _count_domains_from_qdrant and _count_domains_from_qdrant_batch
  (Qdrant scroll queries). Add _get_qdrant_client helper. Remove
  pass 3 defensive re-run (Qdrant reads are consistent). Add
  no_concepts terminal status for zero-vector documents.
- embedder.py: Post-embed hook passes existing qdrant client to
  compute_assignment, avoiding a second connection.
- recon.py: Backfill creates one QdrantClient for the batch. SQL
  filter includes existing needs_reprocess items. Dry-run reports
  no_concepts as separate bucket. --reprocess-missing removes
  concept-dir deletion step (no longer reads from disk).
- docs/domain-assignment.md: Algorithm references Qdrant, documents
  no_concepts status, removes pass 3 description.

Dry-run results: 20,453 assigned, 1,392 tied, 298 no_concepts,
0 needs_reprocess, 0 errors (previously 10,416 needs_reprocess).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 03:59:06 +00:00
c04ccc5011 Fix plugin for PeerTube v8.0.2 validation
PeerTube v8.0.2 requires a "bugs" field in package.json and an
unregister() export in main.js. Add both to pass plugin validation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 01:55:47 +00:00
75b2e19e94 Fix: correct Contabo concept backup path in docs (/opt/backups/recon/concepts/)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 01:42:05 +00:00
a39ec56620 Docs: domain assignment guide, migration runbook, blast radius
- domain-assignment.md: algorithm walkthrough (pass 1/2/3), status values,
  CLI command reference, dashboard review guide
- migration-runbook.md: step-by-step deploy with pre-deploy backups,
  8 STOP pause points for operator verification, staged push rollout,
  quarantined --reprocess-missing procedure, 5 rollback procedures
- deploy-blast-radius.md: per-step risk reference with worst case,
  detection signals, rollback procedures, and risk tiers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 00:06:49 +00:00
d1270be64d Phase 7: manual review dashboard for tied items
Adds /peertube/review page showing only tied_manual items for human
domain assignment. Each row displays video title, channel, concept
domain counts, and a dropdown to select the correct domain.

Routes: GET /peertube/review (page), GET /api/peertube/review/items
(JSON), GET /api/peertube/review/stats (counts),
POST /api/peertube/review/assign (assign + push to PeerTube).

Review subnav entry added to PEERTUBE_SUBNAV.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 00:06:25 +00:00
6a17df8078 Phase 6: post-embed domain assignment hook
After a stream.echo6.co video completes embedding, automatically runs
compute_assignment (pass 1 only). Clear winners get pushed to PeerTube
immediately; ties are marked tied_pass_1 for the batch tiebreaker.

Also tags stream docs that hit early-return paths (no concepts, no valid
concepts) with needs_reprocess status so they are visible to the
--reprocess-missing CLI command.

Error handling: domain assignment failure logs a warning but does not
block the embedding pipeline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 00:06:07 +00:00
a273b52c7e Phase 5: assign-categories CLI commands
Adds assign-categories subcommand with flags:
  --backfill     Pass 1 domain assignment for all complete stream docs
  --tiebreaker-pass  Resolve ties via channel concept analysis
  --push-pending Push assigned categories to PeerTube API (staged via --limit)
  --reprocess-missing  Re-queue items with missing/legacy concepts
  --dry-run      Preview without writes (enhanced for reprocess: shows
                 concept dir existence and file counts)
  --limit N      Cap processing count

Includes pre-deletion audit logging for --reprocess-missing (logs path,
file count, and hash before each shutil.rmtree).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 00:05:19 +00:00
07d26a8ef0 Phase 4: authenticated PeerTube category writer
OAuth2 password-grant client for PeerTube API with token caching and
auto-refresh on 401. Pushes domain categories via PUT /api/v1/videos/{uuid}.

Includes limit parameter on push_pending for staged rollouts, and
systemic failure detection that aborts after 5 consecutive failures
(catches missing plugin or broken auth before wasting API calls).

Config section added to config.yaml for PeerTube API connection
parameters. Real credentials remain in .env (gitignored).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 00:04:51 +00:00
8ab1f8c82f Phase 3: compute_assignment and run_tiebreaker_pass
Domain assigner module with two functions:
- compute_assignment(): pass 1 concept domain count, inline per-document
- run_tiebreaker_pass(): batch channel-level tiebreaker (pass 2 + pass 3
  defensive re-run), mega-channel rule (>500 videos skip to tied_manual)

Filters legacy domains (Sustainment Systems, Off-Grid Systems, Defense &
Tactics) from concept counts automatically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 00:04:37 +00:00
5f36c52fb1 Phase 2: documents table schema for domain assignment
Adds four columns to documents table via idempotent ALTER TABLE
migrations: recon_domain, recon_domain_status, recon_domain_assigned_at,
peertube_category_pushed_at. Adds index on recon_domain_status.

Includes StatusDB helper methods: get/set_domain_assignment,
set_peertube_pushed, get_unpushed_assignments, get_items_by_domain_status,
get_domain_status_counts, get_domain_distribution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 00:04:29 +00:00
71e3dc12ed Phase 1: PeerTube plugin and recon_domains module
Single source of truth for the 18 RECON knowledge domains mapped to
PeerTube category IDs 100-117. Replaces duplicate VALID_DOMAINS sets
in enricher.py and embedder.py with imports from lib/recon_domains.py.

Includes PeerTube plugin (peertube-plugin-recon-domains) that registers
custom categories via videoCategoryManager.addConstant(), and a parity
test to verify constants match between RECON and the PeerTube API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 00:04:13 +00:00
991826b4f1 config: disable 10ft contour test layer (causes green wall on flat terrain) 2026-04-27 02:51:14 +00:00
121eb45b44 feat: add /api/auth/whoami endpoint for frontend auth state
Returns {authenticated: bool, username: string|null} based on
X-Authentik-Username header presence. Used by Navi frontend to
detect auth state without triggering SSO redirect.
2026-04-27 01:26:44 +00:00
b5de9c6e39 fix(geocode): apply viewport bias to Netsyms address results
The /api/geocode endpoint blended Photon and Netsyms results, but only
Photon respected viewport bias from prior work. Address queries to
Netsyms/AddressDB returned globally-sorted matches regardless of where
the user was looking — searching '214 North St' from Idaho returned
Illinois results.

Now fetches up to 200 Netsyms results when viewport lat/lon provided,
sorts by squared distance from viewport center, then returns top N.
Falls back to default ordering when viewport absent. Photon path
unchanged.
2026-04-26 20:59:17 +00:00
2387a96a1e feat(place): add boundary polygon to place detail response
Request polygon_geojson=1 from Nominatim to include admin boundary
polygons in place detail responses. Also fetch boundary via OSM
relation ID for wikidata lookups.
2026-04-26 08:26:47 +00:00
e9c9cee4f3 feat: Add wikidata lookup endpoint for place enrichment
- Add get_place_by_wikidata() to place_detail.py
- Queries Wikidata API for entity details (name, description, coords)
- Extracts population, instance_of, OSM relation ID, Wikipedia link
- Add /api/place/wikidata/<id> route to api.py

Supports Navi basemap label enrichment when OSM details unavailable.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-04-26 08:15:16 +00:00
07e6d0460b docs: remove design doc (relocated to matt/refactored-recon)
NAVI-DIRECTIONS-REDESIGN.md belongs in the design-docs repo
(matt/refactored-recon) alongside PROJECT-BIBLE.md, AUTH-PUBLIC-FRONTEND.md,
and other design artifacts. Code repo holds code only.
2026-04-26 05:03:03 +00:00
4f96d8f6fe docs: Navi directions UX redesign with radial map menu
Design document covering:
- Current state analysis and failure modes
- New DirectionsPanel with visible From/To inputs
- RadialMenu component for map right-click/long-press
- Interaction flows for all directions scenarios
- Mobile considerations (bottom sheet, long-press timing)
- Implementation sequence (10 phases)
- Open questions for Matt

Implementation deferred to dedicated session.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-04-26 04:58:01 +00:00
2ed9335f4e feat(geocode): add viewport bias for location-aware search
- Add lat/lon/zoom params to geocode() and _retrieve_photon_freetext()
- Update nav_tools.py wrapper to pass through viewport params
- Add /api/geocode handler support for lat/lon/zoom query params
- Add _safe_float() helper for param validation
- Cast zoom to int for Photon compatibility

Allows the frontend to pass current map center/zoom to bias
search results toward the visible area.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-04-26 04:03:44 +00:00
f35af18320 feat(place): gate Google Places API calls behind auth
Guest users receive local and cached data only. New Google Places API
calls are only triggered for authenticated users, protecting against
cost exploitation on the public navi.echo6.co frontend.

The pattern: cached Google data flows freely (already paid for by an
authed lookup). New API calls require X-Authentik-Username via
get_user_id() check.
2026-04-26 03:36:21 +00:00
63b68bfea7 feat: add has_contours feature flags for home and regional_pi profiles
Adds has_contours, has_contours_test, and has_contours_test_10ft flags
to support contour layer toggle in Navi frontend. minimal_pi profile
intentionally excluded (no tile overlays in stripped-down deployment).
2026-04-26 03:36:16 +00:00
15c58a69ac Add Nav-I API key management UI
Replace /nav-i/api-keys stub with functional admin page for managing
third-party API keys (Gemini, TomTom, Google Places).

- New lib/api_keys_admin.py: list/update/test operations with masked
  display, atomic .env writes (.env.bak backup), provider-specific
  test calls (Gemini models.list, TomTom geocode, Google Places
  searchText)
- 4 new endpoints: GET /api/nav-i/api-keys/list, POST .../update,
  POST .../test, POST .../restart-recon
- Full UI: key table with masked values, per-key update modal with
  show/hide toggle, inline test results with latency, Gemini detail
  sub-table with per-key stats, RECON restart with confirmation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-23 06:50:44 +00:00
829bc87b7b Add wiki link rewriting to local Kiwix
Rewrites OSM wikipedia/wikidata/wikivoyage/appropedia extratag values
to local Kiwix URLs (wiki.echo6.co) when the article exists in a loaded
ZIM, falling back silently to public URLs otherwise.

- New lib/wiki_rewrite.py: URL classification, Kiwix OPDS catalog
  discovery (xml.etree.ElementTree), HEAD-based availability check,
  positive-only SQLite cache, disabled discovery stubs
- place_detail.py: _enrich_wiki_links() at both Nominatim and Overpass
  enrichment sites, before cache_put
- Profile flags: has_wiki_rewriting (home/regional: true, minimal: false),
  has_wiki_discovery (all: false, stubs for future activation)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-23 06:34:22 +00:00
9c5b0520f9 Add PAD-US public land classification lookup
Integrates USGS PAD-US 4.0 (651k features) into a local PostGIS database
for point-in-polygon land ownership queries. Adds /api/landclass endpoint
returning classifications, public/private status, and management hierarchy.

- lib/landclass.py: connection pool, lookup_landclass(), domain label maps
- lib/api.py: GET /api/landclass?lat=&lon= (feature-flag gated)
- home.yaml: enable has_landclass flag

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-22 15:36:37 +00:00
3280e34718 Add Nav-I dashboard section with restore-as conflict resolution
- Create Nav-I top-level section in dashboard navigation
- Move Deleted Contacts from Knowledge subnav to Nav-I
- Add Nav-I landing page with card grid (deleted count, API keys stub)
- Add /nav-i/api-keys placeholder page
- Add restore-as endpoint for Home/Work conflict resolution
- Conflict modal in deleted contacts template for label rename on restore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-22 06:26:25 +00:00
a4288c0cd8 Add contacts/phone book system with per-user scoping
New files:
- lib/auth.py: Authentik forward-auth helpers (get_user_id, @require_auth)
- lib/contacts.py: ContactsDB with CRUD, soft delete, restore, purge, find_nearby
- lib/contacts_api.py: Flask Blueprint with 9 API endpoints at /api/contacts
- templates/knowledge/deleted_contacts.html: Dashboard recovery page

Modified:
- lib/api.py: Register contacts_bp, add KNOWLEDGE_SUBNAV entry, /deleted-contacts route
- config/profiles: has_contacts feature flag (true for home, false for pi profiles)

Separate SQLite DB at data/contacts.db. Per-user isolation via X-Authentik-Username.
Home/Work labels enforced unique per user. Haversine proximity queries (75m default).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-22 05:29:54 +00:00
095bf8c2af Add Google Places (New) tertiary enrichment for business POIs
Fills opening_hours, phone, and website gaps when OSM + Overture data
is incomplete. Only fires for business-class POIs (amenity, shop, tourism,
leisure, office, craft). Daily API call cap with SQLite tracking.
cache_put now preserves google columns across cache refreshes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-22 04:08:12 +00:00
620f99c762 Add business_intent_poi_boost reranker signal
When a query contains no road-type keywords (st, blvd, ave, etc.),
boost amenity/shop/tourism/leisure/office/craft results (+3.0) and
penalize highway/route results (-4.0). This fixes searches like
"starbucks twin falls" where a named service road outranked the
actual business POI due to Photon position tiebreaking.

Also fixes:
- Intent classifier now recognizes full state names ("idaho" not
  just "ID") for LOCALITY classification
- Locality-type Photon results now populate _city from name field
  so they participate in locality_fuzz scoring
- Trace logging expanded to all candidates with osm_key/value

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-21 19:39:37 +00:00
d460f0e202 Fix type classifier: POI check takes precedence over street_address
Businesses with housenumbers (e.g. M&W Markets at 130 US-30) were
classified as street_address because the housenumber check fired before
the osm_key check. Reorder so osm_key in amenity/shop/tourism/leisure/office
is evaluated first, ensuring businesses get type=poi regardless of
whether they have a street address. Also adds office to the POI key set.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-21 19:08:04 +00:00
65693d15aa Add Overture Maps POI enrichment layer for place details
Ingests 20.9M North America places from Overture Maps Foundation
(release 2026-04-15.0) into PostgreSQL. Enriches /api/place responses
with phone, website, and brand data via spatial + fuzzy name matching
when OSM extratags are sparse.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-21 16:51:25 +00:00
2121ee4936 Add place detail proxy with Nominatim-first routing and Overpass fallback
New /api/place/<osm_type>/<osm_id> endpoint returns cleaned OSM tag data
for PlaceDetail panel enrichment. Routes to local Nominatim (Idaho coverage)
first, falls back to Overpass public API for out-of-region queries. Responses
cached in SQLite (data/place_cache.db) with no expiry.

New modules: lib/place_detail.py (proxy + cache), lib/osm_categories.py
(~50 category humanization mappings). Profile YAMLs updated with
place_details config block and has_nominatim_details flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-21 03:06:51 +00:00
64605b38bb Add TomTom traffic proxy and update profiles for hillshade/traffic layers
- Add /api/traffic/flow proxy route to hide TomTom API key from frontend
- Add tileset_hillshade and traffic config blocks to all three profiles
- Flip has_hillshade and has_traffic_overlay flags in home and regional profiles
- Minimal profile has config blocks but flags remain false (dormant)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-21 00:52:04 +00:00
e6b81db520 feat(navi): deployment profiles + /api/config endpoint
Add profile-driven config infrastructure:
- config/profiles/{home,regional_pi,minimal_pi}.yaml templates
- lib/deployment_config.py loader (reads RECON_PROFILE env var)
- GET /api/config returns active profile as JSON (5min cache)

Frontend reads this on startup to determine tile source, defaults,
and feature flags. No existing behavior changed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-20 23:35:39 +00:00
d4c5c371ca Merge feature/navi-integration: Navi backend (address book, Netsyms, geocoding chain, reverse endpoint) 2026-04-20 22:40:03 +00:00
ac69e2761d feat(navi): add /api/reverse endpoint for map-click reverse geocoding
Accepts lat/lon query params, calls Photon /reverse, returns same
response shape as /api/geocode. Returns 200 with empty results on
no match (graceful degradation for ocean/unmapped areas).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-20 21:26:35 +00:00
87b230dcba feat(navi): structured geocode with usaddress parsing and reranker
Add lib/geocode.py — multi-source retrieval pipeline:
- usaddress CRF parsing with intent classification
- Netsyms structured lookup (uses raw street abbreviations)
- Photon /structured + /api freetext retrieval
- Weighted 10-signal reranker (housenumber, street fuzz, locality,
  source authority, etc.)
- match_code annotations + address book proximity labeling
- Trace log at /tmp/geocode_rerank_trace.log

nav_tools.py now delegates geocode() to the new module.
Tests updated: US address queries correctly return Netsyms results.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-20 16:29:47 +00:00
c76d63b785 refactor(navi): Photon-first geocoding with ranked results
Inverts the /api/geocode chain. Photon is now the primary search
engine; the hand-rolled Netsyms free-text parser is removed.
Address book short-circuits nicknames only ("home", "work") —
full-address queries flow through Photon and address book
entries within 75m annotate matching results with labeled_as.
Coordinate strings detected before search.

Response shape: /api/geocode now returns a ranked candidates
list (always 200 OK, empty list if no match). No more 404 for
unmatched queries. Users can type messy input — wrong case,
missing punctuation, abbreviations, typos — and get results
or close matches.

Netsyms preserved at /api/netsyms/lookup for direct access.
USPS plus4 enrichment of Photon street-address hits is a
planned follow-up.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-20 15:48:03 +00:00
a14501347b fix(navi): address book prefix+boundary match for longer queries
lookup() previously did exact-alias-only matching, so "214 north st
filer" missed the home entry with alias "214 north st". Extend to
match when the query begins with an alias followed by a word
boundary, and when an alias appears as a contiguous token sequence
inside the query. Short aliases ("home") keep matching exactly and
also match with trailing text.

Fixes the UX case where typing a known full address falls through
to Netsyms instead of short-circuiting to address_book.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-20 07:54:32 +00:00
dfab388769 feat(navi): add netsyms tier-2 geocoding + geocode API
Add Netsyms AddressDatabase2025 (159M US+CA addresses) as tier-2
in the geocode chain: address_book → netsyms → photon.

- lib/netsyms.py: SQLite lookup module (lazy, read-only, thread-safe)
- lib/netsyms_api.py: Flask blueprints for /api/netsyms/* and /api/geocode
- lib/netsyms_test.py: 7 test cases (street, free-text, zipcode, health)
- lib/nav_tools.py: new geocode() with consistent {name,lat,lon,source,raw}
- lib/api.py: register netsyms_bp and geocode_bp

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-20 07:24:09 +00:00
23483e8198 feat(navi): address book with geocoding integration
- YAML-backed saved locations (config/address_book.yaml)
- Exact/partial alias matching with case-insensitive lookup
- Flask blueprint: /api/address_book/lookup, /api/address_book/list
- Geocoder short-circuits Photon when address book has exact match
- Test suite for lookup behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-20 04:02:11 +00:00
3243f2f252 feat(navi): semantic query router for intelligent tool selection - Phase H2b
Add centroid-based query classifier that routes Aurora queries to the
appropriate handler (nav_route, nav_reverse_geocode, direct_answer,
rag_search) before the RAG pipeline runs. Uses TEI embeddings against
pre-computed route centroids from 38 example queries.

- query_router.py: standalone module with lazy centroid init
- query_router_test.py: 7-query test suite (all passing)
- Corresponding recon_rag_tool.py v4.2.0 deployed to Open WebUI DB

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-19 23:50:35 +00:00
9841c38011 fix(navi): format tool output as human-readable directions 2026-04-19 22:42:17 +00:00
a9510b5ed9 feat(navi): add nav_tools with route() and reverse_geocode() - Phase H2
- nav_tools.py: route() geocodes via Photon, routes via Valhalla, returns
  summary/maneuvers/polyline. reverse_geocode() for coordinate lookups.
  Supports auto/pedestrian/bicycle/truck modes.
- nav_tools_test.py: 5 live tests against local Photon (2322) and Valhalla (8002)
- aurora_nav_tool.py: Open WebUI Tool exposing get_directions to Aurora LLM

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-19 22:14:26 +00:00
c5283ece3e Merge feature/scraper: Zimit-based web scraper
Replaces wget/SingleFile/Playwright crawl backends with Zimit (openZIM
Docker crawler). Produces ZIM files directly — no zimwriterfs step.
Validated with meshtastic.org (3400+ page Docusaurus site).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-19 19:37:04 +00:00
5f5bcedab9 Fix progress regex and SIGHUP/scan_zims race condition
- Parse Browsertrix "crawled":N JSON format instead of "N pages"
- Add 3s delay between SIGHUP to kiwix-serve and scan_zims() call
  so the OPDS catalog is reloaded before we query it for linking

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-19 19:35:42 +00:00
9692044790 Fix progress parsing for Browsertrix JSON log format
Parse "crawled":N from Browsertrix crawlStatus JSON logs instead of
looking for "N pages" pattern. Also check stdout (not just stderr).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-19 19:33:50 +00:00
b035ba3f20 Fix Zimit: add required --name flag for warc2zim
warc2zim (called internally by zimit) requires --name for ZIM metadata.
Without it, argument validation fails with exit code 2.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-19 14:30:42 +00:00