Commit graph

2 commits

Author SHA1 Message Date
7c1af0f063 Phase 1: Kiwix foundation — ZIM monitor and kiwix-serve setup
- Add lib/zim_monitor.py: polls kiwix-serve OPDS v2 catalog, detects
  new ZIMs, reads accurate article count from python-libzim Counter
  metadata (not inflated OPDS count), inserts into zim_sources table.
  Idempotent on re-run, marks removed ZIMs.
- DB schema: zim_sources, zim_samples, zim_articles tables (created
  via sqlite3, not in migrations — matches existing RECON pattern)
- kiwix-tools 3.7.0 installed from binary tarball at /opt/recon/bin/
  (Ubuntu 24.04 apt ships 3.5.0 which lacks OPDS v2)
- kiwix.service systemd unit on port 8430
- python-libzim 3.9.0 installed
- Test ZIM: Appropedia EN maxi (496 MB, 19,445 articles)
- Add bin/ to .gitignore (binary tarball, not source)
2026-04-16 23:39:34 +00:00
563c16bb71 Initial commit: RECON codebase baseline
Current state of the pipeline code as of 2026-04-14 (Phase 1 scaffolding complete).
Config has new_pipeline.enabled=false and crawler.sites=[] per refactor plan.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 14:57:23 +00:00