Processes places with wikidata but no wikipedia tag:
- Batch resolve Q-IDs via Wikidata API (50/request)
- Validate resolved titles against local ZIM
- Generate summaries with Gemini API (3-4 sentences)
- Circuit breaker: 50 consecutive 429s triggers 5min pause
- Revalidate any remaining unvalidated entries
Filters for US+CA places, skips existing wave 1 entries.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Current state of the pipeline code as of 2026-04-14 (Phase 1 scaffolding complete).
Config has new_pipeline.enabled=false and crawler.sites=[] per refactor plan.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>