mirror of
https://github.com/zvx-echo6/recon.git
synced 2026-05-20 06:34:40 +02:00
Adds _normalize_formats() to the dispatcher that converts non-standard document formats to PDF before dispatch. Supports: - .epub, .mobi -> PDF via ebook-convert (Calibre) - .doc, .docx -> PDF via LibreOffice headless Called per-subfolder before _find_pairs() so _find_pairs() only ever sees standard content files. Conversion failures are logged and skipped -- the original file stays in acquired/ for manual review. Also converts 3 staged epub files and cleans up _staging/. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| acquisition | ||
| processors | ||
| __init__.py | ||
| api.py | ||
| dispatcher.py | ||
| embedder.py | ||
| enricher.py | ||
| extractor.py | ||
| filing.py | ||
| ingester.py | ||
| key_manager.py | ||
| new_pipeline.py | ||
| organizer.py | ||
| peertube_collector.py | ||
| peertube_scraper.py | ||
| status.py | ||
| utils.py | ||
| web_scraper.py | ||