recon/lib
Matt e6224cb279 Migrate dashboard upload to pipeline with multi-format support
Upload handler now writes files to the appropriate hopper subfolder
instead of copying directly to /mnt/library/:
- .pdf -> acquired/pdf/
- .txt -> acquired/text/
- .epub, .doc, .docx, .mobi -> acquired/pdf/ (dispatcher format
  normalizer converts to PDF before processing)

The dispatcher picks up files and routes through the appropriate
processor (pdf_processor or text_processor) for full metadata
voting, domain classification, and canonical filing.

Changes to api_upload() / _process_upload():
- Relaxed extension check: PDF, TXT, EPUB, DOC, DOCX, MOBI
- Routes to correct hopper subfolder by extension
- Writes meta.json sidecar with original filename and category hint
- Removed: direct library copy, add_to_catalogue, queue_document
- Added: hopper-level dedup check (catches rapid re-uploads)
- Kept: catalogue dedup check for immediate user feedback

Changes to api_upload_status():
- Added fallback: checks acquired/ and processing/ dirs if hash
  not yet in documents table (covers gap between upload and
  dispatcher pickup)

Template updated: accept attribute and help text now reflect
multi-format support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 02:18:45 +00:00
..
acquisition Phase 6d: PeerTube acquisition module + service thread 2026-04-15 03:08:51 +00:00
processors Fix: Gemini "null" string bug in pdf_processor metadata voting 2026-04-15 23:30:59 +00:00
__init__.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
api.py Migrate dashboard upload to pipeline with multi-format support 2026-04-16 02:18:45 +00:00
dispatcher.py Phase 6f-2: format normalizer in dispatcher 2026-04-15 23:08:19 +00:00
embedder.py Phase 3: dispatcher, transcript processor, text_dir resolution 2026-04-14 15:39:42 +00:00
enricher.py Phase 3: dispatcher, transcript processor, text_dir resolution 2026-04-14 15:39:42 +00:00
extractor.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
filing.py Phase 5c-1: dispatcher loop, filing worker loop, service rewire 2026-04-14 18:30:58 +00:00
ingester.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
key_manager.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
new_pipeline.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
organizer.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
peertube_collector.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
peertube_scraper.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
status.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
utils.py Phase 3: dispatcher, transcript processor, text_dir resolution 2026-04-14 15:39:42 +00:00
web_scraper.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00