Migrate dashboard upload to pipeline with multi-format support

Upload handler now writes files to the appropriate hopper subfolder
instead of copying directly to /mnt/library/:
- .pdf -> acquired/pdf/
- .txt -> acquired/text/
- .epub, .doc, .docx, .mobi -> acquired/pdf/ (dispatcher format
  normalizer converts to PDF before processing)

The dispatcher picks up files and routes through the appropriate
processor (pdf_processor or text_processor) for full metadata
voting, domain classification, and canonical filing.

Changes to api_upload() / _process_upload():
- Relaxed extension check: PDF, TXT, EPUB, DOC, DOCX, MOBI
- Routes to correct hopper subfolder by extension
- Writes meta.json sidecar with original filename and category hint
- Removed: direct library copy, add_to_catalogue, queue_document
- Added: hopper-level dedup check (catches rapid re-uploads)
- Kept: catalogue dedup check for immediate user feedback

Changes to api_upload_status():
- Added fallback: checks acquired/ and processing/ dirs if hash
  not yet in documents table (covers gap between upload and
  dispatcher pickup)

Template updated: accept attribute and help text now reflect
multi-format support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Matt 2026-04-16 02:18:45 +00:00
commit e6224cb279
2 changed files with 69 additions and 44 deletions

View file

@ -1,12 +1,13 @@
{% extends "base.html" %}
{% block content %}
<h3 class="section-title mb-16">Upload PDF</h3>
<h3 class="section-title mb-16">Upload Document</h3>
<div class="panel">
<form id="upload-form" enctype="multipart/form-data">
<div class="mb-16">
<label class="text-dim text-xs" style="text-transform:uppercase;display:block;margin-bottom:4px;">PDF File</label>
<input type="file" name="file" accept=".pdf" id="upload-file"
<label class="text-dim text-xs" style="text-transform:uppercase;display:block;margin-bottom:4px;">Document File</label>
<input type="file" name="file" accept=".pdf,.txt,.epub,.doc,.docx,.mobi" id="upload-file"
style="background:#0a0a0a;border:1px solid #333;color:#c0c0c0;padding:8px;width:100%;font-family:inherit;">
<span class="text-dim" style="font-size:11px;display:block;margin-top:4px;">Supported: PDF, TXT, EPUB, DOC, DOCX, MOBI</span>
</div>
<div class="mb-16">
<label class="text-dim text-xs" style="text-transform:uppercase;display:block;margin-bottom:4px;">Category</label>
@ -67,7 +68,7 @@ document.getElementById('upload-form').addEventListener('submit', async function
result.innerHTML = '<span style="color:#00ff41;">Queued for processing</span><br>' +
'<span class="text-dim">Hash: ' + data.hash + '</span><br>' +
'<span class="text-dim">File: ' + data.filename + '</span><br>' +
'<span class="text-dim">Category: ' + data.source + '/' + data.category + '</span>';
'<span class="text-dim">Type: ' + data.source_type + '</span>';
fileInput.value = '';
} else {
status.style.color = '#ff4444';