recon/lib
Matt df29d598d3 Phase 6a: transcripts mark organized in-place, skip filing
Transcripts are derived text from PeerTube videos, not primary source
files. They do not belong in library/Domain/Subdomain/ like PDFs.

Change: transcript_processor.pre_flight() now sets organized_at =
CURRENT_TIMESTAMP at the end of successful processing, marking the
transcript as organized in place. The watch URL remains in
catalogue.path and Qdrant download_url so users clicking search
results go to the PeerTube video.

The filing workers path LIKE filter naturally excludes transcripts
since their documents.path is the watch URL, not a filesystem path.
No filing worker changes needed.

Back-fills 2,260 drain items from Phase 5c-2 via one-time SQL.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 22:49:21 +00:00
..
acquisition Phase 3: dispatcher, transcript processor, text_dir resolution 2026-04-14 15:39:42 +00:00
processors Phase 6a: transcripts mark organized in-place, skip filing 2026-04-14 22:49:21 +00:00
__init__.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
api.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
crawler.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
dispatcher.py Phase 5c-1: dispatcher loop, filing worker loop, service rewire 2026-04-14 18:30:58 +00:00
embedder.py Phase 3: dispatcher, transcript processor, text_dir resolution 2026-04-14 15:39:42 +00:00
enricher.py Phase 3: dispatcher, transcript processor, text_dir resolution 2026-04-14 15:39:42 +00:00
extractor.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
filing.py Phase 5c-1: dispatcher loop, filing worker loop, service rewire 2026-04-14 18:30:58 +00:00
ingester.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
key_manager.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
new_pipeline.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
organizer.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
peertube_collector.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
peertube_scraper.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
status.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00
utils.py Phase 3: dispatcher, transcript processor, text_dir resolution 2026-04-14 15:39:42 +00:00
web_scraper.py Initial commit: RECON codebase baseline 2026-04-14 14:57:23 +00:00