4.8 KiB
Phase 5c-1: Service Loop Rewire
Executed: 2026-04-14T18:10–18:30Z UTC
Backup
| Item | Location | MD5 Hash |
|---|---|---|
| recon.db (pre-Phase 5c-1) | CT 130: /tmp/recon.db.phase5c1.20260414.bak |
db48369c9fd0937d1b4869196d3fc19b |
What This Phase Does
Rewires the RECON service loop (cmd_service() in recon.py) to use the new dispatcher + filing worker architecture built in Phases 2–4. Code changes only — the service has NOT been started. Phase 5c-2 will do the first live run.
Three files modified, zero data changes.
Files Changed
| File | Lines Added | Lines Removed | Changes |
|---|---|---|---|
lib/dispatcher.py |
28 | 0 | Added dispatch_loop() — service-thread wrapper around dispatch_once() |
lib/filing.py |
57 | 0 | Added filing_worker_loop() — watches for status=complete items in processing/ |
recon.py |
98 | 242 | Rewired cmd_service(), removed 4 old loop definitions |
Service Thread List: Before vs After
| # | Before (old) | After (new) | Notes |
|---|---|---|---|
| 1 | extract (stage_loop) | dispatcher (dispatch_loop) | NEW — scans acquired/ subfolders |
| 2 | enrich (stage_loop) | extract (stage_loop) | KEPT — vestigial, will be removed in Phase 6 |
| 3 | embed (stage_loop) | enrich (stage_loop) | KEPT |
| 4 | scanner (scanner_loop) | embed (stage_loop) | KEPT |
| 5 | peertube (peertube_scanner_loop) | filing (filing_worker_loop) | NEW — files completed items to library |
| 6 | crawler (crawler_scheduler_loop) | progress (progress_loop) | KEPT |
| 7 | organizer (organizer_loop) | dashboard (start_dashboard) | KEPT |
| 8 | progress (progress_loop) | — | — |
| 9 | dashboard (start_dashboard) | — | — |
Removed threads:
scanner_loop— scanned library tree for new PDFs; replaced by dispatcher scanning acquired/ dirspeertube_scanner_loop— ingested PeerTube transcripts; replaced by dispatcher + transcript processorcrawler_scheduler_loop— crawled configured websites; disabled in config, superseded by acquisition module approachorganizer_loop— filed completed docs using oldorganize_document(); replaced byfiling_worker_loop
Added threads:
dispatch_loop— runsdispatch_once()every 30s (configurable viaservice.dispatch_interval), dispatching content fromacquired/{subfolder}/to the appropriate processorfiling_worker_loop— queries forstatus='complete' AND organized_at IS NULL AND path LIKE '/opt/recon/data/processing/%'every 30s (configurable viaservice.filing_interval), files items tolibrary/Domain/Subdomain/
Vestigial extract worker: The extract stage worker is kept as a no-op safety net. Processors do their own extraction inline, so the extract worker will find nothing to do. Will be removed in Phase 6 cleanup.
Key Design Decisions
Filing worker safety rail
The filing worker query includes AND path LIKE '/opt/recon/data/processing/%' to ensure it only touches items that went through the new dispatcher → processor pipeline. Legacy items at other paths are left alone.
Resilient loops
Both dispatch_loop() and filing_worker_loop() catch all exceptions from their inner logic and log + continue. Service threads never raise to the caller, preventing a single processing error from killing the thread.
Config-driven intervals
New config keys service.dispatch_interval and service.filing_interval control polling frequency (both default to 30s if not specified in config.yaml).
Verification Results
| Check | Result |
|---|---|
py_compile recon.py |
OK |
py_compile lib/dispatcher.py |
OK |
py_compile lib/filing.py |
OK |
import dispatch_once, dispatch_loop |
OK |
import file_processed_item, filing_worker_loop |
OK |
import recon |
OK |
scanner_loop not in cmd_service |
OK |
peertube_scanner_loop not in cmd_service |
OK |
crawler_scheduler_loop not in cmd_service |
OK |
organizer_loop not in cmd_service |
OK |
dispatch_loop in cmd_service |
OK |
filing_worker_loop in cmd_service |
OK |
| recon.service | inactive |
| recon-watchdog.service | inactive |
| Hopper (acquired/stream/) | 2,259 pairs (unchanged) |
| DB counts | 27,553/27,553 (unchanged) |
| Qdrant points | 2,309,260 (unchanged) |
Service NOT Started
This phase is code-only. The service has NOT been started. Phase 5c-2 will:
- Start the service
- Watch the dispatcher pick up the 2,259 hopper items
- Monitor the pipeline processing them through enrich → embed → filing
Commit
| Repo | Branch | Hash | Message |
|---|---|---|---|
| matt/recon | refactor | d9aed35 |
Phase 5c-1: dispatcher loop, filing worker loop, service rewire |