mirror of
https://github.com/zvx-echo6/refactored-recon.git
synced 2026-05-20 14:44:39 +02:00
4.6 KiB
4.6 KiB
Phase 6c: Code Cleanup
Objective
Remove dead code paths left over from the refactor. Investigation first, deletion second — only remove what's confirmed dead.
Investigation Findings
Expected dead code vs reality
| Item | Expected status | Actual status |
|---|---|---|
scanner_loop |
Dead function in recon.py | Already removed in Phase 5c-1 |
peertube_scanner_loop |
Dead function in recon.py | Already removed in Phase 5c-1 |
crawler_scheduler_loop |
Dead function in recon.py | Already removed in Phase 5c-1 |
organizer_loop |
Dead function in recon.py | Already removed in Phase 5c-1 |
| Extract worker thread | Vestigial in cmd_service() | Confirmed dead — 0 items queued, silent 24h+ |
lib/crawler.py |
Legacy module | Confirmed dead — only used by CLI subcommand |
lib/web_scraper.py |
Legacy module | ALIVE — chunk_text() used by transcript_processor |
lib/new_pipeline.py |
Legacy module | ALIVE — active Stream B library management tool (1,637 lines, created Apr 13) |
lib/peertube_scraper.py |
Legacy module | ALIVE — only mechanism for transcript ingestion |
lib/extractor.py |
Dead module | ALIVE — used by cmd_run CLI for batch processing |
Additional findings
- 24
.bakfiles found across/opt/recon/(untracked, manual pre-edit safety backups from Feb-Apr 2026). All originals preserved in git history. - File ownership: All 21
.pyfiles +recon.pycorrectly owned by zvx. No corrections needed. - No TODO/DEPRECATED comments found in any lib/ file.
- All imports in recon.py confirmed used (no dead imports at module level).
- PeerTube transcript ingestion has no automatic mechanism since Phase 5c-1 removed
peertube_scanner_loop. Ingestion is manual only (CLI or dashboard API endpoint).
What Was Removed
recon.py edits (-89 lines, +3 lines)
-
Extract worker thread removed from
cmd_service():from lib.extractor import run_extractionimportextract_workersvariable'extract': 0from totals dict- Extract
threading.Thread(target=stage_loop, ...)from thread list - Extract workers from startup log message
-
cmd_crawlfunction deleted (65 lines) — CLI handler forrecon crawl -
Crawl argparse subparser deleted (15 lines) —
recon crawlsubcommand registration -
Docstring updated to remove
crawlfrom subcommand list
Files deleted
| File | Lines | Reason |
|---|---|---|
lib/crawler.py |
432 | Only referenced by deleted cmd_crawl CLI subcommand |
.bak files deleted (24 files, untracked)
| File | Size |
|---|---|
recon.py.bak-pre-streamb |
48K |
recon.py.bak-pre-ux |
35K |
recon.py.bak-pre-crawler |
35K |
recon.py.bak.202602171647 |
33K |
config.yaml.bak-pre-crawler |
4K |
config.yaml.bak-pre-streamb |
13K |
lib/api.py.bak + 5 more api.py backups |
498K total |
lib/embedder.py.bak |
15K |
lib/enricher.py.bak |
17K |
lib/extractor.py.bak |
18K |
lib/status.py.bak-pre-ux |
10K |
lib/status.py.bak-pre-streamb |
13K |
scripts/validate.py.bak |
6K |
scripts/rebuild_qdrant.py.bak |
6K |
static/js/dashboard.js.bak |
11K |
static/js/peertube.js.bak.20260223 |
5K |
templates/search.html.bak |
2K |
templates/knowledge/dashboard.html.bak |
3K |
What Was Kept (and why)
| Module | Lines | Why kept |
|---|---|---|
lib/web_scraper.py |
324 | transcript_processor.py imports chunk_text() |
lib/new_pipeline.py |
1,637 | Active Stream B library management CLI (created Apr 13) |
lib/peertube_scraper.py |
580 | Only way to ingest PeerTube transcripts |
lib/extractor.py |
601 | Used by cmd_run CLI for batch PDF processing |
Verification
| Check | Result |
|---|---|
| Compile (recon.py) | OK |
| Import (recon module) | OK |
| Import (dispatcher, filing, processors) | OK |
| cmd_service assertions | extract worker absent, dispatch_loop present, filing_worker_loop present |
| Zero crawler references in .py files | Confirmed |
| Service restart | Clean, active |
| Thread count | 13 tasks (was 14 — extract removed) |
| Threads started | enrich, embed, dispatcher, filing, progress, dashboard, metrics |
| Extract thread | Absent (confirmed by logs: no [extract] Stage started) |
| Errors (60s window) | 0 |
| DB rows | catalogue=29,812, documents=29,812 (unchanged) |
| Dashboard | Responsive |
| Hopper | Empty |
Commit
- Commit:
efae402onrefactorbranch - Diff: 2 files changed, 3 insertions(+), 521 deletions(-)
- Pushed to:
forge.echo6.co/matt/recon(origin/refactor)