refactored-recon/phases/phase-6e-skill-and-dashboard-cleanup.md
Matt 5b0d4eed90 Phase 6e doc: add 6e-2 revert note and 6e-3 destination fix
Document the api.py revert (6e-2) and the shadowlib download
destination fix (6e-3) that redirects all three sources from
/mnt/library/Acquired/[SUBDIR]/ to the new dispatcher hopper
at /opt/recon/data/acquired/pdf/.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 03:42:40 +00:00

3.6 KiB

Phase 6e: ShadowLib Skill Cleanup + Dashboard PeerTube Endpoint

Date: 2026-04-15 Status: Partial — 6e-1 and 6e-3 completed, 6e-2 reverted

What Changed

Section 1: ShadowLib Skill — Remove Scribd

Removed all Scribd references from /home/zvx/.claude/skills/shadowlib.md:

  • Deleted Step 0 (Scribd session setup)
  • Deleted Step 4 (Scribd download procedure, ~75 lines)
  • Removed SCRIBD_USER, SCRIBD_PASS from credentials section
  • Removed scribd_session.json references
  • Renumbered steps: old Step 5 → Step 4 (Confirm Transfer), old Step 6 → Step 5 (Report)
  • Updated waterfall: Anna's Archive is now the final fallback (IA → IA CDL → AA)
  • Updated all "all three sources" → "both sources"

Line count: 338 → 245 (93 lines removed) Backup: /tmp/shadowlib.md.bak.20260415

Section 2: Dashboard PeerTube Endpoint — REVERTED

Rewired POST /api/ingest-peertube in lib/api.py (line 630) to use the Phase 6d acquisition module instead of legacy ingest_channel/ingest_all.

Commit: 7e42528 (applied) Revert: 7fe7d03 (reverted immediately after)

Reason for revert: The change was based on an unverified assumption about a dashboard endpoint that isn't part of the user's actual workflow. The endpoint was live for ~7 minutes, triggered two acquisition scans that never completed, wrote zero files, and created zero DB records. Reverted to be safe.

Note: Any future change to the POST /api/ingest-peertube endpoint requires understanding what it actually serves (dashboard UI? external caller?) before modifying its contract.

Section 3: ShadowLib Download Destination Fix (6e-3)

Problem: ShadowLib was writing downloaded PDFs to /mnt/library/Acquired/[SUBDIR]/, which is the legacy library path. The new dispatcher doesn't watch this location — it watches /opt/recon/data/acquired/pdf/. Downloads were silently going to a location the new pipeline ignored.

Fix: Changed all three source destinations to /opt/recon/data/acquired/pdf/:

  • Step 2 (IA free): scp destination updated
  • Step 2B (IA CDL): ia_capture.py --output updated
  • Step 3B (Anna's Archive): scp destination updated

Removed the [SUBDIR] subdirectory picking step entirely. The pdf_processor handles domain classification automatically via Gemini concept enrichment — users no longer need to pre-classify books into categories like "Ham-Radio" or "Medical" at acquisition time.

Other changes:

  • Step 1 simplified to just title + author (no folder selection)
  • "RECON picks up on its hourly scan" replaced with dispatcher language
  • Step 4 (Confirm Transfer) verify path updated
  • Step 5 (Report) path and language updated

Line count: 245 → 244 (1 line removed — the SUBDIR bullet) Backup: /tmp/shadowlib.md.bak.6e3.20260415

Behavior change: Users no longer pick a destination subdirectory at the start of a shadowlib session. The skill is now simpler — identify the book, download via the waterfall, drop in the hopper, done.

This is INSTRUCTION-only — no Python code changed, no service restart needed. The change takes effect the next time the shadowlib skill is invoked.

Verification

6e-1 (Scribd removal)

  • grep -i scribd → 0 matches
  • Step numbering consistent (1→2→2B→3→4→5)
  • Line count: 338 → 245

6e-2 (api.py) — reverted, no verification needed

6e-3 (destination fix)

  • grep SUBDIR → 0 matches
  • grep /mnt/library/Acquired → 0 matches
  • grep "hourly scan" → 0 matches
  • grep acquired/pdf → 7 matches (preamble + 3 sources + 2 verify + 1 report)
  • Full read-through: flow intact, no orphaned references
  • Line count: 245 → 244