# Phase 1: Scaffolding **Executed:** 2026-04-14T14:45Z UTC --- ## Backups (taken before changes) | Item | Location | MD5 Hash | |------|----------|----------| | recon.db | CT 130: `/tmp/recon.db.phase1.20260414.bak` | `69d94a2c21686871c8c6863903710e3f` | | config.yaml | CT 130: `/tmp/config.yaml.phase1.20260414.bak` | `6d70ed572dfb2e704abca3850ae33797` | DB hash matches Phase 0 backup — no changes occurred between phases. --- ## What Changed ### 1. Filesystem: New directory tree Created under `/opt/recon/data/`: ``` acquired/ README.md pdf/.keep stream/.keep html/.keep processing/ README.md ``` All owned by `zvx:zvx`, matching the existing data directory. ### 2. Config: Three edits to `/opt/recon/config.yaml` **a) `new_pipeline.enabled` set to `false`** The Stream B library pipeline (watchdog-driven file intake from `_acquired/` and `_ingest/`) is disabled. This prevents the old pipeline from processing files while we build the replacement. **b) `crawler.sites` set to `[]`** All 44 crawl target site definitions commented out and preserved as historical reference. The crawler scheduler will find zero sites and do nothing if started. **c) New `pipeline:` section added at end of file** ```yaml pipeline: acquired_root: /opt/recon/data/acquired processing_root: /opt/recon/data/processing dispatch: pdf: pdf_processor stream: transcript_processor html: html_processor mtime_stability_seconds: 10 ``` Scaffolding only — no code reads this section yet. Processors do not exist. **Config diff stats:** 284 lines removed, 302 lines added (bulk is the 44 sites being commented/uncommented). ### 3. Schema: `text_dir` column added to `documents` table ```sql ALTER TABLE documents ADD COLUMN text_dir TEXT; ``` All 29,812 existing rows have `text_dir = NULL`. This column will hold the path to each document's extracted text directory, replacing the convention-based `data/text/{hash}/` lookup. --- ## What Did Not Change - **No code modified:** `recon.py`, `lib/`, `scripts/`, templates, static assets — all untouched - **No data modified:** catalogue and documents row counts remain 29,812 each - **No service state changed:** Both `recon.service` and `recon-watchdog.service` remain inactive (both still `enabled` — will auto-start on reboot) - **No Qdrant changes:** Collection `recon_knowledge_hybrid` untouched (2,320,695 points) - **No file moves or deletions:** Existing `data/text/`, `data/concepts/`, NFS mounts all untouched --- ## Verification (post-change) | Check | Result | |-------|--------| | recon.service | inactive | | recon-watchdog.service | inactive | | catalogue rows | 29,812 | | documents rows | 29,812 | | text_dir NULL count | 29,812 (all rows) | | new_pipeline.enabled | `false` | | crawler.sites | `[]` | | pipeline.acquired_root | `/opt/recon/data/acquired` | | New directories exist | all 5 confirmed, zvx:zvx | | YAML validates | yes |