# Archive Receiver Discovery # Generated: 2026-04-09 (Phase 6.0, Question 6) # # NOTE: Hookshot is BLOCKED for E2BE rooms with MAS. This analysis # covers the receiver requirements IF hookshot were used, AND the # alternative approaches that avoid hookshot entirely. ## Hookshot Receiver Requirements (if hookshot were viable) ### Minimum Functionality 1. Listen on HTTP port (plain HTTP on internal Docker network is fine — no TLS needed) 2. Accept multipart/form-data PUT or POST requests 3. Verify X-Matrix-Hookshot-Token header against per-room shared secret 4. Parse the `event` part as JSON 5. Write to durable storage 6. Return 200 OK (hookshot retries on non-2xx) ### Authentication Hookshot sends a per-webhook auth token in X-Matrix-Hookshot-Token header. The receiver validates this token against a known list. ## Storage Format Comparison | Format | Pros | Cons | Recommended For | |--------|------|------|-----------------| | JSONL files | Greppable, simple, no DB, easy backup | No query capability, no indexes, scattered across files | "Never look at it" archival | | SQLite per room | Self-contained, portable, SQL queries | Multiple files to manage, concurrent write limits | Small-scale per-room analysis | | Single SQLite | One file, SQL queries, simple backup | Write contention at scale, max ~10K writes/sec | Small-to-medium single-server | | Postgres | Full SQL, concurrent writes, indexes, JSONB | Needs a running server, more ops overhead | Query-heavy, large-scale | Given Matt's "I'll never look at the DB" feedback: - **Primary: JSONL files** — append-only, one per day per room, greppable, zero ops - **Secondary: Single SQLite** — for when he does need to query (and he will eventually) Both can coexist. The receiver writes JSONL immediately, a nightly job imports into SQLite. ## Receiver Location Options | Location | Pros | Cons | |----------|------|------| | Same Contabo host | Simplest networking, no cross-host latency | Adds load to already-busy server | | Separate CT on Proxmox | Isolated, near /mnt/library storage | Cross-network traffic, more infrastructure | | pi-nas (library host) | Direct /mnt/library access, no NFS | Pi is slow, limited CPU/RAM | **Recommendation:** Separate CT on Proxmox (data node preferred — has 1TB NVMe + 1TB SATA). - /mnt/library is NFS-mounted on data node CTs - Lightweight Python service, minimal resources (512MB RAM, 1 core) - Keeps archive processing off Contabo ## Alternative Approaches (No Hookshot) ### Approach A: Synapse-Level Only (Simplest) No bot, no receiver. Just Synapse config changes: ```yaml redaction_retention_period: null experimental_features: msc2815_enabled: true ``` Data stays in Synapse's Postgres forever. Query via: - Synapse admin API: GET /_synapse/admin/v1/rooms/{room_id}/messages - Direct Postgres: SELECT from event_json WHERE room_id = '...' Export scripts run on Contabo, dump to /mnt/library via NFS or rsync. Pros: Zero new infrastructure, zero ops burden, data already exists in DB Cons: No real-time alerting, export is batch-only, tied to Synapse DB format ### Approach B: Custom matrix-nio Bot (Original Phase 6 Plan) Python bot using matrix-nio with E2EE + MSC4190 support. - Handles MAS login correctly (unlike hookshot) - Decrypts E2BE rooms natively - Writes to its own DB (independent of Synapse retention) - Real-time capture with custom schema Pros: Full control, real-time, independent archive, custom schema Cons: More code to write and maintain, another service to monitor ### Approach C: Hybrid (Recommended) Combine Approach A + lightweight export: 1. Enable `redaction_retention_period: null` + `msc2815_enabled: true` → Synapse retains everything, MSC2815 provides moderator access 2. Build a simple export script (NOT a bot, NOT a service): - Runs nightly via cron - Queries Synapse admin API for room events - Writes JSONL + markdown exports to /mnt/library - No E2EE handling needed — queries the server-side decrypted content 3. No new services, no bot accounts, no device verification This avoids the hookshot E2EE+MAS blocker entirely AND avoids the complexity of a custom matrix-nio bot. The Synapse admin API already has the data. ## CT Number for Receiver/Export Service (if needed) Current CT assignments on data node: CT 130 (RECON) Free CTs on data node: 131-149 If a dedicated CT is needed: CT 131 (next available on data node) But with Approach C (hybrid), no dedicated CT is needed — the export script runs on Contabo via cron alongside the existing backup job.