echo6-docs/archivist.ref
Matt Johnson e9231ac24a Migration: consolidate Echo6 docs to cortex with full infrastructure cleanup sync
- Documents recent infrastructure cleanup (8 CTs destroyed, 35 DNS records removed, Headscale cleanup)
- Adds 24 new runbooks covering Authentik, PeerTube, Meshtastic, RECON, Proxmox, Mailcow, Internet Archive, GPU routing
- Adds project documentation for headscale, vaultwarden, peertube, matrix, mmud, advbbs, arr stack
- Updates services.md, environment.md, caddy.md, authentik.md to match live infrastructure
- Removes 4 deprecated runbook duplicates (canonical versions live in projects/)
- Adds .gitignore for binary archives and editor temp files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 06:02:16 +00:00

235 lines
9.1 KiB
Text

# Signal Archive Bot — Deployment Reference
# Created: 2026-04-12 (Phase 3)
# Status: Phase 5 COMPLETE — bot deployed, transcripts writing, sync token dedup verified
## CT 118 — archivist
| Setting | Value |
|---------|-------|
| VMID | 118 |
| Hostname | archivist |
| Host node | utility (192.168.1.241) |
| IP | 192.168.1.118/24 |
| Gateway | 192.168.1.1 |
| OS | Debian 12 (bookworm) |
| Disk | 8GB (local-lvm:vm-118-disk-0) |
| RAM | 1024 MB |
| Swap | 512 MB |
| Cores | 1 |
| Unprivileged | Yes |
| Features | keyctl=1 |
| Onboot | Yes |
| MAC | BC:24:11:74:E9:DC |
## NFS Mount
### In-container NFS mount: FAILED
- Attempt 1 (no mount=nfs feature): `access denied by server`
- Attempt 2 (with mount=nfs feature): `Operation not permitted`
- Root cause: unprivileged LXC containers cannot mount NFS directly
### Final approach: host-side NFS + Proxmox bind mount
- **Utility host fstab:** `192.168.1.245:/export/library /mnt/library nfs defaults,soft,timeo=150 0 0`
- **CT 118 mp0:** `/mnt/library,mp=/mnt/library,ro=0`
- **Pattern source:** CT 130 (RECON) on data node uses identical approach
- Utility host did NOT have /mnt/library mounted before this deployment
### Write access
- `/mnt/library/` permissions: 2777 (drwxrwsrwx), owner nobody:nogroup
- `/mnt/library/signal-archive/` permissions: 777 (drwxrwxrwx), created by CT root
- Archivist user UID 999 (container) maps to UID 100999 (host/NFS)
- Write access verified: archivist can create dirs, write files, delete within signal-archive/
- No NFS export changes needed — world-writable parent dir permits all UIDs
## Service User
| Setting | Value |
|---------|-------|
| Username | archivist |
| UID | 999 |
| GID | 996 |
| Shell | /usr/sbin/nologin |
| Home | /opt/archivist |
| Host-mapped UID | 100999 |
| Host-mapped GID | 100996 |
## Directory Layout
```
/opt/archivist/ # Home dir (owned by archivist:archivist)
archivist.py # Main bot script (Phase 5)
.env # Environment variables (Phase 5)
/store/ # E2EE key store + sync token (SqliteStore)
/logs/ # Bot logs (archivist.log)
/venv/ # Python virtual environment
/mnt/library/signal-archive/ # Transcript output (NFS bind mount)
<room-slug>/
transcript.log # Append-only human-readable transcript
media/ # Downloaded media files (images, audio, video, etc.)
```
## Python Environment
| Component | Version |
|-----------|---------|
| Python | 3.11.2 |
| pip | 26.0.1 |
| matrix-nio | 0.25.2 (with e2e extras) |
| python-olm | 3.2.16 |
| libolm-dev | 3.2.13~dfsg-1 |
| aiohttp | 3.13.5 |
Venv path: `/opt/archivist/venv/`
### Verified imports
- `nio.AsyncClient` — Matrix client
- `nio.crypto.OlmDevice` — E2EE device management
- `olm.Account` — libolm C binding
- `nio.store.SqliteStore` — crypto key persistence
## System Packages
Installed via apt:
- python3, python3-venv, python3-pip, python3-dev
- libolm-dev
- gcc, g++, make
- nfs-common
- curl, ca-certificates
## Access
```bash
# From utility host
pct exec 118 -- bash
# Direct SSH (not configured yet — no SSH keys installed)
# ssh root@192.168.1.118
```
## Matrix User — @archivist:echo6.co
| Setting | Value |
|---------|-------|
| User ID | @archivist:echo6.co |
| Display name | Archivist Bot |
| Created via | mas-cli (MAS user existed from 2026-04-10, password set via `manage set-password`) |
| Password | `<REDACTED — see credentials file: MATRIX_ARCHIVIST_BOT_PASSWORD>` |
| Device ID | ARCHIVIST |
| Access token | Stable compat token via `mas-cli manage issue-compatibility-token archivist ARCHIVIST` |
| Admin | No (not needed — uses room invitation, not admin force-join) |
### E2EE / Cross-Signing
| Component | Status |
|-----------|--------|
| Device keys (curve25519 + ed25519) | Uploaded |
| One-time keys (signed_curve25519) | 50 uploaded |
| Master key | Published (nKFt5nA+TvUo0AvY1gsk1QL8OK1t9z/ChON30Kdvlek) |
| Self-signing key | Published (9gOB+AHgyBzLP5/xerYon04NLZuIh+o5OHAybmetK2A) |
| User-signing key | Published (1EYngPiwpjOy2aQWY02g5SZaFLM5kZgsHLhFXgwHh0Q) |
| Device self-signed | Yes (ARCHIVIST signed by self-signing key) |
| Cross-signing seeds | /opt/archivist/store/cross_signing_seeds.json (chmod 600) |
| nio store | /opt/archivist/store/@archivist:echo6.co_ARCHIVIST.db |
### Key Sharing — How It Works
- Bridge key sharing policy: `cross-signed-tofu`
- ARCHIVIST device is cross-signed → bridge shares Megolm session keys automatically
- **Interactive verification (SAS emoji) is NOT required** — cross-signing alone is sufficient
- Old messages (before archivist joined) remain undecryptable (Megolm keys not retroactively shared)
- New messages are decryptable immediately
### E2BE Decryption Test — PASSED
- Date: 2026-04-12 14:35 UTC
- Room: COMMS LP group (!XUeWZuPdWQQnUYLJBJ:echo6.co)
- Message: "You'll know that radio has come of age when the median cellphone incorporates a LoRa radio stack."
- Sender: @signal_cdf98bca-c4b7-4fda-8ceb-03db5eb4e7e2:echo6.co (Signal puppet via bridge)
- Result: Successfully decrypted by ARCHIVIST device
### User Creation Notes
- Shared-secret registration (`/_synapse/admin/v1/register`) returns 404 under MAS — endpoint disabled
- Must use `mas-cli manage register-user` or `manage set-password` for existing users
- MAS creates user in both MAS DB and Synapse DB
- Orphaned Synapse `profiles` row caused provisioning failure — fixed by DELETE
- Each `client.login()` creates a NEW MAS compat session with random device ID — use `restore_login()` with stable compat token instead
- matrix-nio v0.25.2 does NOT implement `bootstrap_cross_signing()` — manual implementation required via python-olm PkSigning + raw HTTP API
## Joined Rooms
| Room | Room ID | Type | Archive Slug |
|------|---------|------|-------------|
| COMMS LP group | !XUeWZuPdWQQnUYLJBJ:echo6.co | Bridged Signal group | comms-lp-group |
| DM with Matt | !wgbnqhnYKTHzzJMjDu:echo6.co | Direct message | — |
| Liberal_Preppers_OG | !RvWNPmcKtPImhKPYcA:echo6.co | Bridged Signal group | liberal-preppers-og |
| (3 additional rooms) | !vBXtbgfYcptEuimrmn, !SnGDZgBtYOQuTWeYXp, !aQWFQMrzbkwjyjCPte | Bridged Signal groups | (initialized on first message) |
## Scripts on CT 118
| Script | Purpose | Status |
|--------|---------|--------|
| /opt/archivist/archivist.py | Main bot — transcript writer | **Running in tmux** |
| /opt/archivist/login_once.py | One-shot login + key upload | Completed (superseded) |
| /opt/archivist/bootstrap_crosssigning.py | Cross-signing key bootstrap | Completed (one-time) |
| /opt/archivist/setup_and_verify.py | Device setup + verification listener | Completed |
| /opt/archivist/test_decrypt.py | E2BE decryption test listener | Completed (superseded by archivist.py) |
## Bot Architecture (Phase 5)
### archivist.py — Event-Driven Transcript Bot (~260 lines)
**Core design:** Single-file async Python bot using matrix-nio `sync_forever` with `ClientConfig(store_sync_tokens=True)` for restart deduplication.
**Callbacks:**
- `on_invite` (InviteMemberEvent) → auto-join
- `on_text` (RoomMessageText) → write transcript line, detect edits via `m.replace`
- `on_image/audio/video/file` (RoomMessage* + RoomEncrypted*) → download + decrypt + save to media/
- `on_sticker` (StickerEvent) → same as media
- `on_redaction` (RedactionEvent) → log deletion with original content if cached
- `on_megolm` (MegolmEvent) → log decryption failure (counter per room)
**In-memory caches (NOT persisted):**
- `room_slugs: dict[str, str]` — room_id → slug (rebuilt from transcript headers on startup)
- `event_cache: dict[str, dict]` — event_id → {body, sender, ts} (for edit/redact tracking)
- `name_cache: dict[str, str]` — mxid → display name (Signal ghosts get "(Signal)" suffix)
**Sync token persistence:** matrix-nio SqliteStore handles save/load automatically when `store_sync_tokens=True`. On restart, `loaded_sync_token` resumes from last position — no event replay.
**Encrypted media handling:** RoomEncryptedImage/Audio/Video/File carry `key`, `hashes`, `iv` attributes. Bot downloads ciphertext via `client.download(mxc=url)`, then decrypts with `nio.crypto.decrypt_attachment()`.
### Transcript Format
```
# Transcript: Room Display Name
# Room ID: !xxxxx:echo6.co
# Archive started: 2026-04-12 20:47:01 UTC
# ---
[2026-04-12 18:55:34 UTC] Sender Name (Signal): Message text
[2026-04-12 18:56:00 UTC] Sender Name (Signal): [EDITED] New text
(was: Original text)
[2026-04-12 18:57:00 UTC] Sender Name (Signal): [DELETED] (was: Original text)
[2026-04-12 18:58:00 UTC] Sender Name (Signal): [image: media/1234567890_filename.jpg] caption
```
### Running the Bot
```bash
# Start (tmux, as archivist user)
pct exec 118 -- su -s /bin/bash archivist -c "tmux new-session -d -s archivist /opt/archivist/venv/bin/python3 -u /opt/archivist/archivist.py"
# Check logs
pct exec 118 -- tail -f /opt/archivist/logs/archivist.log
# Stop
pct exec 118 -- su -s /bin/bash archivist -c "tmux send-keys -t archivist C-c"
```
## What's NOT done yet (Phase 6+)
- No systemd service (running in tmux)
- No Tailscale registration
- No SSH key auth configured
- Bot needs invitations to additional bridged rooms as they appear