235 lines
9.1 KiB
Text
235 lines
9.1 KiB
Text
|
|
# Signal Archive Bot — Deployment Reference
|
||
|
|
# Created: 2026-04-12 (Phase 3)
|
||
|
|
# Status: Phase 5 COMPLETE — bot deployed, transcripts writing, sync token dedup verified
|
||
|
|
|
||
|
|
## CT 118 — archivist
|
||
|
|
|
||
|
|
| Setting | Value |
|
||
|
|
|---------|-------|
|
||
|
|
| VMID | 118 |
|
||
|
|
| Hostname | archivist |
|
||
|
|
| Host node | utility (192.168.1.241) |
|
||
|
|
| IP | 192.168.1.118/24 |
|
||
|
|
| Gateway | 192.168.1.1 |
|
||
|
|
| OS | Debian 12 (bookworm) |
|
||
|
|
| Disk | 8GB (local-lvm:vm-118-disk-0) |
|
||
|
|
| RAM | 1024 MB |
|
||
|
|
| Swap | 512 MB |
|
||
|
|
| Cores | 1 |
|
||
|
|
| Unprivileged | Yes |
|
||
|
|
| Features | keyctl=1 |
|
||
|
|
| Onboot | Yes |
|
||
|
|
| MAC | BC:24:11:74:E9:DC |
|
||
|
|
|
||
|
|
## NFS Mount
|
||
|
|
|
||
|
|
### In-container NFS mount: FAILED
|
||
|
|
- Attempt 1 (no mount=nfs feature): `access denied by server`
|
||
|
|
- Attempt 2 (with mount=nfs feature): `Operation not permitted`
|
||
|
|
- Root cause: unprivileged LXC containers cannot mount NFS directly
|
||
|
|
|
||
|
|
### Final approach: host-side NFS + Proxmox bind mount
|
||
|
|
- **Utility host fstab:** `192.168.1.245:/export/library /mnt/library nfs defaults,soft,timeo=150 0 0`
|
||
|
|
- **CT 118 mp0:** `/mnt/library,mp=/mnt/library,ro=0`
|
||
|
|
- **Pattern source:** CT 130 (RECON) on data node uses identical approach
|
||
|
|
- Utility host did NOT have /mnt/library mounted before this deployment
|
||
|
|
|
||
|
|
### Write access
|
||
|
|
- `/mnt/library/` permissions: 2777 (drwxrwsrwx), owner nobody:nogroup
|
||
|
|
- `/mnt/library/signal-archive/` permissions: 777 (drwxrwxrwx), created by CT root
|
||
|
|
- Archivist user UID 999 (container) maps to UID 100999 (host/NFS)
|
||
|
|
- Write access verified: archivist can create dirs, write files, delete within signal-archive/
|
||
|
|
- No NFS export changes needed — world-writable parent dir permits all UIDs
|
||
|
|
|
||
|
|
## Service User
|
||
|
|
|
||
|
|
| Setting | Value |
|
||
|
|
|---------|-------|
|
||
|
|
| Username | archivist |
|
||
|
|
| UID | 999 |
|
||
|
|
| GID | 996 |
|
||
|
|
| Shell | /usr/sbin/nologin |
|
||
|
|
| Home | /opt/archivist |
|
||
|
|
| Host-mapped UID | 100999 |
|
||
|
|
| Host-mapped GID | 100996 |
|
||
|
|
|
||
|
|
## Directory Layout
|
||
|
|
|
||
|
|
```
|
||
|
|
/opt/archivist/ # Home dir (owned by archivist:archivist)
|
||
|
|
archivist.py # Main bot script (Phase 5)
|
||
|
|
.env # Environment variables (Phase 5)
|
||
|
|
/store/ # E2EE key store + sync token (SqliteStore)
|
||
|
|
/logs/ # Bot logs (archivist.log)
|
||
|
|
/venv/ # Python virtual environment
|
||
|
|
|
||
|
|
/mnt/library/signal-archive/ # Transcript output (NFS bind mount)
|
||
|
|
<room-slug>/
|
||
|
|
transcript.log # Append-only human-readable transcript
|
||
|
|
media/ # Downloaded media files (images, audio, video, etc.)
|
||
|
|
```
|
||
|
|
|
||
|
|
## Python Environment
|
||
|
|
|
||
|
|
| Component | Version |
|
||
|
|
|-----------|---------|
|
||
|
|
| Python | 3.11.2 |
|
||
|
|
| pip | 26.0.1 |
|
||
|
|
| matrix-nio | 0.25.2 (with e2e extras) |
|
||
|
|
| python-olm | 3.2.16 |
|
||
|
|
| libolm-dev | 3.2.13~dfsg-1 |
|
||
|
|
| aiohttp | 3.13.5 |
|
||
|
|
|
||
|
|
Venv path: `/opt/archivist/venv/`
|
||
|
|
|
||
|
|
### Verified imports
|
||
|
|
- `nio.AsyncClient` — Matrix client
|
||
|
|
- `nio.crypto.OlmDevice` — E2EE device management
|
||
|
|
- `olm.Account` — libolm C binding
|
||
|
|
- `nio.store.SqliteStore` — crypto key persistence
|
||
|
|
|
||
|
|
## System Packages
|
||
|
|
|
||
|
|
Installed via apt:
|
||
|
|
- python3, python3-venv, python3-pip, python3-dev
|
||
|
|
- libolm-dev
|
||
|
|
- gcc, g++, make
|
||
|
|
- nfs-common
|
||
|
|
- curl, ca-certificates
|
||
|
|
|
||
|
|
## Access
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# From utility host
|
||
|
|
pct exec 118 -- bash
|
||
|
|
|
||
|
|
# Direct SSH (not configured yet — no SSH keys installed)
|
||
|
|
# ssh root@192.168.1.118
|
||
|
|
```
|
||
|
|
|
||
|
|
## Matrix User — @archivist:echo6.co
|
||
|
|
|
||
|
|
| Setting | Value |
|
||
|
|
|---------|-------|
|
||
|
|
| User ID | @archivist:echo6.co |
|
||
|
|
| Display name | Archivist Bot |
|
||
|
|
| Created via | mas-cli (MAS user existed from 2026-04-10, password set via `manage set-password`) |
|
||
|
|
| Password | `<REDACTED — see credentials file: MATRIX_ARCHIVIST_BOT_PASSWORD>` |
|
||
|
|
| Device ID | ARCHIVIST |
|
||
|
|
| Access token | Stable compat token via `mas-cli manage issue-compatibility-token archivist ARCHIVIST` |
|
||
|
|
| Admin | No (not needed — uses room invitation, not admin force-join) |
|
||
|
|
|
||
|
|
### E2EE / Cross-Signing
|
||
|
|
|
||
|
|
| Component | Status |
|
||
|
|
|-----------|--------|
|
||
|
|
| Device keys (curve25519 + ed25519) | Uploaded |
|
||
|
|
| One-time keys (signed_curve25519) | 50 uploaded |
|
||
|
|
| Master key | Published (nKFt5nA+TvUo0AvY1gsk1QL8OK1t9z/ChON30Kdvlek) |
|
||
|
|
| Self-signing key | Published (9gOB+AHgyBzLP5/xerYon04NLZuIh+o5OHAybmetK2A) |
|
||
|
|
| User-signing key | Published (1EYngPiwpjOy2aQWY02g5SZaFLM5kZgsHLhFXgwHh0Q) |
|
||
|
|
| Device self-signed | Yes (ARCHIVIST signed by self-signing key) |
|
||
|
|
| Cross-signing seeds | /opt/archivist/store/cross_signing_seeds.json (chmod 600) |
|
||
|
|
| nio store | /opt/archivist/store/@archivist:echo6.co_ARCHIVIST.db |
|
||
|
|
|
||
|
|
### Key Sharing — How It Works
|
||
|
|
|
||
|
|
- Bridge key sharing policy: `cross-signed-tofu`
|
||
|
|
- ARCHIVIST device is cross-signed → bridge shares Megolm session keys automatically
|
||
|
|
- **Interactive verification (SAS emoji) is NOT required** — cross-signing alone is sufficient
|
||
|
|
- Old messages (before archivist joined) remain undecryptable (Megolm keys not retroactively shared)
|
||
|
|
- New messages are decryptable immediately
|
||
|
|
|
||
|
|
### E2BE Decryption Test — PASSED
|
||
|
|
|
||
|
|
- Date: 2026-04-12 14:35 UTC
|
||
|
|
- Room: COMMS LP group (!XUeWZuPdWQQnUYLJBJ:echo6.co)
|
||
|
|
- Message: "You'll know that radio has come of age when the median cellphone incorporates a LoRa radio stack."
|
||
|
|
- Sender: @signal_cdf98bca-c4b7-4fda-8ceb-03db5eb4e7e2:echo6.co (Signal puppet via bridge)
|
||
|
|
- Result: Successfully decrypted by ARCHIVIST device
|
||
|
|
|
||
|
|
### User Creation Notes
|
||
|
|
|
||
|
|
- Shared-secret registration (`/_synapse/admin/v1/register`) returns 404 under MAS — endpoint disabled
|
||
|
|
- Must use `mas-cli manage register-user` or `manage set-password` for existing users
|
||
|
|
- MAS creates user in both MAS DB and Synapse DB
|
||
|
|
- Orphaned Synapse `profiles` row caused provisioning failure — fixed by DELETE
|
||
|
|
- Each `client.login()` creates a NEW MAS compat session with random device ID — use `restore_login()` with stable compat token instead
|
||
|
|
- matrix-nio v0.25.2 does NOT implement `bootstrap_cross_signing()` — manual implementation required via python-olm PkSigning + raw HTTP API
|
||
|
|
|
||
|
|
## Joined Rooms
|
||
|
|
|
||
|
|
| Room | Room ID | Type | Archive Slug |
|
||
|
|
|------|---------|------|-------------|
|
||
|
|
| COMMS LP group | !XUeWZuPdWQQnUYLJBJ:echo6.co | Bridged Signal group | comms-lp-group |
|
||
|
|
| DM with Matt | !wgbnqhnYKTHzzJMjDu:echo6.co | Direct message | — |
|
||
|
|
| Liberal_Preppers_OG | !RvWNPmcKtPImhKPYcA:echo6.co | Bridged Signal group | liberal-preppers-og |
|
||
|
|
| (3 additional rooms) | !vBXtbgfYcptEuimrmn, !SnGDZgBtYOQuTWeYXp, !aQWFQMrzbkwjyjCPte | Bridged Signal groups | (initialized on first message) |
|
||
|
|
|
||
|
|
## Scripts on CT 118
|
||
|
|
|
||
|
|
| Script | Purpose | Status |
|
||
|
|
|--------|---------|--------|
|
||
|
|
| /opt/archivist/archivist.py | Main bot — transcript writer | **Running in tmux** |
|
||
|
|
| /opt/archivist/login_once.py | One-shot login + key upload | Completed (superseded) |
|
||
|
|
| /opt/archivist/bootstrap_crosssigning.py | Cross-signing key bootstrap | Completed (one-time) |
|
||
|
|
| /opt/archivist/setup_and_verify.py | Device setup + verification listener | Completed |
|
||
|
|
| /opt/archivist/test_decrypt.py | E2BE decryption test listener | Completed (superseded by archivist.py) |
|
||
|
|
|
||
|
|
## Bot Architecture (Phase 5)
|
||
|
|
|
||
|
|
### archivist.py — Event-Driven Transcript Bot (~260 lines)
|
||
|
|
|
||
|
|
**Core design:** Single-file async Python bot using matrix-nio `sync_forever` with `ClientConfig(store_sync_tokens=True)` for restart deduplication.
|
||
|
|
|
||
|
|
**Callbacks:**
|
||
|
|
- `on_invite` (InviteMemberEvent) → auto-join
|
||
|
|
- `on_text` (RoomMessageText) → write transcript line, detect edits via `m.replace`
|
||
|
|
- `on_image/audio/video/file` (RoomMessage* + RoomEncrypted*) → download + decrypt + save to media/
|
||
|
|
- `on_sticker` (StickerEvent) → same as media
|
||
|
|
- `on_redaction` (RedactionEvent) → log deletion with original content if cached
|
||
|
|
- `on_megolm` (MegolmEvent) → log decryption failure (counter per room)
|
||
|
|
|
||
|
|
**In-memory caches (NOT persisted):**
|
||
|
|
- `room_slugs: dict[str, str]` — room_id → slug (rebuilt from transcript headers on startup)
|
||
|
|
- `event_cache: dict[str, dict]` — event_id → {body, sender, ts} (for edit/redact tracking)
|
||
|
|
- `name_cache: dict[str, str]` — mxid → display name (Signal ghosts get "(Signal)" suffix)
|
||
|
|
|
||
|
|
**Sync token persistence:** matrix-nio SqliteStore handles save/load automatically when `store_sync_tokens=True`. On restart, `loaded_sync_token` resumes from last position — no event replay.
|
||
|
|
|
||
|
|
**Encrypted media handling:** RoomEncryptedImage/Audio/Video/File carry `key`, `hashes`, `iv` attributes. Bot downloads ciphertext via `client.download(mxc=url)`, then decrypts with `nio.crypto.decrypt_attachment()`.
|
||
|
|
|
||
|
|
### Transcript Format
|
||
|
|
|
||
|
|
```
|
||
|
|
# Transcript: Room Display Name
|
||
|
|
# Room ID: !xxxxx:echo6.co
|
||
|
|
# Archive started: 2026-04-12 20:47:01 UTC
|
||
|
|
# ---
|
||
|
|
|
||
|
|
[2026-04-12 18:55:34 UTC] Sender Name (Signal): Message text
|
||
|
|
[2026-04-12 18:56:00 UTC] Sender Name (Signal): [EDITED] New text
|
||
|
|
(was: Original text)
|
||
|
|
[2026-04-12 18:57:00 UTC] Sender Name (Signal): [DELETED] (was: Original text)
|
||
|
|
[2026-04-12 18:58:00 UTC] Sender Name (Signal): [image: media/1234567890_filename.jpg] caption
|
||
|
|
```
|
||
|
|
|
||
|
|
### Running the Bot
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Start (tmux, as archivist user)
|
||
|
|
pct exec 118 -- su -s /bin/bash archivist -c "tmux new-session -d -s archivist /opt/archivist/venv/bin/python3 -u /opt/archivist/archivist.py"
|
||
|
|
|
||
|
|
# Check logs
|
||
|
|
pct exec 118 -- tail -f /opt/archivist/logs/archivist.log
|
||
|
|
|
||
|
|
# Stop
|
||
|
|
pct exec 118 -- su -s /bin/bash archivist -c "tmux send-keys -t archivist C-c"
|
||
|
|
```
|
||
|
|
|
||
|
|
## What's NOT done yet (Phase 6+)
|
||
|
|
|
||
|
|
- No systemd service (running in tmux)
|
||
|
|
- No Tailscale registration
|
||
|
|
- No SSH key auth configured
|
||
|
|
- Bot needs invitations to additional bridged rooms as they appear
|