echo6-docs/archivist_discovery.ref

205 lines
8.8 KiB
Text
Raw Permalink Normal View History

# Signal Archive Bot — Phase 1 Discovery Findings
# Date: 2026-04-12
# Status: COMPLETE — awaiting Matt's review before Phase 2
## 1. Synapse Homeserver Configuration
| Setting | Value |
|---------|-------|
| server_name | `echo6.co` |
| public_baseurl | `https://matrix.echo6.co/` |
| database | PostgreSQL `synapse` on `matrix-postgres:5432` |
| enable_registration | `false` |
| registration_shared_secret | `<REDACTED — see credentials file: MATRIX_SYNAPSE_REGISTRATION_SHARED_SECRET>` |
| macaroon_secret_key | `<REDACTED — see credentials file: MATRIX_SYNAPSE_MACAROON_SECRET_KEY>` |
| MAS enabled | `true` (endpoint: `http://matrix-mas:8080/`) |
| MAS secret | `<REDACTED — see credentials file: MATRIX_MAS_SYNAPSE_SECRET>` |
### Experimental Features (already enabled)
- `msc3202_transaction_extensions: true` — appservice transaction extensions for E2BE
- `msc2409_to_device_messages_enabled: true` — to-device messages for appservices
### Appservice Registrations
1. `/data/registration.yaml` — mautrix-signal bridge
2. `/data/doublepuppet.yaml` — double puppeting for echo6.co
## 2. MAS (Matrix Authentication Service)
- Image: `ghcr.io/element-hq/matrix-authentication-service:latest`
- Container: `matrix-mas`
- Port: 127.0.0.1:8085 (host) → 8080 (container)
- Database: PostgreSQL `mas` on `matrix-postgres:5432`, user `mas`
- Upstream OAuth2: Echo6 SSO (Authentik) at `https://auth.echo6.co/application/o/matrix/`
- Client ID: `93kCoZkBlnJyD9EcAm7E4btKflecOcBm9DGONB5T`
- Passwords: enabled (bcrypt, argon2id)
- Email transport: `blackhole` (not functional)
## 3. Matrix Users (6 real + 966 signal puppets)
| User | Admin | Purpose |
|------|-------|---------|
| @matt:echo6.co | Yes | Primary admin |
| @matt1:echo6.co | Yes | Secondary admin |
| @cortex:echo6.co | Yes | Echo6 Cortex Agent bot |
| @contabo:echo6.co | No | Echo6 Contabo Agent bot |
| @agent:echo6.co | No | Bot account |
| @zerby1470:echo6.co | No | Regular user |
| @signalbot:echo6.co | No | mautrix-signal bridge bot |
## 4. mautrix-signal Bridge
| Setting | Value |
|---------|-------|
| Image | `dock.mau.dev/mautrix/signal:v0.2603.0` |
| Container | `mautrix-signal` |
| Status | Running (2+ days uptime, healthy, no errors) |
| Homeserver | `http://matrix-synapse:8008` (Docker network) |
| Appservice addr | `http://mautrix-signal:29328` |
| Bot user | `@signalbot:echo6.co` (device ID `UPX4KKLZVY`) |
| Username template | `signal_{{.}}` |
| Login ID | `58f99d83-f3a8-487f-a2b7-3d118e236d23` (matt's Signal) |
| Database | PostgreSQL `mautrix_signal` on `matrix-postgres:5432` |
### Encryption Config
- allow: `true`
- default: `true`
- require: `true` — ALL bridged rooms use E2BE
- msc4190: `true` (device masquerading for next-gen auth/MAS)
- self_sign: `true`
- allow_key_sharing: `true`
- verification_levels: receive=unverified, send=unverified, share=cross-signed-tofu
### Portal Rooms (10 active, 5 unlinked)
| Portal Name | Room ID | Type |
|-------------|---------|------|
| COMMS LP group | !XUeWZuPdWQQnUYLJBJ:echo6.co | group |
| Liberal_Preppers_OG | !RvWNPmcKtPImhKPYcA:echo6.co | group |
| The Weekly Topic | !vBXtbgfYcptEuimrmn:echo6.co | group |
| Glimmers LP grouo | !qlwFBjKkdpqyUtrvkD:echo6.co | group |
| admins | !YaNspRceyamcRdFmfG:echo6.co | group |
| Left Preppers | !JfxIRowNkLbBlNPjVX:echo6.co | group |
| Resource Media/Links LP groups | !SnGDZgBtYOQuTWeYXp:echo6.co | group |
| DM (e949ab79) | !tSvEWQcXxJItLGAXDr:echo6.co | dm |
| DM (0e206fa1) | !hiDxGpfsYESpVDQXKW:echo6.co | dm |
| DM (a7d7d253) | !EepVZgnoMGiRkIdTAh:echo6.co | dm |
5 additional unlinked portals exist in the DB (Signal conversations with no Matrix room yet).
### Ghost Users
20+ Signal contacts represented as @signal_<uuid>:echo6.co puppet accounts.
## 5. Docker Compose Stack (/opt/matrix/docker-compose.yml)
| Container | Image | Host Port |
|-----------|-------|-----------|
| matrix-postgres | postgres:16-alpine | None (internal) |
| matrix-synapse | matrixdotorg/synapse:latest | 127.0.0.1:8008 |
| matrix-mas | ghcr.io/element-hq/matrix-authentication-service:latest | 127.0.0.1:8085 |
| matrix-element | vectorim/element-web:latest | 127.0.0.1:8088 |
| mautrix-signal | dock.mau.dev/mautrix/signal:v0.2603.0 | None (internal) |
All on `matrix-net` Docker bridge network.
## 6. User Creation Path — @archivist:echo6.co
### Option A: Synapse shared-secret registration (RECOMMENDED)
- `registration_shared_secret` IS configured in homeserver.yaml
- Need to verify if the `/_synapse/admin/v1/register` endpoint still works with MAS enabled
- If it does: `register_new_matrix_user -u archivist -p <password> -a -c /data/homeserver.yaml http://localhost:8008`
### Option B: MAS admin API
- Create user via MAS's admin API or `mas-cli`
- Would create the user in MAS DB + Synapse automatically
- More aligned with the auth architecture
### Option C: Direct DB insert
- Insert into Synapse `users` table + MAS `users` table
- Fragile, not recommended
### Critical consideration: is_synapse_admin
- Per MEMORY.md, MAS compat tokens need `is_synapse_admin` set on `compat_sessions` rows
- The archivist bot needs admin privileges to join rooms and access room state
- After creation, will need to grant admin via DB or API
## 7. Proxmox CT ID Allocation
### Used CTs (21 total):
100, 101, 102, 103, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 120, 121, 130, 200
### Used VMs:
105 (arr), 150 (cortex)
### Free CT IDs in range 100-149:
104(?), 118, 119, 122-129, 131-149
Note: CT 104 (meshing-around) is documented but NOT running in cluster. CT 117 (nf-mtp) is new/undocumented.
### Suggested CT: 118 on utility
- IP: 192.168.1.118
- Resources: 1-2 vCPU, 1 GB RAM, 8 GB disk
- Utility has 20.7 GB free RAM, 0.6% CPU usage
### Node Resource Summary
| Node | RAM Free | CPU Usage |
|------|----------|-----------|
| data | 27.4 GB | 8.8% |
| cloud | 23.0 GB | 0.3% |
| utility | 20.7 GB | 0.6% |
| media | 19.2 GB | 4.2% |
| toc | 26.4 GB | 1.7% |
## 8. Storage — /mnt/library
| Metric | Value |
|--------|-------|
| NFS source | pi-nas (192.168.1.245):/export/library |
| Total | 22 TB |
| Used | 4.4 TB (21%) |
| Free | 18 TB |
| Current mounts | CT 130 (RECON) on data node, CT 118 (archivist) on utility node |
| /mnt/library/signal-archive | Created 2026-04-12, 777 permissions |
### NFS export access:
- `100.64.0.0/10` — Tailscale clients (rw, insecure)
- `192.168.1.0/24` — Local network (rw, insecure, no_root_squash)
### NFS consumers:
- CT 130 (data): pi-nas:/export/library → /mnt/library (host-side mount + bind)
- CT 110 (media): pi-nas:/export/peertube → /var/www/peertube/storage
- CT 118 (utility): pi-nas:/export/library → /mnt/library (host-side mount + bind, added 2026-04-12)
### Note (2026-04-12):
Utility Proxmox host did NOT have /mnt/library mounted before Phase 3. NFS entry added to utility /etc/fstab and bind-mounted into CT 118 via mp0. In-container NFS mounts fail in unprivileged LXC (access denied / operation not permitted).
## 9. Architecture Decision Points
### Q1: Where should the bot run?
- **Option A: New LXC on utility (CT 118)** — project spec says Phase 3 is "Deploy Bot Host (LXC)". NFS mount via local network. Connect to Synapse via https://matrix.echo6.co or Tailscale.
- **Option B: On Contabo** — same Docker network as Synapse (easiest Matrix connectivity). But NFS would need to go over Tailscale (slow for writes). Could also write locally and rsync.
- **Recommendation: Option A (CT 118 on utility)** — matches project spec, fast NFS, bot connects to homeserver via Tailscale or public URL.
### Q2: Does the bot need Synapse admin?
- To join rooms it's not invited to: YES (admin API `/_synapse/admin/v1/join`)
- To read encrypted messages: needs to be IN the room and have Megolm keys shared to it
- Alternative: matt invites @archivist to each bridged room manually
### Q3: E2EE key handling
- Bridge rooms have encryption REQUIRED
- Bot must implement full Megolm key management via matrix-nio
- Device verification (Phase 4) is critical — unverified devices won't receive keys if bridge or other clients have key-sharing restrictions
- Bridge's share level is `cross-signed-tofu` — bot needs valid cross-signing
### Q4: Transcript storage format
- Project spec: plain text, append-only, one file per room
- Path: /mnt/library/signal-archive/<room-name>/<YYYY-MM-DD>.txt
- No database, just files
## 10. Risks and Blockers
1. **MAS + shared secret registration compatibility** — untested, may need MAS admin API instead
2. **E2BE key sharing** — bridge requires cross-signed-tofu for key sharing. Bot must set up cross-signing and verify.
3. **Room join mechanism** — bot needs invitation or admin force-join to each of the 10 bridged rooms
4. **NFS permissions** — new CT will mount as nobody:nogroup by default. Transcript files need correct ownership.
5. **libolm dependency** — matrix-nio E2EE requires libolm C library. Must be available in the LXC container.