echo6-docs/echo6-landing-page-data-export.md

476 lines
19 KiB
Markdown
Raw Permalink Normal View History

# Echo6 Landing Page — Data Export
## Echo6 Platform Reference — Infrastructure, Services & Brand Identity
---
## 1. Brand / Identity
- **Domain:** echo6.co
- **Admin email:** admin@echo6.co
- **Auth provider:** Authentik (auth.echo6.co) — OIDC SSO across all services
### Brand Colors (extracted from logo)
| Color | Hex | RGB | Usage |
|-------|-----|-----|-------|
| Echo6 Cyan | `#28C0E8` | rgb(40, 192, 232) | Primary accent, buttons, links, focus states |
| Echo6 Cyan Light | `#5DD4F5` | rgb(93, 212, 245) | Hover states |
| Echo6 Yellow | `#F0D848` | rgb(240, 216, 72) | Secondary accent, warnings |
| Background Primary | `#0a0e17` | rgb(10, 14, 23) | Page backgrounds |
| Background Secondary | `#111827` | rgb(17, 24, 39) | Cards, sidebars, inputs |
| Background Tertiary | `#1a2332` | rgb(26, 35, 50) | Hover states |
| Border | `#1e3a5f` | rgb(30, 58, 95) | All borders |
| Text Primary | `#e0e6ed` | rgb(224, 230, 237) | Main text |
| Text Muted | `#7a8ca0` | rgb(122, 140, 160) | Secondary text |
### Design Language
- **Aesthetic:** Cyberpunk — dark, sharp, clean. No excessive glow or neon blur.
- **Font:** JetBrains Mono (monospace) throughout all Echo6-branded interfaces
- **Text:** All lowercase in navigation, menus, and footer elements
- **Effects:** Subtle — search bar focus shadow at 0.12 opacity max, no colored glows
---
## 2. Public-Facing Services
| Service | URL | Description | Auth |
|---------|-----|-------------|------|
| Echo6 Search (Homepage) | https://echo6.co | SearXNG search — branded cyberpunk homepage, Google-style layout | Public (SearXNG) |
| Aurora (AI Assistant) | https://ai.echo6.co | RAG-augmented LLM chat — locally-hosted, queries a 95K+ vector knowledge base | Authentik OIDC |
| PeerTube (Video) | https://stream.echo6.co | Self-hosted video platform — 99 curated YouTube channels mirrored, GPU-transcoded | Authentik OIDC |
| File Server | https://files.echo6.co | PDF/document library — ~13,239 documents (military doctrine, survival, comms, trades) | Public |
| Photos (Immich) | https://immich.echo6.co | Self-hosted photo management | Authentik OIDC |
| Mail (Mailcow) | https://mail.echo6.co | Email — Mailcow webmail | Authentik OIDC |
| Cloud (Nextcloud) | https://nextcloud.echo6.co | Self-hosted cloud storage | Authentik OIDC |
| Jellyfin | https://jellyfin.echo6.co | Media streaming | Authentik OIDC |
| Requests (Jellyseerr) | https://requests.echo6.co | Media request management | Authentik OIDC |
| WATCHTOWER | https://wt.echo6.co | Infrastructure monitoring dashboard | Authentik forward auth |
| Auth (Authentik) | https://auth.echo6.co | SSO identity provider — user portal and admin | Authentik native |
---
## 3. Echo6 Homepage — SearXNG Custom Theme
### Overview
The Echo6 homepage at `echo6.co` is a customized SearXNG instance (not a standalone static page). The default SearXNG UI is reskinned via CSS overlay, template overrides, and settings.yml changes to match the Echo6 cyberpunk brand.
### Implementation Approach
- **CSS overlay** (`echo6-custom.css`) — all color, font, layout, and component overrides
- **Logo replacement** — Echo6 logo replaces SearXNG default logo via volume mount or direct file copy
- **Template override** (`base.html`) — injects the top navigation bar, waffle menu JS, and custom footer
- **settings.yml** — instance name, theme, base URL configuration
### SearXNG Settings
```yaml
general:
instance_name: "Echo6"
ui:
default_theme: simple
center_alignment: false # results stretch to fill viewport
theme_args:
simple_style: dark
server:
base_url: "https://echo6.co/"
brand:
issue_url: false
docs_url: false
public_instances: false
wiki_url: false
```
### Homepage Layout (Google.com-inspired)
```
┌─────────────────────────────────────────────────────────┐
│ .//photos .//mail [⠿] [👤] │ ← top nav (right-aligned)
│ │
│ [ECHO6 LOGO] │ ← centered ~35-40% from top
│ │
│ ┌─────────────────────┐ │
│ │ 🔍 search input │ │ ← pill search bar
│ └─────────────────────┘ │
│ │
├─────────────────────────────────────────────────────────┤
│ echo6.co │
│ preferences · about │ ← footer (minimal)
└─────────────────────────────────────────────────────────┘
```
- Homepage locked to viewport (no scroll, `overflow: hidden` on body.index)
- Logo + search bar vertically centered at ~35-40% from top using flexbox
- Default SearXNG text/wordmark hidden via CSS
- Results pages scroll normally (only homepage is viewport-locked)
### Top Navigation
| Element | Style | Target |
|---------|-------|--------|
| .//photos | Cyan monospace link | `https://auth.echo6.co/application/launch/<immich-slug>/` |
| .//mail | Cyan monospace link | `https://auth.echo6.co/application/launch/<mailcow-slug>/` |
| ⠿ (waffle) | Circle button, 36px | Opens waffle menu dropdown |
| 👤 (login) | Circle button, 32px, border | `https://auth.echo6.co/` |
### Waffle Menu (11 items, 3-column grid)
| Label | Target |
|-------|--------|
| aurora | `https://auth.echo6.co/application/launch/<openwebui-slug>/` |
| stream | `https://auth.echo6.co/application/launch/<peertube-slug>/` |
| files | `https://files.echo6.co` (public — no launch URL) |
| watchtower | `https://auth.echo6.co/application/launch/<watchtower-slug>/` |
| photos | `https://auth.echo6.co/application/launch/<immich-slug>/` |
| mail | `https://auth.echo6.co/application/launch/<mailcow-slug>/` |
| cloud | `https://auth.echo6.co/application/launch/<nextcloud-slug>/` |
| jellyfin | `https://auth.echo6.co/application/launch/<jellyfin-slug>/` |
| requests | `https://auth.echo6.co/application/launch/<jellyseerr-slug>/` |
| admin | `https://auth.echo6.co/` |
| preferences | `https://echo6.co/preferences` |
All authenticated service links use Authentik launch URLs (`/application/launch/<slug>/`) for seamless SSO pass-through.
### CSS Architecture (14 sections)
1. CSS variables (brand colors)
2. Homepage viewport lock + centering
3. SearXNG wordmark hiding
4. Logo sizing
5. Search bar (pill shape, cyan focus)
6. Search buttons
7. Top navigation bar (fixed, right-aligned)
8. Waffle menu dropdown (grid, fade-in animation)
9. Footer (minimal, lowercase)
10. Search results page (stretched full-width, themed)
11. Preferences page (dark forms, cyan accents)
12. Scrollbar styling
13. Selection highlight
14. Responsive breakpoints (768px, 480px)
### SearXNG File Locations
| Asset | Path |
|-------|------|
| Logo | `/usr/local/searxng/searx/static/themes/simple/img/searxng.png` |
| Favicon | `/usr/local/searxng/searx/static/themes/simple/img/favicon.png` |
| Custom CSS | Injected via template override or volume mount |
| Base template | `/usr/local/searxng/searx/templates/simple/base.html` |
| Settings | `/etc/searxng/settings.yml` |
---
## 4. Authentik Theme — Echo6 Branding
### Overview
The Authentik instance at `auth.echo6.co` is themed to match the Echo6 cyberpunk aesthetic. This covers the login flow, user dashboard, and admin interface.
### Brand Settings (Admin → System → Brands)
| Setting | Value |
|---------|-------|
| Branding title | `echo6` |
| Logo | Echo6 logo (uploaded via Customization → Files) |
| Favicon | Echo6 favicon |
| Theme | `dark` (forced — not automatic) |
| Custom CSS | Echo6 Authentik CSS (see below) |
### Flow Text Customization
| Flow | Title |
|------|-------|
| Authentication | `echo6 // login` |
| Invalidation | `echo6 // logout` |
| Recovery | `echo6 // recovery` |
| User settings | `echo6 // settings` |
### User Dashboard Layout
```yaml
settings:
theme:
base: dark
cardLayout: 3-column
```
### CSS Customization Scope
The Authentik CSS overrides PatternFly (`pf-c-*`) components:
- Login flow: dark gradient background, cyan buttons, JetBrains Mono font
- Form inputs: dark backgrounds (`#111827`), cyan focus borders
- User dashboard: dark app cards with cyan hover borders
- Admin interface: dark sidebar, cyan active nav indicators
- Buttons: primary = Echo6 cyan with dark text, secondary = border-only
- Tables, tabs, modals, alerts, dropdowns — all themed
- Global font override to JetBrains Mono
### Application Icons
Each Authentik application has a matching service icon uploaded for the user dashboard tiles. Icons sourced from official project assets or cyberpunk-styled variants.
### Version Note
Authentik 2025.12.4 — custom CSS is applied via the Brand CSS field in System → Brands → Edit. The Shadow DOM CSS injection bug from 2025.12.0 is fixed in this version. Logo/favicon use the Files system (Customization → Files) rather than the old `/media/` mount path.
---
## 5. Infrastructure Overview
### Nodes (all connected via Tailscale / self-hosted Headscale)
| Node | Role | Key Services |
|------|------|-------------|
| data | Proxmox host | Hosts RECON LXC (CT 130) |
| utility | Proxmox host | Caddy reverse proxy (CT 101), TLS termination |
| cloud | Proxmox host | — |
| media | Proxmox host | PeerTube LXC (CT 110) |
| toc | Proxmox host | GPU passthrough host for cortex VM |
| cortex | VM on toc | RTX A4000 16GB, Qdrant, Ollama, TEI, OpenWebUI, PeerTube runner |
| pi-nas | NFS storage (OMV) | 18TB — PDF library + PeerTube video storage |
| Contabo VPS | Remote | WATCHTOWER, DNS, automated backups (919GB available) |
### GPU — NVIDIA RTX A4000 (16GB VRAM, on cortex)
| Workload | VRAM | Silicon |
|----------|------|---------|
| TEI embeddings (bge-m3) | 1.3 GB | CUDA |
| Aurora LLM (JOSIEFIED Qwen3 8B) | ~5 GB | CUDA/Tensor |
| Whisper transcription (medium) | ~5 GB | CUDA (short videos) or CPU (long videos) |
| NVENC video encoding | ~0.2-0.5 GB/stream | Dedicated ASIC (no CUDA contention) |
| **Typical total** | **~11.5 GB / 16 GB** | |
---
## 6. RECON — Knowledge Extraction Pipeline
### What It Does
Processes PDFs and web content → extracts structured knowledge concepts via Gemini AI → embeds into a vector database → provides RAG-augmented retrieval for Aurora.
### Pipeline Flow
```
PDF/Web Source → Extract Text → Gemini Enrichment → JSON (saved) → Embed (TEI) → Qdrant
```
### Key Stats
| Metric | Value |
|--------|-------|
| Total documents catalogued | ~13,239 PDFs + web sources |
| Documents complete | ~1,200+ (and growing) |
| Vectors in Qdrant | 95,000+ |
| Embedding throughput | 1,711 embeddings/sec (TEI) |
| Enrichment rate | ~100 docs/hour |
| Search latency | <10ms (HNSW index) |
| Full catalogue reprocess time | ~50 hours |
| Full catalogue cost | ~$135 (Gemini API) |
### Content Sources
| Source | Count | Description |
|--------|-------|-------------|
| Army Field Manuals | ~160 | Official Army doctrine (FM series) |
| Survival Companion Library | ~6,600 | Comprehensive survival/preparedness collection |
| Other documents | ~6,300 | Assorted reference, technical, trades |
| meshtastic.org | ~247 pages | Mesh communications documentation |
| ready.gov | ~5 pages | Emergency preparedness |
### Knowledge Domains (Aurora Taxonomy)
| Tier | Domain | Examples |
|------|--------|----------|
| 1 | Foundational Skills | Water, food, fire, shelter, first aid, navigation, security |
| 2 | Sustainment Systems | Long-term food/water/energy, medical stockpiling, tool maintenance |
| 3 | Defense & Military Tactics | Perimeter defense, small unit tactics, weapons, OPSEC |
| 4 | Off-Grid Systems | Power generation, water independence, thermal, comms infrastructure |
| 5 | Communications & Intelligence | Radio ops, mesh networking, SIGINT, OSINT, backup comms |
| 6 | Scenario Playbooks | Urban collapse, rural homesteading, bug-out, governance |
### Scenario Timescales
| Tag | Scope |
|-----|-------|
| tuesday_prepper | Next-week disruption (power outage, storm) |
| month_prepper | Extended disruption (supply chain, regional disaster) |
| year_prepper | Long-term grid-down |
| multi_year | Sustained collapse |
| eotwawki | End of the world as we know it |
### Concept Schema (22 fields per concept)
Core content, classification (domain/subdomain/skill level/scenario), provenance (source type, credibility score, verification status), reference (chapter, page, key facts), and technical metadata (deterministic IDs, embeddings, download URLs).
### Pipeline Architecture
| Stage | Workers | Bottleneck | Description |
|-------|---------|------------|-------------|
| Extract | 4 | CPU-bound | PyPDF2 → pdftotext → Tesseract → Gemini Vision (4-method fallback chain) |
| Enrich | 16 | I/O-bound (Gemini API) | 10-page windows → Gemini 2.0 Flash → structured JSON concepts |
| Embed | batch | I/O-bound (TEI) | bge-m3 1024-dim → Qdrant insert, 128/batch |
| Scanner | 1 | Hourly cron | Auto-discovers new PDFs from NFS mount |
### Resilience Features
- JSON-first: all Gemini extractions saved to disk before any DB insert
- Idempotent: content hashing prevents duplicate processing
- Recoverable: full Qdrant rebuild from JSON files alone
- Exponential backoff with jitter for API rate limits
- Window-level failure tolerance (partial document processing)
- Error classification (transient vs permanent)
- Automated backup to Contabo VPS every 6 hours
---
## 7. Aurora — AI Assistant
| Property | Value |
|----------|-------|
| Interface | OpenWebUI at ai.echo6.co |
| Model | JOSIEFIED Qwen3 8B (~55 tok/s, ~5GB VRAM) |
| RAG source | RECON Qdrant (95K+ vectors, top-5 retrieval) |
| Embedding model | bge-m3 (1024-dim via TEI) |
| Score threshold | 0.3 cosine similarity |
| Citations | Clickable badges linking to files.echo6.co PDFs or source web pages |
| Dual-mode | Think Toggle filter for reasoning vs. fast response |
| Auth | Authentik OIDC |
---
## 8. PeerTube — Video Platform
### Instance Details
| Property | Value |
|----------|-------|
| URL | https://stream.echo6.co |
| Version | PeerTube v8.0.2 |
| Installation | Native (not Docker), Debian 12 |
| Host | CT 110 on media node |
| Storage | NFS from pi-nas (~18TB capacity) |
| Database | PostgreSQL (peertube_prod) |
| Auth | Authentik OIDC (signup disabled) |
| Transcoding | 720p + 1080p HLS only, NVENC GPU via remote runner |
| Transcription | Whisper medium (GPU for <1hr, CPU for >1hr) |
### Channel Library
| Category | Examples |
|----------|----------|
| Tactical/Firearms | Forgotten Weapons, Active Self Protection, Garand Thumb |
| Preparedness/Homesteading | Canadian Prepper, City Prepping |
| Trades — Electrical | — |
| Trades — Plumbing | — |
| Trades — HVAC | — |
| Trades — Welding | — |
| Trades — Automotive | — |
| Trades — Woodworking | — |
| Ham Radio | Dave Casler, Ham Radio Crash Course, DX Commander |
| Comms/Meshtastic | Andreas Spiess, The Comms Channel, CommsPrepper |
| Education — Math | — |
| Education — Science | — |
| Education — Engineering | — |
| Education — History | — |
| Medical/Fieldcraft | PrepMedic, Skinny Medic |
| Lockpicking/Security | LockPickingLawyer, Bosnianbill |
### Key Stats
| Metric | Value |
|--------|-------|
| Total channels | 99 curated YouTube channels |
| Channels fully synced | 57 of 99 |
| Estimated total library size | ~15.3 TB |
| Transcoding output | 720p + 1080p HLS |
| Transcoder throughput | ~67 videos/hr (NVENC) |
| Transcode win rate | 83% (post-probe-gate optimization) |
| Import method | Resumable chunked upload (10MB chunks) |
### Bulk Import Pipeline
```
Downloader (CT 110, yt-dlp + VPN rotation)
→ Transcoder (cortex, NVENC GPU, 4 workers)
→ Importer (CT 110, PeerTube API, resumable upload)
```
- Round-robin across 99 channels with sliding window (50 videos/batch)
- VPN rotation on rate limit (NordVPN, 6 countries: US, CA, UK, DE, NL, SE)
- Pre-encode probe gate skips already-efficient codecs (H.265/AV1/VP9) and low-bitrate H.264
- Automatic dedup via download archive
- All three stages run as systemd services
---
## 9. Networking & Security
| Layer | Technology |
|-------|-----------|
| Mesh VPN | Tailscale (self-hosted Headscale) |
| Reverse proxy | Caddy (CT 101 on utility) — auto TLS |
| DNS | GoDaddy (external), dnsmasq split DNS (internal) |
| Authentication | Authentik OIDC SSO across all services |
| SSO Launch URLs | `https://auth.echo6.co/application/launch/<slug>/` for seamless pass-through |
| Backup transport | rsync over SSH (ed25519 keys) |
---
## 10. System Relationships
```
RECON → Knowledge extraction & vector storage
Aurora → LLM assistant, queries RECON for knowledge (RAG)
ARGUS → Intelligence/OSINT platform, feeds intel into RECON
PeerTube → Video library (curated educational/preparedness content)
WATCHTOWER → Unified infrastructure monitoring
echo6.co → SearXNG search homepage (branded, cyberpunk theme)
auth.echo6.co → Authentik SSO (themed to match Echo6 brand)
files.echo6.co → Document/PDF download server
```
---
## 11. Technology Stack Summary
| Layer | Technologies |
|-------|-------------|
| Virtualization | Proxmox (5 nodes) |
| Networking | Tailscale/Headscale, Caddy, nginx, dnsmasq |
| GPU compute | NVIDIA RTX A4000 (CUDA, NVENC, Tensor) |
| AI/ML | Gemini 2.0 Flash, Ollama, TEI (bge-m3), JOSIEFIED Qwen3 8B |
| Vector DB | Qdrant (HNSW index, cosine similarity) |
| Databases | SQLite (RECON), PostgreSQL (PeerTube) |
| Video | PeerTube v8, yt-dlp, ffmpeg/NVENC, Whisper |
| Search | SearXNG (custom Echo6 theme) |
| Auth | Authentik (OIDC, custom Echo6 theme) |
| Storage | NFS (pi-nas, 18TB), Contabo VPS (919GB backup) |
| Languages | Python (RECON pipeline), Node.js (PeerTube), Bash |
| OS | Ubuntu 24.04 (RECON), Debian 12 (PeerTube) |
---
## 12. Design / Copy Hooks for Landing Page
### Tagline Ideas (raw material)
- 13,000+ documents processed into searchable knowledge
- 99 curated educational channels, GPU-transcoded and self-hosted
- AI-powered knowledge retrieval across military doctrine, survival, trades, and communications
- Fully self-hosted, zero cloud dependency for core services
- From PDF to actionable knowledge in 50 hours
### Key Differentiators
- **Self-hosted everything**: LLM, video, auth, DNS, monitoring — no SaaS dependencies
- **Knowledge pipeline**: Not just storing PDFs — extracting, structuring, and making them searchable via AI
- **GPU-accelerated**: Single RTX A4000 powers video transcoding, AI inference, embeddings, and transcription simultaneously (different silicon paths)
- **Resilient by design**: JSON-first storage, automated backups, full rebuild capability, exponential backoff
- **Curated content**: 99 hand-picked channels across tactical, trades, comms, medical, and education
- **Cohesive brand**: Cyberpunk aesthetic applied consistently across search, auth, and all services
### Numbers That Pop
- 95,000+ knowledge vectors
- 13,239 documents
- 99 video channels
- 1,711 embeddings/second
- <10ms search latency
- 22-field concept schema
- 6 knowledge domains
- 5 Proxmox nodes
- 18TB storage
- ~15.3TB video library
- 11 services in the waffle menu