Initial commit: MeshAI - LLM-powered Meshtastic assistant

Features: - Multi-backend LLM support (OpenAI, Anthropic, Google) - Rolling summary memory for token optimization (~70-80% reduction) - Per-user conversation history with SQLite persistence - Bang commands (!help, !ping, !reset, !status, !weather) - Meshtastic integration via serial or TCP - Message chunking for mesh network constraints (150 char limit) - Rate limiting to prevent network congestion - Rich TUI configurator - Docker support 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-21 23:24:44 +02:00 · 2025-12-15 11:53:46 -07:00 · 2025-12-15 11:53:46 -07:00 · fd3f995ebb
commit fd3f995ebb
43 changed files with 7947 additions and 0 deletions
--- a/docs/memory_approaches_comparison.txt
+++ b/docs/memory_approaches_comparison.txt
@ -0,0 +1,254 @@
+╔════════════════════════════════════════════════════════════════════════════════╗
+║                    LLM MEMORY APPROACHES COMPARISON                            ║
+╚════════════════════════════════════════════════════════════════════════════════╝
+
+┌────────────────────────────────────────────────────────────────────────────────┐
+│ 1. FULL HISTORY (Current MeshAI Implementation)                               │
+├────────────────────────────────────────────────────────────────────────────────┤
+│                                                                                │
+│  Request 1:  [System] + [Msg1, Msg2]                    = 200 tokens          │
+│  Request 5:  [System] + [Msg1...Msg10]                  = 1000 tokens         │
+│  Request 10: [System] + [Msg1...Msg20]                  = 2000 tokens         │
+│  Request 20: [System] + [Msg1...Msg40]                  = 4000 tokens         │
+│                                                                                │
+│  ✓ Complete context                                                           │
+│  ✗ Linear growth in tokens                                                    │
+│  ✗ Expensive and slow for long conversations                                  │
+│  ✗ Redundant - most messages not relevant to current query                    │
+│                                                                                │
+└────────────────────────────────────────────────────────────────────────────────┘
+
+┌────────────────────────────────────────────────────────────────────────────────┐
+│ 2. WINDOW MEMORY (Keep Last N Only)                                           │
+├────────────────────────────────────────────────────────────────────────────────┤
+│                                                                                │
+│  Request 1:  [System] + [Msg1, Msg2]                    = 200 tokens          │
+│  Request 5:  [System] + [Msg7, Msg8, Msg9, Msg10]       = 500 tokens          │
+│  Request 10: [System] + [Msg17, Msg18, Msg19, Msg20]    = 500 tokens          │
+│  Request 20: [System] + [Msg37, Msg38, Msg39, Msg40]    = 500 tokens          │
+│                                                                                │
+│  ✓ Constant token usage                                                       │
+│  ✓ Very fast and cheap                                                        │
+│  ✗ Completely forgets old context                                             │
+│  ✗ Can't reference earlier conversation                                       │
+│                                                                                │
+└────────────────────────────────────────────────────────────────────────────────┘
+
+┌────────────────────────────────────────────────────────────────────────────────┐
+│ 3. ROLLING SUMMARY (RECOMMENDED)                                              │
+├────────────────────────────────────────────────────────────────────────────────┤
+│                                                                                │
+│  Request 1-5:  [System] + [Msg1...Msg10]                = 1000 tokens         │
+│                (Short conversation - no summary yet)                           │
+│                                                                                │
+│  Request 10+:  [System + Summary] + [Recent 8 msgs]     = 600 tokens          │
+│                                                                                │
+│                ┌─────────────────────────────────────┐                         │
+│                │ Summary: "User discussed weather    │                         │
+│                │ and hiking. Mt Si is 4hr moderate   │                         │
+│                │ hike, Rattlesnake is 2mi easier."   │  (100 tokens)          │
+│                └─────────────────────────────────────┘                         │
+│                           ↓                                                    │
+│                ┌─────────────────────────────────────┐                         │
+│                │ User: How crowded does it get?      │                         │
+│                │ Assistant: Very crowded weekends    │                         │
+│                │ User: Any other trails nearby?      │  (400 tokens)          │
+│                │ Assistant: Rattlesnake is closer    │                         │
+│                │ ... (last 4 exchanges)              │                         │
+│                └─────────────────────────────────────┘                         │
+│                                                                                │
+│  Request 20:   [System + Summary] + [Recent 8 msgs]     = 600 tokens          │
+│                (Summary updated every ~8 new messages)                         │
+│                                                                                │
+│  ✓ Balanced token usage (70-80% reduction)                                    │
+│  ✓ Preserves long-term context via summary                                    │
+│  ✓ Recent messages in full detail                                             │
+│  ✓ Scalable to very long conversations                                        │
+│  ✗ Small overhead for summary generation (1-2s every 8-10 msgs)               │
+│                                                                                │
+└────────────────────────────────────────────────────────────────────────────────┘
+
+┌────────────────────────────────────────────────────────────────────────────────┐
+│ 4. VECTOR STORE MEMORY (ChromaDB/Qdrant)                                      │
+├────────────────────────────────────────────────────────────────────────────────┤
+│                                                                                │
+│  Current query: "What trails are nearby?"                                     │
+│                     ↓ (embed and search)                                      │
+│  ┌──────────────────────────────────────────────────────────────────┐         │
+│  │ Vector DB: Find semantically similar past messages               │         │
+│  │  - "Mt Si is a moderate 4-hour hike" (score: 0.89)               │         │
+│  │  - "Rattlesnake Ledge has lake views" (score: 0.85)              │         │
+│  │  - "Bring water and snacks" (score: 0.62)                        │         │
+│  └──────────────────────────────────────────────────────────────────┘         │
+│                     ↓                                                          │
+│  [System + Top 3 relevant] + [Current query]             = 500 tokens         │
+│                                                                                │
+│  ✓ Semantic retrieval - finds relevant context                                │
+│  ✓ Works for sparse conversations                                             │
+│  ✓ Enables cross-conversation search                                          │
+│  ✗ Requires embeddings (API calls or local model)                             │
+│  ✗ Adds complexity (vector DB, indexing)                                      │
+│  ✗ May retrieve irrelevant "similar" messages                                 │
+│                                                                                │
+└────────────────────────────────────────────────────────────────────────────────┘
+
+┌────────────────────────────────────────────────────────────────────────────────┐
+│ 5. MEMGPT/LETTA (Self-Editing Memory)                                         │
+├────────────────────────────────────────────────────────────────────────────────┤
+│                                                                                │
+│  ┌───────────────────────────────────┐                                        │
+│  │ Core Memory (always in context):  │                                        │
+│  │  - User: Matt                     │  (50 tokens)                           │
+│  │  - Preferences: Metric units      │                                        │
+│  └───────────────────────────────────┘                                        │
+│                ↓                                                               │
+│  ┌───────────────────────────────────┐                                        │
+│  │ Recall Memory (vector search):    │                                        │
+│  │  - [Retrieved: 3 relevant msgs]   │  (300 tokens)                          │
+│  └───────────────────────────────────┘                                        │
+│                ↓                                                               │
+│  ┌───────────────────────────────────┐                                        │
+│  │ Archival Memory (long-term):      │                                        │
+│  │  - [Searchable but not loaded]    │                                        │
+│  └───────────────────────────────────┘                                        │
+│                                                                                │
+│  Agent decides what to remember/forget/search                                 │
+│                                                                                │
+│  ✓ Most sophisticated - agent manages own memory                              │
+│  ✓ Handles complex multi-day conversations                                    │
+│  ✗ Very heavy (200MB+ dependencies)                                           │
+│  ✗ Requires vector embeddings                                                 │
+│  ✗ Overkill for simple chat                                                   │
+│  ✗ Opinionated architecture (hard to integrate)                               │
+│                                                                                │
+└────────────────────────────────────────────────────────────────────────────────┘
+
+╔════════════════════════════════════════════════════════════════════════════════╗
+║                         RECOMMENDATION MATRIX                                  ║
+╚════════════════════════════════════════════════════════════════════════════════╝
+
+┌──────────────┬──────────────┬────────────┬──────────────┬──────────────────────┐
+│   Approach   │ Dependencies │   Tokens   │  Complexity  │    Use Case          │
+├──────────────┼──────────────┼────────────┼──────────────┼──────────────────────┤
+│ Full History │     None     │    High    │     Low      │ Don't use (baseline) │
+├──────────────┼──────────────┼────────────┼──────────────┼──────────────────────┤
+│ Window Only  │     None     │    Low     │     Low      │ Stateless chat bots  │
+├──────────────┼──────────────┼────────────┼──────────────┼──────────────────────┤
+│ Rolling      │              │            │              │ ✓ MESHAI             │
+│ Summary      │     None     │ Very Low   │     Low      │ ✓ Most projects      │
+│ (DIY)        │              │            │              │ ✓ Best balance       │
+├──────────────┼──────────────┼────────────┼──────────────┼──────────────────────┤
+│ LangChain    │   ~50 MB     │ Very Low   │    Medium    │ Want batteries-      │
+│ Summary      │              │            │              │ included solution    │
+├──────────────┼──────────────┼────────────┼──────────────┼──────────────────────┤
+│ Vector Store │   ~20 MB     │    Low     │    Medium    │ Semantic search,     │
+│ (ChromaDB)   │              │            │              │ long-term memory     │
+├──────────────┼──────────────┼────────────┼──────────────┼──────────────────────┤
+│ MemGPT/Letta │  ~200 MB     │    Low     │  Very High   │ Complex multi-day    │
+│              │              │            │              │ agent workflows      │
+└──────────────┴──────────────┴────────────┴──────────────┴──────────────────────┘
+
+╔════════════════════════════════════════════════════════════════════════════════╗
+║                     PERFORMANCE COMPARISON (20 messages)                       ║
+╚════════════════════════════════════════════════════════════════════════════════╝
+
+  Tokens Sent to LLM
+  ↑
+  │
+4000│  ████████████████████████████████  Full History
+  │
+3000│
+  │
+2000│
+  │
+1000│
+  │
+ 600│           ██████  Rolling Summary
+ 500│                   █████  Window Only
+  │                    █████  Vector Store
+  0└─────────────────────────────────────────────────────────→
+     1    5   10   15   20   25   30   35   40  (Conversation length)
+
+  Legend:
+  ████  Full History (linear growth)
+  ████  Rolling Summary (plateau after initial growth)
+  ████  Window/Vector (constant)
+
+
+╔════════════════════════════════════════════════════════════════════════════════╗
+║                    IMPLEMENTATION COMPLEXITY                                   ║
+╚════════════════════════════════════════════════════════════════════════════════╝
+
+┌─────────────────────────────────────────────────────────────────────────────┐
+│  Simple ←───────────────────────────────────────────────────→ Complex       │
+├─────────────────────────────────────────────────────────────────────────────┤
+│                                                                             │
+│  Window Only          Rolling Summary       LangChain        MemGPT        │
+│  (20 lines)           (100 lines)           (10 lines       (200+ lines    │
+│                                             + 50MB dep)      + 200MB dep)   │
+│                                                                             │
+│  ↑                    ↑                     ↑                ↑              │
+│  No deps              No deps               Heavy deps       Very heavy     │
+│  No persistence       SQLite persist        In-memory        Built-in DB    │
+│  Loses old context    Keeps summary         Keeps summary    Multi-tier     │
+│                                                                             │
+│                       ★ RECOMMENDED ★                                       │
+└─────────────────────────────────────────────────────────────────────────────┘
+
+╔════════════════════════════════════════════════════════════════════════════════╗
+║                      FOR MESHAI SPECIFICALLY                                   ║
+╚════════════════════════════════════════════════════════════════════════════════╝
+
+Current:
+  - Messages: 150 chars max (very small)
+  - Conversations: Per-user, linear
+  - Backend: OpenAI-compatible (LiteLLM, local models)
+  - Storage: SQLite + aiosqlite
+  - Problem: Full history sent every time
+
+Constraints:
+  - Lightweight (runs on mesh nodes potentially)
+  - No heavy dependencies
+  - Must work offline (local models)
+  - Persistence required (survive restarts)
+
+Solution: Rolling Summary
+  ✓ Zero dependencies (pure Python)
+  ✓ Works with existing AsyncOpenAI client
+  ✓ Persists in existing SQLite database
+  ✓ ~100 lines of code (easy to maintain)
+  ✓ 70-80% token reduction
+  ✓ Tunable (window_size, summarize_threshold)
+
+Configuration:
+  - window_size = 4 (keep last 4 exchanges = 8 messages)
+  - summarize_threshold = 8 (re-summarize after 8 new messages)
+
+Expected savings:
+  - 10 messages: 0% (no summary yet)
+  - 20 messages: 66% token reduction
+  - 30 messages: 75% token reduction
+  - 50 messages: 84% token reduction
+
+Cost impact (at $0.50/1M tokens):
+  - Before: $0.0012 per request (2400 tokens)
+  - After:  $0.0003 per request (600 tokens)
+  - Savings: $27/month for 1000 requests/day
+
+╔════════════════════════════════════════════════════════════════════════════════╗
+║                              NEXT STEPS                                        ║
+╚════════════════════════════════════════════════════════════════════════════════╝
+
+1. Read:   MEMORY_SUMMARY.md (quick overview)
+2. Study:  MEMORY_RESEARCH.md (detailed analysis)
+3. Test:   python examples/memory_comparison.py (see it in action)
+4. Build:  MEMORY_IMPLEMENTATION_GUIDE.md (step-by-step)
+5. Deploy: Monitor and tune based on real usage
+
+Files created:
+  - /home/zvx/projects/meshai/MEMORY_SUMMARY.md
+  - /home/zvx/projects/meshai/MEMORY_RESEARCH.md
+  - /home/zvx/projects/meshai/MEMORY_IMPLEMENTATION_GUIDE.md
+  - /home/zvx/projects/meshai/examples/memory_comparison.py
+
+Good luck! 🚀