mirror of
https://github.com/zvx-echo6/meshai.git
synced 2026-05-21 23:24:44 +02:00
Initial commit: MeshAI - LLM-powered Meshtastic assistant
Features: - Multi-backend LLM support (OpenAI, Anthropic, Google) - Rolling summary memory for token optimization (~70-80% reduction) - Per-user conversation history with SQLite persistence - Bang commands (!help, !ping, !reset, !status, !weather) - Meshtastic integration via serial or TCP - Message chunking for mesh network constraints (150 char limit) - Rate limiting to prevent network congestion - Rich TUI configurator - Docker support 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
commit
fd3f995ebb
43 changed files with 7947 additions and 0 deletions
254
docs/memory_approaches_comparison.txt
Normal file
254
docs/memory_approaches_comparison.txt
Normal file
|
|
@ -0,0 +1,254 @@
|
|||
╔════════════════════════════════════════════════════════════════════════════════╗
|
||||
║ LLM MEMORY APPROACHES COMPARISON ║
|
||||
╚════════════════════════════════════════════════════════════════════════════════╝
|
||||
|
||||
┌────────────────────────────────────────────────────────────────────────────────┐
|
||||
│ 1. FULL HISTORY (Current MeshAI Implementation) │
|
||||
├────────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ Request 1: [System] + [Msg1, Msg2] = 200 tokens │
|
||||
│ Request 5: [System] + [Msg1...Msg10] = 1000 tokens │
|
||||
│ Request 10: [System] + [Msg1...Msg20] = 2000 tokens │
|
||||
│ Request 20: [System] + [Msg1...Msg40] = 4000 tokens │
|
||||
│ │
|
||||
│ ✓ Complete context │
|
||||
│ ✗ Linear growth in tokens │
|
||||
│ ✗ Expensive and slow for long conversations │
|
||||
│ ✗ Redundant - most messages not relevant to current query │
|
||||
│ │
|
||||
└────────────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌────────────────────────────────────────────────────────────────────────────────┐
|
||||
│ 2. WINDOW MEMORY (Keep Last N Only) │
|
||||
├────────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ Request 1: [System] + [Msg1, Msg2] = 200 tokens │
|
||||
│ Request 5: [System] + [Msg7, Msg8, Msg9, Msg10] = 500 tokens │
|
||||
│ Request 10: [System] + [Msg17, Msg18, Msg19, Msg20] = 500 tokens │
|
||||
│ Request 20: [System] + [Msg37, Msg38, Msg39, Msg40] = 500 tokens │
|
||||
│ │
|
||||
│ ✓ Constant token usage │
|
||||
│ ✓ Very fast and cheap │
|
||||
│ ✗ Completely forgets old context │
|
||||
│ ✗ Can't reference earlier conversation │
|
||||
│ │
|
||||
└────────────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌────────────────────────────────────────────────────────────────────────────────┐
|
||||
│ 3. ROLLING SUMMARY (RECOMMENDED) │
|
||||
├────────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ Request 1-5: [System] + [Msg1...Msg10] = 1000 tokens │
|
||||
│ (Short conversation - no summary yet) │
|
||||
│ │
|
||||
│ Request 10+: [System + Summary] + [Recent 8 msgs] = 600 tokens │
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────┐ │
|
||||
│ │ Summary: "User discussed weather │ │
|
||||
│ │ and hiking. Mt Si is 4hr moderate │ │
|
||||
│ │ hike, Rattlesnake is 2mi easier." │ (100 tokens) │
|
||||
│ └─────────────────────────────────────┘ │
|
||||
│ ↓ │
|
||||
│ ┌─────────────────────────────────────┐ │
|
||||
│ │ User: How crowded does it get? │ │
|
||||
│ │ Assistant: Very crowded weekends │ │
|
||||
│ │ User: Any other trails nearby? │ (400 tokens) │
|
||||
│ │ Assistant: Rattlesnake is closer │ │
|
||||
│ │ ... (last 4 exchanges) │ │
|
||||
│ └─────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ Request 20: [System + Summary] + [Recent 8 msgs] = 600 tokens │
|
||||
│ (Summary updated every ~8 new messages) │
|
||||
│ │
|
||||
│ ✓ Balanced token usage (70-80% reduction) │
|
||||
│ ✓ Preserves long-term context via summary │
|
||||
│ ✓ Recent messages in full detail │
|
||||
│ ✓ Scalable to very long conversations │
|
||||
│ ✗ Small overhead for summary generation (1-2s every 8-10 msgs) │
|
||||
│ │
|
||||
└────────────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌────────────────────────────────────────────────────────────────────────────────┐
|
||||
│ 4. VECTOR STORE MEMORY (ChromaDB/Qdrant) │
|
||||
├────────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ Current query: "What trails are nearby?" │
|
||||
│ ↓ (embed and search) │
|
||||
│ ┌──────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Vector DB: Find semantically similar past messages │ │
|
||||
│ │ - "Mt Si is a moderate 4-hour hike" (score: 0.89) │ │
|
||||
│ │ - "Rattlesnake Ledge has lake views" (score: 0.85) │ │
|
||||
│ │ - "Bring water and snacks" (score: 0.62) │ │
|
||||
│ └──────────────────────────────────────────────────────────────────┘ │
|
||||
│ ↓ │
|
||||
│ [System + Top 3 relevant] + [Current query] = 500 tokens │
|
||||
│ │
|
||||
│ ✓ Semantic retrieval - finds relevant context │
|
||||
│ ✓ Works for sparse conversations │
|
||||
│ ✓ Enables cross-conversation search │
|
||||
│ ✗ Requires embeddings (API calls or local model) │
|
||||
│ ✗ Adds complexity (vector DB, indexing) │
|
||||
│ ✗ May retrieve irrelevant "similar" messages │
|
||||
│ │
|
||||
└────────────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌────────────────────────────────────────────────────────────────────────────────┐
|
||||
│ 5. MEMGPT/LETTA (Self-Editing Memory) │
|
||||
├────────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌───────────────────────────────────┐ │
|
||||
│ │ Core Memory (always in context): │ │
|
||||
│ │ - User: Matt │ (50 tokens) │
|
||||
│ │ - Preferences: Metric units │ │
|
||||
│ └───────────────────────────────────┘ │
|
||||
│ ↓ │
|
||||
│ ┌───────────────────────────────────┐ │
|
||||
│ │ Recall Memory (vector search): │ │
|
||||
│ │ - [Retrieved: 3 relevant msgs] │ (300 tokens) │
|
||||
│ └───────────────────────────────────┘ │
|
||||
│ ↓ │
|
||||
│ ┌───────────────────────────────────┐ │
|
||||
│ │ Archival Memory (long-term): │ │
|
||||
│ │ - [Searchable but not loaded] │ │
|
||||
│ └───────────────────────────────────┘ │
|
||||
│ │
|
||||
│ Agent decides what to remember/forget/search │
|
||||
│ │
|
||||
│ ✓ Most sophisticated - agent manages own memory │
|
||||
│ ✓ Handles complex multi-day conversations │
|
||||
│ ✗ Very heavy (200MB+ dependencies) │
|
||||
│ ✗ Requires vector embeddings │
|
||||
│ ✗ Overkill for simple chat │
|
||||
│ ✗ Opinionated architecture (hard to integrate) │
|
||||
│ │
|
||||
└────────────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
╔════════════════════════════════════════════════════════════════════════════════╗
|
||||
║ RECOMMENDATION MATRIX ║
|
||||
╚════════════════════════════════════════════════════════════════════════════════╝
|
||||
|
||||
┌──────────────┬──────────────┬────────────┬──────────────┬──────────────────────┐
|
||||
│ Approach │ Dependencies │ Tokens │ Complexity │ Use Case │
|
||||
├──────────────┼──────────────┼────────────┼──────────────┼──────────────────────┤
|
||||
│ Full History │ None │ High │ Low │ Don't use (baseline) │
|
||||
├──────────────┼──────────────┼────────────┼──────────────┼──────────────────────┤
|
||||
│ Window Only │ None │ Low │ Low │ Stateless chat bots │
|
||||
├──────────────┼──────────────┼────────────┼──────────────┼──────────────────────┤
|
||||
│ Rolling │ │ │ │ ✓ MESHAI │
|
||||
│ Summary │ None │ Very Low │ Low │ ✓ Most projects │
|
||||
│ (DIY) │ │ │ │ ✓ Best balance │
|
||||
├──────────────┼──────────────┼────────────┼──────────────┼──────────────────────┤
|
||||
│ LangChain │ ~50 MB │ Very Low │ Medium │ Want batteries- │
|
||||
│ Summary │ │ │ │ included solution │
|
||||
├──────────────┼──────────────┼────────────┼──────────────┼──────────────────────┤
|
||||
│ Vector Store │ ~20 MB │ Low │ Medium │ Semantic search, │
|
||||
│ (ChromaDB) │ │ │ │ long-term memory │
|
||||
├──────────────┼──────────────┼────────────┼──────────────┼──────────────────────┤
|
||||
│ MemGPT/Letta │ ~200 MB │ Low │ Very High │ Complex multi-day │
|
||||
│ │ │ │ │ agent workflows │
|
||||
└──────────────┴──────────────┴────────────┴──────────────┴──────────────────────┘
|
||||
|
||||
╔════════════════════════════════════════════════════════════════════════════════╗
|
||||
║ PERFORMANCE COMPARISON (20 messages) ║
|
||||
╚════════════════════════════════════════════════════════════════════════════════╝
|
||||
|
||||
Tokens Sent to LLM
|
||||
↑
|
||||
│
|
||||
4000│ ████████████████████████████████ Full History
|
||||
│
|
||||
3000│
|
||||
│
|
||||
2000│
|
||||
│
|
||||
1000│
|
||||
│
|
||||
600│ ██████ Rolling Summary
|
||||
500│ █████ Window Only
|
||||
│ █████ Vector Store
|
||||
0└─────────────────────────────────────────────────────────→
|
||||
1 5 10 15 20 25 30 35 40 (Conversation length)
|
||||
|
||||
Legend:
|
||||
████ Full History (linear growth)
|
||||
████ Rolling Summary (plateau after initial growth)
|
||||
████ Window/Vector (constant)
|
||||
|
||||
|
||||
╔════════════════════════════════════════════════════════════════════════════════╗
|
||||
║ IMPLEMENTATION COMPLEXITY ║
|
||||
╚════════════════════════════════════════════════════════════════════════════════╝
|
||||
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ Simple ←───────────────────────────────────────────────────→ Complex │
|
||||
├─────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ Window Only Rolling Summary LangChain MemGPT │
|
||||
│ (20 lines) (100 lines) (10 lines (200+ lines │
|
||||
│ + 50MB dep) + 200MB dep) │
|
||||
│ │
|
||||
│ ↑ ↑ ↑ ↑ │
|
||||
│ No deps No deps Heavy deps Very heavy │
|
||||
│ No persistence SQLite persist In-memory Built-in DB │
|
||||
│ Loses old context Keeps summary Keeps summary Multi-tier │
|
||||
│ │
|
||||
│ ★ RECOMMENDED ★ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
╔════════════════════════════════════════════════════════════════════════════════╗
|
||||
║ FOR MESHAI SPECIFICALLY ║
|
||||
╚════════════════════════════════════════════════════════════════════════════════╝
|
||||
|
||||
Current:
|
||||
- Messages: 150 chars max (very small)
|
||||
- Conversations: Per-user, linear
|
||||
- Backend: OpenAI-compatible (LiteLLM, local models)
|
||||
- Storage: SQLite + aiosqlite
|
||||
- Problem: Full history sent every time
|
||||
|
||||
Constraints:
|
||||
- Lightweight (runs on mesh nodes potentially)
|
||||
- No heavy dependencies
|
||||
- Must work offline (local models)
|
||||
- Persistence required (survive restarts)
|
||||
|
||||
Solution: Rolling Summary
|
||||
✓ Zero dependencies (pure Python)
|
||||
✓ Works with existing AsyncOpenAI client
|
||||
✓ Persists in existing SQLite database
|
||||
✓ ~100 lines of code (easy to maintain)
|
||||
✓ 70-80% token reduction
|
||||
✓ Tunable (window_size, summarize_threshold)
|
||||
|
||||
Configuration:
|
||||
- window_size = 4 (keep last 4 exchanges = 8 messages)
|
||||
- summarize_threshold = 8 (re-summarize after 8 new messages)
|
||||
|
||||
Expected savings:
|
||||
- 10 messages: 0% (no summary yet)
|
||||
- 20 messages: 66% token reduction
|
||||
- 30 messages: 75% token reduction
|
||||
- 50 messages: 84% token reduction
|
||||
|
||||
Cost impact (at $0.50/1M tokens):
|
||||
- Before: $0.0012 per request (2400 tokens)
|
||||
- After: $0.0003 per request (600 tokens)
|
||||
- Savings: $27/month for 1000 requests/day
|
||||
|
||||
╔════════════════════════════════════════════════════════════════════════════════╗
|
||||
║ NEXT STEPS ║
|
||||
╚════════════════════════════════════════════════════════════════════════════════╝
|
||||
|
||||
1. Read: MEMORY_SUMMARY.md (quick overview)
|
||||
2. Study: MEMORY_RESEARCH.md (detailed analysis)
|
||||
3. Test: python examples/memory_comparison.py (see it in action)
|
||||
4. Build: MEMORY_IMPLEMENTATION_GUIDE.md (step-by-step)
|
||||
5. Deploy: Monitor and tune based on real usage
|
||||
|
||||
Files created:
|
||||
- /home/zvx/projects/meshai/MEMORY_SUMMARY.md
|
||||
- /home/zvx/projects/meshai/MEMORY_RESEARCH.md
|
||||
- /home/zvx/projects/meshai/MEMORY_IMPLEMENTATION_GUIDE.md
|
||||
- /home/zvx/projects/meshai/examples/memory_comparison.py
|
||||
|
||||
Good luck! 🚀
|
||||
Loading…
Add table
Add a link
Reference in a new issue