feat: Hybrid RAG knowledge base, sentence-aware chunking, MeshMonitor HTTP sync

Knowledge Base: - Hybrid FTS5 + vector search using sqlite-vec and bge-small-en-v1.5 - Reciprocal Rank Fusion for result merging - Domain-aware query construction handles typos - Configurable weights for keyword vs semantic matching Message Chunking: - Sentence-aware splitting respects message boundaries - Continuation prompts for long responses - Natural follow-up detection (yes, ok, continue, more, etc.) - Per-user continuation state management MeshMonitor Integration: - HTTP API trigger sync (replaces file-based triggers.json) - Dynamic refresh interval - Trigger injection into LLM prompt Other: - Updated system prompt for better response length control - Simplified responder to handle message lists - Updated README with new features and architecture diagram - Cleaned up config.example.yaml Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-21 15:14:45 +02:00 · 2026-05-04 07:44:12 +00:00 · 2026-05-04 07:44:12 +00:00 · 0e36869a5f
commit 0e36869a5f
parent e65a558c6f
14 changed files with 986 additions and 464 deletions
--- a/2
+++ b/2
@ -58,6 +58,8 @@ WORKDIR /app
 # Copy requirements first for layer caching
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
+# Pre-download embedding model for hybrid search
+RUN python3 -c "from fastembed import TextEmbedding; TextEmbedding('BAAI/bge-small-en-v1.5')"

 # Copy application code
 COPY --chown=meshai:meshai meshai/ ./meshai/
--- a/README.md
+++ b/README.md
@ -1,268 +1,329 @@
-# MeshAI
-
-LLM-powered assistant for Meshtastic mesh networks.
-
-## Features
-
- **LLM Chat**: Responds to @mentions and DMs with AI-generated responses
- **Multi-backend**: Supports OpenAI, Anthropic Claude, Google Gemini, and local LLMs via LiteLLM
- **Bang Commands**: `!help`, `!ping`, `!reset`, `!status`, `!weather`
- **Conversation History**: Per-user context maintained in SQLite
- **Smart Chunking**: Automatically splits long responses for mesh transmission
- **Rate Limiting**: Configurable delays to avoid flooding the mesh
- **advBBS Compatible**: Runs alongside [advBBS](https://github.com/NovaNexusMesh/advBBS) on the same node — protocol sync messages and mail notifications are automatically filtered
- **Rich Configurator**: Interactive TUI for easy setup
- **MeshMonitor Integration**: Syncs with [MeshMonitor](https://github.com/Yeraze/meshmonitor) by Yeraze to avoid duplicate responses
-
-## Installation
-
-```bash
-# Clone the repository
-git clone https://github.com/zvx-echo6/meshai.git
-cd meshai
-
-# Install with pip
-pip install -e .
-
-# Or install dependencies manually
-pip install -r requirements.txt
-```
-
-## Quick Start
-
-```bash
-# Run the configurator
-meshai --config
-
-# Or copy and edit the example config
-cp config.example.yaml config.yaml
-# Edit config.yaml with your settings
-
-# Run the bot
-meshai
-```
-
-## Configuration
-
-Run `meshai --config` to launch the interactive configurator, or edit `config.yaml` directly.
-
-### Key Settings
-
-```yaml
-bot:
-  name: "ai"                    # @mention trigger
-  respond_to_mentions: true
-  respond_to_dms: true
-
-connection:
-  type: "serial"               # serial or tcp
-  serial_port: "/dev/ttyUSB0"
-
-llm:
-  backend: "openai"            # openai, anthropic, google
-  api_key: "your-api-key"
-  model: "gpt-4o-mini"
-```
-
-### Using Local LLMs
-
-MeshAI works with any OpenAI-compatible API, including:
-
- **LiteLLM**: `base_url: "http://localhost:4000/v1"`
- **Open WebUI**: `base_url: "http://localhost:3000/api"`
- **Ollama**: `base_url: "http://localhost:11434/v1"`
-
-## Commands
-
-| Command | Description |
-|---------|-------------|
-| `!help` | Show available commands |
-| `!ping` | Test connectivity |
-| `!reset` | Clear your conversation history |
-| `!status` | Show bot status and stats |
-| `!weather [location]` | Get weather (uses GPS if no location given) |
-
-## Usage Examples
-
-**Chat via @mention:**
-```
-@ai What's the weather like today?
-> Seattle: 52F, Partly Cloudy, Wind 8mph
-```
-
-**Direct message:**
-```
-DM: Tell me a short joke
-> Why don't scientists trust atoms? They make up everything!
-```
-
-**Weather command:**
-```
-!weather Portland
-> Portland: 48F, Rain, Wind 12mph
-```
-
-## Architecture
-
-```
-┌─────────────────────────────────────────────────────────────┐
-│                        MeshAI                                │
-├─────────────────────────────────────────────────────────────┤
-│  ┌─────────────┐    ┌─────────────┐    ┌─────────────────┐ │
-│  │  Meshtastic │    │   Message   │    │   LLM Backend   │ │
-│  │  Connector  │───▶│   Router    │───▶│   (pluggable)   │ │
-│  │ Serial/TCP  │    │             │    │                 │ │
-│  └─────────────┘    └─────────────┘    └─────────────────┘ │
-│         │                 │                    │            │
-│         │           ┌─────▼─────┐              │            │
-│         │           │ Conversation│             │            │
-│         │           │  History   │◀────────────┘            │
-│         │           │  (SQLite)  │                          │
-│         │           └───────────┘                           │
-│         │                                                   │
-│         ▼                                                   │
-│  ┌─────────────┐                                           │
-│  │  Responder  │  - 2.2-3s delay                           │
-│  │             │  - Chunk to 150 chars                     │
-│  │             │  - Max 2 messages                         │
-│  └─────────────┘                                           │
-└─────────────────────────────────────────────────────────────┘
-```
-
-## Docker
-
-### Quick Start with Docker
-
-```bash
-# Create working directory
-mkdir -p meshai/data && cd meshai
-
-# Download docker-compose file
-curl -O https://raw.githubusercontent.com/zvx-echo6/meshai/main/docker-compose.yml
-
-# Copy and edit config
-curl -o data/config.yaml https://raw.githubusercontent.com/zvx-echo6/meshai/main/config.example.yaml
-# Edit data/config.yaml with your settings
-
-# Start
-docker compose up -d
-
-# View logs
-docker compose logs -f
-```
-
-### Docker Configuration
-
-**TCP Connection** (recommended for Docker):
-```yaml
-# data/config.yaml
-connection:
-  type: "tcp"
-  tcp_host: "192.168.1.100"  # Your Meshtastic node IP
-  tcp_port: 4403
-```
-
-**Serial Connection**:
-```yaml
-# data/config.yaml
-connection:
-  type: "serial"
-  serial_port: "/dev/ttyUSB0"
-```
-
-Then edit `docker-compose.serial.yml` to match your device path.
-
-### Environment Variables
-
-You can pass the API key via environment variable instead of config file:
-
-```bash
-LLM_API_KEY=your-key-here docker compose up -d
-```
-
-Or create a `.env` file:
-```bash
-LLM_API_KEY=your-key-here
-```
-
-### View Logs
-
-```bash
-docker compose logs -f meshai
-```
-
-## Running Alongside advBBS
-
-MeshAI is designed to coexist with [advBBS](https://github.com/NovaNexusMesh/advBBS) on the same Meshtastic node. Both connect via TCP to meshtasticd and share the radio, but MeshAI automatically ignores advBBS traffic:
-
- **Sync protocol** — `MAILREQ|`, `MAILACK|`, `MAILDAT|`, `BOARDREQ|`, etc.
- **RAP protocol** — `advBBS|` pings, pongs, and route advertisements
- **Mail notifications** — `[MAIL]` new message alerts
- **Bang commands in DMs** — `!mail`, `!board`, etc. are left for advBBS to handle
-
-No special configuration is needed. The filter is enabled by default and can be toggled in `config.yaml`:
-
-```yaml
-bot:
-  filter_bbs_protocols: true   # set false to disable
-```
-
-Plain-text BBS responses (e.g. "Welcome back, matt!") are indistinguishable from normal user messages and will be processed normally — this is a known and accepted limitation.
-
-## MeshMonitor Integration
-
-MeshAI integrates with [MeshMonitor](https://github.com/Yeraze/meshmonitor), a comprehensive Meshtastic monitoring platform by Yeraze. When enabled, MeshAI automatically fetches MeshMonitor's auto-responder trigger patterns and ignores messages that MeshMonitor handles, preventing duplicate responses on the mesh.
-
-**Features:**
- Automatic trigger discovery via MeshMonitor's API
- Dynamic ignore list — no manual sync needed
- Trigger list injected into the LLM prompt so MeshAI can discuss MeshMonitor commands conversationally
- Configurable via TUI (option 9) or config.yaml
-
-**Configuration:**
-
-```yaml
-meshmonitor:
-  enabled: true
-  url: "http://192.168.1.100:8080"
-  inject_into_prompt: true
-  refresh_interval: 300
-```
-
-MeshMonitor is a separate project — get it at https://github.com/Yeraze/meshmonitor
-
-## Running as a Service
-
-Create `/etc/systemd/system/meshai.service`:
-
-```ini
-[Unit]
-Description=MeshAI - Meshtastic LLM Assistant
-After=network.target
-
-[Service]
-Type=simple
-User=your-user
-WorkingDirectory=/path/to/meshai
-ExecStart=/usr/bin/python3 -m meshai
-Restart=always
-RestartSec=10
-
-[Install]
-WantedBy=multi-user.target
-```
-
-Then:
-```bash
-sudo systemctl daemon-reload
-sudo systemctl enable meshai
-sudo systemctl start meshai
-```
-
-## License
-
-MIT License
-
-## Author
-
-K7ZVX - matt@echo6.co
+# MeshAI
+
+LLM-powered assistant for Meshtastic mesh networks.
+
+## Features
+
+- **LLM Chat**: Responds to @mentions and DMs with AI-generated responses
+- **Multi-backend**: Supports OpenAI, Anthropic Claude, Google Gemini, and local LLMs via LiteLLM
+- **Knowledge Base (RAG)**: Hybrid FTS5 + vector search over Meshtastic documentation
+- **Message Chunking**: Sentence-aware splitting with continuation prompts for long responses
+- **Bang Commands**: `!help`, `!ping`, `!reset`, `!status`, `!weather`
+- **Conversation History**: Per-user context maintained in SQLite
+- **Rate Limiting**: Configurable delays to avoid flooding the mesh
+- **advBBS Compatible**: Runs alongside [advBBS](https://github.com/NovaNexusMesh/advBBS) on the same node — protocol sync messages and mail notifications are automatically filtered
+- **Rich Configurator**: Interactive TUI for easy setup
+- **MeshMonitor Integration**: Syncs with [MeshMonitor](https://github.com/Yeraze/meshmonitor) by Yeraze to avoid duplicate responses
+
+## Installation
+
+```bash
+# Clone the repository
+git clone https://github.com/zvx-echo6/meshai.git
+cd meshai
+
+# Install with pip
+pip install -e .
+
+# Or install dependencies manually
+pip install -r requirements.txt
+```
+
+## Quick Start
+
+```bash
+# Run the configurator
+meshai --config
+
+# Or copy and edit the example config
+cp config.example.yaml config.yaml
+# Edit config.yaml with your settings
+
+# Run the bot
+meshai
+```
+
+## Configuration
+
+Run `meshai --config` to launch the interactive configurator, or edit `config.yaml` directly.
+
+### Key Settings
+
+```yaml
+bot:
+  name: "ai"                    # @mention trigger
+  respond_to_mentions: true
+  respond_to_dms: true
+
+connection:
+  type: "serial"               # serial or tcp
+  serial_port: "/dev/ttyUSB0"
+
+llm:
+  backend: "openai"            # openai, anthropic, google
+  api_key: "your-api-key"
+  model: "gpt-4o-mini"
+```
+
+### Using Local LLMs
+
+MeshAI works with any OpenAI-compatible API, including:
+
+- **LiteLLM**: `base_url: "http://localhost:4000/v1"`
+- **Open WebUI**: `base_url: "http://localhost:3000/api"`
+- **Ollama**: `base_url: "http://localhost:11434/v1"`
+
+## Commands
+
+| Command | Description |
+|---------|-------------|
+| `!help` | Show available commands |
+| `!ping` | Test connectivity |
+| `!reset` | Clear your conversation history |
+| `!status` | Show bot status and stats |
+| `!weather [location]` | Get weather (uses GPS if no location given) |
+
+## Usage Examples
+
+**Chat via @mention:**
+```
+@ai What's the weather like today?
+> Seattle: 52F, Partly Cloudy, Wind 8mph
+```
+
+**Direct message:**
+```
+DM: Tell me a short joke
+> Why don't scientists trust atoms? They make up everything!
+```
+
+**Weather command:**
+```
+!weather Portland
+> Portland: 48F, Rain, Wind 12mph
+```
+
+## Architecture
+
+```
+┌──────────────────────────────────────────────────────────────────┐
+│                           MeshAI                                  │
+├──────────────────────────────────────────────────────────────────┤
+│  ┌─────────────┐    ┌─────────────┐    ┌──────────────────────┐ │
+│  │  Meshtastic │    │   Message   │    │    LLM Backend       │ │
+│  │  Connector  │───▶│   Router    │───▶│    (pluggable)       │ │
+│  │ Serial/TCP  │    │             │    │                      │ │
+│  └─────────────┘    └──────┬──────┘    └──────────────────────┘ │
+│         │                  │                     │               │
+│         │           ┌──────▼──────┐              │               │
+│         │           │ Conversation│              │               │
+│         │           │   History   │◀─────────────┘               │
+│         │           │  (SQLite)   │                              │
+│         │           └─────────────┘                              │
+│         │                  │                                     │
+│         │           ┌──────▼──────┐    ┌──────────────────────┐ │
+│         │           │  Knowledge  │───▶│  Hybrid FTS5+Vector  │ │
+│         │           │    Base     │    │  (sqlite-vec + BGE)  │ │
+│         │           └─────────────┘    └──────────────────────┘ │
+│         │                                                        │
+│         ▼                                                        │
+│  ┌─────────────┐    ┌─────────────┐                             │
+│  │  Responder  │───▶│   Chunker   │  Sentence-aware splitting   │
+│  │             │    │             │  + continuation prompts     │
+│  └─────────────┘    └─────────────┘                             │
+└──────────────────────────────────────────────────────────────────┘
+```
+
+## Knowledge Base (RAG)
+
+MeshAI can answer questions using a local knowledge base built from Meshtastic documentation. The system uses hybrid search combining:
+
+- **FTS5 keyword search** — fast exact term matching with domain-aware query construction
+- **Vector embeddings** — semantic similarity using `bge-small-en-v1.5` (384 dimensions)
+- **Reciprocal Rank Fusion** — merges results from both methods for best relevance
+
+**Building the knowledge base:**
+
+```bash
+# Extract from Meshtastic ZIM file
+python scripts/zim_to_knowledge.py meshtastic.zim --output knowledge.db
+
+# Or from markdown files
+python scripts/md_to_knowledge.py docs/ --output knowledge.db
+```
+
+**Configuration:**
+
+```yaml
+knowledge:
+  enabled: true
+  db_path: /data/meshai_knowledge.db
+  top_k: 5              # Number of chunks to retrieve
+  fts_weight: 0.5       # Weight for keyword matches (0-1)
+  vector_weight: 0.5    # Weight for semantic matches (0-1)
+```
+
+The knowledge base requires `sqlite-vec` and `fastembed` (installed automatically with requirements.txt).
+
+## Message Chunking
+
+Long LLM responses are automatically split into mesh-friendly chunks:
+
+- **Sentence-aware** — never splits a sentence across messages
+- **Configurable limits** — max characters per message, max messages per response
+- **Continuation prompts** — if content remains, asks "Want me to keep going?"
+- **Natural follow-ups** — responds to "yes", "ok", "continue", "more", etc.
+
+**Configuration:**
+
+```yaml
+response:
+  max_length: 200       # Max chars per message
+  max_messages: 3       # Messages before continuation prompt
+```
+
+## Docker
+
+### Quick Start with Docker
+
+```bash
+# Create working directory
+mkdir -p meshai/data && cd meshai
+
+# Download docker-compose file
+curl -O https://raw.githubusercontent.com/zvx-echo6/meshai/main/docker-compose.yml
+
+# Copy and edit config
+curl -o data/config.yaml https://raw.githubusercontent.com/zvx-echo6/meshai/main/config.example.yaml
+# Edit data/config.yaml with your settings
+
+# Start
+docker compose up -d
+
+# View logs
+docker compose logs -f
+```
+
+### Docker Configuration
+
+**TCP Connection** (recommended for Docker):
+```yaml
+# data/config.yaml
+connection:
+  type: "tcp"
+  tcp_host: "192.168.1.100"  # Your Meshtastic node IP
+  tcp_port: 4403
+```
+
+**Serial Connection**:
+```yaml
+# data/config.yaml
+connection:
+  type: "serial"
+  serial_port: "/dev/ttyUSB0"
+```
+
+Then edit `docker-compose.serial.yml` to match your device path.
+
+### Environment Variables
+
+You can pass the API key via environment variable instead of config file:
+
+```bash
+LLM_API_KEY=your-key-here docker compose up -d
+```
+
+Or create a `.env` file:
+```bash
+LLM_API_KEY=your-key-here
+```
+
+### View Logs
+
+```bash
+docker compose logs -f meshai
+```
+
+## Running Alongside advBBS
+
+MeshAI is designed to coexist with [advBBS](https://github.com/NovaNexusMesh/advBBS) on the same Meshtastic node. Both connect via TCP to meshtasticd and share the radio, but MeshAI automatically ignores advBBS traffic:
+
+- **Sync protocol** — `MAILREQ|`, `MAILACK|`, `MAILDAT|`, `BOARDREQ|`, etc.
+- **RAP protocol** — `advBBS|` pings, pongs, and route advertisements
+- **Mail notifications** — `[MAIL]` new message alerts
+- **Bang commands in DMs** — `!mail`, `!board`, etc. are left for advBBS to handle
+
+No special configuration is needed. The filter is enabled by default and can be toggled in `config.yaml`:
+
+```yaml
+bot:
+  filter_bbs_protocols: true   # set false to disable
+```
+
+Plain-text BBS responses (e.g. "Welcome back, matt!") are indistinguishable from normal user messages and will be processed normally — this is a known and accepted limitation.
+
+## MeshMonitor Integration
+
+MeshAI integrates with [MeshMonitor](https://github.com/Yeraze/meshmonitor), a comprehensive Meshtastic monitoring platform by Yeraze. When enabled, MeshAI automatically fetches MeshMonitor's auto-responder trigger patterns and ignores messages that MeshMonitor handles, preventing duplicate responses on the mesh.
+
+**Features:**
+- Automatic trigger discovery via MeshMonitor's HTTP API
+- Dynamic ignore list — no manual sync needed
+- Trigger list injected into the LLM prompt so MeshAI can discuss MeshMonitor commands conversationally
+- Configurable via TUI (option 9) or config.yaml
+
+**Configuration:**
+
+```yaml
+meshmonitor:
+  enabled: true
+  url: "http://192.168.1.100:8080"
+  inject_into_prompt: true
+  refresh_interval: 300
+```
+
+MeshMonitor is a separate project — get it at https://github.com/Yeraze/meshmonitor
+
+## Running as a Service
+
+Create `/etc/systemd/system/meshai.service`:
+
+```ini
+[Unit]
+Description=MeshAI - Meshtastic LLM Assistant
+After=network.target
+
+[Service]
+Type=simple
+User=your-user
+WorkingDirectory=/path/to/meshai
+ExecStart=/usr/bin/python3 -m meshai
+Restart=always
+RestartSec=10
+
+[Install]
+WantedBy=multi-user.target
+```
+
+Then:
+```bash
+sudo systemctl daemon-reload
+sudo systemctl enable meshai
+sudo systemctl start meshai
+```
+
+## Acknowledgments
+
+- [Meshtastic](https://meshtastic.org/) — the mesh networking platform
+- [MeshMonitor](https://github.com/Yeraze/meshmonitor) by Yeraze — monitoring integration
+- [advBBS](https://github.com/NovaNexusMesh/advBBS) by NovaNexusMesh — BBS coexistence design
+- [sqlite-vec](https://github.com/asg017/sqlite-vec) by Alex Garcia — vector search in SQLite
+- [fastembed](https://github.com/qdrant/fastembed) by Qdrant — fast local embeddings
+
+## License
+
+MIT License
+
+## Author
+
+K7ZVX - matt@echo6.co
--- a/config.example.yaml
+++ b/config.example.yaml
@ -1,74 +1,84 @@
-# MeshAI Configuration
-# LLM-powered Meshtastic assistant
-#
-# Copy this to config.yaml and customize as needed
-# For Docker: mount as /data/config.yaml
-
-# === BOT IDENTITY ===
-bot:
-  name: ai                       # Bot's display name
-  owner: ""                      # Owner's callsign (optional)
-  respond_to_dms: true           # Respond to direct messages
-  filter_bbs_protocols: true     # Ignore advBBS sync/notification messages
-
-# === MESHTASTIC CONNECTION ===
-connection:
-  type: tcp                      # serial | tcp
-  serial_port: /dev/ttyUSB0      # For serial connection
-  tcp_host: localhost            # For TCP connection (meshtasticd)
-  tcp_port: 4403
-
-# === RESPONSE BEHAVIOR ===
-response:
-  delay_min: 2.2                 # Min delay before responding (seconds)
-  delay_max: 3.0                 # Max delay before responding
-  max_length: 150                # Max chars per message chunk
-  max_messages: 2                # Max message chunks per response
-
-# === CONVERSATION HISTORY ===
-history:
-  database: /data/conversations.db
-  max_messages_per_user: 50      # Messages to keep per user
-  conversation_timeout: 86400    # Conversation expiry (seconds, 86400=24h)
-  auto_cleanup: true             # Auto-delete old conversations
-  cleanup_interval_hours: 24     # How often to run cleanup
-  max_age_days: 30               # Delete conversations older than this
-
-# === MEMORY OPTIMIZATION ===
-memory:
-  enabled: true                  # Enable rolling summary memory
-  window_size: 4                 # Recent message pairs to keep in full
-  summarize_threshold: 8         # Messages before re-summarizing
-
-# === MESH CONTEXT ===
-context:
-  enabled: true                  # Observe channel traffic for LLM context
-  observe_channels: []           # Channel indices to observe (empty = all)
-  ignore_nodes: []               # Node IDs to exclude from observation
-  max_age: 2592000               # Max age in seconds (default 30 days)
-  max_context_items: 20          # Max observations injected into LLM context
-
-# === LLM BACKEND ===
-llm:
-  backend: openai                # openai | anthropic | google
-  api_key: ""                    # API key (or use LLM_API_KEY env var)
-  base_url: https://api.openai.com/v1  # API base URL
-  model: gpt-4o-mini             # Model name
-  timeout: 30                    # Request timeout (seconds)
-  system_prompt: >-
-    You are a helpful assistant on a Meshtastic mesh network.
-    Keep responses VERY brief - under 250 characters total.
-    Be concise but friendly. No markdown formatting.
-  google_grounding: false          # Enable Google Search grounding (Gemini only, $35/1k queries)
-
-# === WEATHER ===
-weather:
-  primary: openmeteo             # openmeteo | wttr | llm
-  fallback: llm                  # openmeteo | wttr | llm | none
-  default_location: ""           # Default location for !weather (optional)
-
-# === MESHMONITOR INTEGRATION ===
-meshmonitor:
-  enabled: false
-  triggers_file: /data/triggers.json
-  inject_into_prompt: true
+# MeshAI Configuration
+# LLM-powered Meshtastic assistant
+#
+# Copy this to config.yaml and customize as needed
+# For Docker: mount as /data/config.yaml
+
+# === BOT IDENTITY ===
+bot:
+  name: ai                       # Bot's display name
+  owner: ""                      # Owner's callsign (optional)
+  respond_to_dms: true           # Respond to direct messages
+  filter_bbs_protocols: true     # Ignore advBBS sync/notification messages
+
+# === MESHTASTIC CONNECTION ===
+connection:
+  type: tcp                      # serial | tcp
+  serial_port: /dev/ttyUSB0      # For serial connection
+  tcp_host: localhost            # For TCP connection (meshtasticd)
+  tcp_port: 4403
+
+# === RESPONSE BEHAVIOR ===
+response:
+  delay_min: 2.2                 # Min delay before responding (seconds)
+  delay_max: 3.0                 # Max delay before responding
+  max_length: 200                # Max chars per message chunk
+  max_messages: 3                # Max message chunks per response
+
+# === CONVERSATION HISTORY ===
+history:
+  database: /data/conversations.db
+  max_messages_per_user: 50      # Messages to keep per user
+  conversation_timeout: 86400    # Conversation expiry (seconds, 86400=24h)
+  auto_cleanup: true             # Auto-delete old conversations
+  cleanup_interval_hours: 24     # How often to run cleanup
+  max_age_days: 30               # Delete conversations older than this
+
+# === MEMORY OPTIMIZATION ===
+memory:
+  enabled: true                  # Enable rolling summary memory
+  window_size: 4                 # Recent message pairs to keep in full
+  summarize_threshold: 8         # Messages before re-summarizing
+
+# === MESH CONTEXT ===
+context:
+  enabled: true                  # Observe channel traffic for LLM context
+  observe_channels: []           # Channel indices to observe (empty = all)
+  ignore_nodes: []               # Node IDs to exclude from observation
+  max_age: 2592000               # Max age in seconds (default 30 days)
+  max_context_items: 20          # Max observations injected into LLM context
+
+# === LLM BACKEND ===
+llm:
+  backend: openai                # openai | anthropic | google
+  api_key: ""                    # API key (or use LLM_API_KEY env var)
+  base_url: https://api.openai.com/v1  # API base URL
+  model: gpt-4o-mini             # Model name
+  timeout: 30                    # Request timeout (seconds)
+  system_prompt: >-
+    You are a helpful assistant on a Meshtastic mesh network.
+    Keep responses very brief - 1-2 short sentences, under 300 characters.
+    Only give longer answers if the user explicitly asks for detail or explanation.
+    Be concise but friendly. No markdown formatting.
+  google_grounding: false        # Enable Google Search grounding (Gemini only, $35/1k queries)
+
+# === WEATHER ===
+weather:
+  primary: openmeteo             # openmeteo | wttr | llm
+  fallback: llm                  # openmeteo | wttr | llm | none
+  default_location: ""           # Default location for !weather (optional)
+
+# === MESHMONITOR INTEGRATION ===
+meshmonitor:
+  enabled: false                 # Enable MeshMonitor trigger sync
+  url: ""                        # MeshMonitor web UI URL (e.g. http://192.168.1.100:8080)
+  inject_into_prompt: true       # Include trigger list in LLM prompt
+  refresh_interval: 300          # Seconds between trigger refreshes
+
+# === KNOWLEDGE BASE (RAG) ===
+knowledge:
+  enabled: false                 # Enable knowledge base search
+  db_path: ""                    # Path to knowledge SQLite database
+  top_k: 5                       # Number of chunks to retrieve per query
+  fts_weight: 0.5                # Weight for FTS5 keyword matches (0-1)
+  vector_weight: 0.5             # Weight for vector semantic matches (0-1)
--- a/docker-compose.yml
+++ b/docker-compose.yml
@ -14,12 +14,12 @@
 services:
  meshai:
    # Pull from GitHub Container Registry
-    image: ghcr.io/zvx-echo6/meshai:latest
+    # image: ghcr.io/zvx-echo6/meshai:latest

    # Uncomment to build locally instead of pulling
-    # build:
-    #   context: .
-    #   dockerfile: Dockerfile
+    build:
+      context: .
+      dockerfile: Dockerfile
    #   args:
    #     UID: ${UID:-1000}
    #     GID: ${GID:-1000}
@ -52,7 +52,7 @@ services:
    deploy:
      resources:
        limits:
-          memory: 256M
+          memory: 3G
        reservations:
          memory: 64M

@ -69,6 +69,7 @@ services:
        max-size: "10m"
        max-file: "3"

+
 volumes:
  meshai_data:
    name: meshai_data
--- a/docker-entrypoint.sh
+++ b/docker-entrypoint.sh
@ -64,16 +64,11 @@ llm:

 meshmonitor:
  enabled: false
-  triggers_file: /data/triggers.json
  inject_into_prompt: true
 EOF
    echo "Default config created. Configure via http://localhost:7682"
 fi

-# Create triggers.json if missing
-if [ ! -f "/data/triggers.json" ]; then
-    echo '{"triggers": []}' > /data/triggers.json
-fi

 # Start ttyd for web-based config access
 echo "Starting web config interface on port 7682..."
@ -111,7 +106,7 @@ kill_bot() {
 echo "Starting MeshAI..."
 rm -f /tmp/meshai_restart
 while true; do
-    python -m meshai --config-file "$MESHAI_CONFIG" &
+    python -m meshai -v --config-file "$MESHAI_CONFIG" &
    BOT_PID=$!
    echo "$BOT_PID" > /tmp/meshai.pid
    echo "Bot started (PID $BOT_PID)"
--- a/meshai/chunker.py
+++ b/meshai/chunker.py
@ -0,0 +1,182 @@
+"""Sentence-aware message chunker for Meshtastic's character limits.
+
+Splits LLM responses into messages that:
+- Never exceed max_chars per message (default 200)
+- Never split a sentence across messages
+- Send at most max_messages per response (default 3)
+- If more content remains, replace the last sentence with a continuation prompt
+- Support up to max_continuations follow-ups (default 3)
+"""
+
+import logging
+import re
+
+logger = logging.getLogger(__name__)
+
+# Phrases that trigger continuation of a previous response
+CONTINUE_PHRASES = {
+    "yes", "yeah", "yep", "yea", "sure", "ok", "okay", "go on",
+    "keep going", "continue", "more", "go ahead", "tell me more",
+    "yes please", "y",
+}
+
+CONTINUATION_PROMPT = "Want me to keep going?"
+
+
+def split_sentences(text: str) -> list[str]:
+    """Split text into sentences, preserving abbreviations and decimals."""
+    # Split on . ! ? followed by space or end of string
+    # But not on decimals (4.8) or common abbreviations (e.g. Dr. Mr. etc.)
+    sentences = re.split(r'(?<=[.!?])\s+', text.strip())
+    # Filter empty strings
+    return [s.strip() for s in sentences if s.strip()]
+
+
+def chunk_response(
+    text: str,
+    max_chars: int = 200,
+    max_messages: int = 3,
+) -> tuple[list[str], str]:
+    """Split a response into sentence-aligned messages.
+
+    Args:
+        text: Full LLM response text
+        max_chars: Maximum characters per message
+        max_messages: Maximum messages to send before prompting
+
+    Returns:
+        Tuple of (messages_to_send, remaining_text)
+        If remaining_text is non-empty, the last message includes
+        a continuation prompt.
+    """
+    sentences = split_sentences(text)
+    if not sentences:
+        return [text[:max_chars]], ""
+
+    messages = []
+    current_msg = []
+    current_len = 0
+    sentence_idx = 0
+
+    while sentence_idx < len(sentences) and len(messages) < max_messages:
+        sentence = sentences[sentence_idx]
+
+        # Would this sentence fit in the current message?
+        added_len = len(sentence) + (1 if current_msg else 0)  # +1 for space
+
+        if current_len + added_len <= max_chars:
+            current_msg.append(sentence)
+            current_len += added_len
+            sentence_idx += 1
+        else:
+            # Sentence doesn't fit
+            if current_msg:
+                # Flush current message, start new one with this sentence
+                messages.append(" ".join(current_msg))
+                current_msg = []
+                current_len = 0
+                # Don't increment sentence_idx — retry this sentence in next message
+            else:
+                # Single sentence exceeds max_chars — truncate it
+                messages.append(sentence[:max_chars])
+                sentence_idx += 1
+
+    # Flush any remaining buffered message
+    if current_msg and len(messages) < max_messages:
+        messages.append(" ".join(current_msg))
+
+    # Determine remaining text
+    remaining_sentences = sentences[sentence_idx:]
+
+    # Also include any sentence that was in current_msg but didn't get flushed
+    # because we hit max_messages
+    if current_msg and len(messages) >= max_messages:
+        remaining_sentences = [" ".join(current_msg)] + remaining_sentences
+
+    remaining = " ".join(remaining_sentences)
+
+    # If there's remaining content, replace the end of the last message
+    # with a continuation prompt
+    if remaining:
+        prompt = CONTINUATION_PROMPT
+        last_msg = messages[-1] if messages else ""
+
+        # Check if we can append the prompt to the last message
+        if len(last_msg) + 1 + len(prompt) <= max_chars:
+            messages[-1] = last_msg + " " + prompt
+        else:
+            # Need to shorten the last message to fit the prompt
+            # Remove sentences from the end until it fits
+            last_sentences = split_sentences(last_msg)
+            while last_sentences:
+                test = " ".join(last_sentences) + " " + prompt
+                if len(test) <= max_chars:
+                    # Put removed sentences back into remaining
+                    messages[-1] = test
+                    break
+                removed = last_sentences.pop()
+                remaining = removed + " " + remaining
+            else:
+                # Couldn't fit — just use the prompt as the last message
+                messages[-1] = prompt
+
+    return messages, remaining
+
+
+class ContinuationState:
+    """Tracks continuation state per user."""
+
+    def __init__(self, max_continuations: int = 3):
+        self.max_continuations = max_continuations
+        # user_id -> {"remaining": str, "count": int}
+        self._state: dict[str, dict] = {}
+
+    def has_pending(self, user_id: str) -> bool:
+        """Check if user has pending continuation content."""
+        return user_id in self._state and bool(self._state[user_id]["remaining"])
+
+    def is_continuation_request(self, text: str) -> bool:
+        """Check if the message is a request to continue."""
+        return text.strip().lower().rstrip("!.,?") in CONTINUE_PHRASES
+
+    def store(self, user_id: str, remaining: str) -> None:
+        """Store remaining content for a user."""
+        if remaining:
+            existing = self._state.get(user_id, {"count": 0})
+            self._state[user_id] = {
+                "remaining": remaining,
+                "count": existing.get("count", 0),
+            }
+        elif user_id in self._state:
+            del self._state[user_id]
+
+    def get_continuation(self, user_id: str) -> tuple[list[str], str] | None:
+        """Get the next batch of messages for a continuation request.
+
+        Returns None if no pending content or max continuations reached.
+        """
+        if user_id not in self._state:
+            return None
+
+        state = self._state[user_id]
+        if state["count"] >= self.max_continuations:
+            del self._state[user_id]
+            return None
+
+        remaining = state["remaining"]
+        if not remaining:
+            del self._state[user_id]
+            return None
+
+        messages, new_remaining = chunk_response(remaining)
+        state["count"] += 1
+        state["remaining"] = new_remaining
+
+        if not new_remaining:
+            del self._state[user_id]
+
+        return messages, new_remaining
+
+    def clear(self, user_id: str) -> None:
+        """Clear continuation state for a user."""
+        self._state.pop(user_id, None)
--- a/meshai/cli/configurator.py
+++ b/meshai/cli/configurator.py
@ -81,7 +81,10 @@ class Configurator:
            mm_status = self._status_icon(self.config.meshmonitor.enabled)
            mm_url = self.config.meshmonitor.url or "[dim]not set[/dim]"
            table.add_row("9", "MeshMonitor Sync", f"{mm_status} {mm_url}")
-            table.add_row("10", "Setup Wizard", "[dim]First-time setup[/dim]")
+            kb_status = self._status_icon(self.config.knowledge.enabled)
+            kb_path = self.config.knowledge.db_path or "[dim]not set[/dim]"
+            table.add_row("10", "Knowledge Base", f"{kb_status} {kb_path}")
+            table.add_row("11", "Setup Wizard", "[dim]First-time setup[/dim]")

            console.print(table)
            console.print()
@ -90,13 +93,13 @@ class Configurator:
            if self.modified:
                console.print("[yellow]* Unsaved changes[/yellow]")
                console.print()
-            console.print("[white]11. Save[/white]                  [dim]Save config, stay in menu[/dim]")
-            console.print("[green]12. Save & Restart Bot[/green]   [dim]Apply changes now[/dim]")
-            console.print("[white]13. Save & Exit[/white]          [dim]Save, restart bot, exit[/dim]")
-            console.print("[white]14. Exit without Saving[/white]")
+            console.print("[white]12. Save[/white]                  [dim]Save config, stay in menu[/dim]")
+            console.print("[green]13. Save & Restart Bot[/green]   [dim]Apply changes now[/dim]")
+            console.print("[white]14. Save & Exit[/white]          [dim]Save, restart bot, exit[/dim]")
+            console.print("[white]15. Exit without Saving[/white]")
            console.print()

-            choice = IntPrompt.ask("Select option", default=12)
+            choice = IntPrompt.ask("Select option", default=13)

            if choice == 1:
                self._bot_settings()
@ -117,15 +120,17 @@ class Configurator:
            elif choice == 9:
                self._meshmonitor_settings()
            elif choice == 10:
-                self._setup_wizard()
+                self._knowledge_settings()
            elif choice == 11:
-                self._save_only()
+                self._setup_wizard()
            elif choice == 12:
-                self._save_and_restart()
+                self._save_only()
            elif choice == 13:
+                self._save_and_restart()
+            elif choice == 14:
                self._save_restart_exit()
                break
-            elif choice == 14:
+            elif choice == 15:
                break

    def _show_header(self) -> None:
@ -684,6 +689,45 @@ class Configurator:

        input("\nPress Enter to continue...")

+
+    def _knowledge_settings(self) -> None:
+        """Knowledge base settings submenu."""
+        while True:
+            self._clear()
+            console.print("[bold]Knowledge Base Settings[/bold]\n")
+            table = Table(box=box.ROUNDED)
+            table.add_column("Option", style="cyan", width=4)
+            table.add_column("Setting", style="white")
+            table.add_column("Value", style="green")
+
+            table.add_row("1", "Enabled", self._status_icon(self.config.knowledge.enabled))
+            table.add_row("2", "Database Path", self.config.knowledge.db_path or "[dim]not set[/dim]")
+            table.add_row("3", "Results Count", str(self.config.knowledge.top_k))
+            table.add_row("0", "Back", "")
+
+            console.print(table)
+            console.print()
+
+            choice = IntPrompt.ask("Select option", default=0)
+
+            if choice == 0:
+                return
+            elif choice == 1:
+                value = Confirm.ask("Enable knowledge base?", default=self.config.knowledge.enabled)
+                if value != self.config.knowledge.enabled:
+                    self.config.knowledge.enabled = value
+                    self.modified = True
+            elif choice == 2:
+                value = Prompt.ask("Database path", default=self.config.knowledge.db_path)
+                if value != self.config.knowledge.db_path:
+                    self.config.knowledge.db_path = value
+                    self.modified = True
+            elif choice == 3:
+                value = IntPrompt.ask("Results count (top_k)", default=self.config.knowledge.top_k)
+                if value != self.config.knowledge.top_k:
+                    self.config.knowledge.top_k = value
+                    self.modified = True
+
    def _setup_wizard(self) -> None:
        """First-time setup wizard."""
        self._clear()
--- a/meshai/config.py
+++ b/meshai/config.py
@ -108,7 +108,7 @@ class LLMConfig:
        "passive mesh context buffer (observes channel traffic), smart chunking for LoRa "
        "message limits, prompt injection defense, advBBS filtering.\n\n"
        "RESPONSE RULES:\n"
-        "- Keep responses VERY brief — under 200 characters total.\n"
+        "- Keep responses very brief — 1-2 short sentences, under 300 characters. Only give longer answers if the user explicitly asks for detail or explanation.\n"
        "- Be concise but friendly. No markdown formatting.\n"
        "- If asked about mesh activity and no recent traffic is shown, say you haven't "
        "observed any yet.\n"
@ -155,6 +155,14 @@ class MeshMonitorConfig:
    refresh_interval: int = 300  # Seconds between refreshes


+@dataclass
+class KnowledgeConfig:
+    """FTS5 knowledge base settings."""
+
+    enabled: bool = False
+    db_path: str = ""
+    top_k: int = 5
+
@dataclass
 class Config:
    """Main configuration container."""
@ -169,6 +177,7 @@ class Config:
    llm: LLMConfig = field(default_factory=LLMConfig)
    weather: WeatherConfig = field(default_factory=WeatherConfig)
    meshmonitor: MeshMonitorConfig = field(default_factory=MeshMonitorConfig)
+    knowledge: KnowledgeConfig = field(default_factory=KnowledgeConfig)

    _config_path: Optional[Path] = field(default=None, repr=False)

--- a/meshai/knowledge.py
+++ b/meshai/knowledge.py
@ -0,0 +1,206 @@
+"""Hybrid FTS5 + vector knowledge search for MeshAI."""
+
+import logging
+import re
+import sqlite3
+from typing import Optional
+
+import numpy as np
+import sqlite_vec
+from fastembed import TextEmbedding
+
+logger = logging.getLogger(__name__)
+
+STOPWORDS = {
+    'what', 'is', 'the', 'a', 'an', 'and', 'or', 'for', 'on', 'in',
+    'to', 'of', 'how', 'do', 'does', 'can', 'will', 'would', 'could',
+    'should', 'are', 'was', 'were', 'be', 'been', 'being', 'have',
+    'has', 'had', 'not', 'but', 'if', 'then', 'than', 'that', 'this',
+    'it', 'its', 'my', 'me', 'i', 'you', 'your', 'we', 'they', 'them',
+    'about', 'with', 'from', 'at', 'by', 'up', 'out', 'so', 'no',
+    'yes', 'just', 'get', 'got', 'tell', 'know', 'like',
+}
+
+
+class KnowledgeSearch:
+    """Hybrid FTS5 + vector knowledge search."""
+
+    def __init__(self, db_path: str, top_k: int = 5):
+        self.top_k = top_k
+        self.available = False
+        self._model = None
+        self._conn: Optional[sqlite3.Connection] = None
+        self._has_vec = False
+
+        try:
+            self._conn = sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)
+            self._conn.enable_load_extension(True)
+            sqlite_vec.load(self._conn)
+            self._conn.enable_load_extension(False)
+
+            # Check if vec table exists
+            tables = [r[0] for r in self._conn.execute(
+                "SELECT name FROM sqlite_master WHERE type='table'"
+            ).fetchall()]
+            self._has_vec = "chunks_vec" in tables
+
+            if self._has_vec:
+                logger.info("Loading embedding model for hybrid search...")
+                self._model = TextEmbedding("BAAI/bge-small-en-v1.5")
+                logger.info("Knowledge base loaded with hybrid search (FTS5 + vector)")
+            else:
+                logger.info("Knowledge base loaded with FTS5 only (no vector table)")
+
+            count = self._conn.execute("SELECT count(*) FROM chunks").fetchone()[0]
+            logger.info(f"Knowledge base: {count} chunks from {db_path}")
+            self.available = True
+
+        except Exception as e:
+            logger.warning(f"Failed to load knowledge base: {e}")
+
+    def search(self, query: str) -> list[dict]:
+        """Search knowledge base using hybrid FTS5 + vector with RRF."""
+        if not self.available or not self._conn:
+            return []
+
+        try:
+            fts_results = self._fts_search(query)
+
+            if self._has_vec and self._model:
+                vec_results = self._vec_search(query)
+                merged = self._rrf_merge(fts_results, vec_results)
+            else:
+                merged = [(r[0], r[1]) for r in fts_results]
+
+            # Fetch full data for top results
+            top_ids = [r[0] for r in merged[:self.top_k]]
+            if not top_ids:
+                return []
+
+            results = []
+            for chunk_id in top_ids:
+                row = self._conn.execute(
+                    "SELECT title, content, source, book_title FROM chunks WHERE rowid = ?",
+                    [chunk_id]
+                ).fetchone()
+                if row:
+                    # Truncate content to ~500 chars for prompt injection
+                    content = row[1][:1000] if row[1] else ""
+                    results.append({
+                        "title": row[0] or "",
+                        "excerpt": content,
+                        "source": row[2] or "",
+                        "book_title": row[3] or "",
+                    })
+
+            logger.debug(f"Knowledge search: query='{query[:50]}' -> {len(results)} results")
+            return results
+
+        except Exception as e:
+            logger.warning(f"Knowledge search error: {e}")
+            return []
+
+    def _fts_search(self, query: str, limit: int = 50) -> list[tuple]:
+        """FTS5 keyword search. Returns [(rowid, rank), ...]"""
+        # Domain terms - only use these for FTS, ignore likely typos
+        DOMAIN_TERMS = {
+            'short', 'fast', 'slow', 'long', 'mid', 'medium',
+            'meshtastic', 'lora', 'mesh', 'radio', 'preset', 'modem',
+            'sf', 'cr', 'bw', 'spreading', 'coding', 'bandwidth',
+            'factor', 'rate', 'channel', 'frequency', 'node',
+        }
+
+        cleaned = re.sub(r'[^a-zA-Z0-9\s]', '', query.lower())
+        words = cleaned.split()
+
+        # Extract only domain terms (ignores typos like "waht", "teh")
+        domain_words = [w for w in words if w in DOMAIN_TERMS]
+
+        # Handle compound words: "shortfast" -> ["short", "fast"]
+        expanded = []
+        for w in domain_words:
+            if w == 'shortfast':
+                expanded.extend(['short', 'fast'])
+            elif w == 'longfast':
+                expanded.extend(['long', 'fast'])
+            elif w == 'medslow' or w == 'midslow':
+                expanded.extend(['mid', 'slow'])
+            else:
+                expanded.append(w)
+
+        # Also check for these patterns in non-domain words
+        for w in words:
+            if w not in DOMAIN_TERMS:
+                if 'shortfast' in w:
+                    expanded.extend(['short', 'fast'])
+                elif 'short' in w and 'fast' in w:
+                    expanded.extend(['short', 'fast'])
+                elif 'longfast' in w:
+                    expanded.extend(['long', 'fast'])
+
+        # Dedupe while preserving order
+        seen = set()
+        unique = []
+        for w in expanded:
+            if w not in seen:
+                seen.add(w)
+                unique.append(w)
+
+        if not unique:
+            return []
+
+        # Use AND for domain terms - they should all match
+        fts_query = " AND ".join(unique[:5])
+
+        try:
+            rows = self._conn.execute("""
+                SELECT rowid, rank
+                FROM chunks_fts
+                WHERE chunks_fts MATCH ?
+                ORDER BY rank
+                LIMIT ?
+            """, [fts_query, limit]).fetchall()
+            return rows
+        except Exception as e:
+            logger.warning(f"FTS search error: {e}")
+            return []
+
+
+    def _vec_search(self, query: str, limit: int = 50) -> list[tuple]:
+        """Vector similarity search. Returns [(chunk_rowid, distance), ...]"""
+        try:
+            query_vec = list(self._model.embed([f"query: {query}"]))[0]
+            rows = self._conn.execute("""
+                SELECT chunk_rowid, distance
+                FROM chunks_vec
+                WHERE embedding MATCH ?
+                AND k = ?
+            """, [query_vec.astype(np.float32).tobytes(), limit]).fetchall()
+            return rows
+        except Exception as e:
+            logger.warning(f"Vector search error: {e}")
+            return []
+
+    def _rrf_merge(self, fts_results: list, vec_results: list, k: int = 60) -> list:
+        """Reciprocal Rank Fusion merge of FTS5 and vector results."""
+        scores = {}
+
+        # FTS weight 0.5
+        for rank, (rowid, _) in enumerate(fts_results):
+            scores[rowid] = scores.get(rowid, 0) + 0.5 / (k + rank + 1)
+
+        # Vector weight 0.5
+        for rank, (chunk_rowid, _) in enumerate(vec_results):
+            scores[chunk_rowid] = scores.get(chunk_rowid, 0) + 0.5 / (k + rank + 1)
+
+        return sorted(scores.items(), key=lambda x: x[1], reverse=True)
+
+    def close(self):
+        """Close the database connection."""
+        if self._conn:
+            try:
+                self._conn.close()
+            except Exception:
+                pass
+            self._conn = None
+            self.available = False
--- a/meshai/main.py
+++ b/meshai/main.py
@ -38,6 +38,7 @@ class MeshAI:
        self.llm: Optional[LLMBackend] = None
        self.context: Optional[MeshContext] = None
        self.meshmonitor_sync = None
+        self.knowledge = None
        self.router: Optional[MessageRouter] = None
        self.responder: Optional[Responder] = None
        self._running = False
@ -97,6 +98,8 @@ class MeshAI:

        if self.llm:
            await self.llm.close()
+        if self.knowledge:
+            self.knowledge.close()

        self._remove_pid()
        logger.info("MeshAI stopped")
@ -175,11 +178,23 @@ class MeshAI:
        else:
            self.meshmonitor_sync = None

+        # Knowledge base
+        kb_cfg = self.config.knowledge
+        if kb_cfg.enabled and kb_cfg.db_path:
+            from .knowledge import KnowledgeSearch
+            self.knowledge = KnowledgeSearch(
+                db_path=kb_cfg.db_path,
+                top_k=kb_cfg.top_k,
+            )
+        else:
+            self.knowledge = None
+
        # Message router
        self.router = MessageRouter(
            self.config, self.connector, self.history, self.dispatcher, self.llm,
            context=self.context,
            meshmonitor_sync=self.meshmonitor_sync,
+            knowledge=self.knowledge,
        )

        # Responder
@ -208,6 +223,16 @@ class MeshAI:
            )

            # Route the message
+            # Check for continuation request first
+            continuation_messages = self.router.check_continuation(message)
+            if continuation_messages:
+                await self.responder.send_response(
+                    continuation_messages,
+                    destination=message.sender_id,
+                    channel=message.channel,
+                )
+                return
+
            result = await self.router.route(message)

            if result.route_type == RouteType.IGNORE:
@ -215,18 +240,18 @@ class MeshAI:

            # Determine response
            if result.route_type == RouteType.COMMAND:
-                response = result.response
+                messages = result.response  # Commands return single string
            elif result.route_type == RouteType.LLM:
-                response = await self.router.generate_llm_response(message, result.query)
+                messages = await self.router.generate_llm_response(message, result.query)
            else:
                return

-            if not response:
+            if not messages:
                return

            # Send DM response
            await self.responder.send_response(
-                text=response,
+                messages,
                destination=message.sender_id,
                channel=message.channel,
            )
--- a/meshai/responder.py
+++ b/meshai/responder.py
@ -1,4 +1,4 @@
-"""Response handling - delays and message chunking."""
+"""Response handling - delays and message delivery."""

 import asyncio
 import logging
@ -12,7 +12,7 @@ logger = logging.getLogger(__name__)


 class Responder:
-    """Handles response formatting, chunking, and delivery."""
+    """Handles response delivery with pacing."""

    def __init__(self, config: ResponseConfig, connector: MeshConnector):
        self.config = config
@ -20,120 +20,46 @@ class Responder:

    async def send_response(
        self,
-        text: str,
+        messages: list[str] | str,
        destination: Optional[str] = None,
        channel: int = 0,
    ) -> bool:
-        """Send a response with delay and chunking.
+        """Send response messages with human-pacing delays.

        Args:
-            text: Response text (will be chunked if too long)
+            messages: Pre-chunked messages list, or single string (legacy)
            destination: Node ID for DM, or None for channel broadcast
            channel: Channel to send on

        Returns:
-            True if all chunks sent successfully
+            True if all messages sent successfully
        """
-        # Chunk the message
-        chunks = self._chunk_message(text)
+        # Handle legacy single string
+        if isinstance(messages, str):
+            messages = [messages]

-        # Limit to max messages
-        if len(chunks) > self.config.max_messages:
-            chunks = chunks[: self.config.max_messages]
-            # Truncate last chunk to indicate more was cut
-            if chunks:
-                last = chunks[-1]
-                if len(last) > self.config.max_length - 3:
-                    chunks[-1] = last[: self.config.max_length - 3] + "..."
+        if not messages:
+            return True

        success = True
-        for i, chunk in enumerate(chunks):
-            # Apply delay before sending
-            delay = random.uniform(self.config.delay_min, self.config.delay_max)
-            await asyncio.sleep(delay)
+        for i, msg in enumerate(messages):
+            # Apply delay before sending (except first message)
+            if i > 0:
+                delay = random.uniform(self.config.delay_min, self.config.delay_max)
+                await asyncio.sleep(delay)

-            # Send chunk
+            # Send message
            sent = self.connector.send_message(
-                text=chunk,
+                text=msg,
                destination=destination,
                channel=channel,
            )

            if not sent:
-                logger.error(f"Failed to send chunk {i + 1}/{len(chunks)}")
+                logger.error(f"Failed to send message {i + 1}/{len(messages)}")
                success = False
                break

-            logger.debug(f"Sent chunk {i + 1}/{len(chunks)}: {chunk[:50]}...")
+            logger.debug(f"Sent message {i + 1}/{len(messages)}: {msg[:50]}...")

        return success
-
-    def _chunk_message(self, text: str) -> list[str]:
-        """Split message into chunks respecting max_length.
-
-        Tries to break at word boundaries when possible.
-
-        Args:
-            text: Text to chunk
-
-        Returns:
-            List of chunks
-        """
-        max_len = self.config.max_length
-
-        if len(text) <= max_len:
-            return [text]
-
-        chunks = []
-        remaining = text
-
-        while remaining:
-            if len(remaining) <= max_len:
-                chunks.append(remaining)
-                break
-
-            # Find a good break point
-            chunk = remaining[:max_len]
-
-            # Try to break at word boundary
-            break_point = self._find_break_point(chunk)
-
-            if break_point > 0:
-                chunks.append(remaining[:break_point].rstrip())
-                remaining = remaining[break_point:].lstrip()
-            else:
-                # No good break point, hard cut
-                chunks.append(chunk)
-                remaining = remaining[max_len:]
-
-        return chunks
-
-    def _find_break_point(self, text: str) -> int:
-        """Find best break point in text.
-
-        Prefers: sentence end > comma/semicolon > space
-
-        Args:
-            text: Text to find break in
-
-        Returns:
-            Index to break at, or 0 if no good break found
-        """
-        # Look for sentence endings
-        for char in ".!?":
-            pos = text.rfind(char)
-            if pos > len(text) // 2:  # Only if in second half
-                return pos + 1
-
-        # Look for clause breaks
-        for char in ",;:":
-            pos = text.rfind(char)
-            if pos > len(text) // 2:
-                return pos + 1
-
-        # Look for word boundary
-        pos = text.rfind(" ")
-        if pos > len(text) // 3:  # Only if past first third
-            return pos
-
-        return 0
--- a/meshai/router.py
+++ b/meshai/router.py
@ -13,6 +13,7 @@ from .config import Config
 from .connector import MeshConnector, MeshMessage
 from .context import MeshContext
 from .history import ConversationHistory
+from .chunker import chunk_response, ContinuationState

 logger = logging.getLogger(__name__)

@ -65,6 +66,7 @@ class MessageRouter:
        llm_backend: LLMBackend,
        context: MeshContext = None,
        meshmonitor_sync=None,
+        knowledge=None,
    ):
        self.config = config
        self.connector = connector
@ -73,6 +75,8 @@ class MessageRouter:
        self.llm = llm_backend
        self.context = context
        self.meshmonitor_sync = meshmonitor_sync
+        self.knowledge = knowledge
+        self.continuations = ContinuationState(max_continuations=3)


    def should_respond(self, message: MeshMessage) -> bool:
@ -111,6 +115,30 @@ class MessageRouter:

        return True

+    def check_continuation(self, message) -> list[str] | None:
+        """Check if this is a continuation request and return messages if so.
+
+        Returns:
+            List of messages to send, or None if not a continuation
+        """
+        user_id = message.sender_id
+        text = message.text.strip()
+
+        logger.info(f"check_continuation: user={user_id}, text='{text[:30]}', has_pending={self.continuations.has_pending(user_id)}")
+
+        if self.continuations.has_pending(user_id):
+            if self.continuations.is_continuation_request(text):
+                result = self.continuations.get_continuation(user_id)
+                if result:
+                    messages, _ = result
+                    return messages
+                # Max continuations reached, return None to fall through
+            else:
+                # User asked something new, clear pending continuation
+                self.continuations.clear(user_id)
+
+        return None
+
    async def route(self, message: MeshMessage) -> RouteResult:
        """Route a message and generate response.

@ -208,6 +236,23 @@ class MessageRouter:
                    "\n\n[No recent mesh traffic observed yet.]"
                )

+
+
+        # 5. Knowledge base retrieval
+        if self.knowledge and query:
+            results = self.knowledge.search(query)
+            if results:
+                chunks = "\n\n".join(
+                    f"[{r['title']}]: {r['excerpt']}" for r in results
+                )
+                system_prompt += (
+                    "\n\nREFERENCE KNOWLEDGE - Answer using this information:\n"
+                    + chunks
+                )
+
+        # DEBUG: Log system prompt status
+        logger.warning(f"SYSTEM PROMPT LENGTH: {len(system_prompt)} chars")
+        logger.warning(f"HAS REFERENCE KNOWLEDGE: {'REFERENCE KNOWLEDGE' in system_prompt}")
        try:
            response = await self.llm.generate(
                messages=history,
@ -227,7 +272,21 @@ class MessageRouter:
        # Persist summary if one was created/updated
        await self._persist_summary(message.sender_id)

-        return response
+        # Chunk the response with sentence awareness
+        messages, remaining = chunk_response(
+            response,
+            max_chars=self.config.response.max_length,
+            max_messages=self.config.response.max_messages,
+        )
+
+        # Store remaining content for continuation
+        if remaining:
+            logger.info(f"Storing continuation for {message.sender_id}: {len(remaining)} chars remaining")
+            self.continuations.store(message.sender_id, remaining)
+        else:
+            logger.info(f"No remaining content for {message.sender_id}")
+
+        return messages

    async def _persist_summary(self, user_id: str) -> None:
        """Persist any cached summary to the database.
--- a/requirements.txt
+++ b/requirements.txt
@ -6,3 +6,6 @@ anthropic>=0.18.0
 google-genai>=1.0.0
 rich>=13.0.0
 httpx>=0.25.0
+fastembed>=0.3.0
+sqlite-vec>=0.1.0
+numpy
--- a/triggers.json
+++ b/triggers.json
@ -1 +0,0 @@
-{}