Commit graph

55 commits

Author SHA1 Message Date
matt+claude
c2d5bcfbd1 feat(central): v0.5.4 -- region-aware subscriptions using Central v0.9.20 regional subjects
Pre-v0.5.4 every Central subscription used a bare wildcard (central.wx.>,
central.fire.>, central.traffic.>, central.quake.>, central.hydro.>,
central.space.>), so a Magic Valley operator flipping nws -> central was
in fact subscribing to the all-US firehose and discarding 95% of events
locally. Central v0.9.20 (2026-05-28) added per-region subject suffixes
so the firehose can be filtered server-side. This wires meshai to use them.

Backend (meshai/central/consumer.py):
- New _subjects_for(adapter, region) replaces the static ADAPTER_SUBJECTS
  dict. ADAPTER_SUBJECTS is retained as an alias to _SUBJECTS_BARE for any
  legacy importers; the dispatcher path is unchanged.
- Per-adapter subject patterns (region='us.id' default):
    nws        -> central.wx.alert.us.id.>     (region BEFORE wildcard)
    usgs_quake -> central.quake.event.>.us.id  (region AFTER wildcard)
    firms      -> central.fire.hotspot.>.us.id
    fires      -> central.fire.incident.id.>   (state token at fixed depth)
                  central.fire.perimeter.id.>
    traffic    -> central.traffic.>.id          (bare state, no us. prefix)
    roads511   -> central.traffic.>.id          (shared with traffic, sub-adapter routing)
    usgs       -> central.hydro.>.us.id
                  central.hydro.>.unknown      (workaround until v0.9.20.1)
    swpc       -> central.space.>               (planetary; region ignored)
- Empty/None region falls back to bare wildcards (pre-v0.9.20 behaviour).
- _subject_owned() pulls region from env.central.region and routes through
  _subjects_for; v0.5.3 sub-adapter routing (owned-sources set) still
  applies on shared subjects like central.traffic.>.id.
- start() logs the active region at connect-time for ops visibility.

Config (meshai/config.py):
- CentralConsumerConfig.region: str = "us.id". One region per consumer
  applies to every central-flipped adapter; per-adapter overrides can
  land in v0.6 when there is a real use case.

Frontend (dashboard-frontend/src/pages/Environment.tsx):
- Central Connection panel gets a Region text input next to URL/Durable.
- EnvConfig.central type extended with region: string.
- Static bundle rebuilt; index-DCFmSeOM.js -> index-B24tHcYj.js.

Tests:
- tests/test_central_region_routing.py (new, 9 cases): asserts the exact
  v0.9.20 subject string for each adapter at region='us.id', the SWPC
  global-stays-global rule, the USGS .unknown workaround, the empty-region
  backward-compat fallback for all 8 adapters, and integration through
  CentralConsumer._subject_owned() with the default region.
- tests/test_central_consumer.py + tests/test_central_sub_adapter_routing.py:
  the two tests that asserted bare-wildcard subjects now set
  env.central.region = "" explicitly to preserve their original concern
  (no region semantics — backward-compat path only).

Why swpc stays global: space weather is planetary -- a CME is detected on
the sun, the geomagnetic response is hemispheric. There is no Idaho-only
solar event; subscribing per-region would only drop events we want.

Why hydro has the .unknown workaround: Central v0.9.20 leaves gauges
whose USGS state can't be inferred on central.hydro.>.unknown. Until
v0.9.20.1 backfills the state tag we subscribe to both filters to
avoid silently losing those rows. Idaho downstream-filtering on
data['_enriched']['usgs_site']['state'] is future v0.6 work.

Orthogonal to v0.5.2 dispatcher guards (staleness / cooldown / dedup)
and v0.5.3 sub-adapter routing: the region filter operates at the NATS
subscription layer (server-side), upstream of everything else.

Verified: pytest 327 passed (318 prior + 9 new region-routing tests);
py_compile clean; frontend build clean. Safe-mode preserved -- no toggle
enabled, no master enabled, no central enabled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-04 02:30:33 +00:00
matt+claude
ad6e24d123 fix(notifications): v0.5.2 -- staleness filter, cooldown, dedup, renderer wiring, hydro family
Spam fix from v0.5.0 oversight:
- Staleness filter (default 600s, configurable per-toggle) drops backlog at dispatcher
  entrance -- solves the "restart wave fires days of old events" problem definitively.
- Per-toggle cooldown_seconds (default 300s) throttles same (category, region) bursts.
- Per-(source, event_id) LRU dedup (10k entries) catches Central re-delivery.
- Renderer wired into _dispatch_toggles; toggle path now produces friendly mesh strings
  with 150-byte UTF-8 hard cap and priority-order segment composition (no mid-char trunc).
- categories.py: stream_flood_warning / stream_high_water moved from weather -> geohazards
  family (canonical toggle name = seismic in VALID_TOGGLES) to match the GUI family tab.

Verified end-to-end: 7200s-old events all dropped (100/0), fresh burst throttles to one
mesh broadcast per cooldown window (1/99), dedup catches duplicate event_ids (1/99).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-04 00:40:28 +00:00
b90afc3a74 feat(notifications): v0.5.0 -- Master Toggles UX redesign + Central Connection GUI + grouped categories + region scoping
Per-family notification policy (PagerDuty/Grafana-style): each family gets a
severity threshold + region scope + a severity->channel routing matrix, so an
operator opts in per family rather than hand-writing rules.

SECTION 1 -- BACKEND
- config.py: new NotificationToggle dataclass (enabled, min_severity, regions,
  severity_channels{severity->[channel types]}, quiet_hours_override, + per-channel
  delivery config: broadcast_channel/node_ids/smtp_*/recipients/webhook_*).
  notifications.toggles is now a dict[family]->NotificationToggle with 8 family
  defaults (mesh_health, weather, fire, rf_propagation, roads, avalanche, seismic,
  tracking), all enabled=false (opt-in), min_severity=priority,
  severity_channels={priority:[mesh_broadcast], immediate:[mesh_broadcast, mesh_dm]},
  quiet_hours_override=true. (Old TogglesConfig.enabled was only read by
  build_pipeline via getattr -> degrades to ToggleFilter no-op, so the pipeline
  filter is unchanged; toggles now drive the Dispatcher instead.)
- region_scope:list added to NotificationRuleConfig; _matching_rules filters by
  event.region/regions ([] = all).
- Dispatcher: _dispatch_toggles runs IN PARALLEL to rule matching -- looks up
  get_toggle(event.category), checks enabled + region scope + severity threshold,
  then for each channel in severity_channels[event.severity] builds a synthetic
  rule (override_quiet set only for immediate when quiet_hours_override) and
  delivers. 'digest' channel is skipped in live dispatch (handled by accumulator).
- categories.py: get_toggle() prefix fallback maps the live phases-2.7-2.14
  categories (weather_warning, wildfire_incident, earthquake_event,
  traffic_congestion, geomagnetic/rf_*, stream_*, ...) to their family, fixing the
  v0.4 "category -> other" gap.
- config_loader.py: SECRET_FIELDS += notifications.toggles.*.smtp_password.
- _dataclass_to_dict now recurses dict-of-dataclasses, and the loader coerces the
  toggles dict -> NotificationToggle on both the full-load and section-PUT paths
  (so GUI save round-trips correctly).
- tests/test_notification_toggles.py (11): enabled/disabled, region filter
  (empty+populated+regions-list), severity threshold, per-severity channel routing,
  digest-skipped-live, quiet-hours-override immediate-only, category->family,
  rules+toggles both fire. Full suite: 294 passed (283 + 11).

SECTION 2 -- FRONTEND
- Notifications.tsx: MasterToggles component above the rules section -- 8 family
  cards (icon + enable toggle; collapsed summary 'OFF' or 'N regions, M channels at
  <sev>+'; expanded: severity threshold, severity x channel checkbox matrix,
  region list, quiet-hours-override toggle, per-channel config:
  broadcast_channel/DM node IDs/recipients/SMTP host+port/webhook URL).
- Environment.tsx: CentralConnectionPanel above the family tabs (url, durable,
  enabled) wired to environmental.central.
- npm run build clean (tsc strict); rebuilt static committed (index-CfYlhn4e.js).

SECTION 3 -- VERIFICATION
- py_compile + tsc strict clean; pytest 294 passed.
- Rebuilt prod: /notifications serves Master Toggles, /environment serves Central
  Connection (strings confirmed in the served bundle); 8 adapters, pipeline
  started, no tracebacks, healthy.
- GUI round-trip: enable weather toggle (min_severity=priority,
  regions=[Magic Valley], severity_channels.priority=[mesh_broadcast]) -> PUT
  {saved:true} -> notifications.yaml reflects it; env_feeds traffic.api_key stayed
  ${TOMTOM_API_KEY} (C.3.1 secret preservation holds). Restored to clean opt-in
  baseline.
- Synthetic NWS weather_warning/priority/Magic Valley -> routes through the weather
  toggle to mesh_broadcast; out-of-region and below-threshold events correctly
  dropped.

DEFERRED (noted for a follow-up, not blocking Matt's morning config): Section 2B
rules-editor polish -- grouped-by-family category checkboxes, region_scope
multi-select in the rule editor (backend field + filtering ARE in), tooltips, and
the fire-count Active/No-activity badge -- were not built tonight to keep the build
shippable and verified; the Advanced Rules section is otherwise unchanged and
still functional.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 07:00:10 +00:00
73c007d227 feat(central): v0.4 C.1 Central connector backend (no-op until adapter source flipped)
Adds the backend for sourcing environmental feeds from Central's NATS
JetStream firehose instead of (or alongside) meshai's native adapters.
Architecture is Matt-approved Option 3' (dedicated package + per-adapter
source switch surfaced on the existing Environmental config).

NO-OP POSTURE (intentional): every adapter defaults to feed_source="native"
and environmental.central.enabled defaults false, so on a stock config the
CentralConsumer starts and subscribes to nothing -- behavior is byte-for-byte
v0.3. Live env_feeds.yaml is unchanged on disk; an operator who touches
nothing sees no change. Flipping an adapter to central is Phase C.3; the
dashboard UI for it is Phase C.2.

What landed:
- meshai/central/ package (CentralConsumer): async start()/stop(), JetStream
  durable subscribe to subjects derived from adapters with feed_source=central,
  and _on_message -> normalize -> bus.emit. nats-py is lazy-imported only on
  the connect path, so no-op boot has zero NATS dependency.
- Normalization (CloudEvents envelope -> Central Event -> upstream data):
    source   = inner Event.adapter
    category = Central hierarchical string -> meshai flat, via a small
               table-driven prefix map (map_category)
    severity = 0|1->routine, 2->priority, 3|4->immediate, null->routine
    lat/lon  = geo.centroid, swapped from GeoJSON [lon,lat] -> (lat,lon)
    group_key/inhibit = outer envelope id (dedup parity with native adapters)
    expires/timestamp parsed from ISO-8601
    Event.data = upstream payload verbatim (generic _enriched merge, preserved
                 as-is incl. hydro's extra usgs_site/usgs_stats bundles)
- Tombstone (`.removed.` subject or `:removed` id suffix) -> a "clear" Event
  carrying the ORIGINAL group_key (`:removed` stripped) + data._central_tombstone
  so the grouper/inhibitor lets the prior event lapse naturally.
- config.py: a `_SourcedFeed` mixin adds `feed_source: native|central`
  (validated in __post_init__) to all 10 adapter configs; new
  CentralConsumerConfig as environmental.central { enabled, url, durable,
  connect_timeout }. Both ride the generic _dict_to_dataclass coercion, so
  they are GUI-editable via PUT /config/environmental (Rule 17) -- frontend
  fields come in C.2.
- env/store.py: each adapter is instantiated only when
  enabled AND feed_source=="native"; a feed_source=central adapter is skipped
  natively (debug-logged) so Central can own it without a duplicate.
- main.py: CentralConsumer constructed + started after start_pipeline(),
  stopped in stop().

DEVIATION FROM SPEC (documented): the spec named the new field `source`, but
FIRMSConfig already has a `source` field (the satellite product,
"VIIRS_SNPP_NRT"). To avoid the collision the field is named **feed_source**
across all adapters. Everything else follows the spec.

NETWORKING: zero infra change required. The meshai container already reaches
the Central NATS server directly (TCP to 100.64.0.12:4222 OK) and resolves
central.echo6.mesh via the Phase 2.6.6 MagicDNS fix. No docker-compose edit;
default bridge works (LXC host masquerades to the Tailscale CGNAT range). The
lighter bridge-route / host-net / sidecar fallbacks were not needed.

Tests: tests/test_central_consumer.py (11) + tests/test_config_source_field.py
(6): no-op-when-native, subjects-when-central, source-gate skips native
instantiation, normalize+emit, _enriched preserved verbatim, tombstone->clear,
severity map (0-4/null), category map (>=4 strings), async _on_message
emits+acks, start() no-op without NATS, feed_source default/validate/reject/
dict-coercion. Full suite: 269 passed (was 253 + 16 new).

Verification: (A) no bare self._x() in consumer.py. (B) py_compile clean.
(C) 269 passed. (D) rebuilt prod -- 8 native adapters, pipeline started,
native nifc/traffic emissions still flowing, healthy, no errors, log
"CentralConsumer started; 0 subjects subscribed -- no adapters set to central".
(E) in-container synthetic _on_message injection normalized correctly
(usgs_quake/earthquake_event/immediate, centroid swapped, _enriched preserved)
and reached the bus; ephemeral, no config change to roll back.

C.2 (dashboard frontend for the feed_source switch + central connection) is next.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 02:28:19 +00:00
20e0dec28a fix(notifications): Phase 2.16.1 unblock pipeline -- grouper flush + rules coercion + toggle warning
Phase 2.16 found the live notification pipeline never delivered any
environmental event. Two independent blocking bugs, both fixed here.

BUG A -- grouper held events forever (nothing drove tick()).
Every adapter event sets a group_key, so all were buffered in the Grouper
and never flushed (start_pipeline only started the DigestScheduler; no
tick driver existed). Fixes (per Matt's decisions):
  - Grouper.handle(): immediate-severity events now BYPASS the window
    entirely (delivered straight to next_handler), no buffering latency.
    routine/priority still coalesce.
  - start_pipeline(): schedules an asyncio flush task that calls
    grouper.tick() every `grouper_flush_seconds` (default 5s) so
    coalesced events drain within the window even when poll cadence is
    sparse. stop_pipeline() signals + cancels it.
  before/after (grouper held_count): an immediate+group_key event used to
  sit held (count 1) forever; now held_count==0 on arrival (bypassed). A
  routine event is held (count 1) then drained to 0 by tick()/flush.

BUG B -- notification rules loaded as dicts, crashing the dispatcher.
Root cause (more precise than 2.16's guess): the rules coercion is NOT
missing from the multi-file loader -- it lives in _dict_to_dataclass's
explicit `elif key == "notifications"` branch, but that branch was DEAD
CODE, shadowed by the generic `if hasattr(field_type,
"__dataclass_fields__")` handler that runs first for every dataclass
field (including notifications). So Config.notifications.rules stayed a
list of dicts on ALL load paths, and Dispatcher._matching_rules threw
`AttributeError: 'dict' object has no attribute 'enabled'`. Fix: hoist
the notifications special-handling ahead of the generic handler (and drop
the now-truly-dead duplicate elif).
  before/after (cfg.notifications.rules[0] type): dict -> NotificationRuleConfig.

OBS C -- empty enabled_toggles. Left as 'pass all' for v0.3 (per Matt);
added a startup WARNING in build_pipeline so operators see gating is off:
"enabled_toggles is empty -- ToggleFilter passing all events. Configure
toggles to enable gating." (confirmed firing live).

Tests:
  - tests/test_pipeline_grouper.py (new): test_immediate_severity_bypasses_grouper,
    test_periodic_flush_drains_routine, test_priority_is_also_coalesced_not_bypassed.
  - tests/test_config_loader.py (new): test_multifile_load_coerces_notification_rules,
    test_rules_attribute_access_does_not_raise (regression guards for Bug B).
  - tests/test_pipeline_inhibitor_grouper.py (updated): 5 existing grouper
    hold/coalesce/flush tests primed the grouper with immediate+group_key
    events expecting them to be held; switched those to 'priority' (still
    buffered; still outranks the routine event in the inhibitor-chain test)
    to match the intended immediate-bypass behavior.
  Full suite: 253 passed (was 248 + 5 new; 5 existing updated, none lost).

VERIFICATION (rebuilt prod, traced end-to-end via in-process build_pipeline
probe with a recording channel + live config):
  - rules[0] type: NotificationRuleConfig (Bug B fixed).
  - IMMEDIATE event: held_count==0 on emit (bypassed) -> reached
    channel.deliver(): delivered=[('PROBE_RULE','E2E IMMEDIATE')].
  - ROUTINE event: held_count==1 -> after flush 0 -> reached
    channel.deliver(): delivered+=[('PROBE_RULE','E2E ROUTINE')].
  - Natural Summit-Creek-shaped nifc wildfire_incident (routine, no
    matching dispatch rule): held 1 -> after flush -> landed in the digest
    accumulator (1 event). End-to-end channel.deliver evidence = the
    RecChannel.deliver() calls above.
  - Live container: 8 adapters, healthy, "Grouper flush task started
    (every 5s)", the enabled_toggles warning fired, and NO dispatcher
    AttributeError/traceback.

Follow-up (non-blocking): several Phase 2.7-2.14 categories (e.g.
wildfire_incident, earthquake_event) aren't in the category->toggle map,
so they fall to toggle 'other'. Harmless while enabled_toggles is empty
(pass-all), but should be mapped before toggle gating is turned on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 00:36:13 +00:00
8b2cdeee0b feat(notifications): Phase 2.14 USGS earthquake adapter (new) -- closes Rule 16 Seismic standalone path
First net-new environmental adapter (prior phases wired existing ones).
Adds meshai/env/usgs_quake.py with USGSQuakeAdapter + USGSQuakeConfig,
polling a keyless USGS earthquake GeoJSON feed and emitting one Event per
qualifying quake. Establishes the standalone Seismic path (Rule 16);
Central becomes the dual-source in v0.4.

Adapter (mirrors the fires/usgs-water per-event pattern):
- Feed: https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_day.geojson
  (M2.5+ past day -- M1.0 too noisy, M4.5+ too sparse for the region).
  Tick 300s.
- Filters each feature by min_magnitude AND a geographic bbox.
- Per quake: source=usgs_quake, category=earthquake_event, stable
  event_id = the USGS feature id (e.g. "us6000abcd"), lat/lon from
  geometry.coordinates[1],[0], region tag from config (default
  "magic_valley").
- to_event(): category earthquake_event, magnitude-binned severity passed
  through, group_key = inhibit_key = the USGS id. Defensive None for
  missing id / coords / magnitude. get_events()/health_status mirror the
  other adapters.

MAGNITUDE -> SEVERITY BINS (as proposed):
  M < 3.5        -> routine
  3.5 <= M < 5.0 -> priority
  M >= 5.0       -> immediate
('sig' is captured in the event dict as metadata but severity is
magnitude-binned -- clearer and matches the spec's primary suggestion.)

GEOGRAPHIC BBOX (as proposed) -- [west, south, east, north]:
  [-115.5, 42.0, -110.0, 45.2]
Covers Magic Valley / Twin Falls (SW), the Lost River Range / Borah Peak
and Sawtooths (central Idaho, seismically active -- 1983 M6.9), the eastern
Snake River Plain / INL, and the Yellowstone caldera (NW Wyoming). An empty
bbox disables the geographic filter (accepts all).

Wiring:
- config.py: new USGSQuakeConfig dataclass; usgs_quake field on
  EnvironmentalConfig; loader branch in _dict_to_dataclass.
- store.py __init__: registers self._adapters["usgs_quake"] when enabled --
  this is what grows the live adapter count 7 -> 8.
- store._ingest: NO dedicated branch added. usgs_quake is a standard
  per-event adapter, so the existing generic "else" loop (dedup on
  (source, event_id) + _emit_event) already routes it. (The swpc/ducting
  branches are special only because they also maintain status blobs.)
- env_feeds.yaml (live /data/config): added usgs_quake block, enabled:true,
  default bbox/min_mag/region.

Rule 17: GUI-editable config (env_feeds.yaml). Rule 18 N/A -- USGS
earthquake feed is keyless (no .env entry; .ref credentials has no
USGS/ArcGIS/quake key). Rule 16: standalone path established + validated
in-container.

Tests: tests/test_adapter_usgs_quake.py (15 tests) mirrors the 2.12/2.13
shape -- severity bins, _fetch severity assignment, magnitude filter,
geographic filter (in-bbox vs California/out), empty-bbox-accepts-all,
dedup id stable across ticks for the same quake id, category, severity
pass-through, group_key/inhibit_keys, field population, defensive cases
(missing id/coords/magnitude/corrupted -> None), and malformed-feature
skipping. _fetch tests patch urlopen with synthetic FeatureCollections.
Full suite: 248 passed.

Live smoke test (prod container, rebuilt): clean startup, adapter count
grew 7 -> 8 ("EnvironmentalStore initialized with 8 adapters"), healthy,
no traceback, no usgs_quake errors. In-container standalone tick over the
real feed succeeded (is_loaded=true, last_error=null,
consecutive_errors=0); the feed returned 54 global M2.5+ quakes, 0 inside
the Magic Valley->Yellowstone bbox right now (quiet) -- so no Event is
emitted, acceptable, and it exercises the fetch + magnitude + geographic
filter + no-emit path on live data. The emission path (in-region quake ->
earthquake_event) is unit-validated and uses the same store->bus path
emitting live for NWS, traffic, and NIFC fires.

Note (.gitignore): line 36 `env/` (a virtualenv pattern under "Virtual
environments") collaterally matches meshai/env/, so this NEW file required
`git add -f` (untracked files there are otherwise ignored and hidden from
status). Existing tracked env files are unaffected. Recommended follow-up:
anchor the rule to `/env/` so future net-new env adapters don't need -f.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 00:10:39 +00:00
9674e94efb Phase 2.4: LLM-summarized digest with master toggle filter
- Remove severity-based fork; tee pattern sends all events to both dispatcher and accumulator
- Add ToggleFilter before tee; drops events for disabled toggles
- Rework DigestAccumulator: event log instead of active/resolved tracking
- render_digest now async, calls LLM once per toggle with severity-ordered events
- Fallback to count-based summary when LLM unavailable
- Add TogglesConfig to config.py for master toggle settings
- Update scheduler to await async render_digest
- 75 tests passing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-15 02:37:12 +00:00
d6bc6b2b89 build: normalize all line endings to LF
One-time renormalization pass under the .gitattributes added in the
previous commit. Every tracked text file now uses LF. No semantic
changes — verified via git diff --cached --ignore-all-space showing
zero real differences. Future diffs will only show real content
changes.

This commit will appear huge in git log --stat but represents zero
behavior change. Use git log --follow --ignore-all-space or
git blame -w when archaeologically tracing through this commit.
2026-05-14 22:43:06 +00:00
493b43f7cf feat(notifications): Phase 2.3b digest scheduler
Adds DigestScheduler class that fires digest at configured time (default 07:00)
and routes to rules with trigger_type=schedule and schedule_match=digest.

- DigestScheduler: asyncio task with start/stop lifecycle
- Config: DigestConfig dataclass with schedule and include fields
- Config: schedule_match field on NotificationRuleConfig
- Pipeline: start_pipeline/stop_pipeline async lifecycle functions
- Mesh channels get per-chunk delivery, email/webhook get full text
- 26 new tests covering schedule computation, fire behavior, lifecycle

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-14 22:32:51 +00:00
e6897b3f33 feat(notifications): add region tagger with coordinate and NWS zone matching
Adds meshai/notifications/region_tagger.py with:
- haversine_distance() for great-circle distance calculation
- tag_by_coordinates() maps lat/lon to nearest region within radius
- tag_by_nws_zone() maps NWS zone codes to matching regions

Also adds nws_zones field to RegionAnchor in config.py to support
zone-based matching. Default is empty list for backward compatibility.

This is scaffolding for Phase 2 - not yet wired into any adapters.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-14 16:26:53 +00:00
344ca0677d fix(notifications): complete severity cleanup to 3-level system
- Replace 11 info fallbacks with routine in router.py + channels.py
- Replace 2 warning min_severity defaults with priority
- Update config.example.yaml rules to use routine/priority/immediate
- Annotate config.example.yaml notifications section as transitional pending v0.3 8-toggle rewrite Phase 1.2

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-14 07:00:58 +00:00
95ec7d5351 fix: notification system improvements and threshold corrections
- Fix leftover severity references (info→routine in filter dropdown)
- Fix node_id int handling in connector and channels (handle both int and string)
- Add LLM-generated reports for notifications (replace raw data dumps)
- Fix health.score.composite attribute path for RF reports
- Add deterministic HF band conditions from SFI/Kp values
- Remove max_tokens from LLM calls (character limits at delivery)
- Weather feed improvements: show event_type + area, local events first
- Fix is_online to use configured offline_threshold_hours in data store
- Update stale defaults: offline 24→2h, battery_warning 20→30%
- Add TODO comments for packet_threshold scale bug

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-14 06:03:51 +00:00
49f2838048 refactor: simplify severity to 3 levels (routine/priority/immediate)
- Replace 6-level system (info/advisory/watch/warning/critical/emergency)
  with 3-level military precedence (routine/priority/immediate)
- Every adapter remapped: NWS, NIFC, FIRMS, USGS, SWPC, avalanche,
  traffic, 511, mesh alerts
- is_critical flag removed — severity covers it
- Quiet hours: suppress routine only, priority+immediate always deliver
- Dashboard: blue/amber/red for routine/priority/immediate
- Fix hex node ID parsing in Mesh DM channel (!23261b70 format)
2026-05-13 19:05:50 -06:00
c6b4a64163 fix(health): use real telemetry, fix hardcoded thresholds
- Utilization pillar reads firmware channel_utilization (max of infra
  nodes) instead of estimating from packet counts × 200ms
- is_online uses configured threshold, not hardcoded 24 hours
- Updated defaults: offline 2h (was 24h), battery warning 30% (was 20%)
- Utilization thresholds: 20/25/35/45% matching real Meshtastic behavior
- Behavior pillar threshold aligned with notification config (7200/day)
- has_solar marked as dead code pending Solar Quality Engine

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-13 23:06:31 +00:00
64faf33e3b feat: researched defaults + USGS auto-lookup + category documentation
- Battery thresholds: 30%/15%/5% with voltage equivalents (3.60V/3.50V/3.40V)
- Channel utilization threshold: 40% (firmware throttles GPS at 25%)
- Packet flood threshold: 10 packets/min per node (was 500/day)
- Mesh health threshold: 65 (was 70)
- USGS adapter with NWS NWPS flood stage auto-lookup
- API endpoint: GET /api/env/usgs/lookup/{site_id}
- Alert categories with detailed descriptions and example messages
- Packet flood vs stream flood terminology fully disambiguated

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-13 15:14:16 +00:00
d90b787c12 refactor(notifications): complete UX redesign
- Self-contained rules replace abstract channels
- Inline delivery config (broadcast/DM/email/webhook or none)
- quiet_hours_enabled master toggle separate from start/end times
- delivery_type="" valid: rule matches but does not deliver
- Severity dropdown with plain-English descriptions
- Example messages per alert category
- Default baseline rules: Emergency Broadcast, Infrastructure Down, Fire Alert, Severe Weather
- Condition vs Schedule trigger types
- Test and preview buttons per rule
- stream_flood_warning renamed from flood_warning (distinct from packet_flood)
- Categories display with descriptions

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-13 14:25:57 +00:00
b4f7e24c26 refactor(notifications): self-contained rules, remove abstract channels
- Each notification rule contains its own delivery config inline
- No more separate channels with abstract IDs to cross-reference
- Delivery type selector (Mesh Broadcast/DM/Email/Webhook) with
  inline config fields per type
- Follows MeshMonitor trigger-action UX pattern
- Channel picker from radio for mesh broadcast
- Node picker for mesh DM
- Collapsed rule cards show readable one-line summary
- Trigger type: condition (alerts) or schedule (daily reports)
- Schedule triggers support daily, weekly, custom cron
- Message types: mesh health, RF propagation, alerts digest, custom
- Migrates old channels+rules config to new flat format on load

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-13 07:31:59 +00:00
947cce514e feat(dashboard): comprehensive config UI with help and descriptions
- Add InfoButton component with click-to-toggle popover for field help
- Add SectionDescription component for section intro paragraphs
- Add AlertRuleToggle component with grouped threshold controls
- Add detailed info and helper text for every field in all sections
- Convert Commands section to toggleable command list with descriptions
- Add dropdowns for severity_min, fire state, connection type, LLM backend
- Add region management: Add/Delete buttons with confirmation
- Group alert rules by category: Infrastructure, Power, Utilization, Health
- Remove hardcoded placeholders and Idaho-specific text
- Fix config.py DashboardConfig dataclass decorator
- Fix main.py MessageRouter initialization

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-13 04:26:17 +00:00
3bf5e3dfbc feat(notifications): alert routing with channels, rules, and delivery
- Notification pipeline: categories -> rules -> channels
- Channels: mesh broadcast, mesh DM, email (SMTP), webhook (generic)
- Per-rule severity threshold and category filtering
- Quiet hours with emergency override
- LLM summarization for mesh delivery over 200 chars only
- !subscribe shows available categories, easy mesh subscription
- Dashboard notification rules API endpoints
- Extensible channel system for future transports (Winlink, JS8Call)
- config.yaml notification section with examples

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-13 03:51:37 +00:00
3d74eb92b0 feat(env): add NASA FIRMS satellite fire hotspot detection
- Implement FIRMSAdapter polling NASA FIRMS area API for satellite hotspots
- Cross-reference hotspots against NIFC perimeters to identify new ignitions
- Add !hotspots command with --new flag for filtering new ignitions only
- Add FIRMSConfig dataclass with map_key, source, bbox, day_range options
- Add /api/env/hotspots endpoint for dashboard integration
- Add Satellite Hotspots section to Environment.tsx with NEW badges
- Add FIRMS configuration section to Config.tsx with source/confidence options
- Update config.example.yaml with FIRMS configuration template

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-12 23:06:55 +00:00
f8bf7e5057 feat(env): USGS stream gauges, TomTom traffic, 511 road conditions 2026-05-12 22:22:57 +00:00
ab7392c518 feat: Add MQTT source adapter 2026-05-12 21:57:11 +00:00
c5f4dac8b6 fix: remove hardcoded timezone and region names for portability
- Timezone now configurable (default America/Boise)
- Router prompt generates region name instructions from config
- Any operator can run MeshAI for their region without code changes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-12 15:39:54 -06:00
549ae4bdfb feat(env): NWS weather alerts, NOAA space weather, tropospheric ducting
- Environmental feed system with tick-based adapters
- NWS Active Alerts: polls api.weather.gov, zone-based filtering
- NOAA SWPC: Kp, SFI, R/S/G scales, band assessment, alert detection
- Tropospheric ducting: Open-Meteo GFS refractivity profile, duct classification
- !alerts command for active weather warnings
- !solar / !hf commands for RF propagation (HF + UHF ducting)
- Alert engine integration: severe weather, R3+ blackout, ducting events
- LLM context injection for weather/propagation queries
- Dashboard RF Propagation card with HF + UHF ducting display
- EnvironmentalConfig with per-feed toggles in config.yaml
2026-05-12 17:21:43 +00:00
3ec09ad158 feat(dashboard): embedded FastAPI backend with REST API + WebSocket
- FastAPI runs in MeshAI asyncio loop (no separate process)
- REST API: /api/status, /api/health, /api/nodes, /api/edges,
  /api/regions, /api/sources, /api/config, /api/alerts
- WebSocket at /ws/live pushes health updates and alerts
- Config CRUD: GET/PUT per section with validation and save
- DashboardConfig with port/host in config.yaml
2026-05-12 15:47:58 +00:00
b11874f016 feat(knowledge): add Qdrant backend with SQLite fallback
- Add QdrantKnowledgeSearch class for hybrid dense+sparse vector search
- Query RECON's 2.8M vector database via TEI embeddings + Qdrant
- Uses RRF (Reciprocal Rank Fusion) for hybrid search merging
- Extended KnowledgeConfig with Qdrant/TEI settings
- Auto backend tries Qdrant first, falls back to SQLite FTS5
- Graceful degradation if RECON infrastructure unreachable

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-06 15:54:43 +00:00
34f894ea79 fix: ACK-accelerated delivery — immediate on ACK, retry once, abort on double fail
Delays 1.5-2.5s (was 3-5s, only for broadcasts now).
DMs: send → ACK → next immediately. No ACK → retry once → abort.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-06 14:10:16 +00:00
736d6a313a feat: Full alert engine — 17 conditions, scaling cooldown, per-condition TUI toggles
Alert conditions across all 5 pillars:
  Infrastructure: offline, recovery, new router
  Power: battery 50/25/10%, 7-day trend, USB→battery, solar not charging
  Utilization: sustained >20% for 6h, packet flood >500/24h
  Coverage: infra single gateway, feeder offline, region blackout
  Scores: mesh <70, region <60

Scaling cooldown: immediate → 12h → 24h → 48h → stop
Recovery notifications when conditions resolve
Per-condition on/off toggles in TUI
Battery trend queries SQLite node_snapshots for 7-day history

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-06 05:39:11 +00:00
c8400233bd feat: alert detection and dispatch system for real-time mesh alerts
- Add AlertEngine to detect infra down/recovery, battery critical, critical node offline
- Add alert_channel, critical_nodes, alert_cooldown_minutes config options
- Wire alerts to channel broadcast and DM subscribers
- Add TUI options for critical nodes, alert channel, cooldown

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-06 04:48:41 +00:00
0da697855e fix: delay_max 3.0 → 5.0 for randomized message pacing
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-06 04:24:12 +00:00
b7237469e4 feat: ACK-based message delivery, markdown stripping, prompt fixes
- Connector: send_and_wait_ack() waits for ACK before returning
- Responder: ACK waiting for DMs with retry, delay-based for broadcasts
- Chunker: strip_markdown() removes bold/italic/headers/lists from LLM output
- Router: applies strip_markdown before chunking
- Prompt: stronger no-markdown rules, no manual continuation prompt, gateway explanation
- Config: delays 3-5s (was 2.2-3), max_length 200 (was 150), max_messages 3 (was 2)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-06 04:17:44 +00:00
79ff756a38 fix: Remove hard-coded token limits on LLM responses
- Add max_response_tokens config (8192) to LLMConfig
- Use config value in router.py instead of hardcoded 500
- Update base.py default from 300 to 8192
- Lets LLM generate full responses; chunker handles size limits

Fixes truncated responses like Here are three nodes in the freq

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-05 22:06:50 +00:00
b3c79f12da feat: Tick-based staggered polling for all sources
Both MeshviewSource and MeshMonitorDataSource now use tick-based
staggered polling instead of batch-every-5-minutes:

MeshviewSource (30s ticks):
- Packets: every tick (30s)
- Nodes: every 4 ticks (2 min)
- Stats/Edges: every 6 ticks (3 min)
- Traceroutes: every 10 ticks (5 min)

MeshMonitorDataSource (30s ticks):
- Packets: every 2 ticks (60s)
- Nodes/Telemetry: every 4 ticks (2 min)
- Traceroutes/Channels/Network/Topology: every 10 ticks (5 min)
- Solar: every 20 ticks (10 min)

Features:
- Source health status (avg_response_ms, tick_count, backed_off)
- Source coverage analysis (unique vs shared nodes)
- Tier 1 DATA SOURCES section shows all source health
- Node detail shows source visibility
- Incremental packets and telemetry with dedup
- Rate limit detection (429) with backoff
- Consecutive error exponential backoff
- polite_mode config option for shared instances

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-05 19:16:00 +00:00
aef78996a9 feat: Named single-gw nodes, infra-only monitoring, commands overhaul
Coverage:
- Single-gateway infra nodes named as critical risks per region
- Client single-gw nodes counted but not individually named
- Mesh-wide single-gw infra summary

Monitoring rules by node type:
- Infrastructure: full detail - battery, offline, coverage, neighbors, hardware
- Clients causing problems: named - high util, top senders
- Clients otherwise: counted per region, not individually tracked
- POWER breakdown now infra-only

Commands:
- Removed hardcoded command list from config.py system_prompt
- Dynamic command list in router.py from dispatcher (only enabled commands)
- MeshMonitor commands no longer listed as MeshAI commands
- !help overhaul: grouped by category, per-command detailed help
- LLM explicitly told to only mention listed commands

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-05 15:33:14 +00:00
de400068dd feat(mesh): config-driven regions with stale purge and coverage fix
- Extended RegionAnchor with local_name, description, aliases, cities
- Moved region geographic context from hardcoded Python to config.yaml
- Added 7-day stale node purge in _do_refresh (556 → 267 nodes)
- Fixed coverage lookup: str(node_num) → node_num (int key)
- Added bidirectional neighbor lookup for better region assignment
- Dynamic geography building in router from config
- Reporter reads region context from config instead of hardcoded dict

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-05 06:33:38 +00:00
45630f2cc6 fix: Override LLM brevity for mesh questions — give detailed data responses
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-05 05:29:09 +00:00
c3f1347b4b refactor: Replace auto-clustering with fixed region anchors
- Regions are now user-defined anchor points (name + lat/lon)
- Nodes assigned to nearest region, no distance limits
- Removed auto-naming and region_labels/infra_overrides
- Added Idaho region defaults in TUI
- Simpler, deterministic, user-controlled

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-04 17:35:28 +00:00
a7c409e406 feat: Add Phase 2 - Geographic Hierarchy and Health Scoring
Implements mesh intelligence with geo clustering, four-pillar health scoring,
and auto-naming regions from GPS data.

New: geo.py, mesh_health.py
Modified: config.py, main.py, router.py, configurator.py, config.example.yaml

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-04 16:43:12 +00:00
b945558ba3 feat: Phase 1 — multi-source data aggregation from Meshview and MeshMonitor APIs
- Add MeshviewSource class for fetching nodes, edges, stats from Meshview API
- Add MeshMonitorDataSource class for fetching nodes, channels, telemetry,
  traceroutes, network stats, topology, packets, solar from MeshMonitor API
- Add MeshSourceManager for managing multiple sources with aggregation
- Add MeshSourceConfig dataclass and mesh_sources list to config
- Integrate source_manager into main.py with periodic refresh
- Add source_manager parameter to MessageRouter (for future Phase 3)
- Add Mesh Sources TUI menu with add/edit/remove/test functionality
- Update config.example.yaml with mesh_sources section

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-04 16:26:58 +00:00
root
0e36869a5f feat: Hybrid RAG knowledge base, sentence-aware chunking, MeshMonitor HTTP sync
Knowledge Base:
- Hybrid FTS5 + vector search using sqlite-vec and bge-small-en-v1.5
- Reciprocal Rank Fusion for result merging
- Domain-aware query construction handles typos
- Configurable weights for keyword vs semantic matching

Message Chunking:
- Sentence-aware splitting respects message boundaries
- Continuation prompts for long responses
- Natural follow-up detection (yes, ok, continue, more, etc.)
- Per-user continuation state management

MeshMonitor Integration:
- HTTP API trigger sync (replaces file-based triggers.json)
- Dynamic refresh interval
- Trigger injection into LLM prompt

Other:
- Updated system prompt for better response length control
- Simplified responder to handle message lists
- Updated README with new features and architecture diagram
- Cleaned up config.example.yaml

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-04 07:44:12 +00:00
root
5f66b69c9c feat: Dynamic identity system prompt from bot config
- Build system prompt dynamically using bot.name and bot.owner from config
- Reorder prompt: identity -> static prompt -> MeshMonitor (conditional) -> mesh context
- MeshMonitor description only injected when meshmonitor.enabled is true
- Update default system_prompt to static parts only (commands, architecture, rules)
- Fix meshmonitor.py to handle trigger arrays (not just strings)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-03 05:45:58 +00:00
root
f6540e893d feat: Dynamic MeshMonitor trigger sync — auto-ignore MeshMonitor commands
Add MeshMonitorSync class that reads trigger patterns from a JSON file
and compiles them to regex. The router checks incoming messages against
these patterns and ignores messages that MeshMonitor will handle.

- New meshai/meshmonitor.py: Pattern compilation and file watching
- MeshMonitorConfig dataclass with enabled, triggers_file, inject_into_prompt
- Router integration: ignore matching messages, inject commands into prompt
- Main loop refresh: watch triggers file for changes without restart

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-03 03:20:23 +00:00
Ubuntu
1172b9b67f Add API timeout to all backends + mesh-aware system prompt
All three LLM backends (Google, OpenAI, Anthropic) now wrap API calls
in asyncio.wait_for() using config.timeout (default 30s). Previously
Gemini could hang indefinitely with grounding+AFC enabled.

Router catches TimeoutError with user-friendly "request timed out" message.
Empty context buffer now injects "[No recent mesh traffic observed yet.]"
so the LLM knows the capability exists even when buffer is empty.
Default system prompt updated to mention mesh awareness.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 07:27:39 +00:00
Ubuntu
63a2caad37 Add passive mesh context awareness — observe channel traffic, inject into LLM prompts
New context.py module: ring buffer (50K hard cap, ~25MB ceiling) passively
records all channel broadcasts. Observations are formatted with relative
timestamps and injected into the system prompt when generating LLM responses.
Only public channel traffic is observed; DMs to the bot are excluded (already
in per-user history). Bot's own node ID is auto-added to ignore list.

Config: context.enabled, observe_channels, ignore_nodes, max_age, max_context_items
TUI: new Context settings submenu (menu item 7)
Hourly prune removes expired observations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 22:02:42 +00:00
Ubuntu
1e033316fb Remove dead channel/mention code — DM-only bot cleanup
MeshAI is now DM-only. Removed all unreachable channel response
paths, @mention detection, ChannelsConfig, and channel TUI menu.
Fixed restart mechanism with integrated watcher and SIGKILL fallback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 08:43:25 +00:00
Ubuntu
32147ccaec Add Commands section to TUI configurator with toggle for each built-in command
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 07:33:36 +00:00
Ubuntu
6e2d956be6 Migrate Google backend from deprecated google-generativeai to google-genai SDK with grounding support
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 04:44:44 +00:00
Ubuntu
5ad8da47bb Strip config to working features only
Remove ~15 unused dataclasses (RateLimitsConfig, LoggingConfig,
SafetyConfig, UsersConfig, CommandsConfig, PersonalityConfig,
WebStatusConfig, AnnouncementsConfig, WebhookConfig, IntegrationsConfig,
etc). Strip config.example.yaml and docker-entrypoint.sh defaults.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 00:25:34 +00:00
Ubuntu
247074afd1 Add advBBS protocol message filter to router
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 21:28:20 +00:00
Ubuntu
2a857bcb70 Deprecate Config.get_system_prompt() and fix callback type annotation
- 7a: Config.get_system_prompt() now logs a deprecation warning.
  PersonalityManager.get_system_prompt() is the canonical source (wired
  in commit 4). LLMConfig.system_prompt kept for backwards compat as
  fallback when personality is not configured.
- 7b: Fix AnnouncementScheduler callback type from asyncio.coroutine
  (a decorator, not a type) to Awaitable[None] from typing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 20:19:09 +00:00