Commit graph

4 commits

Author SHA1 Message Date
b948ed775f feat(v0.6-phase2): rip out quiet hours entirely -- dashboard toggle, config schema, pipeline checks. Per Matt's repeated feedback (saved as feedback-quiet-hours-trash.md): silent is better than ugly, mesh users who need a fire alert at 3 AM need it at 3 AM. No replacement.
Backend removals:
  meshai/config.py
    - NotificationRuleConfig.override_quiet field
    - NotificationToggle.quiet_hours_override field
    - NotificationsConfig.quiet_hours_enabled / quiet_hours_start /
      quiet_hours_end fields
    - _default_toggles() no longer sets quiet_hours_override=True
    - rule migration helper no longer copies override_quiet
  meshai/notifications/router.py
    - self._quiet_enabled / _quiet_start / _quiet_end instance vars
    - _in_quiet_hours() method (deleted entirely)
    - The dispatch-time check that suppressed non-overriding rules
      during quiet hours
    - 'override_quiet': False dropped from subscription rule dicts
  meshai/notifications/pipeline/dispatcher.py
    - _toggle_to_rule() no longer passes override_quiet=... to the
      NotificationRuleConfig constructor

Test changes:
  tests/test_notification_toggles.py
    - RecChannel.deliver() no longer records override_quiet
    - test_quiet_hours_override_immediate_only deleted (only tested the
      removed feature)

Frontend removals (dashboard-frontend/src/pages/Notifications.tsx):
  - The 'Enable Quiet Hours' card with its time-range inputs deleted
  - 'Override Quiet Hours' per-rule toggle deleted
  - 'Quiet-hours override (immediate only)' per-toggle field deleted
  - quiet_hours_* fields removed from TS interfaces
  - quietHoursEnabled prop + state plumbing removed from the RuleEditor
  - All override_quiet: false defaults dropped from rule scaffolds
  - Unused Moon icon import dropped

Verification (post-strip):
  grep -rn 'quiet_hours\|override_quiet' meshai/*.py meshai/**/*.py
    -> 0 hits
  grep -rn 'quiet_hours\|override_quiet\|quietHours' dashboard-frontend/src
    -> 0 hits

Test count: 830 -> 829 (-1: test_quiet_hours_override_immediate_only
deleted; no other regressions).

No replacement. Mesh users who need a fire alert at 3 AM need it at 3 AM.
2026-06-05 20:39:36 +00:00
c333a97344 feat(v0.6-2): dispatcher state persistence -- cold-start, cooldowns, dedup LRU to SQLite
Closes Rule-20 dispatcher gap from audit doc v0.6-phase1-audit.md finding #1.
Pre-this-commit the cold-start anchor, 4 drop counters, per-toggle cooldown
map, and dedup OrderedDict all lived in Dispatcher instance memory and were
lost on every container restart.

v5.sql adds three tables:
  - dispatcher_state (singleton id=1): cold_start_anchor + 4 drop counters
  - dispatcher_cooldowns ((toggle,category,region) keyed): last_fired_at
  - dispatcher_dedup ((source,event_id) keyed): seen_at

Dispatcher refactor:
  - __init__ calls _restore_from_db -- counters, cold-start anchor, cooldown
    map, and dedup LRU (most-recent 10k by seen_at) all rehydrated from the
    three new tables
  - write-through on every mutation: _persist_state for counter/anchor,
    _persist_cooldown for cooldown UPSERT + 2*cooldown_s prune,
    _persist_dedup for dedup INSERT OR REPLACE + 7-day cleanup
  - in-memory caches stay authoritative on the fast read path
  - cumulative-since-install counters (NOT since-boot); LLM will be able
    to answer "we have dropped 47 stale events this week" after commit #5
    (env_reporter) lands
  - graceful degrade: missing v5 tables / persistence outage falls back to
    fresh in-memory state without crashing the constructor

Tests:
  - tests/test_dispatcher_persistence.py (17 tests): state restore on init,
    counter+cooldown+dedup survival across simulated restart, cooldown rearm
    within 2x window, dedup LRU rebuild caps at 10k, 7-day cleanup on insert,
    INSERT OR REPLACE on duplicate source+event_id, v5 migration idempotent,
    synthetic storm (50 events) -> restart -> replay (5 incl 1 duplicate)
    with the duplicate dedup-rejected and counters NOT resetting
  - tests/conftest.py (new): autouse MESHAI_DB_PATH redirection to per-test
    tmp file, so the dispatcher_*  tables on production /data dont get
    polluted by tests that construct Dispatcher() without an explicit fixture
  - tests/test_notification_toggles.py: _dispatch helper wipes dedup/cooldown/
    state tables between calls (per-call independence preserved; pre-v0.6-2
    in-memory-only Dispatcher reset naturally per instance)

Test count: 680 -> 697 (+17 new, 0 regressions).

Refs audit doc v0.6-phase1-audit.md finding #1.
2026-06-05 16:35:40 +00:00
053d67db6e feat(v0.5.8b): persistence foundation + WFIGS handler + universal cold-start grace
Three integrated pieces that ship together because they were designed as one safety story: (1) PERSISTENCE FOUNDATION -- new meshai/persistence/ module with SQLite db.py, schema migration framework (v1), 13 tables covering all adapter event shapes (traffic_events, fires, firms_pixels, quake_events, nws_alerts, gauge_readings, swpc_events) + mesh state (mesh_nodes, mesh_telemetry, mesh_positions, mesh_messages_in, mesh_broadcasts_out, mesh_health_events) + cross-cutting event_log + schema_meta. WAL mode for reader concurrency, single-writer pattern, MESHAI_DB_PATH env var, mounted at /data/meshai.sqlite via existing docker-compose meshai_data volume. .gitignore updated. (2) WFIGS HANDLER -- meshai/central/wfigs_handler.py implements the first per-adapter handler that uses the persistence layer. Format: MEDIUM style with town/landclass/county fallback chain, lat/lon at 3-decimal precision, New:/Update: prefix. 8h-rate-limited change-detection per IRWIN via fires.last_broadcast_at. Skips tombstones and perimeters silently (logged to event_log with handled=0). Acres fallback chain DailyAcres -> IncidentSize -> raw.DiscoveryAcres -> raw.FinalAcres -> N/A. Pass-through Initial Attack auto-numbered names (IA 1, IA 2). (3) UNIVERSAL COLD-START GRACE -- meshai/notifications/pipeline/dispatcher.py grows a configurable grace window (cold_start_grace_seconds, default 60s, GUI-editable per Rule 17). Anchored to first-event-seen (not container boot), so the grace activates the moment broadcasts could fire. Suppresses mesh delivery during the window; handler-side persistence (fires UPSERT, event_log) still happens normally. New _cold_start_dropped counter exposed in dispatch_stats(). Designed to protect against JetStream backlog spam at toggle-flip time, applies universally to ALL adapters. (4) WFIGS HANDLER CALLBACK REFACTOR -- New:/Update: prefix now keys on fires.last_broadcast_at IS NULL (not row-missing), and last_broadcast_* field updates moved to a post-broadcast commit callback that the dispatcher invokes ONLY on successful delivery. This means: cold-start-suppressed events leave fires.last_broadcast_at NULL, so when they eventually broadcast post-grace, they correctly render as New: (first ACTUAL delivery for that IRWIN), not Update:. event_log.handled and mesh_broadcasts_out audit row also gated on the same callback -- decoupling persistence rows from broadcast rows for an honest audit trail. New tests: 15 in test_wfigs_handler.py, 15 in test_persistence.py, additional cold-start grace tests in test_dispatcher.py (+4 WFIGS callback scenarios). Synthetic probes wfigs-cleaned-samples.md (initial) and wfigs-cleaned-samples-v2.md (cold-start verification) generated against isolated temp SQLite databases. CT108 /data/meshai.sqlite untouched during build. Master stays off. No live toggle flips. Test count: was 535 (v0.5.7 baseline) -> 566 (persistence) -> 581 (wfigs handler) -> 589 expected (cold-start grace).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-05 03:54:04 +00:00
b90afc3a74 feat(notifications): v0.5.0 -- Master Toggles UX redesign + Central Connection GUI + grouped categories + region scoping
Per-family notification policy (PagerDuty/Grafana-style): each family gets a
severity threshold + region scope + a severity->channel routing matrix, so an
operator opts in per family rather than hand-writing rules.

SECTION 1 -- BACKEND
- config.py: new NotificationToggle dataclass (enabled, min_severity, regions,
  severity_channels{severity->[channel types]}, quiet_hours_override, + per-channel
  delivery config: broadcast_channel/node_ids/smtp_*/recipients/webhook_*).
  notifications.toggles is now a dict[family]->NotificationToggle with 8 family
  defaults (mesh_health, weather, fire, rf_propagation, roads, avalanche, seismic,
  tracking), all enabled=false (opt-in), min_severity=priority,
  severity_channels={priority:[mesh_broadcast], immediate:[mesh_broadcast, mesh_dm]},
  quiet_hours_override=true. (Old TogglesConfig.enabled was only read by
  build_pipeline via getattr -> degrades to ToggleFilter no-op, so the pipeline
  filter is unchanged; toggles now drive the Dispatcher instead.)
- region_scope:list added to NotificationRuleConfig; _matching_rules filters by
  event.region/regions ([] = all).
- Dispatcher: _dispatch_toggles runs IN PARALLEL to rule matching -- looks up
  get_toggle(event.category), checks enabled + region scope + severity threshold,
  then for each channel in severity_channels[event.severity] builds a synthetic
  rule (override_quiet set only for immediate when quiet_hours_override) and
  delivers. 'digest' channel is skipped in live dispatch (handled by accumulator).
- categories.py: get_toggle() prefix fallback maps the live phases-2.7-2.14
  categories (weather_warning, wildfire_incident, earthquake_event,
  traffic_congestion, geomagnetic/rf_*, stream_*, ...) to their family, fixing the
  v0.4 "category -> other" gap.
- config_loader.py: SECRET_FIELDS += notifications.toggles.*.smtp_password.
- _dataclass_to_dict now recurses dict-of-dataclasses, and the loader coerces the
  toggles dict -> NotificationToggle on both the full-load and section-PUT paths
  (so GUI save round-trips correctly).
- tests/test_notification_toggles.py (11): enabled/disabled, region filter
  (empty+populated+regions-list), severity threshold, per-severity channel routing,
  digest-skipped-live, quiet-hours-override immediate-only, category->family,
  rules+toggles both fire. Full suite: 294 passed (283 + 11).

SECTION 2 -- FRONTEND
- Notifications.tsx: MasterToggles component above the rules section -- 8 family
  cards (icon + enable toggle; collapsed summary 'OFF' or 'N regions, M channels at
  <sev>+'; expanded: severity threshold, severity x channel checkbox matrix,
  region list, quiet-hours-override toggle, per-channel config:
  broadcast_channel/DM node IDs/recipients/SMTP host+port/webhook URL).
- Environment.tsx: CentralConnectionPanel above the family tabs (url, durable,
  enabled) wired to environmental.central.
- npm run build clean (tsc strict); rebuilt static committed (index-CfYlhn4e.js).

SECTION 3 -- VERIFICATION
- py_compile + tsc strict clean; pytest 294 passed.
- Rebuilt prod: /notifications serves Master Toggles, /environment serves Central
  Connection (strings confirmed in the served bundle); 8 adapters, pipeline
  started, no tracebacks, healthy.
- GUI round-trip: enable weather toggle (min_severity=priority,
  regions=[Magic Valley], severity_channels.priority=[mesh_broadcast]) -> PUT
  {saved:true} -> notifications.yaml reflects it; env_feeds traffic.api_key stayed
  ${TOMTOM_API_KEY} (C.3.1 secret preservation holds). Restored to clean opt-in
  baseline.
- Synthetic NWS weather_warning/priority/Magic Valley -> routes through the weather
  toggle to mesh_broadcast; out-of-region and below-threshold events correctly
  dropped.

DEFERRED (noted for a follow-up, not blocking Matt's morning config): Section 2B
rules-editor polish -- grouped-by-family category checkboxes, region_scope
multi-select in the rule editor (backend field + filtering ARE in), tooltips, and
the fire-count Active/No-activity badge -- were not built tonight to keep the build
shippable and verified; the Advanced Rules section is otherwise unchanged and
still functional.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 07:00:10 +00:00