Commit graph

5 commits

Author SHA1 Message Date
c333a97344 feat(v0.6-2): dispatcher state persistence -- cold-start, cooldowns, dedup LRU to SQLite
Closes Rule-20 dispatcher gap from audit doc v0.6-phase1-audit.md finding #1.
Pre-this-commit the cold-start anchor, 4 drop counters, per-toggle cooldown
map, and dedup OrderedDict all lived in Dispatcher instance memory and were
lost on every container restart.

v5.sql adds three tables:
  - dispatcher_state (singleton id=1): cold_start_anchor + 4 drop counters
  - dispatcher_cooldowns ((toggle,category,region) keyed): last_fired_at
  - dispatcher_dedup ((source,event_id) keyed): seen_at

Dispatcher refactor:
  - __init__ calls _restore_from_db -- counters, cold-start anchor, cooldown
    map, and dedup LRU (most-recent 10k by seen_at) all rehydrated from the
    three new tables
  - write-through on every mutation: _persist_state for counter/anchor,
    _persist_cooldown for cooldown UPSERT + 2*cooldown_s prune,
    _persist_dedup for dedup INSERT OR REPLACE + 7-day cleanup
  - in-memory caches stay authoritative on the fast read path
  - cumulative-since-install counters (NOT since-boot); LLM will be able
    to answer "we have dropped 47 stale events this week" after commit #5
    (env_reporter) lands
  - graceful degrade: missing v5 tables / persistence outage falls back to
    fresh in-memory state without crashing the constructor

Tests:
  - tests/test_dispatcher_persistence.py (17 tests): state restore on init,
    counter+cooldown+dedup survival across simulated restart, cooldown rearm
    within 2x window, dedup LRU rebuild caps at 10k, 7-day cleanup on insert,
    INSERT OR REPLACE on duplicate source+event_id, v5 migration idempotent,
    synthetic storm (50 events) -> restart -> replay (5 incl 1 duplicate)
    with the duplicate dedup-rejected and counters NOT resetting
  - tests/conftest.py (new): autouse MESHAI_DB_PATH redirection to per-test
    tmp file, so the dispatcher_*  tables on production /data dont get
    polluted by tests that construct Dispatcher() without an explicit fixture
  - tests/test_notification_toggles.py: _dispatch helper wipes dedup/cooldown/
    state tables between calls (per-call independence preserved; pre-v0.6-2
    in-memory-only Dispatcher reset naturally per instance)

Test count: 680 -> 697 (+17 new, 0 regressions).

Refs audit doc v0.6-phase1-audit.md finding #1.
2026-06-05 16:35:40 +00:00
b2c4d53b14 feat(v0.6-1): FIRMS handler -- storage-only, closes silent-drop on central.fire.hotspot.>
v0.5.13 default-deny was silently dropping every FIRMS hotspot because no
per-adapter handler existed. firms_pixels table has been empty since v0.5.8b.

This commit adds central/firms_handler.py which stores every passing pixel
that clears the (currently hardcoded, future GUI-driven) confidence + FRP
floors, with dedup on (round(lat,5), round(lon,5), acq_time, satellite) via
a unique partial index added in v4.sql. NO mesh broadcasts emitted by this
handler -- FIRMS data is for LLM context only and will become queryable
when commit #5 (env_reporter) lands.

Defaults baked in:
  FIRMS_CONFIDENCE_FLOOR = "low"   -- store every confidence level
  FIRMS_FRP_FLOOR        = 0.0     -- store every FRP value
  FIRMS_BBOX_OPTIONAL    = None    -- no spatial filter

These become adapter_config GUI rows in commit #3 (per Matt's v0.6 Phase 1
refinement: hardcoded values become GUI default values so first-deploy
behavior is unchanged).

Wiring:
  - meshai/central/firms_handler.py (new, 270 lines)
  - meshai/persistence/migrations/v4.sql (new, unique dedup index)
  - meshai/persistence/db.py (SCHEMA_VERSION 3 -> 4)
  - meshai/central/consumer.py (dispatch ladder gets firms branch before
    default-deny clause; pattern matches handle_swpc / handle_nwis)
  - tests/test_firms_handler.py (new, 22 tests covering confidence floor,
    FRP floor, bbox, dedup, missing fields, end-to-end through consumer)
  - tests/test_consumer_default_deny.py (swap firms -> avalanche for the
    "no handler" examples since FIRMS now has one)

Test count: 658 -> 680 (+22 firms_handler tests, 0 regressions).

Refs audit doc v0.6-phase1-audit.md finding #2.
2026-06-05 15:50:33 +00:00
0da83e0d3d feat(v0.5.11): band conditions scheduled broadcaster (3x/day HF propagation)
First clock-driven broadcaster in meshai, distinct from the v0.5.8b/v0.5.9/v0.5.10 event-driven adapters. The same persistence + dispatcher + cold-start patterns apply, but the trigger is the wall clock at 06:00 / 14:00 / 22:00 Mountain Time (default; GUI-configurable per Rule 17).

Components: (1) meshai/notifications/scheduled/band_conditions.py with BandConditionsScheduler (asyncio loop, mirrors the existing DigestScheduler shape), compute_band_ratings() with two-tier data sourcing -- (a) latest swpc_kindex + swpc_alerts F10.7 rows from persistence within the last 6h, (b) HamQSL.com solarxml.php fallback when SWPC is stale or incomplete, (c) silent skip when both fail, format_band_conditions_wire() multi-line MEDIUM output (~115-120B). (2) v3 schema migration adding band_conditions_broadcasts(broadcast_id PK AUTO, sent_at, scheduled_for UNIQUE, ratings_json, source). UNIQUE(scheduled_for) enforces per-slot dedup so a retry storm cannot double-broadcast. (3) Dispatcher.dispatch_scheduled_broadcast() bypasses the toggle / rules / freshness-gate pipeline but DOES honour the v0.5.8b cold-start grace -- first scheduled broadcast within the grace window after meshai starts is suppressed, mesh_broadcasts_out audit row only inserted on actual delivery. Channel selection routes through the rf_propagation toggle\'s broadcast_channel since band conditions IS RF-propagation info. (4) NotificationsConfig gains band_conditions_enabled (default true), band_conditions_schedule (list of HH:MM strings, default ["06:00","14:00","22:00"]), band_conditions_tz (default "America/Boise" so DST handles automatically). (5) Notifications.tsx grows a Band Conditions card between Cold-Start Grace and Master Toggles with the enable toggle + 3 TimeInput slots + a one-liner explaining the source priority. (6) build_pipeline + start_pipeline spawn the BandConditionsScheduler alongside the existing DigestScheduler -- best-effort, scheduler failures must NOT break notifications startup.

Wire format examples (multi-line, all under 130B target):

  ☀️ Day Propagation
  📡 Band Conditions:
  80-40m: 🟡 Fair
  30-20m: 🟢 Good
  17-15m: 🟢 Good
  12-10m: 🟡 Fair

  🌞 Day Propagation  (14:00 slot when storm onset, Kp=6 SFI=110)
  📡 Band Conditions:
  80-40m: 🔴 Poor
  30-20m: 🔴 Poor
  17-15m: 🔴 Poor
  12-10m: 🟡 Fair

  🌙 Night Propagation  (22:00 slot, recovery, Kp=4 SFI=120)
  📡 Band Conditions:
  80-40m: 🟡 Fair
  30-20m: 🟡 Fair
  17-15m: 🔴 Poor
  12-10m: 🔴 Poor

Tests: was 686 (v0.5.10 baseline), now 704 (+18 net new -- quiet/storm condition ratings, HamQSL XML parse fallback, both-fail silent-skip path, is_day_slot per HH:MM, wire format for all 3 slot variants, byte-size guard, 6-line shape, fire_slot record row, dedup via UNIQUE constraint, silent-skip path, slot_epoch DST alignment summer + winter). Synthetic 24h probe verified the 3 expected slots fire correctly with quiet/storm/recovery scenarios + the 4th no-data scenario lands as source=\'skipped_no_data\' with no broadcast.

usgs_nwis deferred to v0.5.12 (threshold-curation work). Master OFF in prod.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-05 07:38:51 +00:00
0099d0fd94 feat(v0.5.9): unified incident pipeline + state_511_atis Idaho cutover + two-sided freshness gate
Coordinated change across the consumer dispatch layer + central_normalizer + new incident_handler + new itd_511 work_zone parser. The integrated story: Central PM filed a heads-up that ITD 511 publishes four EventTypes (work_zone, closure, incident, special_event) under us.id Convention A, and meshais v0.5.8 work focused on work_zone shape only -- meaning incidents (the higher-priority operational signal) were silently rendering as the v0.5.7-regression-style fallback. This v0.5.9 closes that gap by treating incidents + closures + special_events from all three traffic adapters (state_511_atis, itd_511, tomtom_incidents) as a single unified pipeline, while also migrating Idaho coverage from state_511_atis to itd_511 (the direct ITD feed) at the consumer level. Components: (1) new meshai/central/incident_handler.py routes incident/closure/special_event events through per-adapter parsers (tomtom, itd_511, state_511_atis-non-ID) to a canonical incident shape, then a single rendering pipeline with sub_type-aware emoji selection (jam/crash/road_closed/disabled_vehicle/parade/special_event/vehicle_fire/road_works). (2) Universal two-sided freshness gate in the consumer dispatch layer: only events with 0 <= age <= 1800s (default-allow on missing start_time) make it past the gate. Rejects both stale events (more than 30 min old) AND future-scheduled events (negative age -- a real itd_511 case for scheduled work projects). The gate sits ABOVE both incident_handler and the v0.5.8 work_zone formatter so all adapters get gated uniformly. (3) state_511_atis Idaho cutover -- both incident_handler and the v0.5.8 work_zone parser skip state_511_atis events where the state token is ID, deferring to itd_511 as the authoritative source. state_511_atis remains fully active for non-Idaho neighbor coverage (WA/OR/MT/UT/WY/NV) -- verified by Phase 2 WA broadcasts in the synthetic probe. (4) new itd_511 work_zone parser (extension to central_normalizer.py) consumes the itd_511 work_zone EventType and produces the same MEDIUM-style wire format as the existing state_511_atis work_zone parser (road + mile range + town + direction + sub_type + ends-at). (5) No Update: broadcasts in the incident pipeline -- per Matts call, real-time traffic Updates (jam getting worse, delay growing) are not actionable for mesh users. State tracking continues via traffic_events UPSERT but only the first sighting of an external_id ever fires a New: broadcast. WFIGS handler unchanged -- fires keep their 8h-rate-limited Update: behavior since acres growth IS operationally meaningful (evacuation decisions). Forecast: 3-10 mesh broadcasts/day in Idaho, all New:. Cross-check: original raw broadcast count was 623 against a fixed-clock 49-min synthetic window; after v0.5.9 REVISED (no Updates) it dropped to 18; after v0.5.9 GAMMA (two-sided gate + Idaho cutover) it dropped to 9. Test count: was 589 baseline, +45 net new -- 634 passing. Synthetic probe verified all four phases: Phase 1 (replay 3032 captured envelopes) = 0 broadcasts (correctly suppressed); Phase 2 (synthesized fresh non-ID + ID) = 7 broadcasts; Phase 3 (synthesized fresh itd_511 work_zone) = 2 broadcasts; Phase 4 (synthesized fresh ID for explicit ID-skip exercise) = caught by ID-skip 1/1. Master stays off in prod; no toggle flips.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-05 06:41:21 +00:00
053d67db6e feat(v0.5.8b): persistence foundation + WFIGS handler + universal cold-start grace
Three integrated pieces that ship together because they were designed as one safety story: (1) PERSISTENCE FOUNDATION -- new meshai/persistence/ module with SQLite db.py, schema migration framework (v1), 13 tables covering all adapter event shapes (traffic_events, fires, firms_pixels, quake_events, nws_alerts, gauge_readings, swpc_events) + mesh state (mesh_nodes, mesh_telemetry, mesh_positions, mesh_messages_in, mesh_broadcasts_out, mesh_health_events) + cross-cutting event_log + schema_meta. WAL mode for reader concurrency, single-writer pattern, MESHAI_DB_PATH env var, mounted at /data/meshai.sqlite via existing docker-compose meshai_data volume. .gitignore updated. (2) WFIGS HANDLER -- meshai/central/wfigs_handler.py implements the first per-adapter handler that uses the persistence layer. Format: MEDIUM style with town/landclass/county fallback chain, lat/lon at 3-decimal precision, New:/Update: prefix. 8h-rate-limited change-detection per IRWIN via fires.last_broadcast_at. Skips tombstones and perimeters silently (logged to event_log with handled=0). Acres fallback chain DailyAcres -> IncidentSize -> raw.DiscoveryAcres -> raw.FinalAcres -> N/A. Pass-through Initial Attack auto-numbered names (IA 1, IA 2). (3) UNIVERSAL COLD-START GRACE -- meshai/notifications/pipeline/dispatcher.py grows a configurable grace window (cold_start_grace_seconds, default 60s, GUI-editable per Rule 17). Anchored to first-event-seen (not container boot), so the grace activates the moment broadcasts could fire. Suppresses mesh delivery during the window; handler-side persistence (fires UPSERT, event_log) still happens normally. New _cold_start_dropped counter exposed in dispatch_stats(). Designed to protect against JetStream backlog spam at toggle-flip time, applies universally to ALL adapters. (4) WFIGS HANDLER CALLBACK REFACTOR -- New:/Update: prefix now keys on fires.last_broadcast_at IS NULL (not row-missing), and last_broadcast_* field updates moved to a post-broadcast commit callback that the dispatcher invokes ONLY on successful delivery. This means: cold-start-suppressed events leave fires.last_broadcast_at NULL, so when they eventually broadcast post-grace, they correctly render as New: (first ACTUAL delivery for that IRWIN), not Update:. event_log.handled and mesh_broadcasts_out audit row also gated on the same callback -- decoupling persistence rows from broadcast rows for an honest audit trail. New tests: 15 in test_wfigs_handler.py, 15 in test_persistence.py, additional cold-start grace tests in test_dispatcher.py (+4 WFIGS callback scenarios). Synthetic probes wfigs-cleaned-samples.md (initial) and wfigs-cleaned-samples-v2.md (cold-start verification) generated against isolated temp SQLite databases. CT108 /data/meshai.sqlite untouched during build. Master stays off. No live toggle flips. Test count: was 535 (v0.5.7 baseline) -> 566 (persistence) -> 581 (wfigs handler) -> 589 expected (cold-start grace).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-05 03:54:04 +00:00