Commit graph

27 commits

Author SHA1 Message Date
11e37c4f48 fix(central): v0.4 D.2 -- remap Central adapter names to meshai source for consistent dashboard attribution
Phase D catalogued a source-name divergence: central-sourced events carried
Central's adapter name (wfigs_incidents, nwis, swpc_alerts, wzdx) rather than
meshai's native source (fires, usgs, swpc, traffic), so the C.2 family-tab
per-adapter event filtering (which keys on the native source name) wouldn't
group central events under the right adapter.

Fix: CENTRAL_ADAPTER_TO_SOURCE table in consumer.py; normalize() now remaps
inner Event.adapter -> meshai source, falling back to the literal adapter name
for anything not in the table (logged at DEBUG when a translation happens).

before -> after (Event.source):
  wfigs_incidents / wfigs_perimeters -> fires
  nwis                               -> usgs
  swpc_alerts / swpc_kindex / swpc_protons -> swpc
  wzdx                               -> traffic
  nws, usgs_quake, firms             -> unchanged (1:1, omitted from table)
  unknown (e.g. experimental_foo)    -> passthrough as-is

Tests: tests/test_central_consumer.py parametrized test_central_adapter_source_remap
(6 cases: 4 remaps + nws passthrough + unknown passthrough). Full suite: 283 passed.

In-prod verify (rebuilt, ephemeral probe over real Central data): the four
observed adapters now normalize to source=fires/usgs/swpc/traffic; nws passes
through. No live flip needed; container stays native baseline + healthy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 06:12:47 +00:00
ea0c68097a fix(central): v0.4 D.1 -- subject-domain category fallback (traffic 'work_zone.wzdx' was mapping to 'other')
Surfaced during the Phase D rollout flipping all five remaining domains to
central. Central's traffic categories are NOT domain-prefixed -- the inner
Event.category for a work zone is "work_zone.wzdx", not "traffic.work_zone".
The prefix table in map_category therefore missed and returned "other", which
would break category-based routing/digest grouping for central-sourced traffic.

before: map_category("work_zone.wzdx") -> "other"
after:  when the category table misses, fall back to the stable subject domain
        token (central.<domain>.<...>): central.traffic.* -> traffic_congestion.
        Added category_from_subject() + a domain->category map (wx, fire, quake,
        hydro, space, disaster, traffic, traffic_flow, traffic_cameras). The
        well-prefixed domains (wx.alert, fire.incident, hydro., space.alert)
        still match the primary table; the fallback only fires on a miss, so a
        known domain never yields "other" again.

Test: tests/test_central_consumer.py gains test_subject_domain_fallback_for_unmapped_category
(category_from_subject + a 'work_zone.wzdx' message -> traffic_congestion).
Full suite: 277 passed.

Verified in prod (rebuilt, all 5 flipped to central): the per-domain
LAST_PER_SUBJECT normalize probe now shows traffic -> category=traffic_congestion
(was 'other'); the other four domains unchanged and clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 05:05:12 +00:00
a491684861 fix(central): v0.4 C.3.1 -- preserve secret refs in save_section + deliver_policy=NEW (no backlog flood)
Fixes the two real bugs C.3 surfaced when flipping usgs_quake to central.

BUG #1 -- GUI save dropped ${VAR} secret refs (config_loader.save_section).
  before: A GUI PUT round-trips the *interpolated* secret value (GET returns the
          resolved key string, e.g. the real TomTom key). save_section's
          check_secrets saw a literal string at a SECRET_FIELDS path, didn't
          recognize it as a ref, and DROPPED it -- losing the on-disk
          ${TOMTOM_API_KEY} placeholder. C.3's flip PUT stripped TomTom's key.
  after:  check_secrets now reads the raw on-disk value (pre-interpolation) for
          each secret field and decides three ways:
            on-disk ${VAR} and new == resolved(VAR)  -> keep the ${VAR} ref
            on-disk ${VAR} and new != resolved(VAR)  -> intentional change, store it
            no on-disk ${VAR} ref                    -> reject (never write a raw
                                                        secret to a domain file)
          ${VAR} resolution mirrors load: os.environ first, then /data/secrets/.env.
          The common case (GUI re-saves unchanged config) now preserves the
          placeholder instead of dropping it.

BUG #2 -- CentralConsumer replayed the entire retained backlog on first flip.
  before: js.subscribe(...) with no config -> default deliver_policy=all. Fine
          for quake (682 msgs) but would flood the bus with ~330k traffic_flow
          messages on first flip.
  after:  consumer_config() -> ConsumerConfig(deliver_policy=DeliverPolicy.NEW):
          only messages published AFTER consumer creation. meshai won't see the
          backlog on first flip -- acceptable, Central is a live firehose for
          current events. (NOT geo-filtering -- that's a Central-side issue filed
          separately for the Central project.)

Files: meshai/config_loader.py (save_section secret preservation),
meshai/central/consumer.py (consumer_config() + deliver_policy=NEW),
tests/test_save_section_secret_preserve.py (new),
tests/test_central_consumer.py (deliver_policy assertion).

Verification:
- (A) py_compile clean on config_loader.py + consumer.py.
- (C) pytest -q: 276 passed (272 + 4 new -- preserve-unchanged-ref,
  changed-value-written, no-placeholder-still-rejects, deliver_policy=NEW).
  The C.2.1 strip test still passes (no placeholder -> reject).
- (D) In-prod (rebuilt): GET+PUT /api/config/environmental round-trip ->
  {"saved":true}; on-disk traffic.api_key stayed '${TOMTOM_API_KEY}'
  (SECRET_REF_PRESERVED: True), not the literal key; disk restored to baseline.
  consumer_config().deliver_policy == DeliverPolicy.NEW in the built image.

Follow-up for D rollout: the durable 'meshai-v04-central_quake_' created during
C.3 was made with deliver_policy=all; re-flipping a domain may need that stale
durable deleted on the Central NATS server first (config mismatch on re-subscribe).

D rollout (remaining domains) is now safe: GUI flips preserve secret refs and
new subscriptions don't replay huge backlogs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 04:55:20 +00:00
a4f23c226e fix(dashboard): v0.4 C.2.1 -- route PUT /config to multi-file save_section (Rule 17 persistence unblocked)
C.2 surfaced that GUI config saves were broken in the prod multi-file
layout. This fixes it. Pre-existing v0.3-era bug (predates C.2; affected
EVERY config section, not just environmental).

Save flow (before -> after):
  before: PUT /api/config/{section} -> config.py::load_config(config.yaml)
          [monolithic, vanilla YAML] -> blows up on the !include orchestrator
          ("could not determine a constructor for the tag '!include'"),
          then config.py::save_config (same !include-blind path). Every save
          500'd; nothing persisted.
  after:  PUT validates the body by coercing to the section dataclass (runs
          __post_init__ validators, e.g. feed_source), then persists via
          config_loader.py::save_section(section, dict, config_dir) -- the
          multi-file / !include-aware writer. It writes ONLY the section's
          target file (env_feeds.yaml for environmental, notifications.yaml,
          llm.yaml, ...), strips SECRET_FIELDS (traffic.api_key, firms.map_key)
          and extracts LOCAL_FIELDS (ducting lat/lon -> local.yaml). The
          orchestrator config.yaml and its !include directives are never
          re-parsed. Live app.state.config is kept in sync via setattr when
          the section isn't restart-required (no disk reload needed).

Also: save_section now tolerates a top-level LIST section (mesh_sources) --
it cleans each item for secrets and writes the list directly instead of
assuming a dict (which would have crashed). Other callers of save_config are
untouched (it remains valid for the monolithic single-file path).

Files: meshai/dashboard/api/config_routes.py (PUT handler + import),
meshai/config_loader.py (save_section list guard),
tests/test_dashboard_config_save.py (new).

Verification:
- (A) py_compile clean on config_routes.py + config_loader.py.
- (C) pytest -q: 272 passed (269 + 3 new -- save_section writes env_feeds,
  strips secret fields, handles the mesh_sources list section).
- (D) Rebuilt prod; ran the C.2 round-trip again, now SUCCESS: backup
  env_feeds.yaml (md5 dde5d634...), GET then PUT /api/config/environmental ->
  {"saved":true,"restart_required":false} (NO !include error); disk reflected
  it (feed_source on all 10 adapters + central block written); restored from
  backup -> md5 matches original -> DISK_PRISTINE_RESTORED.
- (E) Rule 17 round-trip confirmed: the GUI can now SAVE config that
  round-trips to disk in the multi-file !include layout, secrets staying in
  .env and local fields in local.yaml.

C.3 (quake -> central flip) is now unblocked: feed_source can be flipped and
saved from the GUI.

Follow-up (non-blocking): mesh_sources per-item secret stripping
(mesh_sources.*.api_token) isn't matched by the section-relative check in the
new list path; mesh_sources files are volume-only (not git) and this was no
worse before, but worth tightening when mesh_sources GUI save is exercised.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 03:17:30 +00:00
73c007d227 feat(central): v0.4 C.1 Central connector backend (no-op until adapter source flipped)
Adds the backend for sourcing environmental feeds from Central's NATS
JetStream firehose instead of (or alongside) meshai's native adapters.
Architecture is Matt-approved Option 3' (dedicated package + per-adapter
source switch surfaced on the existing Environmental config).

NO-OP POSTURE (intentional): every adapter defaults to feed_source="native"
and environmental.central.enabled defaults false, so on a stock config the
CentralConsumer starts and subscribes to nothing -- behavior is byte-for-byte
v0.3. Live env_feeds.yaml is unchanged on disk; an operator who touches
nothing sees no change. Flipping an adapter to central is Phase C.3; the
dashboard UI for it is Phase C.2.

What landed:
- meshai/central/ package (CentralConsumer): async start()/stop(), JetStream
  durable subscribe to subjects derived from adapters with feed_source=central,
  and _on_message -> normalize -> bus.emit. nats-py is lazy-imported only on
  the connect path, so no-op boot has zero NATS dependency.
- Normalization (CloudEvents envelope -> Central Event -> upstream data):
    source   = inner Event.adapter
    category = Central hierarchical string -> meshai flat, via a small
               table-driven prefix map (map_category)
    severity = 0|1->routine, 2->priority, 3|4->immediate, null->routine
    lat/lon  = geo.centroid, swapped from GeoJSON [lon,lat] -> (lat,lon)
    group_key/inhibit = outer envelope id (dedup parity with native adapters)
    expires/timestamp parsed from ISO-8601
    Event.data = upstream payload verbatim (generic _enriched merge, preserved
                 as-is incl. hydro's extra usgs_site/usgs_stats bundles)
- Tombstone (`.removed.` subject or `:removed` id suffix) -> a "clear" Event
  carrying the ORIGINAL group_key (`:removed` stripped) + data._central_tombstone
  so the grouper/inhibitor lets the prior event lapse naturally.
- config.py: a `_SourcedFeed` mixin adds `feed_source: native|central`
  (validated in __post_init__) to all 10 adapter configs; new
  CentralConsumerConfig as environmental.central { enabled, url, durable,
  connect_timeout }. Both ride the generic _dict_to_dataclass coercion, so
  they are GUI-editable via PUT /config/environmental (Rule 17) -- frontend
  fields come in C.2.
- env/store.py: each adapter is instantiated only when
  enabled AND feed_source=="native"; a feed_source=central adapter is skipped
  natively (debug-logged) so Central can own it without a duplicate.
- main.py: CentralConsumer constructed + started after start_pipeline(),
  stopped in stop().

DEVIATION FROM SPEC (documented): the spec named the new field `source`, but
FIRMSConfig already has a `source` field (the satellite product,
"VIIRS_SNPP_NRT"). To avoid the collision the field is named **feed_source**
across all adapters. Everything else follows the spec.

NETWORKING: zero infra change required. The meshai container already reaches
the Central NATS server directly (TCP to 100.64.0.12:4222 OK) and resolves
central.echo6.mesh via the Phase 2.6.6 MagicDNS fix. No docker-compose edit;
default bridge works (LXC host masquerades to the Tailscale CGNAT range). The
lighter bridge-route / host-net / sidecar fallbacks were not needed.

Tests: tests/test_central_consumer.py (11) + tests/test_config_source_field.py
(6): no-op-when-native, subjects-when-central, source-gate skips native
instantiation, normalize+emit, _enriched preserved verbatim, tombstone->clear,
severity map (0-4/null), category map (>=4 strings), async _on_message
emits+acks, start() no-op without NATS, feed_source default/validate/reject/
dict-coercion. Full suite: 269 passed (was 253 + 16 new).

Verification: (A) no bare self._x() in consumer.py. (B) py_compile clean.
(C) 269 passed. (D) rebuilt prod -- 8 native adapters, pipeline started,
native nifc/traffic emissions still flowing, healthy, no errors, log
"CentralConsumer started; 0 subjects subscribed -- no adapters set to central".
(E) in-container synthetic _on_message injection normalized correctly
(usgs_quake/earthquake_event/immediate, centroid swapped, _enriched preserved)
and reached the bus; ephemeral, no config change to roll back.

C.2 (dashboard frontend for the feed_source switch + central connection) is next.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 02:28:19 +00:00
20e0dec28a fix(notifications): Phase 2.16.1 unblock pipeline -- grouper flush + rules coercion + toggle warning
Phase 2.16 found the live notification pipeline never delivered any
environmental event. Two independent blocking bugs, both fixed here.

BUG A -- grouper held events forever (nothing drove tick()).
Every adapter event sets a group_key, so all were buffered in the Grouper
and never flushed (start_pipeline only started the DigestScheduler; no
tick driver existed). Fixes (per Matt's decisions):
  - Grouper.handle(): immediate-severity events now BYPASS the window
    entirely (delivered straight to next_handler), no buffering latency.
    routine/priority still coalesce.
  - start_pipeline(): schedules an asyncio flush task that calls
    grouper.tick() every `grouper_flush_seconds` (default 5s) so
    coalesced events drain within the window even when poll cadence is
    sparse. stop_pipeline() signals + cancels it.
  before/after (grouper held_count): an immediate+group_key event used to
  sit held (count 1) forever; now held_count==0 on arrival (bypassed). A
  routine event is held (count 1) then drained to 0 by tick()/flush.

BUG B -- notification rules loaded as dicts, crashing the dispatcher.
Root cause (more precise than 2.16's guess): the rules coercion is NOT
missing from the multi-file loader -- it lives in _dict_to_dataclass's
explicit `elif key == "notifications"` branch, but that branch was DEAD
CODE, shadowed by the generic `if hasattr(field_type,
"__dataclass_fields__")` handler that runs first for every dataclass
field (including notifications). So Config.notifications.rules stayed a
list of dicts on ALL load paths, and Dispatcher._matching_rules threw
`AttributeError: 'dict' object has no attribute 'enabled'`. Fix: hoist
the notifications special-handling ahead of the generic handler (and drop
the now-truly-dead duplicate elif).
  before/after (cfg.notifications.rules[0] type): dict -> NotificationRuleConfig.

OBS C -- empty enabled_toggles. Left as 'pass all' for v0.3 (per Matt);
added a startup WARNING in build_pipeline so operators see gating is off:
"enabled_toggles is empty -- ToggleFilter passing all events. Configure
toggles to enable gating." (confirmed firing live).

Tests:
  - tests/test_pipeline_grouper.py (new): test_immediate_severity_bypasses_grouper,
    test_periodic_flush_drains_routine, test_priority_is_also_coalesced_not_bypassed.
  - tests/test_config_loader.py (new): test_multifile_load_coerces_notification_rules,
    test_rules_attribute_access_does_not_raise (regression guards for Bug B).
  - tests/test_pipeline_inhibitor_grouper.py (updated): 5 existing grouper
    hold/coalesce/flush tests primed the grouper with immediate+group_key
    events expecting them to be held; switched those to 'priority' (still
    buffered; still outranks the routine event in the inhibitor-chain test)
    to match the intended immediate-bypass behavior.
  Full suite: 253 passed (was 248 + 5 new; 5 existing updated, none lost).

VERIFICATION (rebuilt prod, traced end-to-end via in-process build_pipeline
probe with a recording channel + live config):
  - rules[0] type: NotificationRuleConfig (Bug B fixed).
  - IMMEDIATE event: held_count==0 on emit (bypassed) -> reached
    channel.deliver(): delivered=[('PROBE_RULE','E2E IMMEDIATE')].
  - ROUTINE event: held_count==1 -> after flush 0 -> reached
    channel.deliver(): delivered+=[('PROBE_RULE','E2E ROUTINE')].
  - Natural Summit-Creek-shaped nifc wildfire_incident (routine, no
    matching dispatch rule): held 1 -> after flush -> landed in the digest
    accumulator (1 event). End-to-end channel.deliver evidence = the
    RecChannel.deliver() calls above.
  - Live container: 8 adapters, healthy, "Grouper flush task started
    (every 5s)", the enabled_toggles warning fired, and NO dispatcher
    AttributeError/traceback.

Follow-up (non-blocking): several Phase 2.7-2.14 categories (e.g.
wildfire_incident, earthquake_event) aren't in the category->toggle map,
so they fall to toggle 'other'. Harmless while enabled_toggles is empty
(pass-all), but should be mapped before toggle gating is turned on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 00:36:13 +00:00
8b2cdeee0b feat(notifications): Phase 2.14 USGS earthquake adapter (new) -- closes Rule 16 Seismic standalone path
First net-new environmental adapter (prior phases wired existing ones).
Adds meshai/env/usgs_quake.py with USGSQuakeAdapter + USGSQuakeConfig,
polling a keyless USGS earthquake GeoJSON feed and emitting one Event per
qualifying quake. Establishes the standalone Seismic path (Rule 16);
Central becomes the dual-source in v0.4.

Adapter (mirrors the fires/usgs-water per-event pattern):
- Feed: https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_day.geojson
  (M2.5+ past day -- M1.0 too noisy, M4.5+ too sparse for the region).
  Tick 300s.
- Filters each feature by min_magnitude AND a geographic bbox.
- Per quake: source=usgs_quake, category=earthquake_event, stable
  event_id = the USGS feature id (e.g. "us6000abcd"), lat/lon from
  geometry.coordinates[1],[0], region tag from config (default
  "magic_valley").
- to_event(): category earthquake_event, magnitude-binned severity passed
  through, group_key = inhibit_key = the USGS id. Defensive None for
  missing id / coords / magnitude. get_events()/health_status mirror the
  other adapters.

MAGNITUDE -> SEVERITY BINS (as proposed):
  M < 3.5        -> routine
  3.5 <= M < 5.0 -> priority
  M >= 5.0       -> immediate
('sig' is captured in the event dict as metadata but severity is
magnitude-binned -- clearer and matches the spec's primary suggestion.)

GEOGRAPHIC BBOX (as proposed) -- [west, south, east, north]:
  [-115.5, 42.0, -110.0, 45.2]
Covers Magic Valley / Twin Falls (SW), the Lost River Range / Borah Peak
and Sawtooths (central Idaho, seismically active -- 1983 M6.9), the eastern
Snake River Plain / INL, and the Yellowstone caldera (NW Wyoming). An empty
bbox disables the geographic filter (accepts all).

Wiring:
- config.py: new USGSQuakeConfig dataclass; usgs_quake field on
  EnvironmentalConfig; loader branch in _dict_to_dataclass.
- store.py __init__: registers self._adapters["usgs_quake"] when enabled --
  this is what grows the live adapter count 7 -> 8.
- store._ingest: NO dedicated branch added. usgs_quake is a standard
  per-event adapter, so the existing generic "else" loop (dedup on
  (source, event_id) + _emit_event) already routes it. (The swpc/ducting
  branches are special only because they also maintain status blobs.)
- env_feeds.yaml (live /data/config): added usgs_quake block, enabled:true,
  default bbox/min_mag/region.

Rule 17: GUI-editable config (env_feeds.yaml). Rule 18 N/A -- USGS
earthquake feed is keyless (no .env entry; .ref credentials has no
USGS/ArcGIS/quake key). Rule 16: standalone path established + validated
in-container.

Tests: tests/test_adapter_usgs_quake.py (15 tests) mirrors the 2.12/2.13
shape -- severity bins, _fetch severity assignment, magnitude filter,
geographic filter (in-bbox vs California/out), empty-bbox-accepts-all,
dedup id stable across ticks for the same quake id, category, severity
pass-through, group_key/inhibit_keys, field population, defensive cases
(missing id/coords/magnitude/corrupted -> None), and malformed-feature
skipping. _fetch tests patch urlopen with synthetic FeatureCollections.
Full suite: 248 passed.

Live smoke test (prod container, rebuilt): clean startup, adapter count
grew 7 -> 8 ("EnvironmentalStore initialized with 8 adapters"), healthy,
no traceback, no usgs_quake errors. In-container standalone tick over the
real feed succeeded (is_loaded=true, last_error=null,
consecutive_errors=0); the feed returned 54 global M2.5+ quakes, 0 inside
the Magic Valley->Yellowstone bbox right now (quiet) -- so no Event is
emitted, acceptable, and it exercises the fetch + magnitude + geographic
filter + no-emit path on live data. The emission path (in-region quake ->
earthquake_event) is unit-validated and uses the same store->bus path
emitting live for NWS, traffic, and NIFC fires.

Note (.gitignore): line 36 `env/` (a virtualenv pattern under "Virtual
environments") collaterally matches meshai/env/, so this NEW file required
`git add -f` (untracked files there are otherwise ignored and hidden from
status). Existing tracked env files are unaffected. Recommended follow-up:
anchor the rule to `/env/` so future net-new env adapters don't need -f.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 00:10:39 +00:00
d3b62ad3c5 feat(notifications): Phase 2.13 ducting adapter threshold-crossing emission (severity-tiered, Option C)
Adds a tier-based threshold-crossing emission path to the tropospheric
ducting adapter, which was status-only until now.

EMISSION PATH (before -> after):
  before: DuctingAdapter had only get_status(); store._ingest's ducting
          branch did `self._ducting_status = adapter.get_status()` and
          emitted NOTHING -- no get_events(), no to_event(), event_count
          hardcoded 0.
  after:  the adapter derives a propagation TIER each tick (with
          hysteresis) and stages an event on tier change; get_events() +
          to_event() added; store._ingest's ducting branch now mirrors the
          swpc branch (dedup on (source, event_id) + _emit_event), so a
          tier change emits to the pipeline bus.

Option C design (severity-tiered by enhancement strength):
- Driving quantity: min M-gradient (modified refractivity gradient,
  M-units/km) the adapter already computes.
- Tiers (ascending strength): normal < super_refraction < duct <
  surface_duct.
    0 <= g < 79  -> super_refraction -> category rf_anomalous_propagation,
                    severity routine
    g < 0        -> duct (elevated)  -> category rf_ducting_enhancement,
                    severity priority
    surface_duct OR g < -100 -> strong/surface duct ->
                    category rf_ducting_enhancement, severity immediate,
                    surface flag set in the summary
    g >= 79      -> normal -> no event
- Hysteresis / anti-flap: a DEADBAND of 5 M-units (TIER_DEADBAND) on the
  two gradient boundaries (79 and 0). A tier change commits only once the
  gradient is past the boundary by the deadband, so a wiggle right at a
  threshold does not flap-trip across the 3h poll interval / 30-min
  Inhibitor TTL mismatch (the Inhibitor TTL is shorter than the poll
  interval, so anti-flap must live in the adapter). The most-severe
  surface/strong-duct tier is categorical (duct reaches the ground) and is
  intentionally NOT held back or onto by the deadband -- it fires and
  clears promptly. (Deadband = 5 M-units chosen per the 5-10 guidance.)
- Stable event_id (SWPC idiom): "ducting_{tier_code}_{lat}_{lon}", e.g.
  "ducting_duct_42.56_-114.47". A sustained tier coalesces on this
  group_key (the store dedups it); an escalation to a stronger tier yields
  a new key and re-notifies. group_key = sole inhibit_key; severity tiering
  delegated to the Inhibitor.
- Prior-state tracking: self._last_tier persists across ticks (the
  deadband needs the last committed tier); _parse_response rebuilds
  _status wholesale, so _update_events runs at the end of each parse.
- Ducting is geographic: events carry the assessment location's lat/lon
  (config.latitude/longitude). Defensive: missing/normal tier, missing
  location, or missing gradient -> None; try/except-guarded.

Rule 17: no new tunable (latitude/longitude/tick_seconds already in
env_feeds.yaml; TIER_DEADBAND is an internal constant). Rule 18 N/A --
Open-Meteo GFS (api.open-meteo.com) is keyless. Rule 16: standalone fetch
path validated in-container.

Tests: tests/test_adapter_ducting.py (19 tests) mirrors the 2.12 SWPC
shape -- tier classification (normal/super_refraction/duct/surface_duct),
severity tiering, scale->category mapping, group_key/inhibit_keys, field
population, defensive cases (normal/missing location/missing gradient/
corrupted -> None), plus regression guards: dedup id stable across
same-tier ticks, tier escalation yields a new id, and TWO deadband guards
(a sub-deadband wiggle at the 0 boundary and at the 79 boundary holds the
prior tier; surface duct is not held by the deadband). Full suite: 233
passed.

Live smoke test (prod container, Phase 2.13 code rebuilt in): clean
startup, 7 env adapters loaded (ducting already counted), healthy, no
traceback. An in-container standalone _fetch of the Open-Meteo GFS
endpoint succeeded (fetch_ok=true, is_loaded=true, last_error=null,
consecutive_errors=0) -- 3/3 repeat probes clean. The current atmosphere
is normal (min M-gradient 122.5 >= 79) so tier=normal and no Event is
emitted -- acceptable, and it exercises the no-emit path and the tier
classifier on live data. NOTE: the running container's first ducting tick
logged a transient "[SSL: UNEXPECTED_EOF_WHILE_READING]" connection error;
the immediate and repeated standalone probes all succeeded, so this was a
transient upstream TLS drop (not DNS/auth/config) and the adapter degrades
gracefully (logs, increments consecutive_errors, returns False, no crash).
The emission path (tier change -> rf_anomalous_propagation /
rf_ducting_enhancement) is unit-validated and uses the same store->bus
path that emitted live for NWS, traffic, and NIFC fires.

Follow-up (not in this change): DuctingAdapter.health_status still returns
event_count hardcoded 0; now that the adapter emits, it could report
len(self._events). Cosmetic (health endpoint only); left out to keep the
diff scoped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 00:01:40 +00:00
dda8b8f96f feat(notifications): Phase 2.12 SWPC space weather adapter + dedup fix
Wires the NOAA SWPC adapter into the notification EventBus and fixes a
dedup bug in its event id, following the Phase 2.7-2.11 pattern.

(A) DEDUP FIX (the regression this phase guards):
  before: event_id = f"swpc_r{r_scale}_{int(time.time())}"
  after:  event_id = f"swpc_{code}{level}"   # e.g. "swpc_g3"
The old id embedded int(time.time()), so every poll produced a unique id.
The store dedups env events on (source, event_id), so each tick during a
blackout was treated as new -> re-emitted to the bus every scales poll
(300s) and accumulated phantom entries in the store. The new id is stable
per condition: a sustained storm coalesces across ticks; only an
escalation to a new level (e.g. G3 -> G4) yields a new id and re-notifies.
Re-emit suppression is the Inhibitor's job (TTL ~1800s), not the id's.

(B) _update_events expanded R-scale-only -> all three NOAA scales:
  - R (Radio Blackout)        -> category rf_propagation_alert
  - S (Solar Radiation Storm) -> category solar_radiation_storm
  - G (Geomagnetic Storm)     -> category geomagnetic_storm
Emit threshold: level >= 1 (level 0 / quiet emits nothing). Severity is
tiered in _update_events and passed through by to_event:
  level 1-2 -> routine, 3-4 -> priority, 5 -> immediate.
(Scope/threshold approved by Matt before applying: "R/S/G at level >= 1".)
Each event carries scale/level discriminator fields for to_event.

(C) to_event(): category from scale, severity pass-through, group_key /
inhibit_keys = the stable event_id (single key; tiering -> Inhibitor).
SWPC conditions are global, so the Event carries lat=None, lon=None and
region="global" (Event.lat/lon are Optional and Event has a region field).
Defensive: missing scale, level<1, or missing event_id -> None;
try/except-guarded.

No store.py change: store already routes swpc through to_event in _ingest
(the swpc special-case) and the Phase 2.9 None-guard handles None returns.

Rule 17: no new tunable. Rule 18 N/A -- SWPC services.swpc.noaa.gov is
keyless (no .env entry; .ref credentials has no SWPC/NOAA key, confirming
none needed). Rule 16: standalone fetch path validated in-container.

Tests: tests/test_adapter_swpc.py (14 tests) mirrors the 2.11 shape --
scale->category mapping, severity pass-through, _update_events severity
tiering (1-2/3-4/5), group_key/inhibit_keys, all-three-scales-emit,
quiet-emits-nothing, field population (lat/lon None + region global), and
defensive cases (missing scale / level 0 / missing id / corrupted -> None).
Plus two dedup regression guards: test_dedup_id_stable_across_ticks
(SAME id across two ticks of the same condition -- fails on the old code)
and test_event_id_changes_with_level (escalation yields a new id). Full
suite: 214 passed.

Live smoke test (prod container, Phase 2.12 code rebuilt in): clean
startup, 7 env adapters loaded, healthy, no traceback, no SWPC errors. An
in-container standalone fetch of the noaa-scales endpoint succeeded
(scales_fetch_ok=true, is_loaded=true, last_error=null,
consecutive_errors=0) over the open API with no DNS/auth errors (Phase
2.6.6 DNS fix). Current conditions are quiet (R0/S0/G0), so no Event is
emitted -- acceptable, and it exercises the level<1 -> no-emit path live.
The emission path (active scale -> rf_propagation_alert / geomagnetic_storm
/ solar_radiation_storm) is unit-validated and uses the same store->bus
path that emitted live for NWS, traffic, and NIFC fires.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 23:41:30 +00:00
c111211850 feat(notifications): Phase 2.11 NIFC fires adapter pipeline integration
Adds NICFFiresAdapter.to_event(), wiring the NIFC/WFIGS wildfire perimeter
adapter into the notification EventBus, following the Phase 2.7 traffic /
2.9 USGS / 2.10 avalanche pattern.

to_event() design:
- Category: every active perimeter with a reported size maps to a single
  wildfire_incident category (the adapter's WFIGS query already filters to
  active WF incidents in the configured state).
- Severity: PASSED THROUGH unchanged. The adapter computes severity by
  proximity to region anchors (< 25 km -> priority, else routine), which
  is a richer, more actionable signal for a mesh-notification use case
  than raw acreage. I deliberately did NOT invent acreage breakpoints --
  pass-through matches the 2.9/2.10 pattern and defers tiering to the
  pipeline Inhibitor. (Flagged for review: if acreage-based or
  containment-based severity is preferred, it belongs in the adapter's
  _fetch severity logic, not to_event.)
- Summary: incident name + acreage + % contained + distance to nearest
  anchor.
- group_key/inhibit_keys: the adapter's stable "nifc_{name}_{state}"
  event_id as both. Re-polls of the same incident coalesce; single
  inhibit key lets the Inhibitor suppress lower-severity re-emissions.
- Defensive: missing centroid (lat/lon), missing event_id, or missing/zero
  acreage returns None; try/except-guarded.

No store.py change: the Phase 2.9 _emit_event None-guard already handles
to_event() returning None, and store gates emission on
hasattr(adapter, "to_event").

Rule 17: no new tunable. fires enabled / state / tick_seconds already
exist in env_feeds.yaml (GUI-editable). Rule 18 N/A -- the WFIGS
Interagency Perimeters ArcGIS FeatureServer is keyless (no .env entry;
the .ref credentials store has no NIFC/ArcGIS/wildfire key, confirming
none is needed). Rule 16: standalone fetch path validated in-container.

FIRMS side-investigation (flagged in the 2.10 report): firms is disabled
because it needs a NASA FIRMS map key that is not provisioned --
env_feeds.yaml has firms.enabled=false with map_key='' (not even a
${FIRMS_MAP_KEY} reference), and /data/secrets/.env has no FIRMS key.
Intentional/blocked-on-key, not a bug. No action this phase.

Config note: fires was already enabled (state US-ID) and already one of
the 7 live adapters (store key "nifc"), so this phase keeps the count at 7
(no 7->8 change) and required no env_feeds.yaml edit. No seasonal
short-circuit, so no temp config wiggling was needed (unlike 2.10).

Tests: tests/test_adapter_fires.py (12 tests) mirrors test_adapter_usgs /
test_adapter_avalanche -- category (always wildfire_incident, independent
of severity), severity pass-through, group_key/inhibit_keys,
distinct-incident keys, field population, summary content, and the
defensive cases (zero acreage -> None, missing centroid/event_id -> None,
corrupted -> None). Full suite: 200 passed.

Live smoke test (prod container, Phase 2.11 code rebuilt in): clean
startup, 7 env adapters loaded, no traceback. There IS an active Idaho
incident today, so this produced a real end-to-end emission rather than
the empty-result cases of 2.9/2.10: the running store logged "NIFC fires
updated: 1 active in US-ID" and "Emitted nifc event cc4bd340be7fd57e
(wildfire_incident) to pipeline bus". An in-container standalone fetch
confirmed health is_loaded=true, last_error=null, consecutive_errors=0,
event_count=1 -- the WFIGS ArcGIS endpoint was reached with no DNS/auth
errors (Phase 2.6.6 DNS fix). The Summit Creek incident (1,500 ac, 0%
contained, ~72 km from the Twin Falls anchor) mapped to
wildfire_incident / routine as designed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 23:33:48 +00:00
1d35188b98 feat(notifications): Phase 2.10 avalanche adapter pipeline integration
Adds AvalancheAdapter.to_event(), wiring the avalanche.org map-layer
adapter into the notification EventBus, following the Phase 2.7 traffic /
2.9 USGS pattern.

to_event() design (emit only elevated danger):
- Category from danger_level: High/Extreme (4-5) -> avalanche_warning;
  Considerable (3) -> avalanche_watch.
- Low/Moderate (1-2) and No-Rating (-1/0) have no distinct trend trigger
  in this adapter and are intentionally NOT emitted (return None) -- the
  two categories are warning/watch only, matching the spec.
- Severity: passed through unchanged from the adapter's danger mapping
  (danger >= 4 -> priority, else routine; the adapter never emits
  "immediate"). Severity tiering is delegated to the pipeline Inhibitor.
- Summary: headline + danger name + travel advice.
- group_key/inhibit_keys: the adapter's stable "avy_{center}_{zone}"
  event_id as both. Re-polls of the same zone coalesce; single inhibit
  key lets the Inhibitor suppress lower-severity re-emissions.
- Defensive: missing centroid (lat/lon), missing event_id, or missing
  danger_level returns None; try/except-guarded.

No store.py change: the Phase 2.9 _emit_event None-guard already handles
to_event() returning None, and store gates emission on
hasattr(adapter, "to_event").

Rule 17: no new tunable. avalanche enabled / center_ids / season_months
already exist in env_feeds.yaml (GUI-editable). Rule 18 N/A -- the
avalanche.org v2 public map-layer API is keyless (no .env entry; the
.ref credentials store has no avalanche provider key, confirming none is
needed). Rule 16: standalone fetch path validated in-container below.

Config note: avalanche was already enabled (center_ids: [SNFAC], the
Sawtooth Avalanche Center -- the correct South Central Idaho / Magic
Valley center). It was already one of the 7 live adapters, so this phase
keeps the count at 7 (no 7->8 change) and required no env_feeds.yaml
edit. There is no per-zone config knob; the adapter fetches all zones for
the configured center.

Tests: tests/test_adapter_avalanche.py (14 tests) mirrors
test_adapter_usgs -- category split (warning vs watch), severity
pass-through, group_key/inhibit_keys, distinct-zone keys, field
population, and the non-emit/defensive cases (low/moderate -> None,
no-rating -> None, missing danger_level/centroid/event_id -> None,
corrupted -> None). Full suite: 188 passed.

Live smoke test (prod container, Phase 2.10 code rebuilt in): clean
startup, 7 env adapters loaded, no traceback. Late May is off-season
(season_months [12,1,2,3,4]) so tick() short-circuits in normal
operation. To exercise the open-API path, a one-shot standalone fetch was
run in-container with an all-months config against center SNFAC: health
is_loaded=true, last_error=null, consecutive_errors=0, last_fetch set,
off_season=false -- the fetch reached api.avalanche.org with no DNS/auth
errors (Phase 2.6.6 DNS fix). event_count=0 because all SNFAC zones are
server-side off_season in late May, so no Event is emitted -- acceptable
per the seasonal caveat. The temporary season_months edit was reverted
and the container restarted on the real config (7 adapters, healthy). The
emission path (elevated -> avalanche_warning / avalanche_watch) is
unit-validated and is the same store->bus path emitting live for NWS and
traffic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 23:08:24 +00:00
4feb6a1895 feat(notifications): Phase 2.9 usgs water adapter pipeline integration
Adds USGSStreamsAdapter.to_event(), wiring the USGS Water Services stream
gauge adapter into the notification EventBus, following the Phase 2.7
traffic pattern.

to_event() design (emit only actionable/elevated readings):
- Category from flood_status: an exceeded stage (Minor/Moderate/Major
  Flood) -> stream_flood_warning; "Action Stage" (approaching) ->
  stream_high_water.
- A routine reading has no flood_status and is intentionally NOT emitted
  (returns None) -- the two categories are both flood-specific and routine
  gauge chatter is not actionable. This matches the spec ("category ...
  based on flood_status").
- Severity: passed through unchanged from the adapter's NWPS-stage logic
  (action->routine, minor/moderate->priority, major->immediate).
- Summary: reading value/unit + flood status.
- group_key/inhibit_keys: a single stable {site_id}_{param} key (the
  adapter's own event_id) as both. Re-polls coalesce; severity tiering is
  delegated to the pipeline Inhibitor (no severity encoded in the key).
- Defensive: missing lat/lon or event_id returns None; try/except-guarded.

store fix (meshai/env/store.py): _emit_event now skips a None return from
to_event() instead of passing it to bus.emit(). Required because usgs
returns None for the common (routine) reading; also retroactively protects
the defensive None returns of the FIRMS/traffic/roads511 adapters, which
previously would have logged a spurious "Failed to emit" warning.

Rule 17: no new tunable. usgs sites / tick_seconds / flood_thresholds
already exist in env_feeds.yaml (GUI-editable). Open API, no key, no .env
entry. Rule 16: standalone path validated end-to-end below.

Tests: tests/test_adapter_usgs.py (13 tests) mirrors test_adapter_traffic
-- category split (flood vs action), severity pass-through,
group_key/inhibit_keys, field population, and the non-emit/defensive cases
(routine -> None, missing lat/lon -> None, missing event_id -> None,
missing properties -> None, corrupted -> None). Full suite: 174 passed.

Live smoke test (prod, sites 13090500 Snake R nr Twin Falls, 13092747 Rock
Creek at Twin Falls, 13108150 Salmon Falls Creek nr Hagerman): clean
startup, 7 env adapters loaded, no traceback. "USGS streams updated: 6
readings from 3 sites" with NWPS flood stages resolved for all 3 -- fetch
succeeds over the open API with no DNS/auth errors (Phase 2.6.6 DNS fix).
All gauges currently below action stage, so flood_status is None and
to_event correctly emits nothing; the new None-guard skipped all 6 with no
error log. The emission path (elevated -> stream_flood_warning /
stream_high_water) is unit-validated and is the same store->bus path
emitting live for NWS (weather_warning/statement) and traffic
(traffic_congestion).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 21:58:13 +00:00
f273a8d5b0 feat(notifications): Phase 2.8 roads511 adapter pipeline integration
Adds Roads511Adapter.to_event(), wiring the state 511 road-events adapter
into the notification EventBus following the Phase 2.7 traffic pattern.

to_event() design:
- Category: fixed "road_closure".
- Severity: passed through unchanged from the adapter's existing
  _parse_event logic (priority on closure, else routine).
- Summary enriched with closure status, roadway, and description.
- group_key: the stored event_id (already the stable "511_{id}" key), so
  re-polls of the same incident coalesce.
- inhibit_keys: a single key equal to group_key. Severity tiering is
  delegated to the pipeline Inhibitor (ranks routine<priority<immediate
  per shared key, suppressing lower-severity re-emissions of the same
  incident within the Inhibitor TTL). No severity encoded into the key.
- Defensive: missing lat/lon or missing event_id returns None; whole body
  is try/except-guarded (returns None on corruption).

Store wiring: no change. EnvironmentalStore._ingest()'s generic "else"
branch already emits any adapter exposing to_event() (live since 2.6.5).

Rule 17: to_event introduces no new tunable. (The state base_url / bbox /
api_key already exist in Roads511Config and env_feeds.yaml; secrets go in
/data/secrets/.env via ${VAR}, never git.)

Tests: tests/test_adapter_roads511.py (14 tests) mirrors
test_adapter_traffic.py -- category, severity pass-through,
group_key/inhibit_keys, field population, defensive cases. Full suite:
161 passed.

live smoke test SKIPPED: Idaho 511 v2 (511.idaho.gov/api/v2) requires an
API key ("Invalid Key" response) and none is available in .ref/credentials
(cannot self-register). Per the standing key-less-adapter policy, the code
+ unit tests are committed and Gate D is skipped; roads511 is left disabled
in prod (enabling it keyless would only emit HTTP 400 errors). The
to_event() path is fully unit-validated and structurally identical to the
live traffic/FIRMS wiring (same EnvironmentalStore->EventBus path); live
validation will run if/when an Idaho 511 key is provided.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 21:18:21 +00:00
d9cc80daf8 feat(notifications): Phase 2.7 traffic adapter pipeline integration
Adds TomTomTrafficAdapter.to_event(), wiring the traffic adapter into
the notification EventBus following the FIRMS pattern (Phase 2.6).

to_event() design:
- Category: fixed "traffic_congestion" (a road closure raises severity,
  not category).
- Severity: passed through unchanged from the adapter's existing
  _fetch_point logic (priority on closure / heavy congestion, else
  routine). No threshold is re-derived or introduced in to_event.
- Summary enriched with current/free-flow speed, % free flow, closure,
  and confidence.
- Defensive: missing lat/lon or missing corridor identity returns None;
  the whole body is try/except-guarded (returns None on corruption).

Inhibit-key composition:
- A single stable per-corridor key, "traffic_{corridor}" (lowercased,
  spaces->_), is used as BOTH group_key and the sole inhibit_key. This
  matches the adapter's own event_id, so re-polls of a corridor coalesce.
- Severity tiering is delegated to the pipeline Inhibitor, which ranks
  routine<priority<immediate per shared inhibit_key: a higher-severity
  emission for a corridor suppresses lower-severity re-emissions of the
  same corridor within the Inhibitor TTL window. No severity is encoded
  into the key (mirrors FIRMS's spatial-key approach).

Store wiring: no change. EnvironmentalStore._ingest()'s generic "else"
branch already emits any adapter exposing to_event() (live since 2.6.5).

Rule 17: to_event introduces no new tunable. The api_key is injected via
the secrets channel ($TOMTOM_API_KEY in /data/secrets/.env, referenced
as ${TOMTOM_API_KEY} in env_feeds.yaml) -- the GUI-editable reference
stays in config while the secret never enters git. The only other knob
in play is the pipeline-level Inhibitor TTL (1800s, set in
build_pipeline), which is pipeline infrastructure, not traffic-owned;
left out of scope.

Tests: tests/test_adapter_traffic.py (15 tests) mirrors
test_adapter_firms.py -- category, severity pass-through,
group_key/inhibit_keys, field population, defensive cases. Full suite:
147 passed.

Smoke test (prod, Magic Valley corridors I-84 @ Jerome, US-93 Perrine
Bridge, US-30 Twin Falls): clean startup, 6 env adapters loaded, no
traceback. "TomTom traffic updated: 3 corridors" (no auth/DNS error),
then 3 Events emitted to the pipeline bus with traffic_congestion
category -- the full store->bus->pipeline path observed live. Emission
count stable at 3 (one per corridor, is_new-gated).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 19:17:27 +00:00
9c5a106c9f feat(env): Phase 2.6 FIRMS adapter emits Events to pipeline bus
Second adapter wired to the new pipeline (after NWS). Reuses the
store-side emission logic added in the NWS commit.

- FIRMSAdapter.to_event() maps stored dict to pipeline Event.
- Category decision: new_ignition vs wildfire_proximity based on
  properties.new_ignition (computed by FIRMS during ingest from
  proximity to known fires).
- Severity passes through (FIRMS already pre-maps to our 3-level
  system during _parse_csv).
- group_key and inhibit_keys use a spatial grid key
  (firms:LAT:LON rounded to 0.01 degrees, ~1km) so repeated
  satellite detections of the same hotspot are coalesced and
  lower-severity re-detections are inhibited.
- Summary text enriched with FRP, confidence, and distance from
  the nearest region anchor when present.
- 13 tests covering category decision, severity pass-through,
  spatial grouping, and defensive handling of incomplete dicts.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-15 05:23:00 +00:00
95dc938c2a feat(notifications): Phase 2.6 NWS adapter pipeline integration
Wires the NWS adapter to the new notification pipeline via EventBus:

- Added fine-grained weather categories: weather_watch, weather_advisory,
  weather_statement (all routine severity) alongside existing weather_warning
- NWSAlertsAdapter._derive_category() maps NWS event type suffix to category:
  "Warning" -> weather_warning, "Watch" -> weather_watch, etc.
- NWSAlertsAdapter.to_event() converts internal event dict to pipeline Event
  with proper group_key (event_id) and inhibit_keys (Warning suppresses Watch)
- EnvironmentalStore accepts optional event_bus parameter
- EnvironmentalStore._ingest() emits new events to bus via _emit_event()
- 22 new tests in test_adapter_nws.py covering category derivation,
  severity mapping, and Event field population

All 119 tests pass.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-15 04:47:31 +00:00
b2bb7f7a95 feat(notifications): Phase 2.5b per-channel-type renderers
Adds dedicated renderer classes per channel type:

- MeshRenderer produces 1+ chunks <=200 chars with (k/N) counters
  when the payload overflows. Reuses the toggle-label vocabulary
  from the digest. Mesh channels skip re-chunking when the payload
  already carries chunk_index metadata (digest path).
- EmailRenderer produces {subject, body} with structured context
  lines. Plain text only; HTML body is a future polish.
- WebhookRenderer produces a JSON-serializable dict with stable
  schema_version 1.0. Optional fields omitted (not nulled) for
  compactness. Designed for reuse by Phase 2.6.5's MQTT event
  publisher.
- All four channel implementations (MeshBroadcast, MeshDM, Email,
  Webhook) now call their renderer in deliver() before transport.
- New renderer tests cover each renderer in isolation; new channel
  integration tests confirm channels actually call their renderer.

Renderers are pure functions of the payload - no network, no
state, fully testable without mocking I/O. The future MQTT
publisher will instantiate WebhookRenderer directly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-15 04:25:44 +00:00
c9d9a9925c feat(notifications): Phase 2.5a channel interface unification
- Switch channels.py from dict-based to dataclass-based interfaces
- Add NotificationPayload dataclass and make_payload_from_event helper
- Update channel.deliver() to be async with (payload, rule) signature
- Add connector parameter to Dispatcher, DigestScheduler, and pipeline builders
- Update pipeline tee to use asyncio.create_task for async dispatch
- Add create_channel_from_dict for legacy router.py compatibility
- Update tests for new async interfaces

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-15 03:45:27 +00:00
a4cb29002d fix(notifications): inject llm_backend into build_pipeline
build_pipeline previously constructed its own LLMBackend from
config.llm, which:
  - duplicated main.py's already-running backend instance
  - failed to inherit env-loaded LLM_API_KEY when called from
    short-lived scripts (eyeball checks, tests), forcing fallback
  - prevented pipeline components from sharing the live backend

build_pipeline and build_pipeline_components now require an
llm_backend parameter. main.py passes the same instance it
constructed for its primary responder. Tests pass mocks. The
digest accumulator now uses the live, authenticated backend.

Added test_build_pipeline_uses_provided_backend to lock in the
injection contract.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-15 03:08:31 +00:00
9674e94efb Phase 2.4: LLM-summarized digest with master toggle filter
- Remove severity-based fork; tee pattern sends all events to both dispatcher and accumulator
- Add ToggleFilter before tee; drops events for disabled toggles
- Rework DigestAccumulator: event log instead of active/resolved tracking
- render_digest now async, calls LLM once per toggle with severity-ordered events
- Fallback to count-based summary when LLM unavailable
- Add TogglesConfig to config.py for master toggle settings
- Update scheduler to await async render_digest
- 75 tests passing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-15 02:37:12 +00:00
d6bc6b2b89 build: normalize all line endings to LF
One-time renormalization pass under the .gitattributes added in the
previous commit. Every tracked text file now uses LF. No semantic
changes — verified via git diff --cached --ignore-all-space showing
zero real differences. Future diffs will only show real content
changes.

This commit will appear huge in git log --stat but represents zero
behavior change. Use git log --follow --ignore-all-space or
git blame -w when archaeologically tracing through this commit.
2026-05-14 22:43:06 +00:00
493b43f7cf feat(notifications): Phase 2.3b digest scheduler
Adds DigestScheduler class that fires digest at configured time (default 07:00)
and routes to rules with trigger_type=schedule and schedule_match=digest.

- DigestScheduler: asyncio task with start/stop lifecycle
- Config: DigestConfig dataclass with schedule and include fields
- Config: schedule_match field on NotificationRuleConfig
- Pipeline: start_pipeline/stop_pipeline async lifecycle functions
- Mesh channels get per-chunk delivery, email/webhook get full text
- 26 new tests covering schedule computation, fire behavior, lifecycle

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-05-14 22:32:51 +00:00
8326fc56b2 refactor(notifications): mesh chunk list and include_toggles 2026-05-14 21:39:35 +00:00
57e2f516c5 refactor(notifications): per-toggle digest lines, exclude rf_propagation, explicit empty digest 2026-05-14 20:48:40 +00:00
96de22c6c0 feat(notifications): Phase 2.3a digest accumulator and renderer
Adds DigestAccumulator tracking ACTIVE NOW and SINCE LAST DIGEST
state per toggle. Replaces StubDigestQueue in build_pipeline; the
stub class is kept for Phase 2.1 backward-compat tests.

- enqueue(): adds new events, updates in place by id, detects
  resolutions (expires past, or title contains cleared/reopened/
  ended/resolved/back online/recovered/lifted)
- tick(now): rolls expired actives into since_last
- render_digest(now): produces a Digest with mesh_compact (<=200
  chars) and full multi-line forms; clears since_last after
- Toggle ordering and labels match the v0.3 design
- Phase 2.3b will add real scheduling on top of this
2026-05-14 19:21:40 +00:00
e67e2cd6a0 feat(notifications): Phase 2.2 inhibitor and grouper
Adds inline pipeline stages between the bus and the severity router:

- Inhibitor: suppresses lower-or-equal severity events when a key
  in event.inhibit_keys is already active. TTL configurable, default
  30 minutes.
- Grouper: coalesces events sharing group_key within a time window
  (default 60s). Most recent event wins. tick() and flush_all()
  drive emission; no background timers in Phase 2.2.
- build_pipeline now wires: bus -> inhibitor -> grouper -> severity_router

Phase 2.1 dispatcher tests continue to pass unchanged.
2026-05-14 18:53:03 +00:00
31fe4d5978 test(notifications): six test cases for Phase 2.1 pipeline 2026-05-14 18:21:24 +00:00