meshai/tests/test_pipeline_grouper.py

61 lines
1.9 KiB
Python
Raw Normal View History

fix(notifications): Phase 2.16.1 unblock pipeline -- grouper flush + rules coercion + toggle warning Phase 2.16 found the live notification pipeline never delivered any environmental event. Two independent blocking bugs, both fixed here. BUG A -- grouper held events forever (nothing drove tick()). Every adapter event sets a group_key, so all were buffered in the Grouper and never flushed (start_pipeline only started the DigestScheduler; no tick driver existed). Fixes (per Matt's decisions): - Grouper.handle(): immediate-severity events now BYPASS the window entirely (delivered straight to next_handler), no buffering latency. routine/priority still coalesce. - start_pipeline(): schedules an asyncio flush task that calls grouper.tick() every `grouper_flush_seconds` (default 5s) so coalesced events drain within the window even when poll cadence is sparse. stop_pipeline() signals + cancels it. before/after (grouper held_count): an immediate+group_key event used to sit held (count 1) forever; now held_count==0 on arrival (bypassed). A routine event is held (count 1) then drained to 0 by tick()/flush. BUG B -- notification rules loaded as dicts, crashing the dispatcher. Root cause (more precise than 2.16's guess): the rules coercion is NOT missing from the multi-file loader -- it lives in _dict_to_dataclass's explicit `elif key == "notifications"` branch, but that branch was DEAD CODE, shadowed by the generic `if hasattr(field_type, "__dataclass_fields__")` handler that runs first for every dataclass field (including notifications). So Config.notifications.rules stayed a list of dicts on ALL load paths, and Dispatcher._matching_rules threw `AttributeError: 'dict' object has no attribute 'enabled'`. Fix: hoist the notifications special-handling ahead of the generic handler (and drop the now-truly-dead duplicate elif). before/after (cfg.notifications.rules[0] type): dict -> NotificationRuleConfig. OBS C -- empty enabled_toggles. Left as 'pass all' for v0.3 (per Matt); added a startup WARNING in build_pipeline so operators see gating is off: "enabled_toggles is empty -- ToggleFilter passing all events. Configure toggles to enable gating." (confirmed firing live). Tests: - tests/test_pipeline_grouper.py (new): test_immediate_severity_bypasses_grouper, test_periodic_flush_drains_routine, test_priority_is_also_coalesced_not_bypassed. - tests/test_config_loader.py (new): test_multifile_load_coerces_notification_rules, test_rules_attribute_access_does_not_raise (regression guards for Bug B). - tests/test_pipeline_inhibitor_grouper.py (updated): 5 existing grouper hold/coalesce/flush tests primed the grouper with immediate+group_key events expecting them to be held; switched those to 'priority' (still buffered; still outranks the routine event in the inhibitor-chain test) to match the intended immediate-bypass behavior. Full suite: 253 passed (was 248 + 5 new; 5 existing updated, none lost). VERIFICATION (rebuilt prod, traced end-to-end via in-process build_pipeline probe with a recording channel + live config): - rules[0] type: NotificationRuleConfig (Bug B fixed). - IMMEDIATE event: held_count==0 on emit (bypassed) -> reached channel.deliver(): delivered=[('PROBE_RULE','E2E IMMEDIATE')]. - ROUTINE event: held_count==1 -> after flush 0 -> reached channel.deliver(): delivered+=[('PROBE_RULE','E2E ROUTINE')]. - Natural Summit-Creek-shaped nifc wildfire_incident (routine, no matching dispatch rule): held 1 -> after flush -> landed in the digest accumulator (1 event). End-to-end channel.deliver evidence = the RecChannel.deliver() calls above. - Live container: 8 adapters, healthy, "Grouper flush task started (every 5s)", the enabled_toggles warning fired, and NO dispatcher AttributeError/traceback. Follow-up (non-blocking): several Phase 2.7-2.14 categories (e.g. wildfire_incident, earthquake_event) aren't in the category->toggle map, so they fall to toggle 'other'. Harmless while enabled_toggles is empty (pass-all), but should be mapped before toggle gating is turned on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 00:36:13 +00:00
"""Phase 2.16.1 grouper tests: immediate bypass + periodic flush of routine."""
from meshai.notifications.pipeline.grouper import Grouper
from meshai.notifications.events import make_event
class Recorder:
def __init__(self):
self.received = []
def handle(self, event):
self.received.append(event)
def _ev(severity, group_key="gk1"):
return make_event(
source="usgs_quake",
category="earthquake_event",
severity=severity,
title=f"test {severity}",
lat=42.6,
lon=-114.5,
group_key=group_key,
inhibit_keys=[group_key],
)
def test_immediate_severity_bypasses_grouper():
"""An immediate event with a group_key is delivered at once, not buffered."""
rec = Recorder()
g = Grouper(next_handler=rec.handle, window_seconds=60.0)
g.handle(_ev("immediate"))
# Delivered immediately, nothing held.
assert len(rec.received) == 1
assert rec.received[0].severity == "immediate"
assert g.held_count() == 0
def test_periodic_flush_drains_routine():
"""A routine event is held, then released by tick() once its window passes."""
rec = Recorder()
g = Grouper(next_handler=rec.handle, window_seconds=0.0) # 0s window -> tick drains now
g.handle(_ev("routine"))
# Held on arrival, not yet delivered.
assert g.held_count() == 1
assert rec.received == []
# The periodic flush task calls tick(); simulate one tick.
drained = g.tick()
assert drained == 1
assert len(rec.received) == 1
assert rec.received[0].severity == "routine"
assert g.held_count() == 0
def test_priority_is_also_coalesced_not_bypassed():
"""Priority events still buffer (only immediate bypasses)."""
rec = Recorder()
g = Grouper(next_handler=rec.handle, window_seconds=60.0)
g.handle(_ev("priority"))
assert rec.received == []
assert g.held_count() == 1