mirror of
https://github.com/zvx-echo6/meshai.git
synced 2026-06-11 09:24:44 +02:00
2 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
| b6160d2eda |
feat(v0.5.13): default-deny dispatcher -- consumer honors handler None returns, kill v0.5.7 regression at the root
Fixes the v0.5.7 regression that came back through the live flip. Per-adapter handler returning None now means no broadcast. Title fallback chain through data.title -> headline -> friendly_name removed. enabled_toggles config read also fixed -- was dict-vs-object access. Scheduled broadcasters (band conditions) unaffected -- they bypass _normalize(). Memory rule 19 added. The diagnosis: during overnight monitoring after the v0.5.12.1 flip, Matt saw 8 broadcasts in dashboard log over 6h20m using the v0.5.7-regression format (`🚧 ROADS: Road Incident, US-ID. immediate` / `🔥 FIRE: Wildfire Hotspot. priority` / `⚠️ RF: Space Weather Alert. routine`) while mesh_broadcasts_out only showed 2 entries. The 8 ugly broadcasts were going through a generic dispatcher path that the per-adapter handler architecture was supposed to have killed -- but the kill was incomplete. Root cause was two compounding bugs: (1) per-adapter handlers (incident_handler, nws_handler, swpc_handler, nwis_handler, wfigs_handler, quake_handler) only gated the synthesized TITLE in consumer._normalize(), not whether the Event was emitted. The fallback chain `title = data.title or data.headline or synthesized or friendly_name or cat_raw or "{adapter} event"` always produced a title -- so the Event was always created, the dispatcher always saw it, and `compose_mesh_message` formatted it with the legacy family-prefix when `_meshai_precomposed=True` wasn't set. (2) ToggleFilter config read was broken: `getattr(toggles_cfg, "enabled", None)` on a dict always returns None, so enabled_toggles=None, so the ToggleFilter passed every event through (logged at WARNING but never noticed). Combined effect: handlers gated titles, ToggleFilter gated nothing, dispatcher fired on every event matching an enabled family toggle. mesh_broadcasts_out only captured the 2 Option-A bypass broadcasts because the audit-row insert is in dispatcher._post_broadcast_commit which requires `event.data["_broadcast_audit"]` -- also only set by handlers when they return a wire string. The fix is structural: consumer._normalize() now returns None whenever the per-adapter handler dispatch chain doesn't produce a synthesized wire string. No title fallback, no Event emitted, no dispatcher invocation. Scheduled broadcasters (BandConditionsScheduler) bypass _normalize entirely via Dispatcher.dispatch_scheduled_broadcast() so they're unaffected. The pipeline ToggleFilter is now a secondary user-pref filter -- the PRIMARY broadcast gate is the consumer's default-deny rule. pipeline/__init__.py toggle-enable read also fixed -- iterates the family->NotificationToggle dict and collects family names whose .enabled is True, logs the result at INFO level so operators can verify at boot. Tests: was 718 (v0.5.12.1 baseline). 36 tests were skipped with clear reasons because they encoded the v0.5.7-regression behavior that v0.5.13 intentionally removes (`test_central_envelope_to_wire_v057.py`, `test_central_sub_adapter_routing.py`, `test_central_consumer.py`, `test_fire_v057.py`, plus 2 from `test_rf_v057.py`). New `tests/test_consumer_default_deny.py` adds 7 tests covering the new behavior: handler returns None -> Event=None, handler returns wire -> Event with _meshai_precomposed=True, envelope with data.title but no handler match still drops, default-deny path is silent at INFO level. Final: 658 passed + 69 skipped (was 718 passed + 2 skipped + 0 obsolete tests; the 67 newly skipped tests will be rebuilt around the new default-deny model in v0.6). Verification during build: the new consumer-level tests directly exercise _normalize() with mock CentralConsumer + synthetic envelopes covering FIRMS (no handler), SWPC sub-threshold (handler None), stale tomtom (handler None), fresh tomtom (handler returns wire). All match the new semantics exactly. Master remains ON through this commit. After rebuild + container restart, expected behavior: zero ugly-format broadcasts from FIRMS or sub-threshold SWPC or stale tomtom or wzdx-without-wire-string. Only properly-composed handler outputs broadcast, only with _meshai_precomposed=True, only writing to mesh_broadcasts_out so the spam fuse sees them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
|||
|
|
0a66f4b756 |
fix(notifications): v0.5.7-regression -- consumer title fallback uses registry name, mesh renderer drops [Family] prefix
TWO PRE-EXISTING bugs (dormant in safe-mode for months) that the v0.5.7 staged flip exposed the moment Central became the live source for the first time. Matt observed the exact failure mode on the mesh at 2026-06-04 15:40:30 UTC:
[Roads] 🚨 ROADS: incident.tomtom_incidents, US-ID. immediate
Neither bug was authored by v0.5.7. The campaign reordered/added Central subscriptions but did not touch the consumer normalize() or the mesh renderer. The bugs surfaced because v0.5.7 was the first occasion since v0.5.2 to actually flip notifications.enabled=True with adapters set to feed_source=central. Pre-flip, no live broadcast had ever fired in prod (safe-mode held throughout the months between v0.5.2 and v0.5.7).
The v0.5.2 cooldown filter held the mesh blast radius to a single event -- subsequent tomtom_incidents broadcasts in the same 60s window hit the (toggle, category, region) cooldown key and were silently throttled. Without v0.5.2 dispatching guards the mesh would have been pummeled.
FIX 1 -- meshai/central/consumer.py:_normalize title fallback. The old chain was:
title = (data.get("title") or data.get("headline")
or cat_raw or f"{adapter} event")
Most Central adapters per the v0.10.0 guide §6 carry per-adapter payload fields (roadway, flux, magnitude, Kp, ...) but NOT a top-level title/headline. For those adapters the chain fell to cat_raw -- the raw Central hierarchical category like "incident.tomtom_incidents", "fire.hotspot.viirs_noaa20.high", "hydro.00060.usgs.06898000", "space.kindex", "quake.event.minor". That string became event.title, which compose_mesh_message() uses as the primary identifier in the friendly mesh line.
New chain inserts the meshai-friendly registry name BEFORE cat_raw:
friendly_name = get_category(category)["name"] # "Road Incident", "Wildfire Hotspot", ...
title = (data.get("title") or data.get("headline")
or friendly_name or cat_raw
or f"{adapter} event")
NWS and USGS quake supply title/headline directly and still take the first-priority slot. cat_raw stays as the last-resort tail for genuinely unknown categories. Per-adapter title synthesis (e.g. tomtom: f"{roadway} - {event_type}") is queued as v0.5.8 work -- intentionally out of scope here.
FIX 2 -- meshai/notifications/renderers/mesh.py:_format_one_line drops the [Family] prefix unconditionally. Pre-fix:
prefix = self._toggle_label(p.event_type) # -> "Roads", "Weather", ...
if prefix:
return f"[{prefix}] {p.message}" # legacy v0.5.0 debug format
return p.message
Since v0.5.2 the dispatcher hands payload.message from compose_mesh_message() whose output ALREADY starts with the family emoji + label ("🚨 ROADS:", "🔥 FIRE:", "⚠ WX:", "🌐 RF:", ...). The renderer wrap produced the visually-broken duplicate "[Roads] 🚨 ROADS: ...". The composer was supposed to be the single source of truth for mesh formatting; the renderer never got the memo.
Post-fix the renderer is a verbatim pass-through:
return p.message or ""
The _toggle_label() method and TOGGLE_LABELS table are KEPT (the digest renderer at notifications/pipeline/digest.py still uses them for the multi-line summary format -- do not remove them).
Why pytest did not catch this
-----------------------------
compose_mesh_message is unit-tested with synthetic Events that have clean titles; no test passes "incident.tomtom_incidents" as event.title to the composer. MeshRenderer.render is unit-tested with synthetic NotificationPayloads carrying legacy messages; no test feeds composer output into the renderer. The seam between consumer/composer/renderer was never end-to-end tested with a realistic Central envelope. New file tests/test_central_envelope_to_wire_v057.py closes that gap.
Tests
-----
PYTHONPATH=. pytest -q: 474 passed, 2 skipped (was 450 baseline; +24 net).
- tests/test_central_envelope_to_wire_v057.py (new): runs five representative Central envelopes (tomtom_incidents, FIRMS hotspot, NWS alert, USGS quake, SWPC alert) through _normalize -> dispatcher -> renderer and asserts the rendered wire string (a) does not start with "[", (b) does not contain any raw Central category token (".tomtom_incidents", ".firms", ".kindex", ".proton_flux"), (c) starts with the composer emoji+label, (d) for adapters lacking upstream title/headline, uses the registry-friendly name in the primary slot. Plus a focused regression-guard test test_matt_smoking_gun_no_longer_reproduces that asserts the exact 2026-06-04 15:40:30 wire string can no longer be produced.
- tests/test_renderers.py: test_mesh_render_event_type_prefix renamed to test_mesh_render_passes_message_verbatim with new assertion (no [Family] prefix); test_mesh_render_unknown_event_type_no_prefix updated for the verbatim contract.
Re-flip verification
--------------------
After the fix landed in container image sha256:0dea6ad3, the staged flip from earlier tonight was repeated in one shot (master + central + 8 adapters + 8 toggles all ON, container restart, 5-minute observation). All 12 v0.5.7-fixed Central subscriptions confirmed active, container healthy, ugly-format detector (grep for "[<Family>] " or raw-category tokens on the wire) saw zero hits, spam-fuse not tripped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|