feat(v0.6-tail-3): enforce OR-not-AND continuously -- close USGS direct-lookup leak + flag environmental config changes as restart-required

Gap 1 -- env_routes.lookup_usgs_site no longer creates a temporary
USGSStreamsAdapter to hit USGS.gov directly. When the env_store has no
native usgs adapter (because usgs.feed_source != native), the endpoint
returns HTTP 404 with a body that says "site lookup unavailable in
central-feed mode; values must be entered manually or sourced from
Central". This closes the AND-mode anti-pattern Central's v0.10.2
report flagged: meshai was in central-feed mode for usgs but the
lookup helper would still call USGS.gov directly the first time the
dashboard opened the Add-Gauge form.

Gap 2 -- config_routes.RESTART_REQUIRED_SECTIONS gains "environmental"
and the PUT handler now diffs the section before/after, returning
{saved, restart_required, changed_keys}. restart_required is true only
when there are actual changes AND the section is in the restart-required
set, so a no-op PUT to environmental never raises a false alarm.

Frontend wiring:
- New RestartBanner component (yellow top-of-main banner) listens to a
  meshai:restart-required CustomEvent + cross-tab storage event,
  persists across navigations via localStorage, shows changed_keys
  preview + Restart-now button (POSTs /api/system/restart) + dismiss.
- Layout.tsx mounts <RestartBanner /> above {children} so it surfaces
  on every page.
- Config.tsx saveSection() now calls notifyRestartRequired(changed_keys)
  alongside its existing setRestartRequired(true) when the API flags
  the section.
- GaugeSites.tsx probes /api/config/environmental at mount and shows a
  "USGS lookup" button next to the site_id input. The button is
  disabled with an explanatory tooltip when usgs.feed_source != native,
  and gracefully renders the 404 detail when the API returns 404 in
  central-feed mode -- enter-manually UX, no silent fallback.

Tests -- tests/test_or_arch_continuous.py (11 cases, all passing):
- USGS lookup 404 with no env_store / no native usgs adapter
- 502 on native-adapter exception
- 200 + payload on native-adapter happy path
- environmental in RESTART_REQUIRED_SECTIONS
- PUT environmental with changed feed_source -> restart_required:true
  + changed_keys list including foo.feed_source dotted path
- PUT bot (non-restart section) -> restart_required:false
- No-op PUT to bot / environmental -> restart_required:false, empty
  changed_keys
- _diff_keys helper unit tests (nested dicts, list-element changes)

Why this matters: per the Spokane post-mortem and Central's v0.10.2
response, both sides need belt-and-suspenders against transient
AND-modes. meshai's static OR enforcement at env_store boot is the
runtime guard; this commit makes the GUI honor it continuously --
the lookup helper can't sneak past it any more, and the user is told
explicitly that an environmental config change does not take effect
until the container restarts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Matt Johnson (via Claude) 2026-06-06 03:51:10 +00:00
commit f89e9c11fb
10 changed files with 641 additions and 151 deletions

View file

@ -17,13 +17,22 @@ logger = logging.getLogger(__name__)
router = APIRouter(tags=["config"])
# Sections that require restart when changed
# Sections that require restart when changed.
# v0.6-tail-3: environmental added. Per Central v0.10.2 OR-not-AND
# verification (Spokane fix), env_store rebuild and CentralConsumer
# subscribe both happen only at boot. A live PUT to
# environmental.<adapter>.feed_source / enabled writes to disk but the
# running process keeps polling the existing native adapters AND newly
# subscribing to Central until the container restarts -- a transient
# AND-mode that violates the architecture for as long as the user
# delays the restart.
RESTART_REQUIRED_SECTIONS = {
"connection",
"llm",
"mesh_sources",
"meshmonitor",
"dashboard",
"environmental",
}
# Valid config section names
@ -134,19 +143,44 @@ async def update_config_section(section: str, request: Request):
config_dir = get_config_dir_from_path(config_path)
save_section(section, data_to_save, config_dir)
# Determine if restart is required
restart_required = section in RESTART_REQUIRED_SECTIONS
# v0.6-tail-3: compute the dotted-key diff so the UI banner can
# show *which* fields require a restart, not just "something
# restart-y changed". This is purely advisory -- the static OR
# enforcement at boot remains the runtime guard.
try:
before_section = _section_to_plain(getattr(
request.app.state.config, section, None))
except Exception:
before_section = None
after_section = data_to_save
changed_keys = _diff_keys(before_section, after_section,
prefix=section)
# Keep the live config in sync (no disk reload needed) when no restart is required
restart_required = (section in RESTART_REQUIRED_SECTIONS
and len(changed_keys) > 0)
# Keep the live config in sync (no disk reload needed) when no
# restart is required. When a restart IS required, the live
# config object intentionally diverges from disk until the user
# actually restarts -- otherwise the runtime would silently
# switch into the transient AND-mode this commit exists to
# prevent.
if not restart_required and getattr(request.app.state, "config", None) is not None:
try:
setattr(request.app.state.config, section, new_value)
except Exception:
pass
logger.info(f"Config section '{section}' updated, restart_required={restart_required}")
logger.info(
"Config section %r updated, restart_required=%s changed_keys=%s",
section, restart_required, changed_keys,
)
return {"saved": True, "restart_required": restart_required}
return {
"saved": True,
"restart_required": restart_required,
"changed_keys": changed_keys,
}
except ValueError as e:
raise HTTPException(status_code=422, detail=str(e))
@ -250,3 +284,51 @@ def register_config_routes_hooks(app):
except Exception:
logger.exception("auto-refresh middleware failed")
return response
# ---- v0.6-tail-3 diff helpers ------------------------------------------
def _section_to_plain(section_value):
"""Dataclass / list / scalar -> JSON-serializable shape."""
if section_value is None:
return None
if isinstance(section_value, list):
return [
_dataclass_to_dict(item) if hasattr(item, "__dataclass_fields__") else item
for item in section_value
]
if hasattr(section_value, "__dataclass_fields__"):
return _dataclass_to_dict(section_value)
return section_value
def _diff_keys(before, after, *, prefix: str) -> list[str]:
"""Recursively collect dotted-path keys where `before` and `after` differ.
Lists are compared element-wise -- structural mismatch yields a single
bracketless path. The function is deliberately tolerant of None /
missing keys so a section being added or removed produces a meaningful
diff instead of crashing.
"""
out: list[str] = []
def walk(b, a, p: str):
if b == a:
return
if isinstance(b, dict) and isinstance(a, dict):
for k in set(b.keys()) | set(a.keys()):
walk(b.get(k), a.get(k), f"{p}.{k}" if p else k)
return
if isinstance(b, list) and isinstance(a, list):
if len(b) != len(a):
out.append(p)
return
for i, (bi, ai) in enumerate(zip(b, a)):
walk(bi, ai, f"{p}[{i}]")
return
out.append(p)
walk(before, after, prefix)
return sorted(out)

View file

@ -1,6 +1,6 @@
"""Environmental data API routes."""
from fastapi import APIRouter, Request
from fastapi import APIRouter, HTTPException, Request
router = APIRouter(tags=["environment"])
@ -143,26 +143,43 @@ async def lookup_usgs_site(request: Request, site_id: str):
Returns site name, location, and flood stage thresholds from NWS NWPS.
Used by the config UI to auto-populate fields when adding a new gauge.
"""
v0.6-tail-3: when usgs.feed_source != native, this endpoint returns 404
instead of creating a temporary USGSStreamsAdapter. The pre-tail-3
behavior was an AND-mode anti-pattern -- meshai was in central-feed
mode for usgs but the lookup helper hit USGS.gov directly anyway.
With this change, the lookup is only available when meshai itself
is the polling source. In central-feed mode the GUI must source
values manually or via Central."""
env_store = getattr(request.app.state, "env_store", None)
if not env_store:
return {"error": "Environmental feeds not enabled"}
raise HTTPException(
status_code=404,
detail="Environmental feeds not enabled",
)
adapters = getattr(env_store, "_adapters", {})
usgs_adapter = adapters.get("usgs")
if not usgs_adapter:
# Create a temporary adapter for lookup
from meshai.env.usgs import USGSStreamsAdapter
from meshai.config import USGSConfig
usgs_adapter = USGSStreamsAdapter(USGSConfig())
# No native usgs adapter on the env_store means usgs is either
# disabled or running on a non-native feed_source (central). In
# central-feed mode meshai must NOT make direct upstream API calls;
# that's the AND-model anti-pattern Central's v0.10.2 report
# called out explicitly. Surface this to the UI as a 404 so the
# frontend can switch the form to manual-entry mode.
raise HTTPException(
status_code=404,
detail=("site lookup unavailable in central-feed mode; values "
"must be entered manually or sourced from Central"),
)
try:
result = usgs_adapter.lookup_site(site_id)
return result
except Exception as e:
return {"error": str(e), "site_id": site_id}
raise HTTPException(status_code=502, detail=str(e))
@router.get("/env/traffic")

View file

@ -8,8 +8,8 @@
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@400;500;600;700&display=swap" rel="stylesheet">
<script type="module" crossorigin src="/assets/index-WV9oBF1j.js"></script>
<link rel="stylesheet" crossorigin href="/assets/index-j88L17ja.css">
<script type="module" crossorigin src="/assets/index-D0oznGRE.js"></script>
<link rel="stylesheet" crossorigin href="/assets/index-BNx9Ej8o.css">
</head>
<body>
<div id="root"></div>