feat(nwis): site + stats enrichment — named location + WaterWatch normalcy band (v0.8.0)

Opens the v0.8.x data-quality cleanup arc. Production code; central-gui AND
central-supervisor restart (adapter contract + enrichment behavior change).

NWIS events rendered as a bare "Water reading: 111 ft3/s" with an empty Location
column -- an operator couldn't tell where the gauge is or whether 111 ft3/s is
drought-low, normal, or near-flood. Coordinates were present but the reverse
geocoder returns null city/state/county for rural gauge points, and USGS site +
percentile data was never fetched. v0.8.0 fetches both.

Approach B (adapter-owned, per the proposal decision): the NWIS adapter -- which
already owns the USGS APIs -- fetches site metadata and daily stats itself and
writes two provenance bundles under event.data["_enriched"]:
- usgs_site {name, lat, lon, state, county} from the OGC monitoring-locations
  item-by-id (the API family the adapter already speaks; JSON, no RDB parser).
- usgs_stats {value, percentile, class_label, severity_band, p10..p90, record_max,
  count, period} from the legacy RDB daily-statistics service (the OGC API has no
  stats endpoint). USGS percentiles are % of days at-or-below, so higher = higher
  flow; classified to the WaterWatch bands -> severity 0-4 (record=4, much
  above/below=3, above/below=2, normal=1; None reserved for "no stats", distinct
  from a normal-flow gauge). Severity is set on the event, so it drives the v0.7.1
  severity chip-picker filter + v0.7.2 map-marker opacity.
- new nwis_enrich.py: pure parse/classify/percentile/band helpers + a sqlite
  SiteStatsCache (site TTL 365d, stats TTL 90d -- one fetch per site+param serves
  every reading for the window, so a warm cache makes zero USGS calls). USGS down
  -> cached-if-present else all-null bundle; the event still publishes.

Framework: the single agreed generic change -- supervisor apply_enrichment now
MERGES into _enriched instead of overwriting, so the still-global geocoder phase
doesn't clobber the adapter's bundles. No other adapter writes _enriched, so this
is inert for them.

GUI: _event_summaries/nwis.html -> "<site> -- <value> <units> (<band>, <Nth>
percentile)", with graceful fallback to "<site> -- <value>" then the bare
"Water reading:". _event_rows/nwis.html detail gains site/normalcy/typical/location
rows. _events_rows.html Location column falls back generically to any
_enriched.<source> carrying state/county when the geocoder is null (works for
future enrichers). events.json contract unchanged (additions under _enriched only).

conftest isolate_enrichment_cache also redirects NWIS_CACHE_DB_PATH off the prod
path (unprivileged-user test isolation). Adds tests/test_nwis_enrichment.py (28
tests: parse, band edges incl P0/P9/P10/P75/P90/record, percentile interpolation,
cache hit/miss/expire, adapter enrich + graceful-null + cache-hit-no-refetch,
summary rendering per band).

Full suite: 710 passed, 1 skipped (central and unprivileged zvx, 3x each).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Matt Johnson 2026-05-25 15:30:19 +00:00
commit 8612f0b75d
9 changed files with 729 additions and 19 deletions

View file

@ -22,14 +22,24 @@ def isolate_enrichment_cache(tmp_path, monkeypatch):
so without this fixture the suite writes to (or, for any user without write
access to /var/lib/central, fails on) the live cache. Point it at a
per-test temp dir so no test ever touches the production path.
Also redirects the NWIS adapter's site/stats cache (v0.8.0,
`central.adapters.nwis.NWIS_CACHE_DB_PATH`, same /var/lib/central prod
default) for the same reason NWISAdapter.startup() opens it.
"""
import central.supervisor as supervisor_mod
import central.adapters.nwis as nwis_mod
monkeypatch.setattr(
supervisor_mod,
"ENRICHMENT_CACHE_DB_PATH",
tmp_path / "enrichment_cache.db",
)
monkeypatch.setattr(
nwis_mod,
"NWIS_CACHE_DB_PATH",
tmp_path / "nwis_cache.db",
)
@pytest.fixture(scope="session")