Central - data hub spine. Adapters -> NATS/JetStream -> archive.
  • Python 90.5%
  • HTML 9.1%
  • PLpgSQL 0.4%
Find a file
Matt Johnson d960d1f5e0 feat(3-J): enrichment framework + GeocoderEnricher + NoOpBackend + FIRMS pilot
First of three PRs for v0.5.0 (J: framework; K: real geocoder backends +
doc revisions; L: operator events tab + per-adapter render + events-map fix).

Design pivot: the Phase 2 "no enrichment, upstream verbatim" reading of
Matt's principle is reframed — consumers can't do follow-up lookups, they
only see what's on the wire, so whatever Central doesn't enrich is
effectively missing downstream. Enrichment is now expected. The producer-doc
§2/§10.1 rewrite lands in PR K; this PR builds the framework PR K documents.

New package src/central/enrichment/:
- base.py        Enricher Protocol (name + async enrich(location) -> dict).
- geocoder.py    GeocoderEnricher + GeocoderBackend Protocol + the locked
                 GEOCODER_FIELDS set (name, city, county, state, country,
                 postal_code, timezone, landclass, elevation_m) + all_null_bundle().
- cache.py       EnrichmentCache — stdlib sqlite3 off the event loop via
                 asyncio.to_thread (no async-sqlite dep). Keyed on
                 (enricher_name, lat_4dp, lon_4dp); per-enricher TTL (24h
                 default); fresh connection per op (sqlite3 isn't thread-safe
                 to share). Cache even all-null; never cache backend failures.
- backends/no_op.py  NoOpBackend — all-null bundle, the PR J default.

Provenance: enrichment results land under event.data["_enriched"][<name>];
everything else in data stays upstream verbatim.

Wiring:
- adapter.py        enrichment_locations: list[tuple[str,str]] = [] class attr.
                    Empty (default) = publish as-is, no enrichment.
- config_models.py  EnrichmentConfig (enricher_class, backend_class,
                    backend_settings, cache_ttl_s). Read once at startup.
- supervisor.py     build_enrichers() + apply_enrichment(); enrichment runs
                    after dedup, before wrap_event, in the poll loop. Class-name
                    registries for enricher/backend resolution (PR K extends).
- firms.py          enrichment_locations = [("latitude","longitude")] — pilot.

Enrichment config is read once at supervisor startup; hot-reload is out of
scope for PR J (noted in EnrichmentConfig + build_enrichers docstrings).

Tests (16 new):
- test_enrichment_framework.py (9): parent-dir/table init, cache miss->hit,
  TTL expiry, 4dp rounding, nearby-coord collapse, concurrent-set single-row,
  backend-failure all-null-not-cached (retries), success cached (one backend
  call), all-null cached.
- test_geocoder_enricher.py (5): NoOp all-null, field-set == GEOCODER_FIELDS,
  null-coords short-circuit (no backend call), name=="geocoder", sequential
  same-coords single backend call.
- test_firms.py (+2): enrichment_locations declared + paths resolve to floats
  in a real event (structural, not literal); event through supervisor
  apply_enrichment emerges with data._enriched.geocoder == all-null bundle.

Verification: full pytest 495 passed (was 479; +16). grep for
subject_for_event/_ADAPTER_REGISTRY clean. Module imports cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 04:39:49 +00:00
docs docs(2-I): producer integration spec — docs/PRODUCER-INTEGRATION.md 2026-05-19 21:17:48 +00:00
etc-templates scaffold: initial repository structure 2026-05-15 19:16:24 +00:00
scripts scaffold: initial repository structure 2026-05-15 19:16:24 +00:00
sql feat(2-G): USGS NWIS adapter (OGC API) + CENTRAL_HYDRO stream 2026-05-19 16:50:21 +00:00
src/central feat(3-J): enrichment framework + GeocoderEnricher + NoOpBackend + FIRMS pilot 2026-05-20 04:39:49 +00:00
systemd feat(gui): add auth core, setup gate, and first-run operator creation 2026-05-17 05:30:49 +00:00
tests feat(3-J): enrichment framework + GeocoderEnricher + NoOpBackend + FIRMS pilot 2026-05-20 04:39:49 +00:00
.gitattributes chore: normalize line endings to LF 2026-05-16 22:26:12 +00:00
.gitignore feat(gui): add auth core, setup gate, and first-run operator creation 2026-05-17 05:30:49 +00:00
.python-version foundation: models, adapter ABC, config, CE wire, schema 2026-05-15 21:08:56 +00:00
CHANGELOG.md docs: add v0.3.0 changelog entry and network bindings reference (#29) 2026-05-18 14:26:09 -06:00
LICENSE scaffold: initial repository structure 2026-05-15 19:16:24 +00:00
pyproject.toml release: bump version to 0.3.0 (#30) 2026-05-18 14:29:28 -06:00
README.md docs: add test database setup, restore geom to test fixture 2026-05-17 18:26:48 +00:00
uv.lock feat(gui): add auth core, setup gate, and first-run operator creation 2026-05-17 05:30:49 +00:00

Central

Central is the data hub spine for the infrastructure. Adapters normalize upstream sources into a canonical event shape, publish CloudEvents to NATS/JetStream, and archive to TimescaleDB for historical query. Single-LXC deployment.

Status

Phase 0 — scaffold. Not yet operational.

Architecture

  • Python 3.12 (uv-managed)
  • NATS + JetStream for live event bus
  • TimescaleDB + PostGIS for archive and geospatial query
  • One supervisor process managing adapter lifecycle
  • One archive consumer process persisting events to TimescaleDB
  • Both processes systemd-managed

Testing

See docs/test-database.md for test database setup.

License

MIT. See LICENSE.