central

matt/central

Fork 0

mirror of https://github.com/zvx-echo6/central.git synced 2026-05-21 18:14:44 +02:00

Commit graph

Author	SHA1	Message	Date
Matt Johnson	98b050b2af	feat(3-K): real geocoder backends + producer-doc reframe + consumer-doc enrichment Second of three PRs for v0.5.0 (J shipped the framework; this fills in real backends + documents the reframed design principle in-tree; L is the events tab + map fix, then tag). Backends (all satisfy GeocoderBackend; never raise, all-null on any failure): - NaviBackend — composed Navi /api/reverse/<lat>/<lon> (name/address + timezone + landclass + elevation in one call). Near-passthrough: response already matches the canonical 9-field shape. Best-effort warmup ping (Boise) on construction when a loop is running; config `headers` slot for a future Authorization: Bearer (config-only, no code change). Default base_url http://192.168.1.130:8440. - PhotonBackend — raw Photon /reverse?lat&lon&limit=1 (name/address only). Maps features[0].properties; postal_code <- postcode; timezone/landclass/ elevation_m null (Navi-composed-endpoint extras). - NominatimBackend — OSM Nominatim /reverse?format=jsonv2 (name/address only). Configurable rate limit (default 1/sec; 0 disables for self-hosted) + required User-Agent. Maps the address block; landclass/elevation_m/timezone null. Registered all three in supervisor _BACKEND_REGISTRY (resolved by EnrichmentConfig backend_class name). Docs — design pivot now in-tree: - PRODUCER §2 reframed: the verbatim Matt quote stays; the translation inverts. Central is the consumer's only data plane (consumers can't do follow-up lookups), so enrich deliberately and centrally, namespaced under _enriched, failing to null. "No enrichment" is gone. - PRODUCER §10.1 inverted: enrichment is expected; the anti-pattern is doing it OUTSIDE the framework (inline in poll(), bypassing cache + _enriched namespacing + the never-raise safety net). - PRODUCER new §13 Enrichment contract: Enricher / GeocoderEnricher / GeocoderBackend Protocols, NoOpBackend default, sqlite cache + TTL + cache-all-null + don't-cache-on-raise semantics, _enriched.<name> provenance, per-field coverage matrix (cross-checked against GEOCODER_FIELDS), and the landclass antimeridian known wrinkle. - CONSUMER FIRMS section: documents the data._enriched.geocoder bundle (9 fields), per-region coverage (US-full, non-US timezone+elevation), and the antimeridian landclass caveat. Tests: - test_navi/photon/nominatim_backend.py — happy-path field mapping, null handling, extra-key drop, network/timeout/non-200/malformed -> all-null (never raises), Nominatim rate-limit (disabled + spacing) + User-Agent. Env-gated live Navi smoke (NAVI_INTEGRATION_TEST=1; skipped by default — the 192.168.1.130 endpoint isn't reachable from CT104's segment). - test_producer_doc.py — +4: §2 verbatim quote present, §10.1 subsection exists, §13 names all four protocol types, §13 coverage matrix == GEOCODER_FIELDS (derived from code, not hardcoded). Verification: full pytest 525 passed, 1 skipped (was 495; +30 backend + 4 doc tests, -1 the env-gated skip). grep subject_for_event/_ADAPTER_REGISTRY clean. All three backends import + resolve via the registry. Flagged for later (NOT done here): adapters besides FIRMS that should declare enrichment_locations (nwis, eonet, gdacs, usgs_quake, wfigs_*) — that's PR L scope alongside the events tab. See PR description. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 16:10:44 +00:00
malice	d92074b134	docs(2-H): consumer integration spec — docs/CONSUMER-INTEGRATION.md (#38 ) Adds the consumer contract for Central's NATS event streams. Primary reader: a Claude Code instance building MeshAI's ingestion layer. The doc IS the spec -- no "see source for details". Opens with Matt's framing: "Central takes it all and gives it all. It's up to the pipe to do with it what it will." Central is a faithful firehose -- adapters preserve every upstream field with no enrichment / formatting / opinionated translation. The CloudEvents envelope adds routing + dedup support; everything else is upstream-shaped. Where the doc lists upstream lookup endpoints for ID-only fields, that is consumer-side convenience -- explicitly NOT a recommendation that Central enrich. Sections (11 total): 1. Quick start (5-line nats-py subscribe-and-print) 2. Connection details (URL / auth / JetStream context / stream discovery) 3. Stream layout (7 streams, derived from streams.py registry) 4. Subject namespace registry (Mermaid tree + full pattern table) 5. Wire format (5a CloudEvents envelope; 5b inner Event payload) -- explicit callout that geo.centroid is [lon, lat] GeoJSON, NOT [lat, lon] 6. Per-adapter reference (12 subsections, locked template) 7. Fall-off / removal semantics (explicit subjects vs absence-as-signal) 8. Consumer patterns (durable vs ephemeral, ack/nack/term, worked example) 9. Dedup implementation guide (single-token vs composite-key adapters) 10. Writing a new consumer checklist 11. Troubleshooting Doc length: 1878 lines (target was 600-1000 originally; revised to 1200-1800 once full-fidelity JSON examples + inciweb 3x narratives + wfigs_perimeters polygon were folded in). Completeness wins per the design principle. Every JSON example is verbatim from CT104. 11 examples sourced from /tmp/nwis-build/evidence.txt (dumped via psql jsonb_pretty); the wfigs_perimeters example is a freshly pulled smallest-active-polygon record so the doc captures the live polygon shape without flooding the page with thousands of coordinate pairs. The doc is assembled by /tmp/nwis-build/build_doc.py which splices live JSON blocks into a markdown template. The build script is local-only (not committed) because the doc itself is the artifact; future updates regenerate by re-pulling live evidence and re-running the assembler. New test: tests/test_consumer_doc.py (5 tests). Parses the doc and asserts: - The "Stream layout" table matches central.streams.STREAMS exactly (stream names + subject filters). - The (name, subject_filter) pairs match the registry as pairs (catches swapped subject filters on existing streams). - Every adapter discovered via central.adapter_discovery.discover_adapters() has a per-adapter subsection -- and vice versa. - The subsection count equals the registry size (catches duplicates). Verification: - 463/463 full suite green (was 458; +5 new consumer_doc tests). - Doc structure: 1 H1, 12 H2, 33 H3, 12 per-adapter sections, 1 mermaid block, 12 JSON blocks (all parse). - All 12 adapters covered. - No regressions elsewhere. Acceptance bars (a)-(e) verbatim: (a) grep "subject_for_event\|_ADAPTER_REGISTRY" -> empty (b) all 12 adapters have per-adapter subsections (c) 5/5 consumer-doc tests pass (d) 463/463 full suite (e) doc length 1878 lines markdownlint was not available on CT104; substituted an inline Python sanity check confirming code-fence balance, JSON-block validity, and structural integrity (12 H2 / 33 H3 / 1 mermaid). Co-authored-by: zvx <zvx@central>	2026-05-19 14:33:51 -06:00

Author

SHA1

Message

Date

Matt Johnson

98b050b2af

feat(3-K): real geocoder backends + producer-doc reframe + consumer-doc enrichment

Second of three PRs for v0.5.0 (J shipped the framework; this fills in real
backends + documents the reframed design principle in-tree; L is the events
tab + map fix, then tag).

Backends (all satisfy GeocoderBackend; never raise, all-null on any failure):
- NaviBackend — composed Navi /api/reverse/<lat>/<lon> (name/address + timezone
  + landclass + elevation in one call). Near-passthrough: response already
  matches the canonical 9-field shape. Best-effort warmup ping (Boise) on
  construction when a loop is running; config `headers` slot for a future
  Authorization: Bearer (config-only, no code change). Default base_url
  http://192.168.1.130:8440.
- PhotonBackend — raw Photon /reverse?lat&lon&limit=1 (name/address only).
  Maps features[0].properties; postal_code <- postcode; timezone/landclass/
  elevation_m null (Navi-composed-endpoint extras).
- NominatimBackend — OSM Nominatim /reverse?format=jsonv2 (name/address only).
  Configurable rate limit (default 1/sec; 0 disables for self-hosted) +
  required User-Agent. Maps the address block; landclass/elevation_m/timezone
  null.

Registered all three in supervisor _BACKEND_REGISTRY (resolved by EnrichmentConfig
backend_class name).

Docs — design pivot now in-tree:
- PRODUCER §2 reframed: the verbatim Matt quote stays; the translation inverts.
  Central is the consumer's only data plane (consumers can't do follow-up
  lookups), so enrich deliberately and centrally, namespaced under _enriched,
  failing to null. "No enrichment" is gone.
- PRODUCER §10.1 inverted: enrichment is expected; the anti-pattern is doing it
  OUTSIDE the framework (inline in poll(), bypassing cache + _enriched
  namespacing + the never-raise safety net).
- PRODUCER new §13 Enrichment contract: Enricher / GeocoderEnricher /
  GeocoderBackend Protocols, NoOpBackend default, sqlite cache + TTL +
  cache-all-null + don't-cache-on-raise semantics, _enriched.<name> provenance,
  per-field coverage matrix (cross-checked against GEOCODER_FIELDS), and the
  landclass antimeridian known wrinkle.
- CONSUMER FIRMS section: documents the data._enriched.geocoder bundle (9
  fields), per-region coverage (US-full, non-US timezone+elevation), and the
  antimeridian landclass caveat.

Tests:
- test_navi/photon/nominatim_backend.py — happy-path field mapping, null
  handling, extra-key drop, network/timeout/non-200/malformed -> all-null
  (never raises), Nominatim rate-limit (disabled + spacing) + User-Agent.
  Env-gated live Navi smoke (NAVI_INTEGRATION_TEST=1; skipped by default — the
  192.168.1.130 endpoint isn't reachable from CT104's segment).
- test_producer_doc.py — +4: §2 verbatim quote present, §10.1 subsection exists,
  §13 names all four protocol types, §13 coverage matrix == GEOCODER_FIELDS
  (derived from code, not hardcoded).

Verification: full pytest 525 passed, 1 skipped (was 495; +30 backend +
4 doc tests, -1 the env-gated skip). grep subject_for_event/_ADAPTER_REGISTRY
clean. All three backends import + resolve via the registry.

Flagged for later (NOT done here): adapters besides FIRMS that should declare
enrichment_locations (nwis, eonet, gdacs, usgs_quake, wfigs_*) — that's PR L
scope alongside the events tab. See PR description.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-20 16:10:44 +00:00

malice

d92074b134

docs(2-H): consumer integration spec — docs/CONSUMER-INTEGRATION.md (#38 )

Adds the consumer contract for Central's NATS event streams. Primary reader:
a Claude Code instance building MeshAI's ingestion layer. The doc IS the spec --
no "see source for details".

Opens with Matt's framing: "Central takes it all and gives it all. It's up to
the pipe to do with it what it will." Central is a faithful firehose --
adapters preserve every upstream field with no enrichment / formatting /
opinionated translation. The CloudEvents envelope adds routing + dedup support;
everything else is upstream-shaped. Where the doc lists upstream lookup
endpoints for ID-only fields, that is consumer-side convenience -- explicitly
NOT a recommendation that Central enrich.

Sections (11 total):
  1. Quick start (5-line nats-py subscribe-and-print)
  2. Connection details (URL / auth / JetStream context / stream discovery)
  3. Stream layout (7 streams, derived from streams.py registry)
  4. Subject namespace registry (Mermaid tree + full pattern table)
  5. Wire format (5a CloudEvents envelope; 5b inner Event payload)
     -- explicit callout that geo.centroid is [lon, lat] GeoJSON, NOT [lat, lon]
  6. Per-adapter reference (12 subsections, locked template)
  7. Fall-off / removal semantics (explicit subjects vs absence-as-signal)
  8. Consumer patterns (durable vs ephemeral, ack/nack/term, worked example)
  9. Dedup implementation guide (single-token vs composite-key adapters)
  10. Writing a new consumer checklist
  11. Troubleshooting

Doc length: 1878 lines (target was 600-1000 originally; revised to 1200-1800
once full-fidelity JSON examples + inciweb 3x narratives + wfigs_perimeters
polygon were folded in). Completeness wins per the design principle.

Every JSON example is verbatim from CT104. 11 examples sourced from
/tmp/nwis-build/evidence.txt (dumped via psql jsonb_pretty); the wfigs_perimeters
example is a freshly pulled smallest-active-polygon record so the doc captures
the live polygon shape without flooding the page with thousands of coordinate
pairs.

The doc is assembled by /tmp/nwis-build/build_doc.py which splices live JSON
blocks into a markdown template. The build script is local-only (not committed)
because the doc itself is the artifact; future updates regenerate by re-pulling
live evidence and re-running the assembler.

New test: tests/test_consumer_doc.py (5 tests). Parses the doc and asserts:
  - The "Stream layout" table matches central.streams.STREAMS exactly
    (stream names + subject filters).
  - The (name, subject_filter) pairs match the registry as pairs (catches
    swapped subject filters on existing streams).
  - Every adapter discovered via central.adapter_discovery.discover_adapters()
    has a per-adapter subsection -- and vice versa.
  - The subsection count equals the registry size (catches duplicates).

Verification:
  - 463/463 full suite green (was 458; +5 new consumer_doc tests).
  - Doc structure: 1 H1, 12 H2, 33 H3, 12 per-adapter sections, 1 mermaid block,
    12 JSON blocks (all parse).
  - All 12 adapters covered.
  - No regressions elsewhere.

Acceptance bars (a)-(e) verbatim:
  (a) grep "subject_for_event|_ADAPTER_REGISTRY" -> empty
  (b) all 12 adapters have per-adapter subsections
  (c) 5/5 consumer-doc tests pass
  (d) 463/463 full suite
  (e) doc length 1878 lines

markdownlint was not available on CT104; substituted an inline Python sanity
check confirming code-fence balance, JSON-block validity, and structural
integrity (12 H2 / 33 H3 / 1 mermaid).

Co-authored-by: zvx <zvx@central>

2026-05-19 14:33:51 -06:00

2 commits