mirror of
https://github.com/zvx-echo6/central.git
synced 2026-06-10 11:54:37 +02:00
Central - data hub spine. Adapters -> NATS/JetStream -> archive.
- Python 89.1%
- HTML 9.3%
- CSS 1.3%
- PLpgSQL 0.3%
itd_511's free-text Comment field carries a milepost in roughly a third of
the live samples ('milepost 32.5', 'MP 80 to MP 81', etc.). meshai's roads
integration needs that as a structured field; wzdx and tomtom_incidents
already speak in structured mile-post / from-to so itd_511 is the only
adapter that needs the regex extraction layer.
Design (per Step-0 review):
- Shared module src/central/enrichment/mile_marker.py exporting
extract(text) -> {value, source, confidence} | None. Pure regex, no I/O,
re-usable by future per-state-DOT adapters (Wyoming, Montana, ...).
- itd_511 calls extract on the Comment in _build_event_record; result lands
under the established _enriched namespace (NOT a new _enrichment one),
keyed 'mile_marker'. Same convention the supervisor's geocoder uses, same
merge semantics apply_enrichment already supports. Absent when no match
(no null placeholder) so subscribers can tell 'not mentioned' from
'extraction found nothing'.
- Confidence tiers: 'high' (single unambiguous MP/milepost/MM match),
'medium' (multiple matches like 'MP 80 to MP 81' -- first wins), 'low'
(bare 'mile N' only; consumers can ignore).
Tests:
- tests/test_enrichment_mile_marker.py: 36 cases parametrized over the 15
real ITD comments I pulled from CENTRAL_TRAFFIC, including the critical
red-herring classes the regex must reject (phone numbers, project key
numbers, state-highway numbers, date/time numbers). Crafted samples
cover M.P. / MM / milemarker / bare-mile patterns not in live ITD data
but required by spec for future DOT adapters.
- tests/test_itd_511.py: 2 integration tests confirming the bundle is
attached on a milepost-bearing Comment and absent otherwise.
Pure enrichment, no schema-breaking changes; meshai's renderer picks it up
additively.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| docs | ||
| etc-templates | ||
| scripts | ||
| sql | ||
| src/central | ||
| systemd | ||
| tests | ||
| .gitattributes | ||
| .gitignore | ||
| .python-version | ||
| CHANGELOG.md | ||
| LICENSE | ||
| pyproject.toml | ||
| README.md | ||
| uv.lock | ||
Central
Central is the data hub spine for the infrastructure. Adapters normalize upstream sources into a canonical event shape, publish CloudEvents to NATS/JetStream, and archive to TimescaleDB for historical query. Single-LXC deployment.
Status
Phase 0 — scaffold. Not yet operational.
Architecture
- Python 3.12 (uv-managed)
- NATS + JetStream for live event bus
- TimescaleDB + PostGIS for archive and geospatial query
- One supervisor process managing adapter lifecycle
- One archive consumer process persisting events to TimescaleDB
- Both processes systemd-managed
Testing
See docs/test-database.md for test database setup.
License
MIT. See LICENSE.