itd_511's free-text Comment field carries a milepost in roughly a third of
the live samples ('milepost 32.5', 'MP 80 to MP 81', etc.). meshai's roads
integration needs that as a structured field; wzdx and tomtom_incidents
already speak in structured mile-post / from-to so itd_511 is the only
adapter that needs the regex extraction layer.
Design (per Step-0 review):
- Shared module src/central/enrichment/mile_marker.py exporting
extract(text) -> {value, source, confidence} | None. Pure regex, no I/O,
re-usable by future per-state-DOT adapters (Wyoming, Montana, ...).
- itd_511 calls extract on the Comment in _build_event_record; result lands
under the established _enriched namespace (NOT a new _enrichment one),
keyed 'mile_marker'. Same convention the supervisor's geocoder uses, same
merge semantics apply_enrichment already supports. Absent when no match
(no null placeholder) so subscribers can tell 'not mentioned' from
'extraction found nothing'.
- Confidence tiers: 'high' (single unambiguous MP/milepost/MM match),
'medium' (multiple matches like 'MP 80 to MP 81' -- first wins), 'low'
(bare 'mile N' only; consumers can ignore).
Tests:
- tests/test_enrichment_mile_marker.py: 36 cases parametrized over the 15
real ITD comments I pulled from CENTRAL_TRAFFIC, including the critical
red-herring classes the regex must reject (phone numbers, project key
numbers, state-highway numbers, date/time numbers). Crafted samples
cover M.P. / MM / milemarker / bare-mile patterns not in live ITD data
but required by spec for future DOT adapters.
- tests/test_itd_511.py: 2 integration tests confirming the bundle is
attached on a milepost-bearing Comment and absent otherwise.
Pure enrichment, no schema-breaking changes; meshai's renderer picks it up
additively.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>