chore: normalize line endings to LF

This commit is contained in:
Matt Johnson 2026-05-16 21:27:30 +00:00
commit 374a8c067f
26 changed files with 5357 additions and 5346 deletions

View file

@ -1,211 +1,211 @@
# Bug Investigation: Cadence Decrease Hot-Reload
**Date:** 2026-05-16
**Component:** central-supervisor
**File:** `supervisor.py`
---
## 1. Reproduction
### Test Case: Decrease 60s → 30s
```
Tlast (poll completed): 04:18:24Z
Config change applied: 04:18:30Z (approx)
Expected next poll: 04:18:54Z (Tlast + 30s)
Actual next poll: 04:19:24Z (Tlast + 60s - OLD cadence)
Subsequent polls: Also at 60s intervals
```
### Log Evidence
```json
{"ts": "...", "msg": "Rescheduled adapter", "adapter": "nws", "old_cadence_s": 60, "new_cadence_s": 30, "next_poll": "2026-05-16T04:18:54+00:00"}
```
- "Rescheduled adapter" log fires with **correct** calculated next_poll
- Actual poll occurs at OLD cadence time
- Subsequent polls continue at OLD cadence
### Contrast: Increase 60s → 90s (WORKS)
```
Tlast: 03:16:34Z
Config change: 03:16:36Z
Expected next poll: 03:18:04Z (Tlast + 90s)
Actual next poll: 03:18:04Z ✅
```
---
## 2. Root Cause
### Location
`supervisor.py` lines 395-450 (`_reschedule_adapter`) and lines 144-181 (`_run_adapter_loop`)
### The Bug
The `cancel_event.set()` call in `_reschedule_adapter` does not reliably wake the `asyncio.wait_for()` in the adapter loop when the cadence is **decreased**.
### Why It Happens
1. **Event handler holds lock during signal:**
```python
# _on_config_change (line 466)
async with self._lock:
new_config = await self._config_source.get_adapter(adapter_name)
# ...
await self._reschedule_adapter(adapter_name, new_config) # sets cancel_event here
```
2. **Reschedule updates config then signals:**
```python
# _reschedule_adapter
state.config = new_config # Line 420
state.adapter.cadence_s = new_cadence # Line 423
# ... logging ...
state.cancel_event.set() # Line 450 - inside lock context
```
3. **Asyncio event delivery delay:**
The `asyncio.Event.set()` queues a wakeup for waiting tasks, but the signal delivery is subject to asyncio's task scheduler. When called from within an `async with` block, the event may not be processed until the current task yields or the lock context exits.
4. **Timing difference between increase and decrease:**
- **Increase (60→90):** Loop has ~30-50s remaining sleep. Event signal arrives well before timeout.
- **Decrease (90→60):** Loop may be ~10s from timeout. By the time event signal is processed, timeout has already fired.
5. **Why subsequent polls use old cadence:**
When the loop times out naturally (rather than being woken by event), it proceeds to poll. After poll completes, `state.last_completed_poll` is updated. The loop then reads `state.config.cadence_s` for the NEXT iteration — but if `state.config` was somehow not durably updated (or there's a stale reference), it uses the old value.
**Alternative theory:** The `state.config = new_config` assignment creates a new config object, but the loop may be reading from a captured reference to the old object if there's any closure behavior we're not seeing.
---
## 3. Proposed Fix
### Option A: Force immediate reschedule (Recommended)
Move the cancel logic OUTSIDE the lock, and use a more aggressive wake pattern:
```python
async def _reschedule_adapter(self, name: str, new_config: AdapterConfig) -> None:
state = self._adapter_states.get(name)
if state is None or not state.is_running:
await self._start_adapter(new_config)
return
old_cadence = state.config.cadence_s
new_cadence = new_config.cadence_s
# Update config atomically
state.config = new_config
state.adapter.cadence_s = new_cadence
# ... (NWS-specific updates, logging) ...
# Cancel and wait for acknowledgment
state.cancel_event.set()
await asyncio.sleep(0) # Force task switch to process event
```
### Option B: Stop and restart the loop task
For cadence changes, stop the current loop task and create a new one:
```python
async def _reschedule_adapter(self, name: str, new_config: AdapterConfig) -> None:
state = self._adapter_states.get(name)
if state is None:
await self._start_adapter(new_config)
return
# Preserve last_completed_poll
preserved_poll = state.last_completed_poll
# Stop current loop
await self._stop_adapter(name)
# Update config
state.config = new_config
state.last_completed_poll = preserved_poll
# Restart loop
await self._start_adapter(new_config)
```
### Option C: Double-signal pattern
Set the event, yield, then set again to ensure delivery:
```python
state.cancel_event.set()
await asyncio.sleep(0)
state.cancel_event.set() # Redundant but ensures visibility
```
---
## 4. Test Gap
### Missing Tests
The test file `test_config_source_new.py` only tests ConfigSource behavior (list, get, protocol compliance). There are **no tests** for:
1. `_reschedule_adapter` interrupting a sleeping loop
2. Cadence decrease being applied mid-sleep
3. Cadence increase being applied mid-sleep
4. Rate-limit guarantee after reschedule
5. `cancel_event` mechanism in isolation
### Recommended Tests
```python
@pytest.mark.asyncio
async def test_cadence_decrease_applies_immediately():
"""Cadence decrease should wake sleeping loop and reschedule."""
# Setup: Adapter polling at 60s cadence
# Action: Change cadence to 30s while sleeping
# Assert: Next poll at last_poll + 30s, not last_poll + 60s
@pytest.mark.asyncio
async def test_cadence_increase_applies_on_next_cycle():
"""Cadence increase should wake sleeping loop and extend wait."""
# Setup: Adapter polling at 60s cadence
# Action: Change cadence to 90s while sleeping
# Assert: Next poll at last_poll + 90s
@pytest.mark.asyncio
async def test_cancel_event_wakes_sleeping_loop():
"""cancel_event.set() should interrupt asyncio.wait_for()."""
# Unit test for the event mechanism in isolation
```
---
## 5. State at End
### LXC State (Reverted)
- **Cadence in DB:** 60s ✅
- **Actual poll interval:** 60s ✅
- **Supervisor restarted:** 2026-05-16T04:43:40Z
- **Verified polls:**
```
04:43:40.964 - First poll after restart
04:44:41.171 - Second poll (61s later) ✅
```
### Mitigation Until Fix
After any cadence change (especially decrease), verify actual poll intervals. If incorrect, restart supervisor:
```bash
systemctl restart central-supervisor
```
---
## Summary
| Item | Details |
|------|---------|
| **Bug** | Cadence decrease hot-reload doesn't apply without restart |
| **Root cause** | `cancel_event.set()` inside lock context has delayed delivery |
| **Affects** | Cadence decreases only; increases work correctly |
| **Workaround** | Restart supervisor after cadence decrease |
| **Fix effort** | Low - add `await asyncio.sleep(0)` after event.set() |
| **Test coverage** | None for hot-reload mechanism |
# Bug Investigation: Cadence Decrease Hot-Reload
**Date:** 2026-05-16
**Component:** central-supervisor
**File:** `supervisor.py`
---
## 1. Reproduction
### Test Case: Decrease 60s → 30s
```
Tlast (poll completed): 04:18:24Z
Config change applied: 04:18:30Z (approx)
Expected next poll: 04:18:54Z (Tlast + 30s)
Actual next poll: 04:19:24Z (Tlast + 60s - OLD cadence)
Subsequent polls: Also at 60s intervals
```
### Log Evidence
```json
{"ts": "...", "msg": "Rescheduled adapter", "adapter": "nws", "old_cadence_s": 60, "new_cadence_s": 30, "next_poll": "2026-05-16T04:18:54+00:00"}
```
- "Rescheduled adapter" log fires with **correct** calculated next_poll
- Actual poll occurs at OLD cadence time
- Subsequent polls continue at OLD cadence
### Contrast: Increase 60s → 90s (WORKS)
```
Tlast: 03:16:34Z
Config change: 03:16:36Z
Expected next poll: 03:18:04Z (Tlast + 90s)
Actual next poll: 03:18:04Z ✅
```
---
## 2. Root Cause
### Location
`supervisor.py` lines 395-450 (`_reschedule_adapter`) and lines 144-181 (`_run_adapter_loop`)
### The Bug
The `cancel_event.set()` call in `_reschedule_adapter` does not reliably wake the `asyncio.wait_for()` in the adapter loop when the cadence is **decreased**.
### Why It Happens
1. **Event handler holds lock during signal:**
```python
# _on_config_change (line 466)
async with self._lock:
new_config = await self._config_source.get_adapter(adapter_name)
# ...
await self._reschedule_adapter(adapter_name, new_config) # sets cancel_event here
```
2. **Reschedule updates config then signals:**
```python
# _reschedule_adapter
state.config = new_config # Line 420
state.adapter.cadence_s = new_cadence # Line 423
# ... logging ...
state.cancel_event.set() # Line 450 - inside lock context
```
3. **Asyncio event delivery delay:**
The `asyncio.Event.set()` queues a wakeup for waiting tasks, but the signal delivery is subject to asyncio's task scheduler. When called from within an `async with` block, the event may not be processed until the current task yields or the lock context exits.
4. **Timing difference between increase and decrease:**
- **Increase (60→90):** Loop has ~30-50s remaining sleep. Event signal arrives well before timeout.
- **Decrease (90→60):** Loop may be ~10s from timeout. By the time event signal is processed, timeout has already fired.
5. **Why subsequent polls use old cadence:**
When the loop times out naturally (rather than being woken by event), it proceeds to poll. After poll completes, `state.last_completed_poll` is updated. The loop then reads `state.config.cadence_s` for the NEXT iteration — but if `state.config` was somehow not durably updated (or there's a stale reference), it uses the old value.
**Alternative theory:** The `state.config = new_config` assignment creates a new config object, but the loop may be reading from a captured reference to the old object if there's any closure behavior we're not seeing.
---
## 3. Proposed Fix
### Option A: Force immediate reschedule (Recommended)
Move the cancel logic OUTSIDE the lock, and use a more aggressive wake pattern:
```python
async def _reschedule_adapter(self, name: str, new_config: AdapterConfig) -> None:
state = self._adapter_states.get(name)
if state is None or not state.is_running:
await self._start_adapter(new_config)
return
old_cadence = state.config.cadence_s
new_cadence = new_config.cadence_s
# Update config atomically
state.config = new_config
state.adapter.cadence_s = new_cadence
# ... (NWS-specific updates, logging) ...
# Cancel and wait for acknowledgment
state.cancel_event.set()
await asyncio.sleep(0) # Force task switch to process event
```
### Option B: Stop and restart the loop task
For cadence changes, stop the current loop task and create a new one:
```python
async def _reschedule_adapter(self, name: str, new_config: AdapterConfig) -> None:
state = self._adapter_states.get(name)
if state is None:
await self._start_adapter(new_config)
return
# Preserve last_completed_poll
preserved_poll = state.last_completed_poll
# Stop current loop
await self._stop_adapter(name)
# Update config
state.config = new_config
state.last_completed_poll = preserved_poll
# Restart loop
await self._start_adapter(new_config)
```
### Option C: Double-signal pattern
Set the event, yield, then set again to ensure delivery:
```python
state.cancel_event.set()
await asyncio.sleep(0)
state.cancel_event.set() # Redundant but ensures visibility
```
---
## 4. Test Gap
### Missing Tests
The test file `test_config_source_new.py` only tests ConfigSource behavior (list, get, protocol compliance). There are **no tests** for:
1. `_reschedule_adapter` interrupting a sleeping loop
2. Cadence decrease being applied mid-sleep
3. Cadence increase being applied mid-sleep
4. Rate-limit guarantee after reschedule
5. `cancel_event` mechanism in isolation
### Recommended Tests
```python
@pytest.mark.asyncio
async def test_cadence_decrease_applies_immediately():
"""Cadence decrease should wake sleeping loop and reschedule."""
# Setup: Adapter polling at 60s cadence
# Action: Change cadence to 30s while sleeping
# Assert: Next poll at last_poll + 30s, not last_poll + 60s
@pytest.mark.asyncio
async def test_cadence_increase_applies_on_next_cycle():
"""Cadence increase should wake sleeping loop and extend wait."""
# Setup: Adapter polling at 60s cadence
# Action: Change cadence to 90s while sleeping
# Assert: Next poll at last_poll + 90s
@pytest.mark.asyncio
async def test_cancel_event_wakes_sleeping_loop():
"""cancel_event.set() should interrupt asyncio.wait_for()."""
# Unit test for the event mechanism in isolation
```
---
## 5. State at End
### LXC State (Reverted)
- **Cadence in DB:** 60s ✅
- **Actual poll interval:** 60s ✅
- **Supervisor restarted:** 2026-05-16T04:43:40Z
- **Verified polls:**
```
04:43:40.964 - First poll after restart
04:44:41.171 - Second poll (61s later) ✅
```
### Mitigation Until Fix
After any cadence change (especially decrease), verify actual poll intervals. If incorrect, restart supervisor:
```bash
systemctl restart central-supervisor
```
---
## Summary
| Item | Details |
|------|---------|
| **Bug** | Cadence decrease hot-reload doesn't apply without restart |
| **Root cause** | `cancel_event.set()` inside lock context has delayed delivery |
| **Affects** | Cadence decreases only; increases work correctly |
| **Workaround** | Restart supervisor after cadence decrease |
| **Fix effort** | Low - add `await asyncio.sleep(0)` after event.set() |
| **Test coverage** | None for hot-reload mechanism |

View file

@ -1,58 +1,58 @@
# Phase 1B Planning Notes
Design notes for Phase 1B GUI features. These are planning items, not
implementation specifications.
## Stream Retention GUI
### Per-Stream Configuration
- Show each stream from `config.streams` table
- Editable max_age_s with preset chips: 1d, 7d, 14d, 30d, 365d
- Custom numeric input allowed (operator can enter 90d, etc.)
- Changes trigger NATS stream update via supervisor hot-reload
### Storage Monitor
Per stream, display:
- **Current bytes**: Live from `nats stream info`
- **Projected bytes**: Calculated from current rate × max_age
- **Days remaining**: Current_bytes / rate_per_day estimate
- Refresh: Real-time polling, not cached
### Global Server Cap
- Show `max_file_store` value as read-only reference
- Editing requires NATS server restart (out of scope for GUI)
- Display per-stream ceiling (30% of server cap) as context
## Region Picker
### Interactive Map
- Bbox selection via click-drag rectangle
- Same UI component for all adapters (NWS, FIRMS, USGS)
- Stores `{north, south, east, west}` floats
- Preview of coverage area with state/country boundaries
### Preset Regions
- Common presets: CONUS, Pacific Northwest, Mountain West
- Quick-select buttons alongside custom draw
## API Key Management
### Key Storage
- View configured API keys (alias only, not values)
- Add new keys with alias and value
- Values encrypted at rest in `config.api_keys`
- Rotation: update value, track `rotated_at`
### Required Keys by Adapter
- **FIRMS** (Phase 1a-6): `MAP_KEY` for NASA FIRMS API
- Future adapters may require additional keys
## Technical Notes
- All GUI changes write to `config.*` tables
- Supervisor receives NOTIFY and hot-reloads
- No service restarts required for config changes
- Stream retention changes apply within 5 seconds
# Phase 1B Planning Notes
Design notes for Phase 1B GUI features. These are planning items, not
implementation specifications.
## Stream Retention GUI
### Per-Stream Configuration
- Show each stream from `config.streams` table
- Editable max_age_s with preset chips: 1d, 7d, 14d, 30d, 365d
- Custom numeric input allowed (operator can enter 90d, etc.)
- Changes trigger NATS stream update via supervisor hot-reload
### Storage Monitor
Per stream, display:
- **Current bytes**: Live from `nats stream info`
- **Projected bytes**: Calculated from current rate × max_age
- **Days remaining**: Current_bytes / rate_per_day estimate
- Refresh: Real-time polling, not cached
### Global Server Cap
- Show `max_file_store` value as read-only reference
- Editing requires NATS server restart (out of scope for GUI)
- Display per-stream ceiling (30% of server cap) as context
## Region Picker
### Interactive Map
- Bbox selection via click-drag rectangle
- Same UI component for all adapters (NWS, FIRMS, USGS)
- Stores `{north, south, east, west}` floats
- Preview of coverage area with state/country boundaries
### Preset Regions
- Common presets: CONUS, Pacific Northwest, Mountain West
- Quick-select buttons alongside custom draw
## API Key Management
### Key Storage
- View configured API keys (alias only, not values)
- Add new keys with alias and value
- Values encrypted at rest in `config.api_keys`
- Rotation: update value, track `rotated_at`
### Required Keys by Adapter
- **FIRMS** (Phase 1a-6): `MAP_KEY` for NASA FIRMS API
- Future adapters may require additional keys
## Technical Notes
- All GUI changes write to `config.*` tables
- Supervisor receives NOTIFY and hot-reloads
- No service restarts required for config changes
- Stream retention changes apply within 5 seconds
## FIRMS Adapter Configuration

View file

@ -1,434 +1,434 @@
# Phase 1a-3 Verification Evidence
## T0 Baseline (TOML config mode, post-merge deploy)
**Timestamp:** 2026-05-16T03:10:51Z
### Upstream Alert IDs
```json
[
"urn:oid:2.49.0.1.840.0.e22a439ed29ed11e4b3686d9fac419ce7ad40059.001.1",
"urn:oid:2.49.0.1.840.0.b7acbf4f0381fb83c1b3f732a4ac9ca16a6204d1.002.1",
"urn:oid:2.49.0.1.840.0.e420a03d4bb13559e9bd61c714d8753fa6a4f66d.001.1",
"urn:oid:2.49.0.1.840.0.82fc471559645fcc3fefe49b4855bde43a7dde2b.001.1",
"urn:oid:2.49.0.1.840.0.add970d087c8d383436ee5958fc56100408aaf2e.001.1",
"urn:oid:2.49.0.1.840.0.f620e3599001fc9937324d55df89b55e475c5568.001.1",
"urn:oid:2.49.0.1.840.0.f620e3599001fc9937324d55df89b55e475c5568.002.1",
"urn:oid:2.49.0.1.840.0.dbde432f293a71618bf9908e5adcf9e5dd27e27c.006.1",
"urn:oid:2.49.0.1.840.0.dbde432f293a71618bf9908e5adcf9e5dd27e27c.001.1",
"urn:oid:2.49.0.1.840.0.dbde432f293a71618bf9908e5adcf9e5dd27e27c.003.1",
"urn:oid:2.49.0.1.840.0.dbde432f293a71618bf9908e5adcf9e5dd27e27c.001.2",
"urn:oid:2.49.0.1.840.0.dbde432f293a71618bf9908e5adcf9e5dd27e27c.005.1",
"urn:oid:2.49.0.1.840.0.b5173bc4f407f3889ea8e9284af261796d04972b.002.1",
"urn:oid:2.49.0.1.840.0.18277c28967847fb1b9e61f5afc236e42659e27b.001.1",
"urn:oid:2.49.0.1.840.0.b5173bc4f407f3889ea8e9284af261796d04972b.001.1",
"urn:oid:2.49.0.1.840.0.86299b43bf001e6c38df077a9b2d8d8e1e7b9116.002.2"
]
```
### Database State
```
count | max
-------+------------------------
30 | 2026-05-16 02:45:00+00
```
### Fresh Envelope Sample (post-restart)
```json
{
"id": "https://api.weather.gov/alerts/urn:oid:2.49.0.1.840.0.35f852d42f3149d3e1722c14e6ffc2e977e48d1b.001.1",
"source": "central/adapters/nws",
"type": "central.wx.alert.lake_wind_advisory.v1",
"time": "2026-05-16T02:45:00+00:00",
"datacontenttype": "application/json",
"centralschemaversion": "1.0.0",
"centralcategory": "wx.alert.lake_wind_advisory",
"centralseverity": 2,
"specversion": "1.0",
"data": { ... }
}
```
**CloudEvents verification:**
- `specversion: "1.0"`
- `type` starts with `central.` (NOT `hub.`) ✅
- Extension attributes use `central*` prefix ✅
- `centralschemaversion` (NOT `hubschemaversion`)
- `centralcategory` (NOT `hubcategory`)
- `centralseverity` (NOT `hubseverity`)
---
## Phase B Step 2: Config Source Cutover (TOML → DB)
**Timestamp:** 2026-05-16T03:13:33Z
### Environment Change
```
# /etc/central/central.env - added:
CENTRAL_CONFIG_SOURCE=db
```
### Supervisor Journal Evidence
```json
{"ts": "2026-05-16T03:13:33.430635+00:00", "level": "INFO", "logger": "central.supervisor", "msg": "Config source: db", "config_source": "db"}
{"ts": "2026-05-16T03:13:33.460162+00:00", "level": "INFO", "logger": "central.config_store", "msg": "Config listener connected to database"}
```
### Archive Journal Evidence
```json
{"ts": "2026-05-16T03:14:03.413008+00:00", "level": "INFO", "logger": "central.archive", "msg": "Archive starting", "nats_url": "nats://localhost:4222", "config_source": "db"}
```
**Result:** Both services running with DB-backed config ✅
---
## Phase B Step 3: Hot-Reload Cadence Test
**Test:** Change cadence from 60s → 90s while adapter is running.
**Goal:** Verify next poll is at Tlast + new_cadence (not old cadence, not immediate).
### Timeline
```
Tlast (last poll): 03:16:34.317219Z
Config change: 03:16:36Z
Expected next poll: 03:18:04.317Z (Tlast + 90s)
Actual next poll: 03:18:04.502Z ✅
```
### Journal Evidence
```json
{"ts": "2026-05-16T03:16:34.317219+00:00", "level": "INFO", "logger": "central.adapters.nws", "msg": "NWS yielded events", "count": 16}
{"ts": "2026-05-16T03:16:37.488781+00:00", "level": "INFO", "logger": "central.supervisor", "msg": "Config change received", "table": "adapters", "key": "nws"}
{"ts": "2026-05-16T03:16:37.511029+00:00", "level": "INFO", "logger": "central.supervisor", "msg": "Rescheduled adapter", "adapter": "nws", "old_cadence_s": 60, "new_cadence_s": 90, "next_poll": "2026-05-16T03:18:04.317651+00:00"}
{"ts": "2026-05-16T03:18:04.502991+00:00", "level": "INFO", "logger": "central.adapters.nws", "msg": "NWS poll completed", "status": 200, "feature_count": 355}
```
**Result:** Rate-limit guarantee upheld. Poll occurred at Tlast + 90s (NOT Tlast + 60s). ✅
---
## Phase B Step 4: Hot-Reload Enable/Disable Test
**Test:** Disable adapter, wait, re-enable.
**Goal:** Verify next poll is at Tlast + cadence (not immediate on re-enable).
### Timeline
```
Tlast (last poll): 03:19:34.758524Z
Disabled at: 03:20:37Z
Re-enabled at: 03:20:48Z
Expected next poll: 03:21:04.758Z (Tlast + 90s)
Actual next poll: 03:21:04.940Z ✅
```
### Journal Evidence
```json
{"ts": "2026-05-16T03:19:34.757999+00:00", "level": "INFO", "logger": "central.adapters.nws", "msg": "NWS yielded events", "count": 16}
{"ts": "2026-05-16T03:20:37.616723+00:00", "level": "INFO", "logger": "central.supervisor", "msg": "Adapter stopped", "adapter": "nws", "preserved_last_poll": "2026-05-16T03:19:34.758524+00:00"}
{"ts": "2026-05-16T03:20:48.947358+00:00", "level": "INFO", "logger": "central.supervisor", "msg": "Adapter restarted", "adapter": "nws", "cadence_s": 90, "preserved_last_poll": "2026-05-16T03:19:34.758524+00:00", "next_poll": "2026-05-16T03:21:04.758524+00:00"}
{"ts": "2026-05-16T03:21:04.940891+00:00", "level": "INFO", "logger": "central.adapters.nws", "msg": "NWS poll completed", "status": 200, "feature_count": 354}
```
**Key observations:**
- `preserved_last_poll` appears in BOTH stop and restart logs (proves state retained)
- `next_poll` calculated from `preserved_last_poll + cadence_s` (not from current time)
- Poll did NOT happen immediately on re-enable
**Result:** Rate-limit guarantee upheld through enable/disable cycle. ✅
---
## Phase B Step 5: T1 Capture and Soak
**T1 Timestamp:** 2026-05-16T03:23:19Z
**T2 Timestamp:** 2026-05-16T03:33:48Z
### T1 State
- Upstream alerts: 16
- DB events: 30
### T2 State (after 10-minute soak)
- Upstream alerts: 16
- DB events: 30
### Poll Activity During Soak
```
03:24:05 - NWS poll completed, status: 200, feature_count: 355
03:25:35 - NWS poll completed, status: 200, feature_count: 357
03:27:05 - NWS poll completed, status: 200, feature_count: 358
03:28:35 - NWS poll completed, status: 200, feature_count: 360
03:30:05 - NWS poll completed, status: 200, feature_count: 357
03:31:35 - NWS poll completed, status: 200, feature_count: 356
03:33:05 - NWS poll completed, status: 200, feature_count: 355
```
**Errors during soak:** None ✅
---
## Phase B Step 6: Data Integrity Check
### Verification
```
Upstream alerts: 16
DB events (total): 30
Missing from DB: 0
All upstream alerts found in DB ✓
```
**Result:** Zero missed alerts. Data integrity confirmed. ✅
---
## Phase B Verification Summary
| Step | Test | Result |
|------|------|--------|
| 2 | Config source cutover | ✅ "Config source: db" in logs |
| 3 | Cadence hot-reload | ✅ Poll at Tlast + new_cadence |
| 4 | Enable/disable cycle | ✅ Rate-limit preserved |
| 5 | 10-minute soak | ✅ No errors |
| 6 | Data integrity | ✅ All alerts in DB |
**Phase B Complete.** System running stable on DB-backed config.
---
## Cadence Revert (Close-out)
**Timestamp:** 2026-05-16T03:54:14Z
### Issue Discovered
During close-out verification, polls were observed at 90s intervals despite
DB showing `cadence_s = 60`. Investigation revealed the live reschedule
from 90→60 (done at 03:23:08 during Phase B) didn't properly update the
in-flight scheduling.
### Resolution
Supervisor restart was required to clear stale state:
```bash
systemctl restart central-supervisor
```
### Post-Restart Verification
**DB State:**
```sql
SELECT name, cadence_s, updated_at FROM config.adapters WHERE name='nws';
```
```
name | cadence_s | updated_at
------+-----------+-------------------------------
nws | 60 | 2026-05-16 03:50:53.210963+00
```
**Poll Intervals After Restart:**
```
03:54:14.621376 - NWS poll completed (first poll after restart)
03:55:15.028963 - NWS poll completed (61s later) ✅
03:56:15.429013 - NWS poll completed (60s later) ✅
```
**Startup Log:**
```json
{"ts": "2026-05-16T03:54:14.318479+00:00", "msg": "Adapter started", "adapter": "nws", "cadence_s": 60}
```
### Bug Note
The cadence DECREASE (90→60) rate-limit test from Phase B showed correct
log output ("Rescheduled adapter" with new_cadence_s=60) but the actual
scheduling didn't update properly. The increase test (60→90) worked
correctly.
**Root cause:** Unknown - requires investigation. The `_reschedule_adapter`
method updates `state.config` and `state.adapter.cadence_s`, and signals
via `cancel_event`, but the scheduling loop may not be re-evaluating
correctly for decreases.
**Mitigation:** After any cadence change, verify actual poll intervals match
expected cadence. If not, restart supervisor.
**Result:** Cadence confirmed at 60s after restart. ✅
---
## Phase 1a-3 Close-out
**Timestamp:** 2026-05-16T04:03:17Z
### PR #3 Merge
- **Merge commit:** 0b23cc4
- **Strategy:** Merge commit (fast-forward)
- **Branch deleted:** feature/1a-3-phase-c-toml-retirement
### LXC Cleanup
**Remove obsolete env var:**
```bash
sed -i '/CENTRAL_CONFIG_SOURCE/d' /etc/central/central.env
```
**Resulting central.env:**
```
CENTRAL_DB_DSN=postgresql://central:***@localhost/central
CENTRAL_NATS_URL=nats://localhost:4222
CENTRAL_MASTER_KEY_PATH=/etc/central/master.key
CENTRAL_LOG_LEVEL=INFO
```
**Retire TOML file:**
```bash
mv /etc/central/central.toml /etc/central/central.toml.retired
```
**Directory listing:**
```
-rw-r----- central central 193 central.env
-rw-r----- central central 1074 central.toml.retired
-rw------- central central 45 master.key
```
### Post-Restart Verification
**Supervisor startup:**
```json
{"ts": "2026-05-16T04:01:18.430800+00:00", "msg": "Config source: db", "config_source": "db"}
{"ts": "2026-05-16T04:01:18.459241+00:00", "msg": "Adapter started", "adapter": "nws", "cadence_s": 60}
{"ts": "2026-05-16T04:01:18.459641+00:00", "msg": "Config listener connected to database"}
{"ts": "2026-05-16T04:01:18.595928+00:00", "msg": "NWS poll completed", "status": 200}
```
**Archive startup:**
```json
{"ts": "2026-05-16T04:01:48.442842+00:00", "msg": "Archive starting", "nats_url": "nats://localhost:4222"}
{"ts": "2026-05-16T04:01:48.468110+00:00", "msg": "Archive consumer ready"}
```
### CloudEvents Envelope Verification (seq 32)
```json
{
"type": "central.wx.alert.winter_weather_advisory.v1",
"source": "central.echo6.co",
"specversion": "1.0",
"centralschemaversion": "1.0",
"centralcategory": "wx.alert.winter_weather_advisory",
"centralseverity": 2
}
```
- Extension attributes use `central*` prefix ✅
### T3 Data Integrity Check
| Metric | T0 | T3 |
|--------|----|----|
| Upstream alerts | 16 | 17 |
| DB events | 30 | 32 |
| Missing | 0 | 0 |
**Result:** Zero alerts missed across T0 → T3. ✅
---
## Phase 1a-3 Final Summary
| Gate | Status |
|------|--------|
| Part 1: Cadence reverted to 60s | ✅ (required restart) |
| Part 2: PR #3 review - no blockers | ✅ |
| Part 3: PR #3 merged | ✅ (0b23cc4) |
| CENTRAL_CONFIG_SOURCE removed | ✅ |
| central.toml retired | ✅ |
| Services healthy | ✅ |
| CloudEvents central* prefix | ✅ |
| Data integrity T0→T3 | ✅ |
**Phase 1a-3 Complete.**
## Final Cadence-Decrease Fix Verification
**Date:** 2026-05-16T17:19-17:25 UTC
**Branch:** feature/remove-adapter-limiter
**Fix:** Removed internal AsyncLimiter from NWSAdapter
### Root Cause
The NWSAdapter had an internal AsyncLimiter(1, cadence_s) that duplicated
the supervisor rate-limit guarantee. When cadence changed via hot-reload,
state.adapter.cadence_s was updated but the internal _limiter retained
the old rate, causing the async with self._limiter context to block for
the remaining time of the old cadence window.
### Fix Applied
1. Removed self._limiter from NWSAdapter
2. Removed self.cadence_s attribute (no longer needed)
3. Removed state.adapter.cadence_s = new_cadence from supervisor
4. Removed aiolimiter dependency
### Verification Results
#### Test 1: Decrease 60 to 30s
```
Tlast: 17:20:38.282
Change: 17:20:39.649 (60 to 30)
Expected: 17:21:08.323 (Tlast + 30s)
Actual: 17:21:08.531 PASS
Subsequent: 17:21:38.751 (30s later) PASS
```
#### Test 2: Increase 30 to 60s
```
Tlast: 17:22:09.242
Change: 17:22:18.515 (30 to 60)
Expected: 17:23:09.284 (Tlast + 60s)
Actual: 17:23:09.634 PASS
```
#### Test 3: Decrease 60 to 15s
```
Tlast: 17:23:09.634
Change: 17:23:28.343 (60 to 15)
Expected: 17:23:24.677 (Tlast + 15s, already passed)
Actual: 17:23:28.736 (immediate, deadline passed) PASS
Subsequent: 17:23:44.129 (15s later) PASS
17:23:59.579 (15s later) PASS
```
#### Test 4: Restore 15 to 60s
```
Change: 17:24:21.355 (15 to 60)
Expected: 17:25:15.072 (Tlast + 60s)
```
### Journal Evidence
```
17:20:38 poll completed (baseline)
17:20:39 Rescheduled 60 to 30, next_poll=17:21:08
17:21:08 poll completed PASS (30s, not 60s)
17:21:38 poll completed PASS (30s interval)
17:22:09 poll completed
17:22:18 Rescheduled 30 to 60, next_poll=17:23:09
17:23:09 poll completed PASS (60s)
17:23:28 Rescheduled 60 to 15, next_poll=17:23:24 (past)
17:23:28 poll completed PASS (immediate)
17:23:44 poll completed PASS (15s)
17:23:59 poll completed PASS (15s)
17:24:21 Rescheduled 15 to 60, next_poll=17:25:15
```
### Conclusion
All cadence transitions work correctly:
- Decrease (60 to 30, 60 to 15): Next poll at Tlast + new_cadence PASS
- Increase (30 to 60, 15 to 60): Next poll at Tlast + new_cadence PASS
- Immediate poll when deadline already passed PASS
- Subsequent intervals use new cadence PASS
The internal AsyncLimiter was the root cause. Removing it allows the
supervisor rate-limit scheduling to work correctly without interference.
# Phase 1a-3 Verification Evidence
## T0 Baseline (TOML config mode, post-merge deploy)
**Timestamp:** 2026-05-16T03:10:51Z
### Upstream Alert IDs
```json
[
"urn:oid:2.49.0.1.840.0.e22a439ed29ed11e4b3686d9fac419ce7ad40059.001.1",
"urn:oid:2.49.0.1.840.0.b7acbf4f0381fb83c1b3f732a4ac9ca16a6204d1.002.1",
"urn:oid:2.49.0.1.840.0.e420a03d4bb13559e9bd61c714d8753fa6a4f66d.001.1",
"urn:oid:2.49.0.1.840.0.82fc471559645fcc3fefe49b4855bde43a7dde2b.001.1",
"urn:oid:2.49.0.1.840.0.add970d087c8d383436ee5958fc56100408aaf2e.001.1",
"urn:oid:2.49.0.1.840.0.f620e3599001fc9937324d55df89b55e475c5568.001.1",
"urn:oid:2.49.0.1.840.0.f620e3599001fc9937324d55df89b55e475c5568.002.1",
"urn:oid:2.49.0.1.840.0.dbde432f293a71618bf9908e5adcf9e5dd27e27c.006.1",
"urn:oid:2.49.0.1.840.0.dbde432f293a71618bf9908e5adcf9e5dd27e27c.001.1",
"urn:oid:2.49.0.1.840.0.dbde432f293a71618bf9908e5adcf9e5dd27e27c.003.1",
"urn:oid:2.49.0.1.840.0.dbde432f293a71618bf9908e5adcf9e5dd27e27c.001.2",
"urn:oid:2.49.0.1.840.0.dbde432f293a71618bf9908e5adcf9e5dd27e27c.005.1",
"urn:oid:2.49.0.1.840.0.b5173bc4f407f3889ea8e9284af261796d04972b.002.1",
"urn:oid:2.49.0.1.840.0.18277c28967847fb1b9e61f5afc236e42659e27b.001.1",
"urn:oid:2.49.0.1.840.0.b5173bc4f407f3889ea8e9284af261796d04972b.001.1",
"urn:oid:2.49.0.1.840.0.86299b43bf001e6c38df077a9b2d8d8e1e7b9116.002.2"
]
```
### Database State
```
count | max
-------+------------------------
30 | 2026-05-16 02:45:00+00
```
### Fresh Envelope Sample (post-restart)
```json
{
"id": "https://api.weather.gov/alerts/urn:oid:2.49.0.1.840.0.35f852d42f3149d3e1722c14e6ffc2e977e48d1b.001.1",
"source": "central/adapters/nws",
"type": "central.wx.alert.lake_wind_advisory.v1",
"time": "2026-05-16T02:45:00+00:00",
"datacontenttype": "application/json",
"centralschemaversion": "1.0.0",
"centralcategory": "wx.alert.lake_wind_advisory",
"centralseverity": 2,
"specversion": "1.0",
"data": { ... }
}
```
**CloudEvents verification:**
- `specversion: "1.0"`
- `type` starts with `central.` (NOT `hub.`) ✅
- Extension attributes use `central*` prefix ✅
- `centralschemaversion` (NOT `hubschemaversion`)
- `centralcategory` (NOT `hubcategory`)
- `centralseverity` (NOT `hubseverity`)
---
## Phase B Step 2: Config Source Cutover (TOML → DB)
**Timestamp:** 2026-05-16T03:13:33Z
### Environment Change
```
# /etc/central/central.env - added:
CENTRAL_CONFIG_SOURCE=db
```
### Supervisor Journal Evidence
```json
{"ts": "2026-05-16T03:13:33.430635+00:00", "level": "INFO", "logger": "central.supervisor", "msg": "Config source: db", "config_source": "db"}
{"ts": "2026-05-16T03:13:33.460162+00:00", "level": "INFO", "logger": "central.config_store", "msg": "Config listener connected to database"}
```
### Archive Journal Evidence
```json
{"ts": "2026-05-16T03:14:03.413008+00:00", "level": "INFO", "logger": "central.archive", "msg": "Archive starting", "nats_url": "nats://localhost:4222", "config_source": "db"}
```
**Result:** Both services running with DB-backed config ✅
---
## Phase B Step 3: Hot-Reload Cadence Test
**Test:** Change cadence from 60s → 90s while adapter is running.
**Goal:** Verify next poll is at Tlast + new_cadence (not old cadence, not immediate).
### Timeline
```
Tlast (last poll): 03:16:34.317219Z
Config change: 03:16:36Z
Expected next poll: 03:18:04.317Z (Tlast + 90s)
Actual next poll: 03:18:04.502Z ✅
```
### Journal Evidence
```json
{"ts": "2026-05-16T03:16:34.317219+00:00", "level": "INFO", "logger": "central.adapters.nws", "msg": "NWS yielded events", "count": 16}
{"ts": "2026-05-16T03:16:37.488781+00:00", "level": "INFO", "logger": "central.supervisor", "msg": "Config change received", "table": "adapters", "key": "nws"}
{"ts": "2026-05-16T03:16:37.511029+00:00", "level": "INFO", "logger": "central.supervisor", "msg": "Rescheduled adapter", "adapter": "nws", "old_cadence_s": 60, "new_cadence_s": 90, "next_poll": "2026-05-16T03:18:04.317651+00:00"}
{"ts": "2026-05-16T03:18:04.502991+00:00", "level": "INFO", "logger": "central.adapters.nws", "msg": "NWS poll completed", "status": 200, "feature_count": 355}
```
**Result:** Rate-limit guarantee upheld. Poll occurred at Tlast + 90s (NOT Tlast + 60s). ✅
---
## Phase B Step 4: Hot-Reload Enable/Disable Test
**Test:** Disable adapter, wait, re-enable.
**Goal:** Verify next poll is at Tlast + cadence (not immediate on re-enable).
### Timeline
```
Tlast (last poll): 03:19:34.758524Z
Disabled at: 03:20:37Z
Re-enabled at: 03:20:48Z
Expected next poll: 03:21:04.758Z (Tlast + 90s)
Actual next poll: 03:21:04.940Z ✅
```
### Journal Evidence
```json
{"ts": "2026-05-16T03:19:34.757999+00:00", "level": "INFO", "logger": "central.adapters.nws", "msg": "NWS yielded events", "count": 16}
{"ts": "2026-05-16T03:20:37.616723+00:00", "level": "INFO", "logger": "central.supervisor", "msg": "Adapter stopped", "adapter": "nws", "preserved_last_poll": "2026-05-16T03:19:34.758524+00:00"}
{"ts": "2026-05-16T03:20:48.947358+00:00", "level": "INFO", "logger": "central.supervisor", "msg": "Adapter restarted", "adapter": "nws", "cadence_s": 90, "preserved_last_poll": "2026-05-16T03:19:34.758524+00:00", "next_poll": "2026-05-16T03:21:04.758524+00:00"}
{"ts": "2026-05-16T03:21:04.940891+00:00", "level": "INFO", "logger": "central.adapters.nws", "msg": "NWS poll completed", "status": 200, "feature_count": 354}
```
**Key observations:**
- `preserved_last_poll` appears in BOTH stop and restart logs (proves state retained)
- `next_poll` calculated from `preserved_last_poll + cadence_s` (not from current time)
- Poll did NOT happen immediately on re-enable
**Result:** Rate-limit guarantee upheld through enable/disable cycle. ✅
---
## Phase B Step 5: T1 Capture and Soak
**T1 Timestamp:** 2026-05-16T03:23:19Z
**T2 Timestamp:** 2026-05-16T03:33:48Z
### T1 State
- Upstream alerts: 16
- DB events: 30
### T2 State (after 10-minute soak)
- Upstream alerts: 16
- DB events: 30
### Poll Activity During Soak
```
03:24:05 - NWS poll completed, status: 200, feature_count: 355
03:25:35 - NWS poll completed, status: 200, feature_count: 357
03:27:05 - NWS poll completed, status: 200, feature_count: 358
03:28:35 - NWS poll completed, status: 200, feature_count: 360
03:30:05 - NWS poll completed, status: 200, feature_count: 357
03:31:35 - NWS poll completed, status: 200, feature_count: 356
03:33:05 - NWS poll completed, status: 200, feature_count: 355
```
**Errors during soak:** None ✅
---
## Phase B Step 6: Data Integrity Check
### Verification
```
Upstream alerts: 16
DB events (total): 30
Missing from DB: 0
All upstream alerts found in DB ✓
```
**Result:** Zero missed alerts. Data integrity confirmed. ✅
---
## Phase B Verification Summary
| Step | Test | Result |
|------|------|--------|
| 2 | Config source cutover | ✅ "Config source: db" in logs |
| 3 | Cadence hot-reload | ✅ Poll at Tlast + new_cadence |
| 4 | Enable/disable cycle | ✅ Rate-limit preserved |
| 5 | 10-minute soak | ✅ No errors |
| 6 | Data integrity | ✅ All alerts in DB |
**Phase B Complete.** System running stable on DB-backed config.
---
## Cadence Revert (Close-out)
**Timestamp:** 2026-05-16T03:54:14Z
### Issue Discovered
During close-out verification, polls were observed at 90s intervals despite
DB showing `cadence_s = 60`. Investigation revealed the live reschedule
from 90→60 (done at 03:23:08 during Phase B) didn't properly update the
in-flight scheduling.
### Resolution
Supervisor restart was required to clear stale state:
```bash
systemctl restart central-supervisor
```
### Post-Restart Verification
**DB State:**
```sql
SELECT name, cadence_s, updated_at FROM config.adapters WHERE name='nws';
```
```
name | cadence_s | updated_at
------+-----------+-------------------------------
nws | 60 | 2026-05-16 03:50:53.210963+00
```
**Poll Intervals After Restart:**
```
03:54:14.621376 - NWS poll completed (first poll after restart)
03:55:15.028963 - NWS poll completed (61s later) ✅
03:56:15.429013 - NWS poll completed (60s later) ✅
```
**Startup Log:**
```json
{"ts": "2026-05-16T03:54:14.318479+00:00", "msg": "Adapter started", "adapter": "nws", "cadence_s": 60}
```
### Bug Note
The cadence DECREASE (90→60) rate-limit test from Phase B showed correct
log output ("Rescheduled adapter" with new_cadence_s=60) but the actual
scheduling didn't update properly. The increase test (60→90) worked
correctly.
**Root cause:** Unknown - requires investigation. The `_reschedule_adapter`
method updates `state.config` and `state.adapter.cadence_s`, and signals
via `cancel_event`, but the scheduling loop may not be re-evaluating
correctly for decreases.
**Mitigation:** After any cadence change, verify actual poll intervals match
expected cadence. If not, restart supervisor.
**Result:** Cadence confirmed at 60s after restart. ✅
---
## Phase 1a-3 Close-out
**Timestamp:** 2026-05-16T04:03:17Z
### PR #3 Merge
- **Merge commit:** 0b23cc4
- **Strategy:** Merge commit (fast-forward)
- **Branch deleted:** feature/1a-3-phase-c-toml-retirement
### LXC Cleanup
**Remove obsolete env var:**
```bash
sed -i '/CENTRAL_CONFIG_SOURCE/d' /etc/central/central.env
```
**Resulting central.env:**
```
CENTRAL_DB_DSN=postgresql://central:***@localhost/central
CENTRAL_NATS_URL=nats://localhost:4222
CENTRAL_MASTER_KEY_PATH=/etc/central/master.key
CENTRAL_LOG_LEVEL=INFO
```
**Retire TOML file:**
```bash
mv /etc/central/central.toml /etc/central/central.toml.retired
```
**Directory listing:**
```
-rw-r----- central central 193 central.env
-rw-r----- central central 1074 central.toml.retired
-rw------- central central 45 master.key
```
### Post-Restart Verification
**Supervisor startup:**
```json
{"ts": "2026-05-16T04:01:18.430800+00:00", "msg": "Config source: db", "config_source": "db"}
{"ts": "2026-05-16T04:01:18.459241+00:00", "msg": "Adapter started", "adapter": "nws", "cadence_s": 60}
{"ts": "2026-05-16T04:01:18.459641+00:00", "msg": "Config listener connected to database"}
{"ts": "2026-05-16T04:01:18.595928+00:00", "msg": "NWS poll completed", "status": 200}
```
**Archive startup:**
```json
{"ts": "2026-05-16T04:01:48.442842+00:00", "msg": "Archive starting", "nats_url": "nats://localhost:4222"}
{"ts": "2026-05-16T04:01:48.468110+00:00", "msg": "Archive consumer ready"}
```
### CloudEvents Envelope Verification (seq 32)
```json
{
"type": "central.wx.alert.winter_weather_advisory.v1",
"source": "central.echo6.co",
"specversion": "1.0",
"centralschemaversion": "1.0",
"centralcategory": "wx.alert.winter_weather_advisory",
"centralseverity": 2
}
```
- Extension attributes use `central*` prefix ✅
### T3 Data Integrity Check
| Metric | T0 | T3 |
|--------|----|----|
| Upstream alerts | 16 | 17 |
| DB events | 30 | 32 |
| Missing | 0 | 0 |
**Result:** Zero alerts missed across T0 → T3. ✅
---
## Phase 1a-3 Final Summary
| Gate | Status |
|------|--------|
| Part 1: Cadence reverted to 60s | ✅ (required restart) |
| Part 2: PR #3 review - no blockers | ✅ |
| Part 3: PR #3 merged | ✅ (0b23cc4) |
| CENTRAL_CONFIG_SOURCE removed | ✅ |
| central.toml retired | ✅ |
| Services healthy | ✅ |
| CloudEvents central* prefix | ✅ |
| Data integrity T0→T3 | ✅ |
**Phase 1a-3 Complete.**
## Final Cadence-Decrease Fix Verification
**Date:** 2026-05-16T17:19-17:25 UTC
**Branch:** feature/remove-adapter-limiter
**Fix:** Removed internal AsyncLimiter from NWSAdapter
### Root Cause
The NWSAdapter had an internal AsyncLimiter(1, cadence_s) that duplicated
the supervisor rate-limit guarantee. When cadence changed via hot-reload,
state.adapter.cadence_s was updated but the internal _limiter retained
the old rate, causing the async with self._limiter context to block for
the remaining time of the old cadence window.
### Fix Applied
1. Removed self._limiter from NWSAdapter
2. Removed self.cadence_s attribute (no longer needed)
3. Removed state.adapter.cadence_s = new_cadence from supervisor
4. Removed aiolimiter dependency
### Verification Results
#### Test 1: Decrease 60 to 30s
```
Tlast: 17:20:38.282
Change: 17:20:39.649 (60 to 30)
Expected: 17:21:08.323 (Tlast + 30s)
Actual: 17:21:08.531 PASS
Subsequent: 17:21:38.751 (30s later) PASS
```
#### Test 2: Increase 30 to 60s
```
Tlast: 17:22:09.242
Change: 17:22:18.515 (30 to 60)
Expected: 17:23:09.284 (Tlast + 60s)
Actual: 17:23:09.634 PASS
```
#### Test 3: Decrease 60 to 15s
```
Tlast: 17:23:09.634
Change: 17:23:28.343 (60 to 15)
Expected: 17:23:24.677 (Tlast + 15s, already passed)
Actual: 17:23:28.736 (immediate, deadline passed) PASS
Subsequent: 17:23:44.129 (15s later) PASS
17:23:59.579 (15s later) PASS
```
#### Test 4: Restore 15 to 60s
```
Change: 17:24:21.355 (15 to 60)
Expected: 17:25:15.072 (Tlast + 60s)
```
### Journal Evidence
```
17:20:38 poll completed (baseline)
17:20:39 Rescheduled 60 to 30, next_poll=17:21:08
17:21:08 poll completed PASS (30s, not 60s)
17:21:38 poll completed PASS (30s interval)
17:22:09 poll completed
17:22:18 Rescheduled 30 to 60, next_poll=17:23:09
17:23:09 poll completed PASS (60s)
17:23:28 Rescheduled 60 to 15, next_poll=17:23:24 (past)
17:23:28 poll completed PASS (immediate)
17:23:44 poll completed PASS (15s)
17:23:59 poll completed PASS (15s)
17:24:21 Rescheduled 15 to 60, next_poll=17:25:15
```
### Conclusion
All cadence transitions work correctly:
- Decrease (60 to 30, 60 to 15): Next poll at Tlast + new_cadence PASS
- Increase (30 to 60, 15 to 60): Next poll at Tlast + new_cadence PASS
- Immediate poll when deadline already passed PASS
- Subsequent intervals use new cadence PASS
The internal AsyncLimiter was the root cause. Removing it allows the
supervisor rate-limit scheduling to work correctly without interference.

View file

@ -1,96 +1,96 @@
# Central Data Hub - Environment Reference
## Development Locations
### Active Development: CT104 (Central LXC)
All development work happens on the Central LXC container:
| Property | Value |
|----------|-------|
| **Hostname** | `central` |
| **Tailscale IP** | `100.64.0.12` |
| **LAN IP** | `192.168.1.104` |
| **SSH access** | `zvx@central` or `zvx@100.64.0.12` |
| **Repository path** | `/opt/central` |
| **Python venv** | `/opt/central/.venv` |
| **Services** | `central-supervisor`, `central-archive` |
### Parked Clone: Cortex
The cortex VM at `/home/zvx/projects/central` contains a clone that is
**not actively used for development**. It may be retired in the future.
Do not make changes there.
### Local Workstation: matt-desktop
The Windows workstation (matt-desktop) has no Central repository clones.
The directory `C:\Users\mtthw\central_work\` is scratch space only and
should not be used for commits.
## Repository
| Property | Value |
|----------|-------|
| **Origin** | `git@github.com:zvx-echo6/central.git` |
| **Main branch** | `main` |
| **Default user** | `central` (on CT104) |
## Services
### central-supervisor
The main adapter scheduler and event publisher. Polls upstream APIs,
normalizes events, and publishes to NATS JetStream.
```bash
# Status
systemctl status central-supervisor
# Logs
journalctl -u central-supervisor -f
# Restart (requires sudo)
sudo systemctl restart central-supervisor
```
### central-archive
Consumes events from NATS JetStream and archives to PostgreSQL/TimescaleDB.
```bash
# Status
systemctl status central-archive
# Logs
journalctl -u central-archive -f
```
## Database
PostgreSQL 16 with TimescaleDB runs on CT104:
```bash
# Connect as central user
psql -h localhost -U central -d central
# Check adapter config
SELECT name, cadence_s, enabled FROM config.adapters;
# Check recent events
SELECT id, time, category FROM events ORDER BY time DESC LIMIT 10;
```
## SSH Access from Windows
From matt-desktop, connect via Tailscale:
```bash
# Direct connection
ssh zvx@100.64.0.12
# Using hostname (if Tailscale DNS configured)
ssh zvx@central
```
Note: The `zvx` user requires password for sudo operations.
# Central Data Hub - Environment Reference
## Development Locations
### Active Development: CT104 (Central LXC)
All development work happens on the Central LXC container:
| Property | Value |
|----------|-------|
| **Hostname** | `central` |
| **Tailscale IP** | `100.64.0.12` |
| **LAN IP** | `192.168.1.104` |
| **SSH access** | `zvx@central` or `zvx@100.64.0.12` |
| **Repository path** | `/opt/central` |
| **Python venv** | `/opt/central/.venv` |
| **Services** | `central-supervisor`, `central-archive` |
### Parked Clone: Cortex
The cortex VM at `/home/zvx/projects/central` contains a clone that is
**not actively used for development**. It may be retired in the future.
Do not make changes there.
### Local Workstation: matt-desktop
The Windows workstation (matt-desktop) has no Central repository clones.
The directory `C:\Users\mtthw\central_work\` is scratch space only and
should not be used for commits.
## Repository
| Property | Value |
|----------|-------|
| **Origin** | `git@github.com:zvx-echo6/central.git` |
| **Main branch** | `main` |
| **Default user** | `central` (on CT104) |
## Services
### central-supervisor
The main adapter scheduler and event publisher. Polls upstream APIs,
normalizes events, and publishes to NATS JetStream.
```bash
# Status
systemctl status central-supervisor
# Logs
journalctl -u central-supervisor -f
# Restart (requires sudo)
sudo systemctl restart central-supervisor
```
### central-archive
Consumes events from NATS JetStream and archives to PostgreSQL/TimescaleDB.
```bash
# Status
systemctl status central-archive
# Logs
journalctl -u central-archive -f
```
## Database
PostgreSQL 16 with TimescaleDB runs on CT104:
```bash
# Connect as central user
psql -h localhost -U central -d central
# Check adapter config
SELECT name, cadence_s, enabled FROM config.adapters;
# Check recent events
SELECT id, time, category FROM events ORDER BY time DESC LIMIT 10;
```
## SSH Access from Windows
From matt-desktop, connect via Tailscale:
```bash
# Direct connection
ssh zvx@100.64.0.12
# Using hostname (if Tailscale DNS configured)
ssh zvx@central
```
Note: The `zvx` user requires password for sudo operations.