These tests pass on both fixed and unfixed code, meaning they do
not actually exercise the cadence-decrease bug. The tests were
added as part of PR #4 but direct verification showed they
do not catch the issue they claim to test.
A follow-up issue should be filed for proper regression tests
that reproduce the actual bug (AsyncLimiter blocking).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The NWSAdapter no longer has a cadence_s attribute since the
internal limiter was removed. The supervisor's rate limiting
via state.config.cadence_s and last_completed_poll is sufficient.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The NWSAdapter had an internal AsyncLimiter that duplicated the
supervisor's rate-limit guarantee. When cadence changed, only
state.adapter.cadence_s was updated, not the internal limiter,
causing the cadence-decrease bug.
Since the supervisor already enforces rate limiting via
last_completed_poll + cadence_s scheduling, the adapter-level
limiter was redundant and caused the 30-second blocking observed
in diagnostic logs.
Removes:
- aiolimiter import
- self.cadence_s attribute (unused elsewhere)
- self._limiter creation
- async with self._limiter context in _fetch_alerts
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
environment.md:
- Documents CT104 as the active development location
- Lists SSH access, repository paths, and service commands
- Notes that cortex clone is parked, matt-desktop has no clones
BUG-CADENCE-DECREASE.md:
- Full investigation of the cadence-decrease hot-reload bug
- Root cause analysis: cancel_event.set() inside lock context
- Proposed fix (Option A - structural)
- Test gap identification
- Production verification steps
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The cancel_event.set() call was inside the async lock context in
_on_config_change, causing delayed signal delivery to the sleeping
loop. This manifested as cadence decreases not applying without a
restart - the loop would sleep its full original timeout before
seeing the new cadence.
Fix: _reschedule_adapter now returns the AdapterState to signal,
and _on_config_change signals AFTER releasing the lock. This ensures
immediate event delivery per asyncio semantics.
The lock protects state consistency during config fetches and updates.
The cancel_event is a one-way notification that does not need lock
protection - it simply wakes the sleeping coroutine.
See docs/BUG-CADENCE-DECREASE.md for full investigation.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add tests that exercise the ACTUAL running loop with cancel_event
signaling, not just AdapterState math in isolation.
Test cases:
- Test 1: Cadence decrease (60->30) wakes loop immediately
- Test 2: Cadence increase (10->20) extends wait correctly
- Test 3: Enable/disable/enable with gap > cadence polls immediately
- Test 4: Enable/disable/enable with gap < cadence waits
These tests verify the cancel_event mechanism properly interrupts
the sleeping loop when config changes occur via _on_config_change.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Phase B operational cutover verification:
- Config source cutover from TOML to DB confirmed
- Hot-reload cadence test passed (rate-limit guarantee)
- Enable/disable cycle test passed (preserved_last_poll)
- 10-minute soak with zero errors
- Data integrity verified (all alerts in DB)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Unit files load env vars from /etc/central/central.env using
EnvironmentFile directive. Includes README with installation
and configuration instructions.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
CloudEvents envelope format is protocol-level (not operator config).
When using DB config source without TOML, wrap_event() now uses
DEFAULT_CLOUDEVENTS_CONFIG from cloudevents_constants.py.
Changes:
- Add cloudevents_constants.py with DEFAULT_CLOUDEVENTS_CONFIG
- Update wrap_event() to accept Config, CloudEventsConfig, or None
- Simplify supervisor: always use wrap_event (has defaults)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously, _stop_adapter() used pop() to remove adapter state,
which lost last_completed_poll. On re-enable, a fresh state was
created, causing immediate poll and violating rate-limit guarantee.
Changes:
- Add is_running property to AdapterState
- _stop_adapter: preserve state, just cancel task
- _start_adapter: reuse existing stopped state if present
- Add _remove_adapter for full cleanup when adapter is deleted
- _on_config_change: call _remove_adapter for deleted adapters
Integration tests verify:
- Test A: gap > cadence -> immediate poll (correct)
- Test B: gap < cadence -> wait until last_poll + cadence (was broken)
- Test C: delete + re-add -> fresh state (correct)
Tests-fail-before-fix verified: Test A/B failed on unfixed code
with "State was removed on stop!", pass with fix.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Archive now reads NATS URL and Postgres DSN from bootstrap_config
instead of TOML file. This is sufficient for archive since it only
needs connection strings, not adapter configuration.
No ConfigSource wiring needed - archive just consumes from JetStream.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Refactors supervisor to use ConfigSource abstraction:
- AdapterState tracks last_completed_poll for rate limiting
- Hot-reload via NOTIFY: cadence/enable/disable changes take effect
- Rate-limit guarantee: next poll at last_poll + new_cadence, not now
- Logs config source at startup (toml or db)
- Logs reschedule decisions with next poll timestamp
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- ConfigSource protocol with list_enabled_adapters, get_adapter, watch_for_changes
- TomlConfigSource: loads from TOML file, watch_for_changes is no-op
- DbConfigSource: wraps ConfigStore with LISTEN/NOTIFY support
- CENTRAL_CONFIG_SOURCE bootstrap flag: toml (default) or db
- CENTRAL_CONFIG_TOML_PATH for specifying TOML file location
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Listener now automatically reconnects on connection loss with
exponential backoff (1s-30s). Cancellation propagates cleanly.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds auto-update trigger for updated_at column on adapters table
and partial index for efficient enabled adapter queries.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add central-cli with config-store-check command that:
- Connects via bootstrap config
- Lists adapters from config store
- Verifies crypto round-trip
Updates pyproject.toml with new dependencies:
- pydantic-settings>=2.7.0
- cryptography>=44.0.0
New entry points:
- central-migrate
- central-cli
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add ConfigStore class providing async access to config schema:
- get_adapter/list_adapters/upsert_adapter for adapter config
- pause_adapter/unpause_adapter for runtime control
- set_api_key/get_api_key with encryption via crypto.py
- listen_for_changes using Postgres LISTEN/NOTIFY
Includes Pydantic models (AdapterConfig, ApiKeyInfo) and tests
using real Postgres test database.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add encrypt/decrypt functions using AES-256-GCM for secret storage.
Master key loaded from file path specified in bootstrap config.
Features:
- 32-byte key from base64-encoded file
- 12-byte random nonce per encryption
- AEAD authentication (detects tampering)
- Key caching with clear_key_cache() for rotation
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add pydantic-settings based Settings class for loading configuration
from environment variables or .env file. Provides early-stage config
before database-backed config store is available.
Includes:
- CENTRAL_DB_DSN, CENTRAL_NATS_URL, CENTRAL_MASTER_KEY_PATH, CENTRAL_LOG_LEVEL
- Cached loader with get_settings()
- Tests for env vars, .env file, validation, caching
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rename extension attributes for consistency with project naming:
- hubschemaversion → centralschemaversion
- hubcategory → centralcategory
- hubseverity → centralseverity
Non-breaking change - no consumers depend on these names yet.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>