echo6-docs/runbooks/recon-service-integration.md

464 lines
15 KiB
Markdown
Raw Normal View History

# RECON Dashboard Service Integration
Add a management UI for a remote service to a Flask/FastAPI dashboard. The pattern: SSH key trust between the dashboard host and the target, scoped sudoers for specific commands, a REST API layer (`GET /api/{service}/status` + `POST /api/{service}/{action}`), and a frontend panel with status indicator, action buttons, and live feedback.
Use this when you have a service running on a remote LXC/VM that needs a web management interface — start/stop/restart, status checks, log tailing, or config hot-reload — without SSH-ing into the box manually.
---
## Prerequisites
- A running Flask or FastAPI dashboard (e.g., RECON on VM 131, WATCHTOWER on Contabo)
- The target service running on a reachable host (LXC, VM, or bare metal)
- SSH access from the dashboard host to the target host
- The dashboard runs as a known user (e.g., `zvx`, `recon`, `watchtower`)
---
## Inputs
Prompt the user for all of these before executing:
```
DASHBOARD_HOST= # Host running the dashboard (e.g., "192.168.1.130", "CT 130")
DASHBOARD_USER= # User the dashboard runs as (e.g., "zvx")
DASHBOARD_APP_PATH= # Path to the dashboard app (e.g., "/opt/recon/lib/api.py")
DASHBOARD_STATIC_PATH= # Path to frontend files (e.g., "/opt/recon/lib/static/")
TARGET_HOST= # Host running the service to manage (e.g., "192.168.1.170")
TARGET_USER= # User to SSH as on the target (e.g., "zvx")
SERVICE_NAME= # systemd service name (e.g., "peertube", "pt-downloader")
SERVICE_DISPLAY_NAME= # Human-readable name for the UI (e.g., "PeerTube", "Downloader")
SERVICE_SLUG= # URL-safe slug (e.g., "peertube", "downloader")
ALLOWED_ACTIONS= # Comma-separated actions (e.g., "start,stop,restart,status,logs")
```
---
## Step 1: Set Up SSH Key Trust
The dashboard host must be able to SSH to the target without a password prompt.
### Generate key (if not already present)
```bash
ssh $DASHBOARD_HOST "test -f /home/$DASHBOARD_USER/.ssh/id_ed25519 || \
ssh-keygen -t ed25519 -N '' -f /home/$DASHBOARD_USER/.ssh/id_ed25519"
```
### Copy public key to target
```bash
# Get the public key
PUBKEY=$(ssh $DASHBOARD_HOST "cat /home/$DASHBOARD_USER/.ssh/id_ed25519.pub")
# Add to target's authorized_keys
ssh $TARGET_HOST "mkdir -p /home/$TARGET_USER/.ssh && \
echo '$PUBKEY' >> /home/$TARGET_USER/.ssh/authorized_keys && \
chmod 700 /home/$TARGET_USER/.ssh && \
chmod 600 /home/$TARGET_USER/.ssh/authorized_keys && \
chown -R $TARGET_USER:$TARGET_USER /home/$TARGET_USER/.ssh"
```
### Gate
```bash
ssh $DASHBOARD_HOST "ssh -o BatchMode=yes -o ConnectTimeout=5 $TARGET_USER@$TARGET_HOST 'hostname'"
```
Must return the target hostname without prompting for a password. If it fails:
- "Permission denied (publickey)" → key not in authorized_keys, or wrong user
- "Host key verification failed" → add `-o StrictHostKeyChecking=accept-new` for first connection
- Timeout → network issue, firewall, or wrong IP
---
## Step 2: Configure Scoped Sudoers on Target
Grant the target user passwordless sudo for **only** the specific commands the dashboard needs. Never use `NOPASSWD: ALL` for service integrations.
```bash
ssh root@$TARGET_HOST "cat > /etc/sudoers.d/${SERVICE_SLUG}-mgmt << 'SUDOERS'
# Allow $TARGET_USER to manage $SERVICE_NAME via dashboard
$TARGET_USER ALL=(ALL) NOPASSWD: /usr/bin/systemctl start $SERVICE_NAME
$TARGET_USER ALL=(ALL) NOPASSWD: /usr/bin/systemctl stop $SERVICE_NAME
$TARGET_USER ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart $SERVICE_NAME
$TARGET_USER ALL=(ALL) NOPASSWD: /usr/bin/systemctl status $SERVICE_NAME
$TARGET_USER ALL=(ALL) NOPASSWD: /usr/bin/journalctl -u $SERVICE_NAME *
SUDOERS
chmod 440 /etc/sudoers.d/${SERVICE_SLUG}-mgmt"
```
### Gate
```bash
ssh $DASHBOARD_HOST "ssh $TARGET_USER@$TARGET_HOST 'sudo systemctl status $SERVICE_NAME'"
```
Must return service status without a password prompt. If "sudo: a password is required", the sudoers file has a syntax error or isn't being loaded — check `visudo -cf /etc/sudoers.d/${SERVICE_SLUG}-mgmt`.
---
## Step 3: Add API Endpoints
Add REST endpoints to the dashboard app for status checks and actions.
### Flask pattern
```python
import subprocess
import shlex
SERVICE_INTEGRATIONS = {
'$SERVICE_SLUG': {
'display_name': '$SERVICE_DISPLAY_NAME',
'target_host': '$TARGET_USER@$TARGET_HOST',
'service_name': '$SERVICE_NAME',
'allowed_actions': ['start', 'stop', 'restart', 'status', 'logs'],
},
}
def ssh_cmd(host: str, cmd: str, timeout: int = 10) -> dict:
"""Execute a command on a remote host via SSH."""
full_cmd = f"ssh -o BatchMode=yes -o ConnectTimeout=5 {host} {shlex.quote(cmd)}"
try:
result = subprocess.run(
full_cmd, shell=True, capture_output=True, text=True, timeout=timeout
)
return {
'success': result.returncode == 0,
'stdout': result.stdout.strip(),
'stderr': result.stderr.strip(),
'exit_code': result.returncode,
}
except subprocess.TimeoutExpired:
return {'success': False, 'stdout': '', 'stderr': 'SSH command timed out', 'exit_code': -1}
@app.route('/api/services/<slug>/status')
def api_service_status(slug):
svc = SERVICE_INTEGRATIONS.get(slug)
if not svc:
return jsonify({'error': 'Unknown service'}), 404
result = ssh_cmd(svc['target_host'], f"sudo systemctl status {svc['service_name']}")
# Parse systemctl status output
active = 'active (running)' in result.get('stdout', '')
return jsonify({
'service': svc['display_name'],
'active': active,
'raw': result['stdout'],
})
@app.route('/api/services/<slug>/<action>', methods=['POST'])
def api_service_action(slug, action):
svc = SERVICE_INTEGRATIONS.get(slug)
if not svc:
return jsonify({'error': 'Unknown service'}), 404
if action not in svc['allowed_actions']:
return jsonify({'error': f'Action {action} not allowed'}), 403
if action == 'logs':
result = ssh_cmd(
svc['target_host'],
f"sudo journalctl -u {svc['service_name']} -n 50 --no-pager",
timeout=15,
)
elif action in ('start', 'stop', 'restart'):
result = ssh_cmd(svc['target_host'], f"sudo systemctl {action} {svc['service_name']}")
elif action == 'status':
result = ssh_cmd(svc['target_host'], f"sudo systemctl status {svc['service_name']}")
else:
return jsonify({'error': 'Unknown action'}), 400
return jsonify(result)
```
### FastAPI pattern
Same logic, different decorators:
```python
@app.get('/api/services/{slug}/status')
async def api_service_status(slug: str):
# Same implementation, wrapped in run_in_executor for async
@app.post('/api/services/{slug}/{action}')
async def api_service_action(slug: str, action: str):
# Same implementation
```
### Gate
Restart the dashboard and test:
```bash
curl -s http://$DASHBOARD_HOST:8420/api/services/$SERVICE_SLUG/status | python3 -m json.tool
```
Must return JSON with `active: true/false` and service details.
```bash
curl -s -X POST http://$DASHBOARD_HOST:8420/api/services/$SERVICE_SLUG/restart | python3 -m json.tool
```
Must return `success: true`.
---
## Step 4: Add Frontend Panel
Add a service management panel to the dashboard UI. This goes in the appropriate tab (e.g., Upload, Dashboard, or a new Services tab).
```html
<!-- Service Management Panel: $SERVICE_DISPLAY_NAME -->
<div class="service-panel" id="panel-$SERVICE_SLUG">
<h3>$SERVICE_DISPLAY_NAME</h3>
<!-- Status indicator -->
<div class="status-row">
<span class="status-dot" id="status-$SERVICE_SLUG"></span>
<span id="status-text-$SERVICE_SLUG">Checking...</span>
<button onclick="refreshStatus('$SERVICE_SLUG')" class="btn-sm">Refresh</button>
</div>
<!-- Action buttons -->
<div class="action-buttons">
<button onclick="serviceAction('$SERVICE_SLUG', 'restart')" class="btn btn-warning">Restart</button>
<button onclick="serviceAction('$SERVICE_SLUG', 'stop')" class="btn btn-danger">Stop</button>
<button onclick="serviceAction('$SERVICE_SLUG', 'start')" class="btn btn-success">Start</button>
<button onclick="serviceAction('$SERVICE_SLUG', 'logs')" class="btn btn-info">View Logs</button>
</div>
<!-- Feedback area -->
<pre id="feedback-$SERVICE_SLUG" class="feedback-box" style="display:none;"></pre>
</div>
```
```javascript
// Service management JS
async function refreshStatus(slug) {
const dot = document.getElementById(`status-${slug}`);
const text = document.getElementById(`status-text-${slug}`);
try {
const resp = await fetch(`/api/services/${slug}/status`);
const data = await resp.json();
dot.className = data.active ? 'status-dot active' : 'status-dot inactive';
text.textContent = data.active ? 'Running' : 'Stopped';
} catch (e) {
dot.className = 'status-dot error';
text.textContent = 'Unreachable';
}
}
async function serviceAction(slug, action) {
const feedback = document.getElementById(`feedback-${slug}`);
feedback.style.display = 'block';
feedback.textContent = `Executing ${action}...`;
try {
const resp = await fetch(`/api/services/${slug}/${action}`, { method: 'POST' });
const data = await resp.json();
feedback.textContent = data.stdout || data.stderr || (data.success ? 'Done' : 'Failed');
// Refresh status after action
if (['start', 'stop', 'restart'].includes(action)) {
setTimeout(() => refreshStatus(slug), 2000);
}
} catch (e) {
feedback.textContent = `Error: ${e.message}`;
}
}
// Auto-refresh status every 30 seconds
setInterval(() => {
document.querySelectorAll('.service-panel').forEach(panel => {
const slug = panel.id.replace('panel-', '');
refreshStatus(slug);
});
}, 30000);
// Initial load
document.addEventListener('DOMContentLoaded', () => {
document.querySelectorAll('.service-panel').forEach(panel => {
const slug = panel.id.replace('panel-', '');
refreshStatus(slug);
});
});
```
```css
.status-dot {
display: inline-block;
width: 12px;
height: 12px;
border-radius: 50%;
margin-right: 8px;
}
.status-dot.active { background: #22c55e; }
.status-dot.inactive { background: #ef4444; }
.status-dot.error { background: #f59e0b; }
.feedback-box {
background: #1e1e2e;
color: #cdd6f4;
padding: 12px;
border-radius: 4px;
max-height: 300px;
overflow-y: auto;
font-size: 12px;
margin-top: 8px;
}
```
---
## Step 5: Verify End-to-End
1. **Load the dashboard** in a browser: `http://$DASHBOARD_HOST:8420/`
2. **Check status indicator**: should show green dot + "Running" (or red + "Stopped")
3. **Click Restart**: feedback box should show systemctl output, status should flip briefly then return to Running
4. **Click View Logs**: should show last 50 journal lines
5. **Click Stop**: status should change to Stopped (red)
6. **Click Start**: status should change to Running (green)
---
## Adding More Services
To integrate a second service, repeat Steps 1-4 with new inputs. The `SERVICE_INTEGRATIONS` dict supports multiple entries:
```python
SERVICE_INTEGRATIONS = {
'peertube': { ... },
'downloader': {
'display_name': 'Bulk Downloader',
'target_host': 'zvx@192.168.1.170',
'service_name': 'pt-downloader',
'allowed_actions': ['start', 'stop', 'restart', 'status', 'logs'],
},
'transcoder': {
'display_name': 'H.265 Transcoder',
'target_host': 'zvx@192.168.1.150',
'service_name': 'pt-transcoder',
'allowed_actions': ['start', 'stop', 'restart', 'status', 'logs'],
},
}
```
Each service gets its own panel in the UI, its own sudoers file on the target, and its own API routes (all handled by the generic `/<slug>/<action>` pattern).
---
## Security Considerations
- **Scoped sudoers**: Only allow the specific `systemctl` and `journalctl` commands needed. Never `NOPASSWD: ALL`.
- **SSH BatchMode**: `BatchMode=yes` ensures SSH never falls back to interactive password prompt. If key auth fails, the command fails immediately.
- **Action allowlist**: The `allowed_actions` list prevents the API from executing arbitrary commands. Only listed actions are accepted.
- **No shell injection**: Use `shlex.quote()` on any user-provided or variable input before passing to `subprocess.run(shell=True)`. Or use `subprocess.run(cmd_list)` with a list to avoid shell entirely.
- **Timeout on SSH**: Always set `-o ConnectTimeout` and `subprocess.run(timeout=)` to prevent the dashboard from hanging on network issues.
---
## Troubleshooting
### Status shows "Unreachable"
SSH from the dashboard host to the target is failing. Test manually:
```bash
ssh -o BatchMode=yes -o ConnectTimeout=5 $TARGET_USER@$TARGET_HOST 'hostname'
```
Common causes: SSH key not deployed, wrong user, firewall, target host down.
### "sudo: a password is required"
The sudoers file isn't working. Check:
```bash
ssh root@$TARGET_HOST "visudo -cf /etc/sudoers.d/${SERVICE_SLUG}-mgmt"
```
Must say "parsed OK". Also verify the username in the sudoers file matches `$TARGET_USER`.
### Actions work via curl but not from the browser
CORS issue. Add CORS headers to the API:
```python
# Flask
from flask_cors import CORS
CORS(app)
# Or manually:
@app.after_request
def add_cors(response):
response.headers['Access-Control-Allow-Origin'] = '*'
response.headers['Access-Control-Allow-Methods'] = 'GET, POST'
return response
```
### Dashboard hangs when target host is down
The SSH timeout isn't working, or it's set too high. Ensure both `-o ConnectTimeout=5` (SSH) and `timeout=10` (subprocess) are set. The subprocess timeout is the hard limit.
### Log output is truncated
The `-n 50` flag limits journalctl output. Increase it, or add a `lines` query parameter:
```python
lines = request.args.get('lines', 50, type=int)
lines = min(lines, 500) # Cap to prevent abuse
```
---
## Usage Examples
### RECON managing pipeline services (CT 130 dashboard → CT 110 PeerTube)
```
DASHBOARD_HOST=192.168.1.130 (VM 131, data node)
DASHBOARD_USER=zvx
TARGET_HOST=192.168.1.170 (CT 110, media node)
SERVICE_NAME=peertube
SERVICE_SLUG=peertube
Sudoers on CT 110:
zvx ALL=(ALL) NOPASSWD: /usr/bin/systemctl start peertube
zvx ALL=(ALL) NOPASSWD: /usr/bin/systemctl stop peertube
zvx ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart peertube
zvx ALL=(ALL) NOPASSWD: /usr/bin/systemctl status peertube
zvx ALL=(ALL) NOPASSWD: /usr/bin/journalctl -u peertube *
API endpoints:
GET /api/services/peertube/status → returns active/inactive + raw systemctl output
POST /api/services/peertube/restart → restarts PeerTube, returns success/failure
POST /api/services/peertube/logs → returns last 50 journal lines
Dashboard panel: green/red dot + Restart/Stop/Start/Logs buttons + feedback box
```
### WATCHTOWER monitoring remote services (Contabo → multiple hosts)
```
DASHBOARD_HOST=5.189.158.149 (Contabo)
DASHBOARD_USER=root
Services managed:
- peertube (CT 110): start/stop/restart/status/logs
- pt-downloader (CT 110): start/stop/restart/status/logs
- pt-importer (CT 110): start/stop/restart/status/logs
- pt-transcoder (cortex): start/stop/restart/status/logs
- recon (VM 131): start/stop/restart/status/logs
Each service has its own sudoers file on its target host,
its own entry in SERVICE_INTEGRATIONS, and its own UI panel.
```
---
*Last updated: 2026-04-19 — Updated CT 130 references to VM 131*