refactored-recon/phases/phase-6b-dashboard-bug-fix.md

122 lines
3.7 KiB
Markdown
Raw Permalink Normal View History

# Phase 6b: Dashboard "Untitled / WEB" Bug Fix
## Bug Description
The RECON dashboard's "Recently Completed" table showed every transcript as:
- **Title:** "Untitled" (instead of the real video title)
- **Type:** WEB badge (instead of a transcript indicator)
The Sources table also tagged `stream.echo6.co` as WEB type.
## Root Causes
### Bug 1: Title = "Untitled"
The Recently Completed SQL query read `documents.book_title`:
```sql
SELECT book_title, ... FROM documents WHERE status = 'complete' ...
```
`book_title` is populated by the PDF metadata voting pipeline (Gemini
extracts title/author from PDF content). Transcripts never go through
metadata voting, so `book_title` is always NULL for them. The Python-side
fallback `r['book_title'] or 'Untitled'` then kicked in.
The real title was available in `catalogue.filename` all along.
### Bug 2: Type = "WEB"
The type was computed by a SQL CASE expression:
```sql
CASE WHEN path LIKE 'http%' THEN 'web' ELSE 'pdf' END as type
```
Since transcript `documents.path` holds the PeerTube watch URL
(`https://stream.echo6.co/w/<uuid>`), it matched `http%` and was
classified as `web`. The JavaScript then rendered `<span class="badge-web">WEB</span>`.
There was no `transcript` type — only `web` and `pdf` existed.
## Fix Applied
### api.py — Recently Completed query
```sql
-- Before:
SELECT book_title, ...
CASE WHEN path LIKE 'http%' THEN 'web' ELSE 'pdf' END as type
FROM documents WHERE status = 'complete' ...
-- After:
SELECT COALESCE(d.book_title, c.filename) as title, ...
CASE
WHEN c.source = 'stream.echo6.co' THEN 'transcript'
WHEN d.path LIKE 'http%' THEN 'web'
ELSE 'pdf'
END as type
FROM documents d
JOIN catalogue c ON d.hash = c.hash
WHERE d.status = 'complete' ...
```
Python mapping updated: `r['title'] or 'Untitled'` (was `r['book_title']`).
### api.py — Sources query
Same CASE expression fix applied to the sources aggregation query (line 885).
### dashboard.js — Badge rendering
Added transcript case to both badge-rendering locations (sources table
and recent completions table):
```javascript
var badge = r.type === 'transcript'
? '<span class="badge-transcript">TRANSCRIPT</span>'
: r.type === 'web'
? '<span class="badge-web">WEB</span>'
: '<span class="badge-pdf">PDF</span>';
```
### recon.css — Badge style
```css
.badge-transcript { background: #3b1f5e; color: #c084fc; padding: 2px 8px;
border-radius: var(--radius); font-size: 11px; }
```
Purple badge to differentiate from blue (web) and green (pdf).
## Files Changed
| File | Change |
|------|--------|
| `lib/api.py` | COALESCE title fallback, transcript type branch (2 queries) |
| `static/js/dashboard.js` | Transcript badge rendering (2 locations) |
| `static/css/recon.css` | `.badge-transcript` class |
## Verification
### Before (all transcripts)
- Title: "Untitled"
- Type: WEB (blue badge)
### After (sample from API response)
| Title | Type | Concepts | Vectors |
|-------|------|----------|---------|
| Pond Power | transcript | 3 | 3 |
| Prolonged Field Care Podcast 205: Sufentanil | transcript | 6 | 6 |
| Rate and Rhythm: Sinus Bradycardia and Sinus Tachycardia | transcript | 6 | 6 |
| American Persimmon - Identifying Male and Female Flowers | transcript | 2 | 2 |
### Sources table
- `stream.echo6.co`: type=transcript, catalogued=19133, complete=19132
- PDF sources (35 entries): all correctly typed as `pdf`
- Web sources: 0 (no web-scraped content in current data)
### PDFs unaffected
PDF sources continue to display with correct `book_title` and PDF type badge.
## Commit
- **Commit:** `70b80cb` on `refactor` branch
- **Pushed to:** `forge.echo6.co/matt/recon` (origin/refactor)