Commit graph

2 commits

Author SHA1 Message Date
125602fa13 Fix SingleFile CLI: remove invalid --crawl-delay flag
SingleFile CLI has no --crawl-delay option. The invalid flag caused the
process to print help and exit with no output. Added --crawl-no-parent
and --crawl-replace-URLs instead. Removed unused crawl_delay config key.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-18 19:28:03 +00:00
da50e5f0b8 Add scraper Phase 2: smart crawl mode detection + browser fallback
- Pre-flight detection: wget + Playwright probe to auto-detect if site
  needs browser rendering (JS apps, parking page redirects)
- SingleFile CLI crawl backend for JS-rendered sites
- crawl_mode column in scrape_jobs (static/browser/redirect/auto)
- API: optional crawl_mode param on submit, cleared on retry
- Config: rate_limit_delay 2.0→0.5, /api/ reject pattern, preflight
  + singlefile config sections
- Prerequisites: Node.js 22, single-file-cli, Playwright + Chromium

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-18 18:26:43 +00:00