2026-04-21 03:06:51 +00:00
|
|
|
"""
|
cleanup: remove dead place_detail orchestrator cluster + lib/google_places.py (post-PR-11 dead code)
PR #11 (cleanup #2) deleted the /api/place* HTTP handlers but left their
orchestrator functions in lib/place_detail.py as dead code. Pre-flight for the
original Task #27 (delete google_places.py) surfaced that _enrich_with_google
is NOT a no-caller leaf — it's called by the unreachable get_place_detail. A
full caller-graph trace showed ~90% of place_detail.py is dead orchestration.
Scope expanded (Matt confirmed in chat) to remove the whole dead cluster:
- lib/google_places.py (entire file)
- place_detail.py: get_place_detail, get_place_by_wikidata, _enrich_with_google,
_apply_google_data, _enrich_with_overture, _enrich_with_wiki_index,
_enrich_wiki_links, _parse_nominatim, _parse_nominatim_address, _parse_overpass,
_build_overpass_query, cache_get, cache_put, _get_db + their now-unused
imports/constants (json, time, requests, osm_categories, NOMINATIM_URL, etc.)
KEEP only lookup_wiki_index + _get_wiki_index_db (the wiki_enrich_api survivor
path) — preserved byte-exact. Module docstring refreshed.
Flagged separately (not touched): overture.py + osm_categories.py are now
orphaned (only consumers were the deleted cluster); stale docstrings; the
deployment_config.py:9 catalog comment.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 10:21:31 -06:00
|
|
|
Wiki-index lookup for place enrichment.
|
2026-04-21 03:06:51 +00:00
|
|
|
|
cleanup: remove dead place_detail orchestrator cluster + lib/google_places.py (post-PR-11 dead code)
PR #11 (cleanup #2) deleted the /api/place* HTTP handlers but left their
orchestrator functions in lib/place_detail.py as dead code. Pre-flight for the
original Task #27 (delete google_places.py) surfaced that _enrich_with_google
is NOT a no-caller leaf — it's called by the unreachable get_place_detail. A
full caller-graph trace showed ~90% of place_detail.py is dead orchestration.
Scope expanded (Matt confirmed in chat) to remove the whole dead cluster:
- lib/google_places.py (entire file)
- place_detail.py: get_place_detail, get_place_by_wikidata, _enrich_with_google,
_apply_google_data, _enrich_with_overture, _enrich_with_wiki_index,
_enrich_wiki_links, _parse_nominatim, _parse_nominatim_address, _parse_overpass,
_build_overpass_query, cache_get, cache_put, _get_db + their now-unused
imports/constants (json, time, requests, osm_categories, NOMINATIM_URL, etc.)
KEEP only lookup_wiki_index + _get_wiki_index_db (the wiki_enrich_api survivor
path) — preserved byte-exact. Module docstring refreshed.
Flagged separately (not touched): overture.py + osm_categories.py are now
orphaned (only consumers were the deleted cluster); stale docstrings; the
deployment_config.py:9 catalog comment.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 10:21:31 -06:00
|
|
|
Provides lookup_wiki_index(wikidata_id, name, country_code) — a pure read of the
|
|
|
|
|
local wiki_index.db, used by the /api/wiki-enrich endpoint (navi-places
|
|
|
|
|
HTTP-fetches wiki enrichment instead of reading the 2.1 GB DB directly).
|
2026-04-21 03:06:51 +00:00
|
|
|
"""
|
|
|
|
|
import os
|
|
|
|
|
import sqlite3
|
|
|
|
|
|
|
|
|
|
from .utils import setup_logging
|
|
|
|
|
|
|
|
|
|
logger = setup_logging('recon.place_detail')
|
|
|
|
|
|
2026-05-21 21:47:52 +00:00
|
|
|
|
|
|
|
|
# ── Wiki Index enrichment ───────────────────────────────────────────────
|
|
|
|
|
|
|
|
|
|
_wiki_index_conn = None
|
|
|
|
|
|
|
|
|
|
def _get_wiki_index_db():
|
|
|
|
|
global _wiki_index_conn
|
|
|
|
|
if _wiki_index_conn is not None:
|
|
|
|
|
return _wiki_index_conn
|
|
|
|
|
|
|
|
|
|
db_path = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "data", "wiki_index.db")
|
|
|
|
|
if not os.path.exists(db_path):
|
|
|
|
|
logger.debug(f"wiki_index.db not found at {db_path}")
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
_wiki_index_conn = sqlite3.connect(db_path, check_same_thread=False)
|
|
|
|
|
_wiki_index_conn.row_factory = sqlite3.Row
|
|
|
|
|
logger.info(f"Wiki index DB ready at {db_path}")
|
|
|
|
|
return _wiki_index_conn
|
|
|
|
|
|
|
|
|
|
|
recon: add /api/wiki-enrich endpoint (extraction #5 prep, additive) (#8)
HTTP wrapper over the wiki_index lookup so the (future) navi-places service can
fetch wiki enrichment over HTTP instead of reading recon's 2.1 GB
data/wiki_index.db directly (Phase A option B — HTTP coupling).
GET /api/wiki-enrich?wikidata=<Qid> (primary key)
GET /api/wiki-enrich?name=<name>&country=<cc> (fallback key)
-> 200 {wiki_summary?, wiki_population?, wiki_url?, wikivoyage_url?}
-> 400 if no usable key; 404 on no match. Public (no auth, like /api/place/*).
Route keys are wikidata_id / name+country — NOT osm_type/osm_id — because that
is how wiki_index is actually queried (the in-process _enrich_with_wiki_index
looks up by result['wikidata_id'] then name+country_code, never by OSM id; see
extraction-5-wiki-enrich-investigation.md). An osm-keyed route would have forced
a redundant in-recon place lookup.
Changes (additive only):
- lib/place_detail.py: new standalone lookup_wiki_index(wikidata_id, name,
country_code) doing the same two SELECTs + field/URL mapping as the
in-process path, returning a dict or None. Pure DB read, never raises.
`_enrich_with_wiki_index` is LEFT UNTOUCHED — it can be DRY-refactored to
delegate to this in a later PR; the in-process enrichment path is unchanged.
- lib/wiki_enrich_api.py: new wiki_enrich_bp blueprint with the route.
- lib/api.py: register the blueprint (one block).
- lib/wiki_enrich_api_test.py: 4 tests (hit-by-wikidata + decoded fields,
no-match -> 404, name+country fallback, no-key -> 400) over an in-memory
fixture DB; plain-assert style + __main__ runner (recon venv has no pytest).
Verified green against recon's venv (flask 3.1.2).
Does NOT remove the in-process _enrich_with_wiki_index call from place_detail —
that happens in a later PR once navi-places is live and serving.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 13:23:08 -06:00
|
|
|
def lookup_wiki_index(wikidata_id=None, name=None, country_code=None):
|
|
|
|
|
"""Standalone wiki_index lookup, extracted for the /api/wiki-enrich endpoint
|
|
|
|
|
(extraction #5: navi-places HTTP-fetches wiki enrichment instead of reading
|
|
|
|
|
the 2.1 GB wiki_index.db directly).
|
|
|
|
|
|
|
|
|
|
Mirrors the lookup that `_enrich_with_wiki_index` performs in-process:
|
|
|
|
|
by wikidata_id first, then a name + country_code fallback. Returns a dict of
|
|
|
|
|
wiki enrichment fields (only those present), or None if there is no match or
|
|
|
|
|
the wiki_index DB is unavailable. Pure DB read — no feature-flag gating
|
|
|
|
|
(callers decide whether to call) and never raises.
|
|
|
|
|
|
|
|
|
|
NOTE: additive only — `_enrich_with_wiki_index` is intentionally left
|
|
|
|
|
untouched here; it can be DRY-refactored to delegate to this in a later PR.
|
|
|
|
|
"""
|
|
|
|
|
db = _get_wiki_index_db()
|
|
|
|
|
if not db:
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
try:
|
|
|
|
|
cur = db.cursor()
|
|
|
|
|
row = None
|
|
|
|
|
|
|
|
|
|
if wikidata_id:
|
|
|
|
|
wid = wikidata_id
|
|
|
|
|
if isinstance(wid, str) and wid.startswith("http"):
|
|
|
|
|
wid = wid.split("/")[-1]
|
|
|
|
|
cur.execute(
|
|
|
|
|
"SELECT summary, wiki_population, wikipedia_title, wikivoyage_title FROM wiki_places WHERE wikidata_id = ?",
|
|
|
|
|
(wid,)
|
|
|
|
|
)
|
|
|
|
|
row = cur.fetchone()
|
|
|
|
|
|
|
|
|
|
if not row and name and country_code:
|
|
|
|
|
cur.execute(
|
|
|
|
|
"SELECT summary, wiki_population, wikipedia_title, wikivoyage_title FROM wiki_places WHERE place_name = ? AND country_code = ? LIMIT 1",
|
|
|
|
|
(name, country_code.lower())
|
|
|
|
|
)
|
|
|
|
|
row = cur.fetchone()
|
|
|
|
|
|
|
|
|
|
if not row:
|
|
|
|
|
return None
|
|
|
|
|
|
|
|
|
|
out = {}
|
|
|
|
|
if row["summary"]:
|
|
|
|
|
out["wiki_summary"] = row["summary"]
|
|
|
|
|
if row["wiki_population"]:
|
|
|
|
|
try:
|
|
|
|
|
out["wiki_population"] = int(row["wiki_population"])
|
|
|
|
|
except (ValueError, TypeError):
|
|
|
|
|
out["wiki_population"] = row["wiki_population"]
|
|
|
|
|
if row["wikipedia_title"]:
|
|
|
|
|
title = row["wikipedia_title"].replace(" ", "_")
|
|
|
|
|
out["wiki_url"] = f"https://en.wikipedia.org/wiki/{title}"
|
|
|
|
|
if row["wikivoyage_title"]:
|
|
|
|
|
title = row["wikivoyage_title"].replace(" ", "_")
|
|
|
|
|
out["wikivoyage_url"] = f"https://en.wikivoyage.org/wiki/{title}"
|
|
|
|
|
|
|
|
|
|
return out or None
|
|
|
|
|
|
|
|
|
|
except Exception as e:
|
|
|
|
|
logger.debug(f"wiki_index lookup error: {e}")
|
|
|
|
|
return None
|