col: simulator auto-adjusts spending to local prices via Numbeo+Expatistan

The Monte Carlo used to compare jurisdictions at a flat London-equivalent
spend, which silently overstated the cost-of-living for any move to a
cheaper region. Now every cross-jurisdiction simulation auto-scales
spending_gbp by the real Numbeo/Expatistan ratio between the user's
baseline city and the target city.

Architecture:
- fire_planner/col/baseline.py — 22 cities with headline Numbeo data
  (source URLs + snapshot dates embedded) — fallback when scraper fails
- col/numbeo.py + col/expatistan.py — httpx async scrapers, regex-parsed,
  polite 1.1s rate-limit, EUR/USD anchored
- col/cache.py — PG-backed cache (col_snapshot table, 1-year TTL)
- col/service.py — sync compute_col_ratio() for the simulator; async
  lookup_city_cached() with source reconciliation for the refresh CronJob
- alembic 0005 — col_snapshot table, UNIQUE(city_slug, source_name)

Simulator wiring:
- SimulateRequest gains col_auto_adjust=True (default), col_baseline_city,
  col_target_city. Defaults pick the jurisdiction's representative city.
- _resolve_col_adjustment scales spending_gbp before path-building.
- SimulateResult surfaces col_multiplier_applied + col_adjusted_spending_gbp.

CLIs:
- python -m fire_planner col-seed — loads BASELINES into col_snapshot
  (post-migration seed step)
- python -m fire_planner col-refresh-stale --within-days 7 — used by the
  weekly fire-planner-col-refresh CronJob

268 tests pass. Mypy strict + ruff clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-05-22 14:14:57 +00:00
parent 70101c836c
commit e72fd22a17
14 changed files with 1641 additions and 6 deletions

View file

@ -244,6 +244,50 @@ class IncomeStream(Base):
server_default=func.now())
class ColSnapshot(Base):
"""Cached cost-of-living snapshot per (city_slug, source).
Phase 2 of the COL subsystem. Replaces the previous "baseline-only"
lookup with cache-then-scrape semantics:
service.lookup_city(slug) check ColSnapshot, return if fresh
else scrape Numbeo, upsert, return
if scrape fails, fall back to baseline.py
TTL default = 365 days (`expires_at = fetched_at + interval '365 day'`).
The user explicitly asked for 1y on 2026-05-21 Numbeo data doesn't
move fast enough to need monthly refresh, and the API/scraper has rate-
limit risk we prefer to amortise. Phase-3 CronJob will run a nightly
refresh of stale rows so individual user requests never have to scrape.
`(city_slug, source_name)` is unique we can store multiple sources
per city (Numbeo + Expatistan) and reconcile in service.py.
"""
__tablename__ = "col_snapshot"
__table_args__ = {"schema": SCHEMA_NAME} # noqa: RUF012
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
city_slug: Mapped[str] = mapped_column(String(64), nullable=False, index=True)
city_display: Mapped[str] = mapped_column(String(128), nullable=False)
country: Mapped[str] = mapped_column(String(64), nullable=False)
source_name: Mapped[str] = mapped_column(String(32), nullable=False)
source_url: Mapped[str | None] = mapped_column(String, nullable=True)
snapshot_date: Mapped[date] = mapped_column(Date, nullable=False)
fetched_at: Mapped[datetime] = mapped_column(TIMESTAMP(timezone=True),
nullable=False,
server_default=func.now())
expires_at: Mapped[datetime] = mapped_column(TIMESTAMP(timezone=True), nullable=False)
total_no_rent_gbp: Mapped[Decimal] = mapped_column(Numeric(12, 2), nullable=False)
total_with_rent_gbp: Mapped[Decimal] = mapped_column(Numeric(12, 2), nullable=False)
rent_1bed_center_gbp: Mapped[Decimal] = mapped_column(Numeric(12, 2), nullable=False)
rent_1bed_outside_gbp: Mapped[Decimal | None] = mapped_column(Numeric(12, 2), nullable=True)
raw_currency: Mapped[str] = mapped_column(String(3), nullable=False, server_default="GBP")
gbp_per_unit: Mapped[Decimal] = mapped_column(Numeric(12, 8),
nullable=False,
server_default=text("1"))
by_category_json: Mapped[dict[str, Any] | None] = mapped_column(JSON_TYPE, nullable=True)
class RetirementGoal(Base):
"""A user-defined success criterion for a scenario.