Commit graph

355 commits

Author SHA1 Message Date
Viktor Barzin
a3ac9cc060
Fix Drone CI variable interpolation in API verify-deploy step
Drone expands ${VAR} as its own variables before the shell runs, so
${BASE_API} and ${DEPLOY} were replaced with empty strings. Use $VAR
(no braces) so the shell handles them instead. Also add fallback for
empty jq output to prevent "sh: out of range" errors.
2026-02-10 21:32:11 +00:00
Viktor Barzin
902f1b0852
Switch task progress to throttled event-driven updates
Replace timer-based _monitor_progress (1s sleep loop) with a
ProgressReporter class that publishes on actual state changes,
throttled to at most 1 publish per 250ms. A background flush
every 2s keeps ETA/elapsed current during quiet periods.

Switch WebSocket forwarder from get_message() polling (1s timeout)
to async pubsub.listen() for instant Redis-to-WebSocket delivery.

Combined latency improvement: ~1.5s average → ~250ms.
2026-02-10 21:24:33 +00:00
Viktor Barzin
b816f695f0
Replace deployment-level rollout check with pod-level verify-deploy
Check for a pod younger than 60s that is ready and running the exact
expected image tag, instead of polling deployment status fields.
2026-02-09 23:21:07 +00:00
Viktor Barzin
dbcd30679a
Add POI task phases to TaskPhase type union
Add 'starting' and 'computing' to TaskPhase so TypeScript accepts
the POI phase comparisons in TaskProgressDrawer.
2026-02-09 23:06:47 +00:00
Viktor Barzin
749e99de93
Add Claude agent definitions [ci skip] 2026-02-09 23:02:30 +00:00
Viktor Barzin
2d86213db5
Refactor task progress to unified useTaskProgress hook
Replace WebSocket-only useTaskWebSocket with useTaskProgress that
provides a unified task state interface. TaskIndicator no longer
manages its own polling or auth — it receives task state from the
parent via props. Rename wsTasks prop to tasks throughout.
2026-02-09 23:02:24 +00:00
Viktor Barzin
3616e678ac
Reduce task polling frequency and raise rate limits to prevent 429s
With 8+ active tasks, polling every 5s generates ~96 task_status
requests/min, exceeding the 60/60s rate limit. Two fixes:

- Adaptive polling: 30s when WebSocket is connected (safety net),
  5s only when WebSocket is down (primary source)
- Raise task_status rate limit to 200/60s and tasks_for_user to
  60/60s to handle burst scenarios (page reloads, WS reconnects)
2026-02-09 22:59:39 +00:00
Viktor Barzin
791b5a9d55
Fix real-time task progress by closing WS on pubsub exit and keeping polling active
Three interconnected bugs prevented progress updates from reaching the frontend:

1. _forward_pubsub could exit silently while _handle_client_messages kept
   the WebSocket alive (responding to pings), so the client never detected
   the broken forwarding path. Replace asyncio.gather with asyncio.wait
   (FIRST_COMPLETED) so both coroutines are cancelled together.

2. Polling was stopped on WS connect with no fallback if forwarding broke.
   Now polling runs always alongside WebSocket as a safety net.

3. Redis publish failures in task_progress_publisher were logged at DEBUG
   and the broken client was reused forever. Log at WARNING and reset the
   client so the next call reconnects.
2026-02-09 22:48:57 +00:00
Viktor Barzin
8d52bdf99d
Fix stale task polling by keeping it always active alongside WebSocket
Polling was disabled when wsConnected was true, but if the WS connected
while workers hadn't been redeployed (no pub/sub messages flowing), the
UI received no updates at all. Polling now always runs at 5s as the
baseline. WebSocket provides faster real-time updates on top when
available — the two coexist, last writer wins.
2026-02-09 21:37:07 +00:00
Viktor Barzin
f3cbeb3f5e
Cache Docker builder stages in Drone CI for faster builds
Push intermediate builder stages as :builder tags and use cache_from
to reuse dependency layers (pip install, npm ci) across builds.
2026-02-09 21:35:42 +00:00
Viktor Barzin
8559c4b461
Add real-time WebSocket task progress with multi-job drawer
Replace 5s HTTP polling with WebSocket-based real-time updates for task
progress. Celery workers publish progress to Redis pub/sub channels;
a FastAPI WebSocket endpoint subscribes and forwards to the browser.
Polling is kept as a 30s fallback when WebSocket is unavailable.

The task progress drawer now supports multiple concurrent jobs with a
tab bar for switching between scrape and POI distance tasks.

Backend:
- Add services/task_progress_publisher.py (Redis pub/sub bridge)
- Add api/ws_routes.py (WebSocket endpoint with JWT auth)
- Publish progress from listing_tasks and poi_tasks
- Publish REVOKED via pub/sub on cancel/clear to fix stuck UI

Frontend:
- Add useTaskWebSocket hook with reconnection and keepalive
- Add TaskState and WS message types
- TaskIndicator: WS-driven updates with polling fallback
- TaskProgressDrawer: multi-job tabs, POI phase timeline
- Guard against WS overwriting local cancel state
2026-02-09 21:31:45 +00:00
Viktor Barzin
73d19e29d5
Fix duplicate listings via staged Redis cache and frontend stream cancellation
Three-pronged fix for duplicate listings appearing in the UI:

1. Backend: Replace direct rpush cache writes with staged population
   (write to temp key, then atomic RENAME to live key). Skip cache
   writes entirely for POI-enriched requests. Clean staging keys on
   invalidation.

2. Frontend: Add AbortController to cancel in-flight streaming requests
   when loadListings is called again, preventing data mixing.

3. Frontend: Deduplicate features by URL during stream accumulation as
   a safety net against any remaining server-side duplicates.
2026-02-09 21:17:30 +00:00
Viktor Barzin
5b8aa98446
Reformat docker-compose command arrays and update tsconfig buildinfo 2026-02-09 20:52:44 +00:00
Viktor Barzin
f465ccb8b8
Fix alembic migration to rename buylisting.longtitude to longitude
The migration was incorrectly changed to a no-op, leaving the typo
in the buylisting table which caused OperationalError during scraping.
2026-02-09 20:47:30 +00:00
Viktor Barzin
0fb4504646
Fix stale browser cache after redeployment with proper nginx cache headers
index.html is served with Cache-Control: no-cache so the browser always
fetches the latest version with updated asset hashes. Hashed assets under
/assets/ are cached indefinitely since their filenames change on rebuild.

This prevents browsers from serving old cached JS bundles (including the
broken obfuscated build) after a new deployment.
2026-02-08 22:51:11 +00:00
Viktor Barzin
7319f77f1d
Remove JS obfuscator that broke Mapbox GL map rendering
vite-plugin-obfuscator processes ALL output chunks including vendor
libraries, corrupting Mapbox GL's WebGL shader string literals via
base64 encoding and string splitting. This caused the map to render
as a blank screen in production.

Vite's built-in esbuild minification already mangles identifiers and
removes whitespace, providing sufficient code protection.

Adds regression tests to prevent re-introducing obfuscation plugins.
2026-02-08 22:47:01 +00:00
Viktor Barzin
6a1c35946e
Add rollout wait step to Drone CI pipelines
Both frontend and API pipelines now wait for K8s deployments to fully
roll out before marking the build as successful. Polls the K8s API
every 5s for up to 300s, checking observedGeneration, updatedReplicas,
and readyReplicas to confirm the new image is live in production.
2026-02-08 20:28:02 +00:00
Viktor Barzin
5b566bab4c
Fix POI distance calculation reliability for remote/Celery execution
- Fix silent log loss: replace hardcoded "uvicorn.error" logger with __name__
  in osrm_client, otp_client, poi_distance_calculator, and poi_tasks (uvicorn
  logger has no handlers in Celery worker, so all errors were silently dropped)
- Add Celery retry: autoretry_for=(Exception,), max_retries=3, retry_backoff
- Add top-level exception handling in task with full traceback logging
- Fix upsert_distances: replace session.merge() (PK-based) with proper
  dialect-aware INSERT ON DUPLICATE KEY UPDATE / ON CONFLICT DO UPDATE
- Filter out listings with null/zero coordinates before routing
- Raise OSError when all routing engines fail with 0 results computed,
  distinguishing "nothing to compute" from "all engines unreachable"
2026-02-08 20:11:12 +00:00
Viktor Barzin
46995cb9da
Restart celery and celery-beat deployments on API image build
Prevents stale celery containers when code changes are deployed.
2026-02-08 20:09:10 +00:00
Viktor Barzin
1ace45353a
Add API anti-abuse hardening: disable docs in prod, origin validator, exception handler
- Disable OpenAPI docs/redoc/openapi.json when APP_ENV=production
- Strip uvicorn Server header with --no-server-header in Dockerfile and docker-compose.yml
- Add OriginValidatorMiddleware to reject state-changing requests from disallowed origins
- Add global exception handler to prevent stack trace leakage on unhandled errors
- Add tests for all new security features (OpenAPI, origin validation, exception handler, server header)
2026-02-08 20:06:46 +00:00
Viktor Barzin
162d9a886d
Harden frontend assets: disable source maps, add JS obfuscation, env var config
- Disable source maps in production builds (vite.config.ts: sourcemap: false)
- Add vite-plugin-obfuscator for JS obfuscation (hex identifiers, base64 string encoding)
- Move OIDC config behind VITE_* env vars with dev fallbacks (auth/config.ts)
- Add server_tokens off to nginx.conf to stop advertising nginx version
- Add type declaration for vite-plugin-obfuscator
2026-02-08 20:06:33 +00:00
Viktor Barzin
492921424e
Add security regression tests for all hardening fixes
- New: test_security_headers.py — verify all headers present, HSTS conditional on HTTPS
- New: test_passkey_error_handling.py — generic vs user-facing error messages
- New: test_poi_validation.py — field length and coordinate range constraints
- Extend test_rate_limiter.py — client IP depth selection, in-memory fallback enforcement
- Extend test_models.py — sqm range validation
- Extend test_task_service.py — IDOR 404, ownership 200, traceback suppression in production
2026-02-08 19:42:53 +00:00
Viktor Barzin
727dd537ef
Fix XSS in map popups by replacing setHTML with setDOMContent
- POI popup: use DOM API with textContent (auto-escapes) instead of template literal in setHTML
- Listing popup: replace renderToString + setHTML with createRoot + setDOMContent for proper React lifecycle
2026-02-08 19:42:44 +00:00
Viktor Barzin
0a9a83507e
Harden backend security: IDOR fix, error sanitization, rate limiter fallback, security headers
- Fix task status IDOR by adding ownership check; suppress traceback/error in production
- Passkey routes: return generic error messages for internal exceptions, keep ValueError for user-facing
- JWT_SECRET and OIDC_CLIENT_ID: raise RuntimeError in production when using defaults
- Rate limiter: add in-memory fallback counter when Redis is unavailable
- Fix X-Forwarded-For IP spoofing with trusted_proxy_depth (rightmost-N selection)
- Add SecurityHeadersMiddleware (X-Content-Type-Options, X-Frame-Options, CSP, conditional HSTS)
- CORS: add PUT/DELETE methods for POI routes
- POI input validation: field length and coordinate range constraints
- QueryParameters: add min_sqm <= max_sqm validation
2026-02-08 19:42:30 +00:00
Viktor Barzin
e431eaf2aa
Fix POI distance calculation for buy listings
The distance calculator always queried the rentlisting table regardless of
listing type because get_listings() defaulted to RentListing when called
without query_parameters. Added a listing_type parameter to get_listings()
and _get_model_for_query() so callers can select the correct table directly.
2026-02-08 19:10:32 +00:00
Viktor Barzin
54bdcac14a
Run alembic migrations on startup, fix User model, add POI travel sorting and streaming options 2026-02-08 18:50:13 +00:00
Viktor Barzin
743e018668
Redesign filter panel with range sliders, separated visualization card, and backend filter support
Simplify the filter UI to show only essential filters (type toggle, price/bedroom
range sliders, min size) by default, with advanced filters collapsed. Extract
visualization controls (color-by metric, POI travel mode) into a separate
VisualizationCard component. Wire up previously ignored backend filters: max_sqm,
min/max_price_per_sqm, and district_names now work end-to-end.
2026-02-08 18:50:06 +00:00
Viktor Barzin
1f4a3f858c
Fix heatmap crash on small datasets by clamping percentile indices
Math.round(values.length * 0.95) produces an out-of-bounds index when
the dataset has fewer than ~20 features (e.g. after tight travel time
filtering). values[outOfBounds] returns undefined, cascading to NaN
color stops which crash Mapbox's expression evaluator. Clamp both
min and max indices to values.length - 1.
2026-02-08 17:46:33 +00:00
Viktor Barzin
07d4fa5f84
Add per-POI travel time filtering and fix heatmap color stops
Replace the single global max travel time filter with per-POI filters.
Each POI gets its own travel mode selector and max minutes input in the
filter panel. Listings must satisfy ALL active filters (AND logic).

Fix Mapbox "Input is not a number" error by ensuring color stops are
always strictly monotonic (guard min === max) and always set (even when
no valid metric values exist). Also filter Infinity values from the
color scale computation. Widen the filter panel from w-64 to w-80.
2026-02-08 16:02:46 +00:00
Viktor Barzin
81d31eaecf
Auto-reload listings on task completion and show all POIs in detail view
Thread onTaskCompleted callback from TaskIndicator through Header to App.tsx
so listings auto-refresh when a background task (e.g. POI distance calculation)
completes. Add AllPOIDistances component to PropertyCard that shows all user
POIs with travel times or — placeholder for missing modes.
2026-02-08 15:11:21 +00:00
Viktor Barzin
01dae5dfbd
Fix OSRM setup: update Geofabrik URL and use bind mount for data
- Update Geofabrik download URL from great-britain to united-kingdom
  (old path returns 302 redirect to homepage).
- Switch OSRM Docker volumes from named volume to bind mount
  (./osrm-data:/data) so osrm-setup.sh output is used directly.
- Add osrm-data/ to .gitignore (large binaries, regenerated by script).
2026-02-08 14:51:52 +00:00
Viktor Barzin
2fdafdcb64
Auto-trigger WALK + BICYCLE distance calculation on POI creation
After creating a POI, automatically trigger WALK and BICYCLE distance
calculations (cheap OSRM batch API). TRANSIT is excluded since it uses
the expensive OTP backend — users trigger it manually via the calculator
button. Failure is non-fatal: the POI is still created and calculation
can be retried manually.
2026-02-08 14:51:34 +00:00
Viktor Barzin
8a5d1b3787
Fix POI distance calculation: OSRM index separator and error handling
- Fix OSRM client to use semicolons (not commas) for source/destination
  indices in /table API requests. Commas caused "Query string malformed"
  errors for any batch with more than one origin.
- Add error handling in poi_distance_calculator for unreachable routing
  engines (OSRM/OTP). Connection failures now log an error and skip the
  mode instead of crashing the entire Celery task.
2026-02-08 14:50:09 +00:00
Viktor Barzin
7084c46f89
Add Kubernetes manifests for routing engines
Deployments and Services for osrm-foot (256-512MB), osrm-bicycle
(256-512MB), and OTP (1-2GB). Includes PVCs for data storage and an
init Job to download and pre-process Greater London OSM data.
2026-02-08 13:16:38 +00:00
Viktor Barzin
8509a0326f
Add frontend POI management and travel time display
POIManager component in FilterPanel for creating/deleting POIs and
triggering distance calculations. PropertyCard shows travel time badges
(walk/cycle/transit) per POI. Map renders POI locations as red markers.
API client extended with POST body support for POI endpoints.
2026-02-08 13:16:32 +00:00
Viktor Barzin
bb489c2032
Add OSRM and OTP Docker services with setup scripts
Adds osrm-foot, osrm-bicycle, and otp services to Docker Compose under
a 'routing' profile (opt-in). Setup scripts download Greater London OSM
data and pre-process for OSRM foot/bicycle profiles, plus TfL GTFS for
OTP transit. Routing engine env vars added to .env.sample.
2026-02-08 13:16:10 +00:00
Viktor Barzin
149311508e
Add POI CLI commands: add-poi, list-pois, calculate-poi
Enables POI management and distance calculation from the command line,
following the existing Click command patterns with asyncio.run().
2026-02-08 13:15:24 +00:00
Viktor Barzin
bd788df9aa
Add POI API routes and Celery task
FastAPI router with CRUD endpoints for POIs, distance calculation
trigger, and distance queries. Streaming GeoJSON endpoint now accepts
include_poi_distances=true to inject travel times into features.
Celery task wraps the distance calculator with progress reporting.
2026-02-08 13:14:47 +00:00
Viktor Barzin
da0a56895d
Add self-hosted routing clients and distance calculator
RoutingConfig loads OSRM/OTP URLs from env vars. OSRM client uses the
/table endpoint for batch NxM distance matrices (walk/cycle). OTP client
uses GraphQL API for transit routes. POI distance calculator orchestrates
both, skipping already-computed distances and reporting progress.
2026-02-08 13:14:37 +00:00
Viktor Barzin
8a31e5449c
Add POI repository and service layer
POIRepository handles all database operations for POIs and distances
including upsert, cascading delete, and skip-on-recompute via
get_existing_distance_keys(). POI service provides unified high-level
functions shared by both CLI and API.
2026-02-08 13:13:17 +00:00
Viktor Barzin
5783d8fae9
Add POI and POIDistance data models with migration
Introduces PointOfInterest (per-user named locations with lat/lng) and
POIDistance (travel time/distance per listing+POI+mode triple) SQLModel
entities, plus an Alembic migration to create both tables with indexes
and a composite unique constraint.
2026-02-08 13:13:05 +00:00
Viktor Barzin
9ff381106b
Add Claude Code internet mode marker file 2026-02-08 00:45:49 +00:00
Viktor Barzin
87b5bd8676
Add API rate limiting, metrics guard, and audit middleware
Per-user rate limits via Redis sliding window, IP-restricted /metrics
endpoint, audit logging of all requests, CORS tightening, and export
caps on listing/geojson endpoints.
2026-02-08 00:45:43 +00:00
Viktor Barzin
08ac72bbfc
Remove leftover crawler/ files after repo flattening 2026-02-08 00:45:33 +00:00
Viktor Barzin
a7fa9b81cf
Fix OIDC logout not redirecting back to app
Pass id_token_hint explicitly to signoutRedirect() so Authentik
honors the post_logout_redirect_uri and sends users back to the app.
2026-02-08 00:29:02 +00:00
Viktor Barzin
e5ce8c1201
Fix buy listing support: thread ListingType through processing pipeline
The listing processor was hardcoded to create RentListing objects and
query only the rentlisting table. Buy listings fetched from Rightmove
were stored in the wrong table with missing fields. This threads
ListingType through ListingProcessor and all Step subclasses so the
correct model (RentListing/BuyListing) is created, the correct table
is queried, and buy-specific fields (service_charge, lease_left) are
parsed from the API response and included in GeoJSON streaming output.
2026-02-07 23:34:08 +00:00
Viktor Barzin
5e48a26958
Pin deployment image to build number tag instead of restarting latest
Images are now tagged with both :latest and :${DRONE_BUILD_NUMBER}.
The deploy step uses JSON Patch to set the container image to the
specific build number tag, making deployments deterministic and
compatible with Terraform (which should ignore_changes on the image).
2026-02-07 23:07:16 +00:00
Viktor Barzin
eafbc1ac52
Flatten repo structure: move crawler/ to root, remove vqa/ and immoweb/
The crawler subdirectory was the only active project. Moving it to the
repo root simplifies paths and removes the unnecessary nesting. The
vqa/ and immoweb/ directories were legacy/unused and have been removed.

Updated .drone.yml, .gitignore, .claude/ docs, and skills to reflect
the new flat structure.
2026-02-07 23:01:20 +00:00
Viktor Barzin
e2247be700
Apply Color By metric change instantly without data re-fetch
The metric dropdown only updated form state, requiring "Apply Filters"
to take effect — which re-fetched identical data and remounted the Map,
making the result look unchanged. Now the Select's onValueChange
immediately propagates the metric to queryParameters, so the Map
re-renders in-place with the new color scheme, legend, and hex values.
2026-02-07 22:47:13 +00:00
Viktor Barzin
5e2c5923f6
Cross-fade hex grid on zoom with dual-layer swap
Replace single-layer fade-out/swap/fade-in with a dual-layer (A/B)
cross-fade so old hexagons dissolve while new ones appear simultaneously.
Pans still do a direct data swap on the active layer since hex positions
are origin-anchored and stable.

Fix click/hover handlers to query both layers via queryRenderedFeatures,
and use moveLayer to keep the active layer on top so stale features from
the faded-out layer don't intercept clicks.
2026-02-07 22:25:50 +00:00