Commit graph

9 commits

Author SHA1 Message Date
Viktor Barzin
578b97b0c5
Add configurable request timeout and retry on TimeoutError
Requests to Rightmove API previously had no explicit timeout, causing
hung connections to block workers indefinitely. Add a configurable
request_timeout (default 30s) to ScraperConfig and apply it to all
aiohttp sessions. Also retry on TimeoutError in addition to
ThrottlingError for all API query functions.
2026-02-21 17:50:36 +00:00
Viktor Barzin
f833309297
Refactor backend for cleaner error handling, DRY, and type safety
- Extract rate limiter DRY: consolidate 3 duplicated check/respond paths
  into _check_counter and _enforce_limit helpers, add proper type annotations
- Replace bare Exception raises with FloorplanDownloadError and
  RightmoveApiError; narrow catch clauses to specific exception types;
  fix Step base class to inherit from ABC
- Consolidate MAX_OCR_WORKERS into config/scraper_config.py; extract
  _find_tenure_value helper to deduplicate tenure parsing
- Extract _build_poi_distances_lookup from stream endpoint to reduce nesting
- Fix csv_exporter: optional decisions.json, NaN instead of -1 sentinels,
  guard against division by zero on missing square meters
- Fix notifications.py broken list[Surface]() constructor, database.py
  stale comments and missing type annotation, auth.py type:ignore,
  ui_exporter.py stale TODO
- Fix 3 pre-existing test failures: mock cache layer in streaming tests,
  bypass rate limiter for test isolation, fix cache invalidation test to
  account for two-pattern scan loop
2026-02-10 22:19:24 +00:00
Viktor Barzin
eafbc1ac52
Flatten repo structure: move crawler/ to root, remove vqa/ and immoweb/
The crawler subdirectory was the only active project. Moving it to the
repo root simplifies paths and removes the unnecessary nesting. The
vqa/ and immoweb/ directories were legacy/unused and have been removed.

Updated .drone.yml, .gitignore, .claude/ docs, and skills to reflect
the new flat structure.
2026-02-07 23:01:20 +00:00
Kadir
e2f7998ee9 merging the visual query answering with the crawler. Monorepo go! 2024-03-01 16:42:48 +01:00
Kadir Tugan
8698b41e0e adding lat/lon to the db 2023-11-18 13:09:03 +02:00
Kadir Tugan
72ff6a7202 adding query to fetch a single detailpage 2023-11-18 12:56:15 +02:00
Kadir Tugan
31d760cf30 adding more parameters to the function 2023-11-18 12:38:54 +02:00
Kadir Tugan
d49b11e28b ruff format 2023-11-18 12:30:04 +02:00
Kadir Tugan
65fc4cdda4 updating poetry 2023-11-18 12:20:22 +02:00
Renamed from query.py (Browse further)