2026-02-07 20:19:57 +00:00
|
|
|
"""Unit tests for tasks/listing_tasks.py."""
|
wrongmove: round-3 fix sweep — scrape pipeline, BUY tab, filter URL state, render hygiene, map polish
Coordinated fix across 31 bugs found in a parallel QA pass. Findings docs at /tmp/wrongmove-bugs/qa-round-3/qa{1,2,3,4}-*.md.
## Backend / scrape (Fix-1) — 8 bugs
- B1 [P0] Scrape totally broken on prod: pod UID 100 vs NFS dir 1000:1000 mode 775 → PermissionError on every never-seen listing. Switched Dockerfile to explicit `useradd --uid 1000 --gid 1000`; added securityContext + chown initContainer to k8s/{api,celery-beat}-deployment.yaml. Celery worker manifest lives outside this repo — Dockerfile UID change is the load-bearing fix.
- B4 [P1] Celery broker reaped every ~30s by Redis HAProxy idle timeout. Added `broker_transport_options` / `result_backend_transport_options` with `socket_keepalive=True, health_check_interval=25` in celery_app.py + same kwargs on every redis.from_url/Redis call across services/, utils/redis_lock.py, redis_repository.py.
- B5 [P1] dump_listings_task never published terminal FAILURE to the task_progress pub/sub channel — UI polled forever. Wrap body in try/except that publishes FAILURE before re-raising.
- B6 [P1] _process_worker had no per-listing exception handler — one bad listing killed the whole scrape via asyncio.gather. Wrap loop body in try/except Exception (re-raises CancelledError).
- B20 [P2] dump_listings_task gained time_limit=3600, soft_time_limit=3500, acks_late=True.
- B21 [P2] RedisRepository moved off shared db0 (was alongside paperless-ngx) to db3 via REDIS_USER_DB env var; keys prefixed `wrongmove:user:`.
- B32 [P3] redis_lock now uses uuid4() owner token + Lua compare-and-delete.
- B33 [P3] Slack notify in refresh_listings → asyncio.create_task (fire-and-forget).
## Frontend filter system (Fix-2) — 7 bugs
- B2 [P0] BUY tab click triggered "Maximum update depth exceeded" → ErrorBoundary. Replaced the three mutually-triggering useEffects in FilterBar with a single one-way controlled-value flow (URL → parent state → form), guarded by previousListingTypeRef so price-defaults fires once per real transition.
- B3 [P0] Filter values never reached the URL. Wired useFilterParams.setFilterValues into FilterBar/FilterPanel onSubmit + handleRemoveChip + new handleResetAllFilters; fed parsed filterValues into both forms' defaultValues; added URL→form sync via form.reset on browser back/forward.
- B8 [P1] Chip removal now resets form state via new FilterBar onFormReady callback — More badge no longer sticks.
- B12 [P2] Desktop swipe-review FAB added next to header (mobile FAB unchanged).
- B17 [P2] "Reset all" affordance on chip strip.
- B22 [P2] formatPrice precision: 1500 → £1.5k, 2500 → £2.5k (no longer collides with £2k/£3k defaults).
- B30 [P3] last_seen_days input gained min={0}.
## Frontend render hygiene + data integrity (Fix-3) — 8 bugs
- B7 [P1] streamingService bails on first non-NDJSON chunk (HTML response = backend down) and throws StreamParseError so the existing AlertError dialog surfaces a single user-visible error instead of 18× console.error spam.
- B9 [P1] formatDuration widened to (null|undefined|number): returns "—" for non-finite or negative, caps implausibly large values.
- B10 [P1] PropertyCard / PropertyCardCompact / SwipeCard JSX leaves render "—" for null total_price/qm/qmprice (was "£0/0 m²/£0/m²" — looked like free listings).
- B13 [P2] hexgrid worker reduceAverage uses Number.isFinite filter instead of !isNaN (which incorrectly accepted null → 0, biasing per-hex averages low).
- B14 [P2] ListingDetail Overview wraps agency in "Listed by" labelled block so it can't collapse to a bare agency name.
- B15 [P2] Compact POIDistanceBadges iterates all three travel modes with "—" for missing, matching the detail-sheet Travel table.
- B24 [P3] Drawer.Description (sr-only) added to ListingDetailSheet + MobileBottomSheet to silence Radix a11y warning.
- B25 [P3] lastSeenDays clamped to ≥0 so future timestamps don't render as "-7d ago".
## Frontend map / carousel / tasks polish (Fix-4) — 8 bugs
- B11 [P2] HexgridHeatmapClient destroy race: Map.tsx adds .catch() + ref guard so post-destroy promise rejections are silent no-ops. Verified by browser smoke (24 rapid Map↔List toggles → 0 pageErrors).
- B16 [P2] PhotoCarousel + inner CardCarousel gained keyboard nav (Arrow keys).
- B18 [P2] Default map center moved from Czech Republic to London (zoom 10).
- B19+B29 [P2/P3] Mapbox token: no longer hard-coded fallback; reads env-only and shows a clear "Map unavailable — set VITE_MAPBOX_TOKEN" banner when missing.
- B23 [P3] PhotoCarousel suppresses "1/1" counter for single-photo listings; added onError fallback for broken URLs.
- B26 [P3] PhotoCarousel only enables loop when photos.length > 1.
- B27 [P3] TaskIndicator cancel/clear-all buttons gained aria-label + data-testid.
- B28 [P3] useTaskProgress strips terminal-local task IDs from the polling union — no more forever-poll on completed tasks.
## Tests
74 new vitest tests + 18 new pytest tests. Local: tsc clean, 201 vitest tests pass, 633 pytest tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 22:27:29 +00:00
|
|
|
import asyncio
|
2026-02-07 20:19:57 +00:00
|
|
|
import json
|
|
|
|
|
import os
|
|
|
|
|
from collections import deque
|
|
|
|
|
from unittest.mock import MagicMock, patch, AsyncMock, call
|
|
|
|
|
|
|
|
|
|
import pytest
|
|
|
|
|
|
|
|
|
|
import tasks.listing_tasks as module
|
|
|
|
|
from tasks.listing_tasks import (
|
|
|
|
|
_update_task_state,
|
|
|
|
|
_PipelineState,
|
wrongmove: round-3 fix sweep — scrape pipeline, BUY tab, filter URL state, render hygiene, map polish
Coordinated fix across 31 bugs found in a parallel QA pass. Findings docs at /tmp/wrongmove-bugs/qa-round-3/qa{1,2,3,4}-*.md.
## Backend / scrape (Fix-1) — 8 bugs
- B1 [P0] Scrape totally broken on prod: pod UID 100 vs NFS dir 1000:1000 mode 775 → PermissionError on every never-seen listing. Switched Dockerfile to explicit `useradd --uid 1000 --gid 1000`; added securityContext + chown initContainer to k8s/{api,celery-beat}-deployment.yaml. Celery worker manifest lives outside this repo — Dockerfile UID change is the load-bearing fix.
- B4 [P1] Celery broker reaped every ~30s by Redis HAProxy idle timeout. Added `broker_transport_options` / `result_backend_transport_options` with `socket_keepalive=True, health_check_interval=25` in celery_app.py + same kwargs on every redis.from_url/Redis call across services/, utils/redis_lock.py, redis_repository.py.
- B5 [P1] dump_listings_task never published terminal FAILURE to the task_progress pub/sub channel — UI polled forever. Wrap body in try/except that publishes FAILURE before re-raising.
- B6 [P1] _process_worker had no per-listing exception handler — one bad listing killed the whole scrape via asyncio.gather. Wrap loop body in try/except Exception (re-raises CancelledError).
- B20 [P2] dump_listings_task gained time_limit=3600, soft_time_limit=3500, acks_late=True.
- B21 [P2] RedisRepository moved off shared db0 (was alongside paperless-ngx) to db3 via REDIS_USER_DB env var; keys prefixed `wrongmove:user:`.
- B32 [P3] redis_lock now uses uuid4() owner token + Lua compare-and-delete.
- B33 [P3] Slack notify in refresh_listings → asyncio.create_task (fire-and-forget).
## Frontend filter system (Fix-2) — 7 bugs
- B2 [P0] BUY tab click triggered "Maximum update depth exceeded" → ErrorBoundary. Replaced the three mutually-triggering useEffects in FilterBar with a single one-way controlled-value flow (URL → parent state → form), guarded by previousListingTypeRef so price-defaults fires once per real transition.
- B3 [P0] Filter values never reached the URL. Wired useFilterParams.setFilterValues into FilterBar/FilterPanel onSubmit + handleRemoveChip + new handleResetAllFilters; fed parsed filterValues into both forms' defaultValues; added URL→form sync via form.reset on browser back/forward.
- B8 [P1] Chip removal now resets form state via new FilterBar onFormReady callback — More badge no longer sticks.
- B12 [P2] Desktop swipe-review FAB added next to header (mobile FAB unchanged).
- B17 [P2] "Reset all" affordance on chip strip.
- B22 [P2] formatPrice precision: 1500 → £1.5k, 2500 → £2.5k (no longer collides with £2k/£3k defaults).
- B30 [P3] last_seen_days input gained min={0}.
## Frontend render hygiene + data integrity (Fix-3) — 8 bugs
- B7 [P1] streamingService bails on first non-NDJSON chunk (HTML response = backend down) and throws StreamParseError so the existing AlertError dialog surfaces a single user-visible error instead of 18× console.error spam.
- B9 [P1] formatDuration widened to (null|undefined|number): returns "—" for non-finite or negative, caps implausibly large values.
- B10 [P1] PropertyCard / PropertyCardCompact / SwipeCard JSX leaves render "—" for null total_price/qm/qmprice (was "£0/0 m²/£0/m²" — looked like free listings).
- B13 [P2] hexgrid worker reduceAverage uses Number.isFinite filter instead of !isNaN (which incorrectly accepted null → 0, biasing per-hex averages low).
- B14 [P2] ListingDetail Overview wraps agency in "Listed by" labelled block so it can't collapse to a bare agency name.
- B15 [P2] Compact POIDistanceBadges iterates all three travel modes with "—" for missing, matching the detail-sheet Travel table.
- B24 [P3] Drawer.Description (sr-only) added to ListingDetailSheet + MobileBottomSheet to silence Radix a11y warning.
- B25 [P3] lastSeenDays clamped to ≥0 so future timestamps don't render as "-7d ago".
## Frontend map / carousel / tasks polish (Fix-4) — 8 bugs
- B11 [P2] HexgridHeatmapClient destroy race: Map.tsx adds .catch() + ref guard so post-destroy promise rejections are silent no-ops. Verified by browser smoke (24 rapid Map↔List toggles → 0 pageErrors).
- B16 [P2] PhotoCarousel + inner CardCarousel gained keyboard nav (Arrow keys).
- B18 [P2] Default map center moved from Czech Republic to London (zoom 10).
- B19+B29 [P2/P3] Mapbox token: no longer hard-coded fallback; reads env-only and shows a clear "Map unavailable — set VITE_MAPBOX_TOKEN" banner when missing.
- B23 [P3] PhotoCarousel suppresses "1/1" counter for single-photo listings; added onError fallback for broken URLs.
- B26 [P3] PhotoCarousel only enables loop when photos.length > 1.
- B27 [P3] TaskIndicator cancel/clear-all buttons gained aria-label + data-testid.
- B28 [P3] useTaskProgress strips terminal-local task IDs from the polling union — no more forever-poll on completed tasks.
## Tests
74 new vitest tests + 18 new pytest tests. Local: tsc clean, 201 vitest tests pass, 633 pytest tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 22:27:29 +00:00
|
|
|
_process_worker,
|
2026-02-07 20:19:57 +00:00
|
|
|
TaskLogHandler,
|
|
|
|
|
SCRAPE_LOCK_NAME,
|
|
|
|
|
LOG_BUFFER_MAX_LINES,
|
|
|
|
|
NUM_WORKERS,
|
|
|
|
|
PHASE_SPLITTING,
|
|
|
|
|
PHASE_FETCHING,
|
|
|
|
|
PHASE_PROCESSING,
|
|
|
|
|
PHASE_COMPLETED,
|
wrongmove: round-3 fix sweep — scrape pipeline, BUY tab, filter URL state, render hygiene, map polish
Coordinated fix across 31 bugs found in a parallel QA pass. Findings docs at /tmp/wrongmove-bugs/qa-round-3/qa{1,2,3,4}-*.md.
## Backend / scrape (Fix-1) — 8 bugs
- B1 [P0] Scrape totally broken on prod: pod UID 100 vs NFS dir 1000:1000 mode 775 → PermissionError on every never-seen listing. Switched Dockerfile to explicit `useradd --uid 1000 --gid 1000`; added securityContext + chown initContainer to k8s/{api,celery-beat}-deployment.yaml. Celery worker manifest lives outside this repo — Dockerfile UID change is the load-bearing fix.
- B4 [P1] Celery broker reaped every ~30s by Redis HAProxy idle timeout. Added `broker_transport_options` / `result_backend_transport_options` with `socket_keepalive=True, health_check_interval=25` in celery_app.py + same kwargs on every redis.from_url/Redis call across services/, utils/redis_lock.py, redis_repository.py.
- B5 [P1] dump_listings_task never published terminal FAILURE to the task_progress pub/sub channel — UI polled forever. Wrap body in try/except that publishes FAILURE before re-raising.
- B6 [P1] _process_worker had no per-listing exception handler — one bad listing killed the whole scrape via asyncio.gather. Wrap loop body in try/except Exception (re-raises CancelledError).
- B20 [P2] dump_listings_task gained time_limit=3600, soft_time_limit=3500, acks_late=True.
- B21 [P2] RedisRepository moved off shared db0 (was alongside paperless-ngx) to db3 via REDIS_USER_DB env var; keys prefixed `wrongmove:user:`.
- B32 [P3] redis_lock now uses uuid4() owner token + Lua compare-and-delete.
- B33 [P3] Slack notify in refresh_listings → asyncio.create_task (fire-and-forget).
## Frontend filter system (Fix-2) — 7 bugs
- B2 [P0] BUY tab click triggered "Maximum update depth exceeded" → ErrorBoundary. Replaced the three mutually-triggering useEffects in FilterBar with a single one-way controlled-value flow (URL → parent state → form), guarded by previousListingTypeRef so price-defaults fires once per real transition.
- B3 [P0] Filter values never reached the URL. Wired useFilterParams.setFilterValues into FilterBar/FilterPanel onSubmit + handleRemoveChip + new handleResetAllFilters; fed parsed filterValues into both forms' defaultValues; added URL→form sync via form.reset on browser back/forward.
- B8 [P1] Chip removal now resets form state via new FilterBar onFormReady callback — More badge no longer sticks.
- B12 [P2] Desktop swipe-review FAB added next to header (mobile FAB unchanged).
- B17 [P2] "Reset all" affordance on chip strip.
- B22 [P2] formatPrice precision: 1500 → £1.5k, 2500 → £2.5k (no longer collides with £2k/£3k defaults).
- B30 [P3] last_seen_days input gained min={0}.
## Frontend render hygiene + data integrity (Fix-3) — 8 bugs
- B7 [P1] streamingService bails on first non-NDJSON chunk (HTML response = backend down) and throws StreamParseError so the existing AlertError dialog surfaces a single user-visible error instead of 18× console.error spam.
- B9 [P1] formatDuration widened to (null|undefined|number): returns "—" for non-finite or negative, caps implausibly large values.
- B10 [P1] PropertyCard / PropertyCardCompact / SwipeCard JSX leaves render "—" for null total_price/qm/qmprice (was "£0/0 m²/£0/m²" — looked like free listings).
- B13 [P2] hexgrid worker reduceAverage uses Number.isFinite filter instead of !isNaN (which incorrectly accepted null → 0, biasing per-hex averages low).
- B14 [P2] ListingDetail Overview wraps agency in "Listed by" labelled block so it can't collapse to a bare agency name.
- B15 [P2] Compact POIDistanceBadges iterates all three travel modes with "—" for missing, matching the detail-sheet Travel table.
- B24 [P3] Drawer.Description (sr-only) added to ListingDetailSheet + MobileBottomSheet to silence Radix a11y warning.
- B25 [P3] lastSeenDays clamped to ≥0 so future timestamps don't render as "-7d ago".
## Frontend map / carousel / tasks polish (Fix-4) — 8 bugs
- B11 [P2] HexgridHeatmapClient destroy race: Map.tsx adds .catch() + ref guard so post-destroy promise rejections are silent no-ops. Verified by browser smoke (24 rapid Map↔List toggles → 0 pageErrors).
- B16 [P2] PhotoCarousel + inner CardCarousel gained keyboard nav (Arrow keys).
- B18 [P2] Default map center moved from Czech Republic to London (zoom 10).
- B19+B29 [P2/P3] Mapbox token: no longer hard-coded fallback; reads env-only and shows a clear "Map unavailable — set VITE_MAPBOX_TOKEN" banner when missing.
- B23 [P3] PhotoCarousel suppresses "1/1" counter for single-photo listings; added onError fallback for broken URLs.
- B26 [P3] PhotoCarousel only enables loop when photos.length > 1.
- B27 [P3] TaskIndicator cancel/clear-all buttons gained aria-label + data-testid.
- B28 [P3] useTaskProgress strips terminal-local task IDs from the polling union — no more forever-poll on completed tasks.
## Tests
74 new vitest tests + 18 new pytest tests. Local: tsc clean, 201 vitest tests pass, 633 pytest tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 22:27:29 +00:00
|
|
|
dump_listings_task,
|
2026-02-07 20:19:57 +00:00
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestUpdateTaskState:
|
|
|
|
|
"""Tests for _update_task_state."""
|
|
|
|
|
|
|
|
|
|
def test_injects_logs_from_active_buffer(self):
|
|
|
|
|
task = MagicMock()
|
|
|
|
|
original = module._active_log_buffer
|
|
|
|
|
try:
|
|
|
|
|
module._active_log_buffer = deque(["log line 1", "log line 2"])
|
|
|
|
|
_update_task_state(task, "test_state", {"key": "value"})
|
|
|
|
|
task.update_state.assert_called_once()
|
|
|
|
|
call_meta = task.update_state.call_args[1]["meta"]
|
|
|
|
|
assert call_meta["logs"] == ["log line 1", "log line 2"]
|
|
|
|
|
assert call_meta["key"] == "value"
|
|
|
|
|
finally:
|
|
|
|
|
module._active_log_buffer = original
|
|
|
|
|
|
|
|
|
|
def test_works_when_buffer_is_none(self):
|
|
|
|
|
task = MagicMock()
|
|
|
|
|
original = module._active_log_buffer
|
|
|
|
|
try:
|
|
|
|
|
module._active_log_buffer = None
|
|
|
|
|
_update_task_state(task, "some_state", {"phase": "testing"})
|
|
|
|
|
task.update_state.assert_called_once_with(
|
|
|
|
|
state="some_state", meta={"phase": "testing"}
|
|
|
|
|
)
|
|
|
|
|
# No "logs" key should be injected
|
|
|
|
|
call_meta = task.update_state.call_args[1]["meta"]
|
|
|
|
|
assert "logs" not in call_meta
|
|
|
|
|
finally:
|
|
|
|
|
module._active_log_buffer = original
|
|
|
|
|
|
|
|
|
|
def test_state_string_is_passed_through(self):
|
|
|
|
|
task = MagicMock()
|
|
|
|
|
original = module._active_log_buffer
|
|
|
|
|
try:
|
|
|
|
|
module._active_log_buffer = None
|
|
|
|
|
_update_task_state(task, "PROGRESS", {})
|
|
|
|
|
task.update_state.assert_called_once_with(state="PROGRESS", meta={})
|
|
|
|
|
finally:
|
|
|
|
|
module._active_log_buffer = original
|
|
|
|
|
|
|
|
|
|
def test_empty_buffer_injects_empty_list(self):
|
|
|
|
|
task = MagicMock()
|
|
|
|
|
original = module._active_log_buffer
|
|
|
|
|
try:
|
|
|
|
|
module._active_log_buffer = deque()
|
|
|
|
|
_update_task_state(task, "state", {"a": 1})
|
|
|
|
|
call_meta = task.update_state.call_args[1]["meta"]
|
|
|
|
|
assert call_meta["logs"] == []
|
|
|
|
|
finally:
|
|
|
|
|
module._active_log_buffer = original
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestTaskLogHandler:
|
|
|
|
|
"""Tests for the TaskLogHandler."""
|
|
|
|
|
|
|
|
|
|
def test_emit_appends_to_buffer(self):
|
|
|
|
|
buf = deque(maxlen=10)
|
|
|
|
|
handler = TaskLogHandler(buf)
|
|
|
|
|
handler.setFormatter(
|
|
|
|
|
__import__("logging").Formatter("%(message)s")
|
|
|
|
|
)
|
|
|
|
|
record = __import__("logging").LogRecord(
|
|
|
|
|
name="test", level=20, pathname="", lineno=0,
|
|
|
|
|
msg="hello", args=(), exc_info=None,
|
|
|
|
|
)
|
|
|
|
|
handler.emit(record)
|
|
|
|
|
assert "hello" in buf
|
|
|
|
|
|
|
|
|
|
def test_buffer_respects_maxlen(self):
|
|
|
|
|
buf = deque(maxlen=2)
|
|
|
|
|
handler = TaskLogHandler(buf)
|
|
|
|
|
handler.setFormatter(
|
|
|
|
|
__import__("logging").Formatter("%(message)s")
|
|
|
|
|
)
|
|
|
|
|
for i in range(5):
|
|
|
|
|
record = __import__("logging").LogRecord(
|
|
|
|
|
name="test", level=20, pathname="", lineno=0,
|
|
|
|
|
msg=f"msg{i}", args=(), exc_info=None,
|
|
|
|
|
)
|
|
|
|
|
handler.emit(record)
|
|
|
|
|
assert len(buf) == 2
|
|
|
|
|
assert list(buf) == ["msg3", "msg4"]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestDumpListingsTask:
|
|
|
|
|
"""Tests for dump_listings_task Celery task."""
|
|
|
|
|
|
|
|
|
|
@patch("tasks.listing_tasks.redis_lock")
|
|
|
|
|
def test_skips_when_lock_not_acquired(self, mock_redis_lock):
|
|
|
|
|
"""Task should skip when another scrape is running."""
|
|
|
|
|
mock_cm = MagicMock()
|
|
|
|
|
mock_cm.__enter__ = MagicMock(return_value=False)
|
|
|
|
|
mock_cm.__exit__ = MagicMock(return_value=False)
|
|
|
|
|
mock_redis_lock.return_value = mock_cm
|
|
|
|
|
|
|
|
|
|
from tasks.listing_tasks import dump_listings_task
|
|
|
|
|
|
|
|
|
|
# Use run() which handles bind=True properly
|
|
|
|
|
task_instance = dump_listings_task
|
|
|
|
|
task_instance.update_state = MagicMock()
|
|
|
|
|
|
|
|
|
|
result = dump_listings_task.run('{"listing_type": "RENT"}')
|
|
|
|
|
|
|
|
|
|
assert result["status"] == "skipped"
|
|
|
|
|
assert result["reason"] == "another_job_running"
|
|
|
|
|
mock_redis_lock.assert_called_once_with(SCRAPE_LOCK_NAME)
|
|
|
|
|
|
|
|
|
|
@patch("tasks.listing_tasks.asyncio.run")
|
|
|
|
|
@patch("tasks.listing_tasks.redis_lock")
|
|
|
|
|
def test_calls_dump_listings_full_when_lock_acquired(
|
|
|
|
|
self, mock_redis_lock, mock_asyncio_run
|
|
|
|
|
):
|
|
|
|
|
"""Task should call dump_listings_full when lock is acquired."""
|
|
|
|
|
mock_cm = MagicMock()
|
|
|
|
|
mock_cm.__enter__ = MagicMock(return_value=True)
|
|
|
|
|
mock_cm.__exit__ = MagicMock(return_value=False)
|
|
|
|
|
mock_redis_lock.return_value = mock_cm
|
|
|
|
|
|
|
|
|
|
mock_asyncio_run.return_value = []
|
|
|
|
|
|
|
|
|
|
from tasks.listing_tasks import dump_listings_task
|
|
|
|
|
|
|
|
|
|
task_instance = dump_listings_task
|
|
|
|
|
task_instance.update_state = MagicMock()
|
|
|
|
|
|
|
|
|
|
params_json = '{"listing_type": "RENT", "min_price": 1000, "max_price": 5000}'
|
|
|
|
|
result = dump_listings_task.run(params_json)
|
|
|
|
|
|
|
|
|
|
assert result["phase"] == "completed"
|
|
|
|
|
assert result["progress"] == 1
|
|
|
|
|
mock_asyncio_run.assert_called_once()
|
|
|
|
|
mock_redis_lock.assert_called_once_with(SCRAPE_LOCK_NAME)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestSetupPeriodicTasks:
|
|
|
|
|
"""Tests for setup_periodic_tasks."""
|
|
|
|
|
|
2026-05-16 12:07:50 +00:00
|
|
|
# NOTE: every call to setup_periodic_tasks also registers the unconditional
|
|
|
|
|
# `daily-market-aggregator` task (one extra call per invocation), so
|
|
|
|
|
# call_count assertions below account for that +1.
|
|
|
|
|
|
2026-02-07 20:19:57 +00:00
|
|
|
@patch("tasks.listing_tasks.SchedulesConfig.from_env")
|
|
|
|
|
def test_registers_enabled_schedules(self, mock_from_env):
|
|
|
|
|
from config.schedule_config import ScheduleConfig
|
|
|
|
|
from models.listing import ListingType
|
|
|
|
|
|
|
|
|
|
schedule = ScheduleConfig(
|
|
|
|
|
name="Test Schedule",
|
|
|
|
|
listing_type=ListingType.RENT,
|
|
|
|
|
hour="3",
|
|
|
|
|
minute="30",
|
|
|
|
|
)
|
|
|
|
|
mock_config = MagicMock()
|
|
|
|
|
mock_config.get_enabled_schedules.return_value = [schedule]
|
|
|
|
|
mock_from_env.return_value = mock_config
|
|
|
|
|
|
|
|
|
|
sender = MagicMock()
|
|
|
|
|
module.setup_periodic_tasks(sender)
|
|
|
|
|
|
2026-05-16 12:07:50 +00:00
|
|
|
# 1 schedule + 1 market aggregator.
|
|
|
|
|
assert sender.add_periodic_task.call_count == 2
|
|
|
|
|
names = [c.kwargs["name"] for c in sender.add_periodic_task.call_args_list]
|
|
|
|
|
assert "Test Schedule" in names
|
|
|
|
|
assert "daily-market-aggregator" in names
|
2026-02-07 20:19:57 +00:00
|
|
|
|
|
|
|
|
@patch("tasks.listing_tasks.SchedulesConfig.from_env")
|
|
|
|
|
def test_handles_config_error_gracefully(self, mock_from_env):
|
2026-05-16 12:07:50 +00:00
|
|
|
"""A malformed SCRAPE_SCHEDULES must not block the market aggregator."""
|
2026-02-07 20:19:57 +00:00
|
|
|
mock_from_env.side_effect = ValueError("bad config")
|
|
|
|
|
|
|
|
|
|
sender = MagicMock()
|
|
|
|
|
module.setup_periodic_tasks(sender)
|
|
|
|
|
|
2026-05-16 12:07:50 +00:00
|
|
|
# Aggregator still registers (the two systems are independent).
|
|
|
|
|
assert sender.add_periodic_task.call_count == 1
|
|
|
|
|
assert sender.add_periodic_task.call_args.kwargs["name"] == "daily-market-aggregator"
|
2026-02-07 20:19:57 +00:00
|
|
|
|
|
|
|
|
@patch("tasks.listing_tasks.SchedulesConfig.from_env")
|
|
|
|
|
def test_registers_nothing_when_no_schedules(self, mock_from_env):
|
|
|
|
|
mock_config = MagicMock()
|
|
|
|
|
mock_config.get_enabled_schedules.return_value = []
|
|
|
|
|
mock_from_env.return_value = mock_config
|
|
|
|
|
|
|
|
|
|
sender = MagicMock()
|
|
|
|
|
module.setup_periodic_tasks(sender)
|
|
|
|
|
|
2026-05-16 12:07:50 +00:00
|
|
|
# Only the market aggregator registered — no user schedules.
|
|
|
|
|
assert sender.add_periodic_task.call_count == 1
|
|
|
|
|
assert sender.add_periodic_task.call_args.kwargs["name"] == "daily-market-aggregator"
|
2026-02-07 20:19:57 +00:00
|
|
|
|
|
|
|
|
@patch("tasks.listing_tasks.SchedulesConfig.from_env")
|
|
|
|
|
def test_registers_multiple_schedules(self, mock_from_env):
|
|
|
|
|
from config.schedule_config import ScheduleConfig
|
|
|
|
|
from models.listing import ListingType
|
|
|
|
|
|
|
|
|
|
schedules = [
|
|
|
|
|
ScheduleConfig(name="Rent", listing_type=ListingType.RENT, hour="2"),
|
|
|
|
|
ScheduleConfig(name="Buy", listing_type=ListingType.BUY, hour="4"),
|
|
|
|
|
]
|
|
|
|
|
mock_config = MagicMock()
|
|
|
|
|
mock_config.get_enabled_schedules.return_value = schedules
|
|
|
|
|
mock_from_env.return_value = mock_config
|
|
|
|
|
|
|
|
|
|
sender = MagicMock()
|
|
|
|
|
module.setup_periodic_tasks(sender)
|
|
|
|
|
|
2026-05-16 12:07:50 +00:00
|
|
|
# 2 schedules + 1 market aggregator.
|
|
|
|
|
assert sender.add_periodic_task.call_count == 3
|
2026-02-07 20:19:57 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestPipelineState:
|
|
|
|
|
"""Tests for _PipelineState dataclass."""
|
|
|
|
|
|
|
|
|
|
def test_default_initialization(self):
|
|
|
|
|
state = _PipelineState()
|
|
|
|
|
assert state.ids_collected == 0
|
|
|
|
|
assert state.completed_subqueries == 0
|
|
|
|
|
assert state.total_pages_fetched == 0
|
|
|
|
|
assert state.fetching_done is False
|
|
|
|
|
assert state.processed_count == 0
|
|
|
|
|
assert state.failed_count == 0
|
|
|
|
|
assert state.details_fetched == 0
|
|
|
|
|
assert state.images_downloaded == 0
|
|
|
|
|
assert state.ocr_completed == 0
|
|
|
|
|
assert state.processed_listings == []
|
|
|
|
|
|
|
|
|
|
def test_incrementing_counters(self):
|
|
|
|
|
state = _PipelineState()
|
|
|
|
|
state.ids_collected += 5
|
|
|
|
|
state.completed_subqueries += 3
|
|
|
|
|
state.total_pages_fetched += 10
|
|
|
|
|
state.processed_count += 4
|
|
|
|
|
state.failed_count += 1
|
|
|
|
|
state.details_fetched += 4
|
|
|
|
|
state.images_downloaded += 3
|
|
|
|
|
state.ocr_completed += 2
|
|
|
|
|
|
|
|
|
|
assert state.ids_collected == 5
|
|
|
|
|
assert state.completed_subqueries == 3
|
|
|
|
|
assert state.total_pages_fetched == 10
|
|
|
|
|
assert state.processed_count == 4
|
|
|
|
|
assert state.failed_count == 1
|
|
|
|
|
assert state.details_fetched == 4
|
|
|
|
|
assert state.images_downloaded == 3
|
|
|
|
|
assert state.ocr_completed == 2
|
|
|
|
|
|
|
|
|
|
def test_appending_to_processed_listings(self):
|
|
|
|
|
state = _PipelineState()
|
|
|
|
|
state.processed_listings.append("listing_a")
|
|
|
|
|
state.processed_listings.append("listing_b")
|
|
|
|
|
assert len(state.processed_listings) == 2
|
|
|
|
|
assert state.processed_listings == ["listing_a", "listing_b"]
|
|
|
|
|
|
|
|
|
|
def test_separate_instances_have_independent_lists(self):
|
|
|
|
|
state_a = _PipelineState()
|
|
|
|
|
state_b = _PipelineState()
|
|
|
|
|
state_a.processed_listings.append("only_a")
|
|
|
|
|
assert state_b.processed_listings == []
|
|
|
|
|
|
|
|
|
|
def test_fetching_done_toggle(self):
|
|
|
|
|
state = _PipelineState()
|
|
|
|
|
assert state.fetching_done is False
|
|
|
|
|
state.fetching_done = True
|
|
|
|
|
assert state.fetching_done is True
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestPhaseConstants:
|
|
|
|
|
"""Tests for phase constant values."""
|
|
|
|
|
|
|
|
|
|
def test_phase_splitting(self):
|
|
|
|
|
assert PHASE_SPLITTING == "splitting"
|
|
|
|
|
|
|
|
|
|
def test_phase_fetching(self):
|
|
|
|
|
assert PHASE_FETCHING == "fetching"
|
|
|
|
|
|
|
|
|
|
def test_phase_processing(self):
|
|
|
|
|
assert PHASE_PROCESSING == "processing"
|
|
|
|
|
|
|
|
|
|
def test_phase_completed(self):
|
|
|
|
|
assert PHASE_COMPLETED == "completed"
|
|
|
|
|
|
|
|
|
|
def test_num_workers(self):
|
|
|
|
|
assert NUM_WORKERS == 20
|
wrongmove: round-3 fix sweep — scrape pipeline, BUY tab, filter URL state, render hygiene, map polish
Coordinated fix across 31 bugs found in a parallel QA pass. Findings docs at /tmp/wrongmove-bugs/qa-round-3/qa{1,2,3,4}-*.md.
## Backend / scrape (Fix-1) — 8 bugs
- B1 [P0] Scrape totally broken on prod: pod UID 100 vs NFS dir 1000:1000 mode 775 → PermissionError on every never-seen listing. Switched Dockerfile to explicit `useradd --uid 1000 --gid 1000`; added securityContext + chown initContainer to k8s/{api,celery-beat}-deployment.yaml. Celery worker manifest lives outside this repo — Dockerfile UID change is the load-bearing fix.
- B4 [P1] Celery broker reaped every ~30s by Redis HAProxy idle timeout. Added `broker_transport_options` / `result_backend_transport_options` with `socket_keepalive=True, health_check_interval=25` in celery_app.py + same kwargs on every redis.from_url/Redis call across services/, utils/redis_lock.py, redis_repository.py.
- B5 [P1] dump_listings_task never published terminal FAILURE to the task_progress pub/sub channel — UI polled forever. Wrap body in try/except that publishes FAILURE before re-raising.
- B6 [P1] _process_worker had no per-listing exception handler — one bad listing killed the whole scrape via asyncio.gather. Wrap loop body in try/except Exception (re-raises CancelledError).
- B20 [P2] dump_listings_task gained time_limit=3600, soft_time_limit=3500, acks_late=True.
- B21 [P2] RedisRepository moved off shared db0 (was alongside paperless-ngx) to db3 via REDIS_USER_DB env var; keys prefixed `wrongmove:user:`.
- B32 [P3] redis_lock now uses uuid4() owner token + Lua compare-and-delete.
- B33 [P3] Slack notify in refresh_listings → asyncio.create_task (fire-and-forget).
## Frontend filter system (Fix-2) — 7 bugs
- B2 [P0] BUY tab click triggered "Maximum update depth exceeded" → ErrorBoundary. Replaced the three mutually-triggering useEffects in FilterBar with a single one-way controlled-value flow (URL → parent state → form), guarded by previousListingTypeRef so price-defaults fires once per real transition.
- B3 [P0] Filter values never reached the URL. Wired useFilterParams.setFilterValues into FilterBar/FilterPanel onSubmit + handleRemoveChip + new handleResetAllFilters; fed parsed filterValues into both forms' defaultValues; added URL→form sync via form.reset on browser back/forward.
- B8 [P1] Chip removal now resets form state via new FilterBar onFormReady callback — More badge no longer sticks.
- B12 [P2] Desktop swipe-review FAB added next to header (mobile FAB unchanged).
- B17 [P2] "Reset all" affordance on chip strip.
- B22 [P2] formatPrice precision: 1500 → £1.5k, 2500 → £2.5k (no longer collides with £2k/£3k defaults).
- B30 [P3] last_seen_days input gained min={0}.
## Frontend render hygiene + data integrity (Fix-3) — 8 bugs
- B7 [P1] streamingService bails on first non-NDJSON chunk (HTML response = backend down) and throws StreamParseError so the existing AlertError dialog surfaces a single user-visible error instead of 18× console.error spam.
- B9 [P1] formatDuration widened to (null|undefined|number): returns "—" for non-finite or negative, caps implausibly large values.
- B10 [P1] PropertyCard / PropertyCardCompact / SwipeCard JSX leaves render "—" for null total_price/qm/qmprice (was "£0/0 m²/£0/m²" — looked like free listings).
- B13 [P2] hexgrid worker reduceAverage uses Number.isFinite filter instead of !isNaN (which incorrectly accepted null → 0, biasing per-hex averages low).
- B14 [P2] ListingDetail Overview wraps agency in "Listed by" labelled block so it can't collapse to a bare agency name.
- B15 [P2] Compact POIDistanceBadges iterates all three travel modes with "—" for missing, matching the detail-sheet Travel table.
- B24 [P3] Drawer.Description (sr-only) added to ListingDetailSheet + MobileBottomSheet to silence Radix a11y warning.
- B25 [P3] lastSeenDays clamped to ≥0 so future timestamps don't render as "-7d ago".
## Frontend map / carousel / tasks polish (Fix-4) — 8 bugs
- B11 [P2] HexgridHeatmapClient destroy race: Map.tsx adds .catch() + ref guard so post-destroy promise rejections are silent no-ops. Verified by browser smoke (24 rapid Map↔List toggles → 0 pageErrors).
- B16 [P2] PhotoCarousel + inner CardCarousel gained keyboard nav (Arrow keys).
- B18 [P2] Default map center moved from Czech Republic to London (zoom 10).
- B19+B29 [P2/P3] Mapbox token: no longer hard-coded fallback; reads env-only and shows a clear "Map unavailable — set VITE_MAPBOX_TOKEN" banner when missing.
- B23 [P3] PhotoCarousel suppresses "1/1" counter for single-photo listings; added onError fallback for broken URLs.
- B26 [P3] PhotoCarousel only enables loop when photos.length > 1.
- B27 [P3] TaskIndicator cancel/clear-all buttons gained aria-label + data-testid.
- B28 [P3] useTaskProgress strips terminal-local task IDs from the polling union — no more forever-poll on completed tasks.
## Tests
74 new vitest tests + 18 new pytest tests. Local: tsc clean, 201 vitest tests pass, 633 pytest tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 22:27:29 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Regression tests for QA-round-3 backend bugs (B5, B6, B20)
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestProcessWorkerExceptionHandling:
|
|
|
|
|
"""B6 regression: _process_worker must keep draining the queue when a
|
|
|
|
|
single listing raises an unhandled exception (e.g. PermissionError).
|
|
|
|
|
Previously one bad listing aborted the entire scrape."""
|
|
|
|
|
|
|
|
|
|
async def test_continues_after_per_listing_exception(self):
|
|
|
|
|
"""A PermissionError from one listing should not stop sibling listings."""
|
|
|
|
|
# Three listings in the queue followed by a None sentinel.
|
|
|
|
|
queue: asyncio.Queue[int | None] = asyncio.Queue()
|
|
|
|
|
for listing_id in [1, 2, 3]:
|
|
|
|
|
await queue.put(listing_id)
|
|
|
|
|
await queue.put(None)
|
|
|
|
|
|
|
|
|
|
# Processor: listing 1 succeeds, listing 2 raises, listing 3 succeeds.
|
|
|
|
|
good_listing = MagicMock()
|
|
|
|
|
|
|
|
|
|
async def fake_process_listing(listing_id, on_step_complete=None):
|
|
|
|
|
if listing_id == 2:
|
|
|
|
|
raise PermissionError("Permission denied: data/rs/2")
|
|
|
|
|
return good_listing
|
|
|
|
|
|
|
|
|
|
processor = MagicMock()
|
|
|
|
|
processor.process_listing = AsyncMock(side_effect=fake_process_listing)
|
|
|
|
|
|
|
|
|
|
state = _PipelineState()
|
|
|
|
|
reporter = MagicMock()
|
|
|
|
|
|
|
|
|
|
await _process_worker(queue, processor, state, reporter)
|
|
|
|
|
|
|
|
|
|
# All three IDs were attempted (queue drained before exit).
|
|
|
|
|
assert processor.process_listing.call_count == 3
|
|
|
|
|
# Two succeeded, one failed.
|
|
|
|
|
assert state.processed_count == 2
|
|
|
|
|
assert state.failed_count == 1
|
|
|
|
|
assert len(state.processed_listings) == 2
|
|
|
|
|
|
|
|
|
|
async def test_cancelled_error_propagates(self):
|
|
|
|
|
"""CancelledError must NOT be swallowed — cooperative cancellation
|
|
|
|
|
relies on it propagating up through asyncio.gather()."""
|
|
|
|
|
queue: asyncio.Queue[int | None] = asyncio.Queue()
|
|
|
|
|
await queue.put(99)
|
|
|
|
|
# No sentinel — the worker should bubble the CancelledError before
|
|
|
|
|
# ever getting a chance to drain further.
|
|
|
|
|
|
|
|
|
|
processor = MagicMock()
|
|
|
|
|
processor.process_listing = AsyncMock(side_effect=asyncio.CancelledError())
|
|
|
|
|
|
|
|
|
|
state = _PipelineState()
|
|
|
|
|
reporter = MagicMock()
|
|
|
|
|
|
|
|
|
|
with pytest.raises(asyncio.CancelledError):
|
|
|
|
|
await _process_worker(queue, processor, state, reporter)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestDumpListingsTaskFailurePublish:
|
|
|
|
|
"""B5 regression: dump_listings_task must publish a terminal FAILURE
|
|
|
|
|
event to the task_progress:<id> pub/sub channel when the underlying
|
|
|
|
|
scrape raises an exception. Previously only the happy-path SUCCESS
|
|
|
|
|
was published, leaving WebSocket subscribers stuck on the last
|
|
|
|
|
progress packet."""
|
|
|
|
|
|
|
|
|
|
@patch("tasks.listing_tasks.publish_task_progress")
|
|
|
|
|
@patch("tasks.listing_tasks.asyncio.run")
|
|
|
|
|
@patch("tasks.listing_tasks.redis_lock")
|
|
|
|
|
def test_publishes_failure_event_on_exception(
|
|
|
|
|
self, mock_redis_lock, mock_asyncio_run, mock_publish
|
|
|
|
|
):
|
|
|
|
|
"""When dump_listings_full raises, a FAILURE event is published."""
|
|
|
|
|
mock_cm = MagicMock()
|
|
|
|
|
mock_cm.__enter__ = MagicMock(return_value=True)
|
|
|
|
|
mock_cm.__exit__ = MagicMock(return_value=False)
|
|
|
|
|
mock_redis_lock.return_value = mock_cm
|
|
|
|
|
|
|
|
|
|
mock_asyncio_run.side_effect = PermissionError(
|
|
|
|
|
"[Errno 13] Permission denied: 'data/rs/12345'"
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
dump_listings_task.update_state = MagicMock()
|
|
|
|
|
|
|
|
|
|
# Force a deterministic task_id so we can assert on it.
|
|
|
|
|
with patch.object(
|
|
|
|
|
type(dump_listings_task),
|
|
|
|
|
"request",
|
|
|
|
|
new=MagicMock(id="fake-task-id"),
|
|
|
|
|
create=True,
|
|
|
|
|
):
|
|
|
|
|
with pytest.raises(PermissionError):
|
|
|
|
|
dump_listings_task.run(
|
|
|
|
|
'{"listing_type": "RENT", "min_price": 1000, "max_price": 5000}'
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
# Inspect publish_task_progress calls for a FAILURE event.
|
|
|
|
|
failure_calls = [
|
|
|
|
|
c for c in mock_publish.call_args_list
|
|
|
|
|
if len(c.args) >= 2 and c.args[1] == "FAILURE"
|
|
|
|
|
]
|
|
|
|
|
assert failure_calls, (
|
|
|
|
|
f"Expected a FAILURE publish, got: "
|
|
|
|
|
f"{[c.args[1] for c in mock_publish.call_args_list if len(c.args) >= 2]}"
|
|
|
|
|
)
|
|
|
|
|
# The meta payload must include an error message.
|
|
|
|
|
meta = failure_calls[0].args[2]
|
|
|
|
|
assert "error" in meta
|
|
|
|
|
assert "Permission denied" in meta["error"]
|
|
|
|
|
assert meta["exc_type"] == "PermissionError"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestDumpListingsTaskDecoratorConfig:
|
|
|
|
|
"""B20 regression: dump_listings_task must have time_limit /
|
|
|
|
|
soft_time_limit / acks_late configured so dead tasks reap themselves
|
|
|
|
|
even after pickup."""
|
|
|
|
|
|
|
|
|
|
def test_task_has_time_limits(self):
|
|
|
|
|
# Celery exposes these via the task attributes once decorated.
|
|
|
|
|
assert dump_listings_task.time_limit == 3600
|
|
|
|
|
assert dump_listings_task.soft_time_limit == 3500
|
|
|
|
|
|
|
|
|
|
def test_task_acks_late(self):
|
|
|
|
|
assert dump_listings_task.acks_late is True
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestCeleryAppKeepaliveOptions:
|
|
|
|
|
"""B4 regression: broker / result-backend transport options must
|
|
|
|
|
enable TCP keepalive and a Celery-level health check so the Redis
|
|
|
|
|
HAProxy in front of the in-cluster Sentinel doesn't reap idle
|
|
|
|
|
connections every 30s."""
|
|
|
|
|
|
|
|
|
|
def test_broker_transport_options_present(self):
|
|
|
|
|
from celery_app import app as celery_app
|
|
|
|
|
opts = celery_app.conf.get("broker_transport_options") or {}
|
|
|
|
|
assert opts.get("socket_keepalive") is True
|
|
|
|
|
assert opts.get("health_check_interval") == 25
|
|
|
|
|
|
|
|
|
|
def test_result_backend_transport_options_present(self):
|
|
|
|
|
from celery_app import app as celery_app
|
|
|
|
|
opts = celery_app.conf.get("result_backend_transport_options") or {}
|
|
|
|
|
assert opts.get("socket_keepalive") is True
|
|
|
|
|
assert opts.get("health_check_interval") == 25
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class TestRedisClientKeepalive:
|
|
|
|
|
"""B4 regression: every helper that creates a Redis client must
|
|
|
|
|
pass socket_keepalive=True and health_check_interval=25."""
|
|
|
|
|
|
|
|
|
|
@patch("services.task_progress_publisher.redis")
|
|
|
|
|
def test_task_progress_publisher_uses_keepalive(self, mock_redis):
|
|
|
|
|
# Reset the cached client so the patch takes effect.
|
|
|
|
|
import services.task_progress_publisher as m
|
|
|
|
|
m._redis_client = None
|
|
|
|
|
m._get_redis_client()
|
|
|
|
|
mock_redis.Redis.from_url.assert_called_once()
|
|
|
|
|
kwargs = mock_redis.Redis.from_url.call_args.kwargs
|
|
|
|
|
assert kwargs["socket_keepalive"] is True
|
|
|
|
|
assert kwargs["health_check_interval"] == 25
|
|
|
|
|
m._redis_client = None # leave the module clean for other tests
|
|
|
|
|
|
|
|
|
|
@patch("utils.redis_lock.redis")
|
|
|
|
|
def test_redis_lock_uses_keepalive(self, mock_redis):
|
|
|
|
|
from utils.redis_lock import get_redis_client
|
|
|
|
|
get_redis_client()
|
|
|
|
|
mock_redis.from_url.assert_called_once()
|
|
|
|
|
kwargs = mock_redis.from_url.call_args.kwargs
|
|
|
|
|
assert kwargs["socket_keepalive"] is True
|
|
|
|
|
assert kwargs["health_check_interval"] == 25
|