wrongmove: round-3 fix sweep — scrape pipeline, BUY tab, filter URL state, render hygiene, map polish

Coordinated fix across 31 bugs found in a parallel QA pass. Findings docs at /tmp/wrongmove-bugs/qa-round-3/qa{1,2,3,4}-*.md.

## Backend / scrape (Fix-1) — 8 bugs

- B1 [P0] Scrape totally broken on prod: pod UID 100 vs NFS dir 1000:1000 mode 775 → PermissionError on every never-seen listing. Switched Dockerfile to explicit `useradd --uid 1000 --gid 1000`; added securityContext + chown initContainer to k8s/{api,celery-beat}-deployment.yaml. Celery worker manifest lives outside this repo — Dockerfile UID change is the load-bearing fix.
- B4 [P1] Celery broker reaped every ~30s by Redis HAProxy idle timeout. Added `broker_transport_options` / `result_backend_transport_options` with `socket_keepalive=True, health_check_interval=25` in celery_app.py + same kwargs on every redis.from_url/Redis call across services/, utils/redis_lock.py, redis_repository.py.
- B5 [P1] dump_listings_task never published terminal FAILURE to the task_progress pub/sub channel — UI polled forever. Wrap body in try/except that publishes FAILURE before re-raising.
- B6 [P1] _process_worker had no per-listing exception handler — one bad listing killed the whole scrape via asyncio.gather. Wrap loop body in try/except Exception (re-raises CancelledError).
- B20 [P2] dump_listings_task gained time_limit=3600, soft_time_limit=3500, acks_late=True.
- B21 [P2] RedisRepository moved off shared db0 (was alongside paperless-ngx) to db3 via REDIS_USER_DB env var; keys prefixed `wrongmove:user:`.
- B32 [P3] redis_lock now uses uuid4() owner token + Lua compare-and-delete.
- B33 [P3] Slack notify in refresh_listings → asyncio.create_task (fire-and-forget).

## Frontend filter system (Fix-2) — 7 bugs

- B2 [P0] BUY tab click triggered "Maximum update depth exceeded" → ErrorBoundary. Replaced the three mutually-triggering useEffects in FilterBar with a single one-way controlled-value flow (URL → parent state → form), guarded by previousListingTypeRef so price-defaults fires once per real transition.
- B3 [P0] Filter values never reached the URL. Wired useFilterParams.setFilterValues into FilterBar/FilterPanel onSubmit + handleRemoveChip + new handleResetAllFilters; fed parsed filterValues into both forms' defaultValues; added URL→form sync via form.reset on browser back/forward.
- B8 [P1] Chip removal now resets form state via new FilterBar onFormReady callback — More badge no longer sticks.
- B12 [P2] Desktop swipe-review FAB added next to header (mobile FAB unchanged).
- B17 [P2] "Reset all" affordance on chip strip.
- B22 [P2] formatPrice precision: 1500 → £1.5k, 2500 → £2.5k (no longer collides with £2k/£3k defaults).
- B30 [P3] last_seen_days input gained min={0}.

## Frontend render hygiene + data integrity (Fix-3) — 8 bugs

- B7 [P1] streamingService bails on first non-NDJSON chunk (HTML response = backend down) and throws StreamParseError so the existing AlertError dialog surfaces a single user-visible error instead of 18× console.error spam.
- B9 [P1] formatDuration widened to (null|undefined|number): returns "—" for non-finite or negative, caps implausibly large values.
- B10 [P1] PropertyCard / PropertyCardCompact / SwipeCard JSX leaves render "—" for null total_price/qm/qmprice (was "£0/0 m²/£0/m²" — looked like free listings).
- B13 [P2] hexgrid worker reduceAverage uses Number.isFinite filter instead of !isNaN (which incorrectly accepted null → 0, biasing per-hex averages low).
- B14 [P2] ListingDetail Overview wraps agency in "Listed by" labelled block so it can't collapse to a bare agency name.
- B15 [P2] Compact POIDistanceBadges iterates all three travel modes with "—" for missing, matching the detail-sheet Travel table.
- B24 [P3] Drawer.Description (sr-only) added to ListingDetailSheet + MobileBottomSheet to silence Radix a11y warning.
- B25 [P3] lastSeenDays clamped to ≥0 so future timestamps don't render as "-7d ago".

## Frontend map / carousel / tasks polish (Fix-4) — 8 bugs

- B11 [P2] HexgridHeatmapClient destroy race: Map.tsx adds .catch() + ref guard so post-destroy promise rejections are silent no-ops. Verified by browser smoke (24 rapid Map↔List toggles → 0 pageErrors).
- B16 [P2] PhotoCarousel + inner CardCarousel gained keyboard nav (Arrow keys).
- B18 [P2] Default map center moved from Czech Republic to London (zoom 10).
- B19+B29 [P2/P3] Mapbox token: no longer hard-coded fallback; reads env-only and shows a clear "Map unavailable — set VITE_MAPBOX_TOKEN" banner when missing.
- B23 [P3] PhotoCarousel suppresses "1/1" counter for single-photo listings; added onError fallback for broken URLs.
- B26 [P3] PhotoCarousel only enables loop when photos.length > 1.
- B27 [P3] TaskIndicator cancel/clear-all buttons gained aria-label + data-testid.
- B28 [P3] useTaskProgress strips terminal-local task IDs from the polling union — no more forever-poll on completed tasks.

## Tests

74 new vitest tests + 18 new pytest tests. Local: tsc clean, 201 vitest tests pass, 633 pytest tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-05-10 22:27:29 +00:00
parent 0b5308200e
commit a42944a756
46 changed files with 2260 additions and 238 deletions

View file

@ -0,0 +1,37 @@
"""Regression test for Dockerfile UID/GID.
QA-round-3 B1: the production stage of the Dockerfile must create the
``appuser`` account with UID 1000 / GID 1000. Previously this used
``adduser --system`` which on Debian-slim assigns UID 100 / GID 65534
(nogroup), causing PermissionError when the scraper tried to create new
listing directories on the NFS-backed PVC (owned 1000:1000 mode 775).
"""
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parents[2]
DOCKERFILE = REPO_ROOT / "Dockerfile"
class TestDockerfileAppUser:
"""The Dockerfile production stage must run as uid 1000 / gid 1000."""
def test_production_stage_creates_user_with_uid_1000(self) -> None:
contents = DOCKERFILE.read_text()
# The fix uses `useradd --uid 1000 --gid 1000` (and a matching
# groupadd) instead of `adduser --system` which would assign uid 100.
assert "--uid 1000" in contents, (
"Dockerfile must create appuser with explicit --uid 1000 to "
"match NFS-backed data PVC ownership"
)
assert "--gid 1000" in contents, (
"Dockerfile must create appuser with explicit --gid 1000"
)
def test_production_stage_does_not_use_adduser_system(self) -> None:
"""`adduser --system` assigns uid 100 — must not be used."""
contents = DOCKERFILE.read_text()
assert "adduser --system" not in contents, (
"Dockerfile must not use `adduser --system` for appuser — "
"it assigns uid 100 which can't write to the 1000:1000 NFS mount"
)

View file

@ -52,6 +52,15 @@ class TestMakeCacheKey:
class TestGetRedisClient:
"""Tests for _get_redis_client() URL parsing."""
# Keepalive kwargs added to keep connection alive across the in-cluster
# Redis HAProxy 30s idle timeout. Same options are passed to every
# redis.from_url call in the codebase.
_KEEPALIVE_KWARGS = {
"decode_responses": True,
"socket_keepalive": True,
"health_check_interval": 25,
}
@mock.patch("services.listing_cache.redis")
def test_default_broker_url(self, mock_redis):
"""Uses default localhost URL when env var is not set."""
@ -59,7 +68,7 @@ class TestGetRedisClient:
_get_redis_client()
mock_redis.from_url.assert_called_once_with(
"redis://localhost:6379/2", decode_responses=True
"redis://localhost:6379/2", **self._KEEPALIVE_KWARGS
)
@mock.patch("services.listing_cache.redis")
@ -71,7 +80,7 @@ class TestGetRedisClient:
_get_redis_client()
mock_redis.from_url.assert_called_once_with(
"redis://myhost:1234/2", decode_responses=True
"redis://myhost:1234/2", **self._KEEPALIVE_KWARGS
)
@mock.patch("services.listing_cache.redis")
@ -84,7 +93,7 @@ class TestGetRedisClient:
_get_redis_client()
mock_redis.from_url.assert_called_once_with(
"redis://:secret@myhost:6379/2", decode_responses=True
"redis://:secret@myhost:6379/2", **self._KEEPALIVE_KWARGS
)
@mock.patch("services.listing_cache.redis")
@ -97,7 +106,7 @@ class TestGetRedisClient:
_get_redis_client()
mock_redis.from_url.assert_called_once_with(
"redis://myhost:6379/2?timeout=5", decode_responses=True
"redis://myhost:6379/2?timeout=5", **self._KEEPALIVE_KWARGS
)

View file

@ -1,4 +1,5 @@
"""Unit tests for tasks/listing_tasks.py."""
import asyncio
import json
import os
from collections import deque
@ -10,6 +11,7 @@ import tasks.listing_tasks as module
from tasks.listing_tasks import (
_update_task_state,
_PipelineState,
_process_worker,
TaskLogHandler,
SCRAPE_LOCK_NAME,
LOG_BUFFER_MAX_LINES,
@ -18,6 +20,7 @@ from tasks.listing_tasks import (
PHASE_FETCHING,
PHASE_PROCESSING,
PHASE_COMPLETED,
dump_listings_task,
)
@ -293,3 +296,174 @@ class TestPhaseConstants:
def test_num_workers(self):
assert NUM_WORKERS == 20
# ---------------------------------------------------------------------------
# Regression tests for QA-round-3 backend bugs (B5, B6, B20)
# ---------------------------------------------------------------------------
class TestProcessWorkerExceptionHandling:
"""B6 regression: _process_worker must keep draining the queue when a
single listing raises an unhandled exception (e.g. PermissionError).
Previously one bad listing aborted the entire scrape."""
async def test_continues_after_per_listing_exception(self):
"""A PermissionError from one listing should not stop sibling listings."""
# Three listings in the queue followed by a None sentinel.
queue: asyncio.Queue[int | None] = asyncio.Queue()
for listing_id in [1, 2, 3]:
await queue.put(listing_id)
await queue.put(None)
# Processor: listing 1 succeeds, listing 2 raises, listing 3 succeeds.
good_listing = MagicMock()
async def fake_process_listing(listing_id, on_step_complete=None):
if listing_id == 2:
raise PermissionError("Permission denied: data/rs/2")
return good_listing
processor = MagicMock()
processor.process_listing = AsyncMock(side_effect=fake_process_listing)
state = _PipelineState()
reporter = MagicMock()
await _process_worker(queue, processor, state, reporter)
# All three IDs were attempted (queue drained before exit).
assert processor.process_listing.call_count == 3
# Two succeeded, one failed.
assert state.processed_count == 2
assert state.failed_count == 1
assert len(state.processed_listings) == 2
async def test_cancelled_error_propagates(self):
"""CancelledError must NOT be swallowed — cooperative cancellation
relies on it propagating up through asyncio.gather()."""
queue: asyncio.Queue[int | None] = asyncio.Queue()
await queue.put(99)
# No sentinel — the worker should bubble the CancelledError before
# ever getting a chance to drain further.
processor = MagicMock()
processor.process_listing = AsyncMock(side_effect=asyncio.CancelledError())
state = _PipelineState()
reporter = MagicMock()
with pytest.raises(asyncio.CancelledError):
await _process_worker(queue, processor, state, reporter)
class TestDumpListingsTaskFailurePublish:
"""B5 regression: dump_listings_task must publish a terminal FAILURE
event to the task_progress:<id> pub/sub channel when the underlying
scrape raises an exception. Previously only the happy-path SUCCESS
was published, leaving WebSocket subscribers stuck on the last
progress packet."""
@patch("tasks.listing_tasks.publish_task_progress")
@patch("tasks.listing_tasks.asyncio.run")
@patch("tasks.listing_tasks.redis_lock")
def test_publishes_failure_event_on_exception(
self, mock_redis_lock, mock_asyncio_run, mock_publish
):
"""When dump_listings_full raises, a FAILURE event is published."""
mock_cm = MagicMock()
mock_cm.__enter__ = MagicMock(return_value=True)
mock_cm.__exit__ = MagicMock(return_value=False)
mock_redis_lock.return_value = mock_cm
mock_asyncio_run.side_effect = PermissionError(
"[Errno 13] Permission denied: 'data/rs/12345'"
)
dump_listings_task.update_state = MagicMock()
# Force a deterministic task_id so we can assert on it.
with patch.object(
type(dump_listings_task),
"request",
new=MagicMock(id="fake-task-id"),
create=True,
):
with pytest.raises(PermissionError):
dump_listings_task.run(
'{"listing_type": "RENT", "min_price": 1000, "max_price": 5000}'
)
# Inspect publish_task_progress calls for a FAILURE event.
failure_calls = [
c for c in mock_publish.call_args_list
if len(c.args) >= 2 and c.args[1] == "FAILURE"
]
assert failure_calls, (
f"Expected a FAILURE publish, got: "
f"{[c.args[1] for c in mock_publish.call_args_list if len(c.args) >= 2]}"
)
# The meta payload must include an error message.
meta = failure_calls[0].args[2]
assert "error" in meta
assert "Permission denied" in meta["error"]
assert meta["exc_type"] == "PermissionError"
class TestDumpListingsTaskDecoratorConfig:
"""B20 regression: dump_listings_task must have time_limit /
soft_time_limit / acks_late configured so dead tasks reap themselves
even after pickup."""
def test_task_has_time_limits(self):
# Celery exposes these via the task attributes once decorated.
assert dump_listings_task.time_limit == 3600
assert dump_listings_task.soft_time_limit == 3500
def test_task_acks_late(self):
assert dump_listings_task.acks_late is True
class TestCeleryAppKeepaliveOptions:
"""B4 regression: broker / result-backend transport options must
enable TCP keepalive and a Celery-level health check so the Redis
HAProxy in front of the in-cluster Sentinel doesn't reap idle
connections every 30s."""
def test_broker_transport_options_present(self):
from celery_app import app as celery_app
opts = celery_app.conf.get("broker_transport_options") or {}
assert opts.get("socket_keepalive") is True
assert opts.get("health_check_interval") == 25
def test_result_backend_transport_options_present(self):
from celery_app import app as celery_app
opts = celery_app.conf.get("result_backend_transport_options") or {}
assert opts.get("socket_keepalive") is True
assert opts.get("health_check_interval") == 25
class TestRedisClientKeepalive:
"""B4 regression: every helper that creates a Redis client must
pass socket_keepalive=True and health_check_interval=25."""
@patch("services.task_progress_publisher.redis")
def test_task_progress_publisher_uses_keepalive(self, mock_redis):
# Reset the cached client so the patch takes effect.
import services.task_progress_publisher as m
m._redis_client = None
m._get_redis_client()
mock_redis.Redis.from_url.assert_called_once()
kwargs = mock_redis.Redis.from_url.call_args.kwargs
assert kwargs["socket_keepalive"] is True
assert kwargs["health_check_interval"] == 25
m._redis_client = None # leave the module clean for other tests
@patch("utils.redis_lock.redis")
def test_redis_lock_uses_keepalive(self, mock_redis):
from utils.redis_lock import get_redis_client
get_redis_client()
mock_redis.from_url.assert_called_once()
kwargs = mock_redis.from_url.call_args.kwargs
assert kwargs["socket_keepalive"] is True
assert kwargs["health_check_interval"] == 25

View file

@ -6,69 +6,103 @@ import pytest
from utils.redis_lock import redis_lock, get_redis_client
def _setup_client(mock_get_client: mock.MagicMock, set_return: object = True) -> mock.MagicMock:
"""Return a MagicMock redis client wired up for the lock helper."""
mock_client = mock.MagicMock()
mock_client.set.return_value = set_return
mock_get_client.return_value = mock_client
return mock_client
class TestRedisLock:
"""Tests for redis_lock context manager."""
@mock.patch("utils.redis_lock.get_redis_client")
def test_lock_acquired_successfully(self, mock_get_client):
"""Test lock acquisition when no other lock exists."""
mock_client = mock.MagicMock()
mock_client.set.return_value = True
mock_get_client.return_value = mock_client
mock_client = _setup_client(mock_get_client)
with redis_lock("test_lock") as acquired:
assert acquired is True
mock_client.set.assert_called_once_with("lock:test_lock", "1", nx=True, ex=3600 * 4)
mock_client.delete.assert_called_once_with("lock:test_lock")
# Lock is set with the owner UUID, nx=True, and the configured TTL.
assert mock_client.set.call_count == 1
args, kwargs = mock_client.set.call_args
assert args[0] == "lock:test_lock"
assert isinstance(args[1], str) and len(args[1]) == 32 # uuid4 hex
assert kwargs == {"nx": True, "ex": 3600 * 4}
# Release happens via register_script (Lua CAS), not raw DEL.
mock_client.register_script.assert_called_once()
# The script wrapper is called once with the lock key and owner token.
release_script = mock_client.register_script.return_value
release_script.assert_called_once()
call_args = release_script.call_args
assert call_args.kwargs["keys"] == ["lock:test_lock"]
assert call_args.kwargs["args"][0] == args[1] # same owner token
@mock.patch("utils.redis_lock.get_redis_client")
def test_lock_not_acquired(self, mock_get_client):
"""Test lock not acquired when another lock exists."""
mock_client = mock.MagicMock()
mock_client.set.return_value = None # Redis returns None when nx=True fails
mock_get_client.return_value = mock_client
# Redis returns None when nx=True fails
mock_client = _setup_client(mock_get_client, set_return=None)
with redis_lock("test_lock") as acquired:
assert acquired is False
mock_client.set.assert_called_once_with("lock:test_lock", "1", nx=True, ex=3600 * 4)
# Should NOT call delete since we didn't acquire the lock
mock_client.delete.assert_not_called()
# Should NOT register or invoke the release script since we didn't acquire.
mock_client.register_script.assert_not_called()
@mock.patch("utils.redis_lock.get_redis_client")
def test_lock_released_on_exception(self, mock_get_client):
"""Test lock is released even when exception occurs."""
mock_client = mock.MagicMock()
mock_client.set.return_value = True
mock_get_client.return_value = mock_client
mock_client = _setup_client(mock_get_client)
with pytest.raises(ValueError):
with redis_lock("test_lock") as acquired:
assert acquired is True
raise ValueError("Test error")
# Lock should still be released
mock_client.delete.assert_called_once_with("lock:test_lock")
# Lock should still be released via the Lua CAS script.
mock_client.register_script.assert_called_once()
mock_client.register_script.return_value.assert_called_once()
@mock.patch("utils.redis_lock.get_redis_client")
def test_custom_timeout(self, mock_get_client):
"""Test lock with custom timeout."""
mock_client = mock.MagicMock()
mock_client.set.return_value = True
mock_get_client.return_value = mock_client
mock_client = _setup_client(mock_get_client)
with redis_lock("test_lock", timeout=300) as acquired:
assert acquired is True
mock_client.set.assert_called_once_with("lock:test_lock", "1", nx=True, ex=300)
# Only one SET call with the configured TTL.
args, kwargs = mock_client.set.call_args
assert args[0] == "lock:test_lock"
assert kwargs == {"nx": True, "ex": 300}
@mock.patch("utils.redis_lock.get_redis_client")
def test_owner_token_is_unique_per_acquisition(self, mock_get_client):
"""Each acquisition gets a fresh UUID owner token (fencing token)."""
mock_client = _setup_client(mock_get_client)
with redis_lock("test_lock"):
pass
token_first = mock_client.set.call_args[0][1]
with redis_lock("test_lock"):
pass
token_second = mock_client.set.call_args[0][1]
assert token_first != token_second
@mock.patch("utils.redis_lock.redis")
def test_get_redis_client_uses_broker_url(self, mock_redis):
"""Test Redis client is created from CELERY_BROKER_URL."""
"""Test Redis client is created from CELERY_BROKER_URL with keepalive."""
with mock.patch.dict("os.environ", {"CELERY_BROKER_URL": "redis://testhost:1234/5"}):
get_redis_client()
mock_redis.from_url.assert_called_once_with(
"redis://testhost:1234/5", decode_responses=True
"redis://testhost:1234/5",
decode_responses=True,
socket_keepalive=True,
health_check_interval=25,
)

View file

@ -0,0 +1,96 @@
"""Unit tests for redis_repository.RedisRepository.
Regression coverage for QA-round-3 B21: user state must live on a dedicated
Redis logical DB (default db3) with all keys prefixed by ``wrongmove:user:``.
"""
from datetime import timedelta
from unittest import mock
import pytest
def _patched_app() -> mock.MagicMock:
"""Return a MagicMock standing in for celery_app.app.broker_connection()."""
fake_app = mock.MagicMock()
fake_app.broker_connection.return_value.info.return_value = {
"hostname": "redis.test",
"port": 6379,
}
return fake_app
class TestRedisRepositoryDbSelection:
"""B21 regression: RedisRepository writes to a dedicated DB (default 3)."""
def test_defaults_to_db_3(self):
import redis_repository
with mock.patch.dict("os.environ", {}, clear=False) as _env:
_env.pop("REDIS_USER_DB", None)
with mock.patch.object(redis_repository, "redis") as mock_redis, \
mock.patch.object(redis_repository, "app", _patched_app()):
redis_repository.RedisRepository()
kwargs = mock_redis.Redis.call_args.kwargs
assert kwargs["db"] == 3
assert kwargs["socket_keepalive"] is True
assert kwargs["health_check_interval"] == 25
def test_honours_REDIS_USER_DB_env_var(self):
import redis_repository
with mock.patch.dict("os.environ", {"REDIS_USER_DB": "7"}):
with mock.patch.object(redis_repository, "redis") as mock_redis, \
mock.patch.object(redis_repository, "app", _patched_app()):
redis_repository.RedisRepository()
kwargs = mock_redis.Redis.call_args.kwargs
assert kwargs["db"] == 7
class TestRedisRepositoryKeyPrefix:
"""B21 regression: all keys written via set_key/get_key are namespaced
with the ``wrongmove:user:`` prefix to avoid collisions on a shared
Redis instance."""
def _fresh_repo(self) -> tuple[object, mock.MagicMock]:
import redis_repository
with mock.patch.object(redis_repository, "redis"), \
mock.patch.object(redis_repository, "app", _patched_app()):
repo = redis_repository.RedisRepository()
fake_client = mock.MagicMock()
repo.redis_client = fake_client
return repo, fake_client
def test_set_key_prepends_prefix(self):
repo, fake_client = self._fresh_repo()
repo.set_key("webauthn:challenge:abc", {"x": 1})
# First positional arg to .set is the key.
set_call_key = fake_client.set.call_args.args[0]
assert set_call_key == "wrongmove:user:webauthn:challenge:abc"
# expire uses the same prefixed key.
expire_call_key = fake_client.expire.call_args.args[0]
assert expire_call_key == "wrongmove:user:webauthn:challenge:abc"
def test_get_key_uses_prefix(self):
repo, fake_client = self._fresh_repo()
fake_client.get.return_value = '{"hello": "world"}'
result = repo.get_key("webauthn:challenge:abc")
fake_client.get.assert_called_once_with(
"wrongmove:user:webauthn:challenge:abc"
)
assert result == {"hello": "world"}
def test_does_not_double_prefix_already_prefixed_key(self):
repo, fake_client = self._fresh_repo()
repo.set_key("wrongmove:user:already-prefixed", {"x": 1})
set_call_key = fake_client.set.call_args.args[0]
assert set_call_key == "wrongmove:user:already-prefixed"
def test_add_task_for_user_uses_prefixed_key(self):
repo, fake_client = self._fresh_repo()
fake_client.get.return_value = None # no prior tasks
from api.auth import User
user = User(sub="", email="test@example.com", name="")
repo.add_task_for_user(user, "task-xyz")
# The redis SET should target the namespaced key.
set_call_key = fake_client.set.call_args.args[0]
assert set_call_key == "wrongmove:user:test@example.com/tasks"