Add spec.paused: null to the strategic merge patch to clear any
paused state that might be blocking rollouts. Also add detailed
deployment status debug output (conditions, replica counts,
paused state) and list ReplicaSets to diagnose why new pods
aren't being created despite spec changes.
The JSON Patch was updating the deployment spec image but K8s wasn't
creating new pods — the rollout was stuck from previous failed builds.
Switch to strategic-merge-patch that also sets restartedAt annotation,
equivalent to 'kubectl rollout restart', which forces K8s to create
a new ReplicaSet regardless of stalled rollout state.
- Add prefers-reduced-motion support (global CSS + SwipeCard spring config)
- Add dismissible map click hint overlay with localStorage persistence
- Replace generic image alt text with descriptive property info across 4 components
- Rename filter buttons to clarify intent (Show Matching Listings / Scrape New from Rightmove)
- Fix mobile FAB overlap with bottom sheet via dynamic snap-aware positioning
- Add swipe review onboarding overlay with gesture explanations and button labels
- Delete unused components: AppSidebar, ActiveQuery, SavedView
Show deployment spec image to confirm patch worked, dump all pod
details (image, ready, phase, restarts, reason) on attempts 1/10/30
and on failure. This will reveal whether pods are crashlooping, stuck
in Pending, or if the deployment patch didn't take effect.
Init OTel metrics at module level in celery_app.py so prefork child
processes inherit the MeterProvider and PrometheusMetricReader from
the parent. Previously, worker_process_init created a separate
MeterProvider in each child, disconnected from the HTTP server in
the main process — so all scrape/celery/OCR metrics were silently
lost.
Update Grafana dashboard with API Performance and Frontend Performance
sections, synced from the live cluster dashboard.
Kubernetes may normalize image references to include docker.io/ prefix
(e.g., docker.io/viktorbarzin/realestatecrawler:356), causing exact
string match to fail. Use endswith() instead. Also add debug output
on first attempt to show actual pod images and readiness state.
Collect browser-side worker round-trips, computation times, main-thread
operations, and feature counts, batch them client-side, and expose as
Prometheus histograms via a new POST /api/perf endpoint.
The jq filter required pods to be younger than 60 seconds AND ready,
but pods often take longer than 60s to become ready (alembic migrations,
image pull, etc.), causing the check to never match. The image tag check
is sufficient to confirm the new version is running. This aligns the API
verify-deploy with the frontend verify-deploy which has no age filter.
R-tree building, hex grid generation, and percentile sorting now run
off the main thread, eliminating 20s+ UI freezes on large datasets.
The old bundled HexgridHeatmap.js is replaced by a typed worker +
main-thread client with dual R-trees (worker for grid gen, main
thread for synchronous click queries).
Listing stream fires immediately on auth without waiting for POI
fetch. POI distances are not needed for initial rendering and are
only computed when user selects POI metric or sets travel filters.
This saves ~200-500ms on initial load and keeps the stream on the
cached Redis path.
New GET /api/poi/distances/bulk returns all POI distances keyed by
listing ID, allowing the frontend to fetch distances separately
from the listing stream and keep the stream on the cached path.
Decision IDs are now loaded inside the streaming generators after
the metadata message is yielded, eliminating a blocking DB query
from the pre-stream path (~100-200ms improvement to TTFB).
Subsequent batches use the normal batch_size (default 50). This
reduces server-side time-to-first-property by ~10x since only 5
features need to be serialized before the first yield.
Overrides LimitRange defaults (500m CPU) which caused kernel CPU
throttling during streaming requests. API gets 2000m CPU limit,
celery-beat gets 200m.
Measured baseline: 877ms TTFP cold, 334ms warm, with CPU throttling
confirmed (500m limit, 10-12 throttle events per streaming request).
Plan covers: CPU limit bump, frontend waterfall elimination, adaptive
first batch, deferred decision filtering, POI decoupling, and
Server-Timing headers.
The X button aborts the in-flight fetch via AbortController,
which was already wired up but had no UI trigger. Works for
both desktop and mobile views.
Move drag handler from outer card to details section only.
Swiping on photos now navigates the carousel, swiping on
the details area below triggers like/dislike/skip actions.
Rightmove API stores photos under the 'photos' key in the response,
but the GeoJSON export and detail API were only checking 'images'.
This key mismatch caused all listings to fall back to the single
photo_thumbnail. Now checks both keys with fallback.
- Use maxSizeUrl instead of url for photo URLs (highest available
resolution from Rightmove)
- Remove 5-photo cap in GeoJSON export — return all available photos
- Apply same fix to both streaming and model-based export paths,
and to the listing detail API endpoint
- Card now fills available height with photo carousel in top half
and property details in bottom half
- Details section shows: price, beds/sqm/price-per-sqm, agency,
available date, all POI distances, and price history summary
- Fix DialogTitle accessibility warning in ListingDetailSheet and
MobileBottomSheet (add sr-only Drawer.Title)
SwipeCard now shows all available photos (up to 5) using
embla-carousel instead of just the thumbnail. Includes prev/next
arrow buttons and dot indicators. The photo area uses touch-action:
pan-x so carousel swipes don't trigger card swipes.
The age filter rejected pods older than 60s, but by the time the
verify step runs after build+publish+deploy, the new pods are often
already past that window. The image tag check is sufficient to
identify the correct deployment.
Same pattern as the frontend pipeline: build pushes to a staging
tag (build-N) in parallel with tests. Tests split into unit and
integration shards using a shared venv from the workspace. Publish
step uses skopeo to copy the manifest to final tags (:N, :latest,
:builder) only after all tests and the build succeed. Named the
final Dockerfile stage 'production' so target skips the test stage.
Build step now pushes to a staging tag (build-N) in parallel with
test shards. A new publish step uses skopeo to copy the manifest to
the final tags (:N and :latest) only after all tests pass. This is
a server-side manifest copy with no layer re-upload.
- Add onTap callback to SwipeCard using useDrag's tap detection
- Wire through SwipeReviewMode to open ListingDetailSheet on tap
- Fix color overlay misalignment: add relative to card container so
the absolute overlay positions within the rounded card, not the
full-width outer wrapper
Split the monolithic "build and test" kaniko step into a DAG:
tests run in 4 parallel shards (vitest --shard) alongside the
Docker image build, gated by a shared npm ci step. The kaniko
build now targets the named 'production' stage to skip the
in-Dockerfile test stage.
- Replace pip with uv for 10-25x faster dependency installation
- Add BuildKit cache mounts for apt and uv caches across builds
- Add non-root appuser for improved container security
- Add HEALTHCHECK directive for container orchestration
- Add curl to runtime for healthcheck support
- Remove libgl1 (unused), add syntax directive for BuildKit