Add time-to-first-property implementation plan
7 tasks: K8s resource limits, adaptive first batch, deferred decision fetch, Server-Timing headers, bulk POI distances endpoint, frontend waterfall elimination, and end-to-end verification.
This commit is contained in:
parent
6d653dba63
commit
8db7b60493
1 changed files with 683 additions and 0 deletions
683
docs/plans/2026-02-22-time-to-first-property-plan.md
Normal file
683
docs/plans/2026-02-22-time-to-first-property-plan.md
Normal file
|
|
@ -0,0 +1,683 @@
|
||||||
|
# Time-to-First-Property Performance Optimization — Implementation Plan
|
||||||
|
|
||||||
|
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||||
|
|
||||||
|
**Goal:** Reduce time-to-first-property from 5s+ to under 1s by eliminating CPU throttling, frontend waterfalls, and pre-stream blocking work.
|
||||||
|
|
||||||
|
**Architecture:** Six independent changes: (1) K8s resource limits fix, (2) adaptive first-batch sizing on the backend, (3) deferred decision-ID fetch, (4) Server-Timing headers, (5) bulk POI distances endpoint, (6) frontend waterfall elimination. Tasks 1-4 are backend-only; task 5 spans backend+frontend; task 6 is frontend-only.
|
||||||
|
|
||||||
|
**Tech Stack:** Python/FastAPI, Redis, K8s YAML, React/TypeScript, NDJSON streaming
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 1: Set Explicit API Pod Resources in K8s
|
||||||
|
|
||||||
|
The `realestate-crawler-api` deployment inherits LimitRange defaults (500m CPU / 1Gi memory). The 500m CPU limit causes kernel throttling during streaming requests (~240-460ms lost per request).
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Create: `k8s/api-deployment.yaml`
|
||||||
|
|
||||||
|
**Step 1: Create the K8s deployment manifest**
|
||||||
|
|
||||||
|
Extract the current deployment, then set explicit resources. Run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
KUBECONFIG=.config kubectl get deployment realestate-crawler-api -n realestate-crawler -o yaml > k8s/api-deployment.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
Then edit `k8s/api-deployment.yaml` to set these resource values on the container spec:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
cpu: "50m"
|
||||||
|
memory: "128Mi"
|
||||||
|
limits:
|
||||||
|
cpu: "2000m"
|
||||||
|
memory: "1Gi"
|
||||||
|
```
|
||||||
|
|
||||||
|
Remove server-managed fields (`managedFields`, `resourceVersion`, `uid`, `creationTimestamp`, `generation`, `status`) to make it a clean declarative manifest.
|
||||||
|
|
||||||
|
**Step 2: Also create a manifest for celery-beat**
|
||||||
|
|
||||||
|
Extract and clean up `realestate-crawler-celery-beat` the same way. It only needs minimal resources:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
cpu: "10m"
|
||||||
|
memory: "64Mi"
|
||||||
|
limits:
|
||||||
|
cpu: "200m"
|
||||||
|
memory: "256Mi"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 3: Apply and verify**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
KUBECONFIG=.config kubectl apply -f k8s/api-deployment.yaml
|
||||||
|
KUBECONFIG=.config kubectl rollout status deployment/realestate-crawler-api -n realestate-crawler
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify throttling is eliminated by re-running the performance test from the design doc and checking `nr_throttled` delta is 0.
|
||||||
|
|
||||||
|
**Step 4: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add k8s/api-deployment.yaml k8s/celery-beat-deployment.yaml
|
||||||
|
git commit -m "Set explicit resource limits for API and celery-beat pods
|
||||||
|
|
||||||
|
Overrides LimitRange defaults (500m CPU) which caused kernel CPU
|
||||||
|
throttling during streaming requests. API gets 2000m CPU limit,
|
||||||
|
celery-beat gets 200m."
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 2: Adaptive First Batch Size
|
||||||
|
|
||||||
|
Send a small "primer" batch (5 features) immediately, then switch to normal batch size. This reduces first-property latency by ~10x on the server side.
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `api/app.py:55` (add `FIRST_BATCH_SIZE` constant)
|
||||||
|
- Modify: `api/app.py:271-311` (`_stream_from_cache`)
|
||||||
|
- Modify: `api/app.py:314-383` (`_stream_from_db`)
|
||||||
|
- Test: `tests/test_listing_geojson.py`
|
||||||
|
|
||||||
|
**Step 1: Write the failing test**
|
||||||
|
|
||||||
|
Add to `tests/test_listing_geojson.py` in `TestStreamingEndpoint`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def test_first_batch_is_smaller(self, client, mock_repository):
|
||||||
|
"""Test that the first batch is smaller than subsequent batches for fast first paint."""
|
||||||
|
response = client.get("/api/listing_geojson/stream?listing_type=RENT&batch_size=50&limit=10")
|
||||||
|
lines = response.text.strip().split("\n")
|
||||||
|
messages = [json.loads(line) for line in lines]
|
||||||
|
|
||||||
|
batch_messages = [m for m in messages if m["type"] == "batch"]
|
||||||
|
assert len(batch_messages) >= 1
|
||||||
|
|
||||||
|
# First batch should contain FIRST_BATCH_SIZE (5) or fewer features
|
||||||
|
from api.app import FIRST_BATCH_SIZE
|
||||||
|
assert len(batch_messages[0]["features"]) <= FIRST_BATCH_SIZE
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pytest tests/test_listing_geojson.py::TestStreamingEndpoint::test_first_batch_is_smaller -v
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: FAIL — `FIRST_BATCH_SIZE` not defined, or first batch has 50 features.
|
||||||
|
|
||||||
|
**Step 3: Implement adaptive batching**
|
||||||
|
|
||||||
|
In `api/app.py`, add the constant near line 55:
|
||||||
|
|
||||||
|
```python
|
||||||
|
FIRST_BATCH_SIZE = 5
|
||||||
|
```
|
||||||
|
|
||||||
|
In `_stream_from_cache` (around line 291), replace the single-batch-size loop with adaptive logic:
|
||||||
|
|
||||||
|
```python
|
||||||
|
count = 0
|
||||||
|
is_first_batch = True
|
||||||
|
for feature_batch in get_cached_features(query_parameters, batch_size=batch_size):
|
||||||
|
# Apply decision filtering
|
||||||
|
if decision_filter != "everything":
|
||||||
|
feature_batch = [
|
||||||
|
f for f in feature_batch
|
||||||
|
if _should_include(
|
||||||
|
f.get("properties", {}).get("id", 0),
|
||||||
|
decision_filter,
|
||||||
|
disliked_ids,
|
||||||
|
liked_ids,
|
||||||
|
)
|
||||||
|
]
|
||||||
|
if limit and count + len(feature_batch) > limit:
|
||||||
|
feature_batch = feature_batch[:limit - count]
|
||||||
|
count += len(feature_batch)
|
||||||
|
if feature_batch:
|
||||||
|
if is_first_batch:
|
||||||
|
# Yield a small primer batch for fast first paint
|
||||||
|
yield json.dumps({"type": "batch", "features": feature_batch[:FIRST_BATCH_SIZE]}) + "\n"
|
||||||
|
remainder = feature_batch[FIRST_BATCH_SIZE:]
|
||||||
|
if remainder:
|
||||||
|
yield json.dumps({"type": "batch", "features": remainder}) + "\n"
|
||||||
|
is_first_batch = False
|
||||||
|
else:
|
||||||
|
yield json.dumps({"type": "batch", "features": feature_batch}) + "\n"
|
||||||
|
if limit and count >= limit:
|
||||||
|
break
|
||||||
|
```
|
||||||
|
|
||||||
|
In `_stream_from_db` (around line 358-366), replace the batch accumulation logic:
|
||||||
|
|
||||||
|
```python
|
||||||
|
count = 0
|
||||||
|
batch: list[dict] = []
|
||||||
|
is_first_batch = True
|
||||||
|
current_batch_target = FIRST_BATCH_SIZE
|
||||||
|
for row in repository.stream_listings_optimized(
|
||||||
|
query_parameters, limit=limit, page_size=batch_size
|
||||||
|
):
|
||||||
|
feature = convert_row_to_geojson(row, query_parameters.listing_type.value)
|
||||||
|
if poi_distances_lookup and row['id'] in poi_distances_lookup:
|
||||||
|
feature['properties']['poi_distances'] = poi_distances_lookup[row['id']]
|
||||||
|
|
||||||
|
if not _should_include(row['id'], decision_filter, disliked_ids, liked_ids):
|
||||||
|
if staging_key:
|
||||||
|
cache_features_batch_staged(staging_key, [feature])
|
||||||
|
continue
|
||||||
|
|
||||||
|
batch.append(feature)
|
||||||
|
count += 1
|
||||||
|
|
||||||
|
if len(batch) >= current_batch_target:
|
||||||
|
if staging_key:
|
||||||
|
cache_features_batch_staged(staging_key, batch)
|
||||||
|
yield json.dumps({"type": "batch", "features": batch}) + "\n"
|
||||||
|
batch = []
|
||||||
|
if is_first_batch:
|
||||||
|
is_first_batch = False
|
||||||
|
current_batch_target = batch_size
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 4: Run tests**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pytest tests/test_listing_geojson.py -v
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: All pass, including `test_first_batch_is_smaller`.
|
||||||
|
|
||||||
|
**Step 5: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add api/app.py tests/test_listing_geojson.py
|
||||||
|
git commit -m "Send smaller first batch (5 features) for faster first paint
|
||||||
|
|
||||||
|
Subsequent batches use the normal batch_size (default 50). This
|
||||||
|
reduces server-side time-to-first-property by ~10x since only 5
|
||||||
|
features need to be serialized before the first yield."
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 3: Defer Decision ID Fetch
|
||||||
|
|
||||||
|
Move the decision ID lookup out of the blocking pre-stream path. Start streaming immediately, fetch decision IDs concurrently.
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `api/app.py:385-459` (`stream_listing_geojson`)
|
||||||
|
- Modify: `api/app.py:271-311` (`_stream_from_cache`)
|
||||||
|
- Modify: `api/app.py:314-383` (`_stream_from_db`)
|
||||||
|
- Test: `tests/test_listing_geojson.py`
|
||||||
|
|
||||||
|
**Step 1: Write the failing test**
|
||||||
|
|
||||||
|
Add to `tests/test_listing_geojson.py` in `TestStreamingEndpoint`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def test_streaming_still_filters_disliked(self, client):
|
||||||
|
"""Test that disliked listings are filtered even with deferred decision loading."""
|
||||||
|
with patch("api.app.get_cached_count", return_value=None), \
|
||||||
|
patch("api.app._get_user_id_safe", return_value=42), \
|
||||||
|
patch("api.app.ListingRepository") as MockRepo, \
|
||||||
|
patch("api.app.DecisionRepository"), \
|
||||||
|
patch("api.app.decision_service") as mock_ds:
|
||||||
|
|
||||||
|
mock_ds.get_disliked_ids.return_value = {2}
|
||||||
|
|
||||||
|
mock_instance = MagicMock()
|
||||||
|
mock_instance.count_listings.return_value = 3
|
||||||
|
mock_instance.stream_listings_optimized.return_value = iter([
|
||||||
|
{
|
||||||
|
'id': 1, 'price': 2000.0, 'number_of_bedrooms': 2,
|
||||||
|
'square_meters': 50.0, 'longitude': -0.1, 'latitude': 51.5,
|
||||||
|
'photo_thumbnail': None, 'last_seen': datetime.now(),
|
||||||
|
'agency': None, 'price_history_json': '[]', 'available_from': None,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'id': 2, 'price': 2500.0, 'number_of_bedrooms': 2,
|
||||||
|
'square_meters': 60.0, 'longitude': -0.12, 'latitude': 51.51,
|
||||||
|
'photo_thumbnail': None, 'last_seen': datetime.now(),
|
||||||
|
'agency': None, 'price_history_json': '[]', 'available_from': None,
|
||||||
|
},
|
||||||
|
])
|
||||||
|
MockRepo.return_value = mock_instance
|
||||||
|
|
||||||
|
response = client.get("/api/listing_geojson/stream?listing_type=RENT&limit=10")
|
||||||
|
messages = [json.loads(line) for line in response.text.strip().split("\n")]
|
||||||
|
|
||||||
|
all_features = []
|
||||||
|
for m in messages:
|
||||||
|
if m["type"] == "batch":
|
||||||
|
all_features.extend(m["features"])
|
||||||
|
|
||||||
|
# Listing 2 should be filtered out
|
||||||
|
feature_ids = [f["properties"]["id"] for f in all_features]
|
||||||
|
assert 2 not in feature_ids
|
||||||
|
assert 1 in feature_ids
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 2: Run test to verify it passes (existing behavior)**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pytest tests/test_listing_geojson.py::TestStreamingEndpoint::test_streaming_still_filters_disliked -v
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: PASS — this is a behavior-preservation test.
|
||||||
|
|
||||||
|
**Step 3: Refactor to defer decision loading**
|
||||||
|
|
||||||
|
In `stream_listing_geojson` (line ~415-431), replace the synchronous decision loading with a lazy loader pattern. Pass the user email and decision_filter to the generators, and let them fetch IDs on demand:
|
||||||
|
|
||||||
|
Change the `stream_listing_geojson` endpoint to pass `user_email` and `decision_filter` to the generators instead of pre-fetched ID sets. The generators call `_get_user_id_safe` and `decision_service` internally after yielding the metadata message.
|
||||||
|
|
||||||
|
In both `_stream_from_cache` and `_stream_from_db`, add parameters `user_email: str | None = None` and `decision_filter: str = "all"`. After yielding the metadata message, resolve the decision IDs:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Resolve decision IDs (deferred to after metadata is sent)
|
||||||
|
disliked_ids: set[int] | None = None
|
||||||
|
liked_ids: set[int] | None = None
|
||||||
|
if decision_filter != "everything" and user_email:
|
||||||
|
user_id = _get_user_id_safe(user_email)
|
||||||
|
if user_id is not None:
|
||||||
|
decision_repo = DecisionRepository(engine)
|
||||||
|
listing_type_str = query_parameters.listing_type.value
|
||||||
|
if decision_filter == "liked":
|
||||||
|
liked_ids = decision_service.get_liked_ids(
|
||||||
|
decision_repo, user_id, listing_type_str
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
disliked_ids = decision_service.get_disliked_ids(
|
||||||
|
decision_repo, user_id, listing_type_str
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
Remove the `disliked_ids` and `liked_ids` parameters from the generator signatures and the pre-fetch from `stream_listing_geojson`.
|
||||||
|
|
||||||
|
**Step 4: Run all tests**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pytest tests/test_listing_geojson.py -v
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: All pass — decision filtering still works, but now happens after the metadata message.
|
||||||
|
|
||||||
|
**Step 5: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add api/app.py tests/test_listing_geojson.py
|
||||||
|
git commit -m "Defer decision ID fetch to after metadata message
|
||||||
|
|
||||||
|
Decision IDs are now loaded inside the streaming generators after
|
||||||
|
the metadata message is yielded, eliminating a blocking DB query
|
||||||
|
from the pre-stream path (~100-200ms improvement to TTFB)."
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 4: Add Server-Timing Headers
|
||||||
|
|
||||||
|
Add `Server-Timing` headers to the streaming response so latency breakdown is visible in browser DevTools.
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `api/app.py:385-459` (`stream_listing_geojson`)
|
||||||
|
- Test: `tests/test_listing_geojson.py`
|
||||||
|
|
||||||
|
**Step 1: Write the failing test**
|
||||||
|
|
||||||
|
Add to `TestStreamingEndpoint`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def test_streaming_includes_server_timing_header(self, client, mock_repository):
|
||||||
|
"""Test that streaming response includes Server-Timing header."""
|
||||||
|
response = client.get("/api/listing_geojson/stream?listing_type=RENT&limit=10")
|
||||||
|
assert response.status_code == 200
|
||||||
|
assert "server-timing" in response.headers
|
||||||
|
# Should contain at least the cache_check timing
|
||||||
|
assert "cache_check" in response.headers["server-timing"]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pytest tests/test_listing_geojson.py::TestStreamingEndpoint::test_streaming_includes_server_timing_header -v
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: FAIL — no `server-timing` header.
|
||||||
|
|
||||||
|
**Step 3: Implement Server-Timing**
|
||||||
|
|
||||||
|
In `stream_listing_geojson`, add timing around the cache check and build a `Server-Timing` header:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import time
|
||||||
|
|
||||||
|
# ... inside stream_listing_geojson, after rate limit logic:
|
||||||
|
timings: list[str] = []
|
||||||
|
|
||||||
|
t0 = time.monotonic()
|
||||||
|
cached_count = get_cached_count(query_parameters)
|
||||||
|
timings.append(f"cache_check;dur={(time.monotonic() - t0) * 1000:.1f}")
|
||||||
|
|
||||||
|
# ... rest of the logic to choose generator ...
|
||||||
|
|
||||||
|
return StreamingResponse(
|
||||||
|
generator,
|
||||||
|
media_type="application/x-ndjson",
|
||||||
|
headers={
|
||||||
|
"Cache-Control": "no-cache",
|
||||||
|
"X-Accel-Buffering": "no",
|
||||||
|
"Server-Timing": ", ".join(timings),
|
||||||
|
}
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 4: Run tests**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pytest tests/test_listing_geojson.py -v
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: All pass.
|
||||||
|
|
||||||
|
**Step 5: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add api/app.py tests/test_listing_geojson.py
|
||||||
|
git commit -m "Add Server-Timing header to streaming endpoint
|
||||||
|
|
||||||
|
Reports cache_check latency in the response header, visible in
|
||||||
|
browser DevTools Network tab for ongoing performance monitoring."
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 5: Bulk POI Distances Endpoint
|
||||||
|
|
||||||
|
Add a new endpoint that returns all POI distances for a user in one call, keyed by listing ID. This decouples POI distances from the listing stream, allowing the stream to always hit the Redis cache.
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `api/poi_routes.py` (add new endpoint)
|
||||||
|
- Modify: `frontend/src/services/poiService.ts` (add bulk fetch function)
|
||||||
|
- Modify: `frontend/src/services/index.ts` (re-export)
|
||||||
|
- Modify: `frontend/src/constants/index.ts` (add endpoint constant)
|
||||||
|
- Test: `tests/unit/test_poi_validation.py` or new test file
|
||||||
|
|
||||||
|
**Step 1: Write the failing test**
|
||||||
|
|
||||||
|
Create or add to a test file for the POI routes:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# In tests/test_listing_geojson.py or a new tests/test_poi_distances_bulk.py
|
||||||
|
|
||||||
|
def test_bulk_poi_distances_returns_dict(client):
|
||||||
|
"""Test bulk POI distances endpoint returns listing_id -> distances mapping."""
|
||||||
|
with patch("api.poi_routes._get_user_id", return_value=42), \
|
||||||
|
patch("api.poi_routes.POIRepository") as MockRepo, \
|
||||||
|
patch("api.poi_routes.poi_service") as mock_svc:
|
||||||
|
|
||||||
|
mock_pois = [MagicMock(id=1, name="Office")]
|
||||||
|
mock_svc.get_user_pois.return_value = mock_pois
|
||||||
|
mock_svc.get_distances_for_listings.return_value = [
|
||||||
|
MagicMock(listing_id=100, poi_id=1, travel_mode="WALK",
|
||||||
|
duration_seconds=600, distance_meters=800),
|
||||||
|
]
|
||||||
|
|
||||||
|
mock_repo_instance = MagicMock()
|
||||||
|
MockRepo.return_value = mock_repo_instance
|
||||||
|
mock_repo_instance.get_listing_ids.return_value = [100, 200]
|
||||||
|
|
||||||
|
response = client.get("/api/poi/distances/bulk?listing_type=RENT")
|
||||||
|
assert response.status_code == 200
|
||||||
|
data = response.json()
|
||||||
|
assert "100" in data # JSON keys are strings
|
||||||
|
assert data["100"][0]["poi_name"] == "Office"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pytest tests/test_poi_distances_bulk.py -v # or wherever you placed it
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: FAIL — endpoint doesn't exist.
|
||||||
|
|
||||||
|
**Step 3: Implement the bulk endpoint**
|
||||||
|
|
||||||
|
In `api/poi_routes.py`, add:
|
||||||
|
|
||||||
|
```python
|
||||||
|
@poi_router.get("/distances/bulk")
|
||||||
|
async def get_bulk_distances(
|
||||||
|
user: Annotated[User, Depends(get_current_user)],
|
||||||
|
listing_type: ListingType = ListingType.RENT,
|
||||||
|
) -> dict[int, list[POIDistanceResponse]]:
|
||||||
|
"""Get all POI distances for the current user, keyed by listing ID.
|
||||||
|
|
||||||
|
Used by the frontend to decouple POI distance loading from the main
|
||||||
|
listing stream, allowing the stream to always hit the Redis cache.
|
||||||
|
"""
|
||||||
|
user_id = _get_user_id(user)
|
||||||
|
repo = POIRepository(engine)
|
||||||
|
pois = {p.id: p for p in poi_service.get_user_pois(repo, user_id)}
|
||||||
|
if not pois:
|
||||||
|
return {}
|
||||||
|
|
||||||
|
from repositories.listing_repository import ListingRepository
|
||||||
|
from database import engine as db_engine
|
||||||
|
listing_repo = ListingRepository(db_engine)
|
||||||
|
all_ids = list(listing_repo.get_listing_ids(listing_type))
|
||||||
|
if not all_ids:
|
||||||
|
return {}
|
||||||
|
|
||||||
|
distances = poi_service.get_distances_for_listings(
|
||||||
|
repo, all_ids, listing_type, user_id
|
||||||
|
)
|
||||||
|
|
||||||
|
result: dict[int, list[POIDistanceResponse]] = {}
|
||||||
|
for d in distances:
|
||||||
|
poi_name = pois[d.poi_id].name if d.poi_id in pois else "Unknown"
|
||||||
|
result.setdefault(d.listing_id, []).append(
|
||||||
|
POIDistanceResponse(
|
||||||
|
poi_id=d.poi_id,
|
||||||
|
poi_name=poi_name,
|
||||||
|
travel_mode=d.travel_mode,
|
||||||
|
duration_seconds=d.duration_seconds,
|
||||||
|
distance_meters=d.distance_meters,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
return result
|
||||||
|
```
|
||||||
|
|
||||||
|
Note: The `/distances/bulk` route must be registered before `/distances` (which takes query params) to avoid route conflicts. Verify the order in the file.
|
||||||
|
|
||||||
|
**Step 4: Add frontend fetch function**
|
||||||
|
|
||||||
|
In `frontend/src/services/poiService.ts`, add:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
export async function fetchBulkPOIDistances(
|
||||||
|
user: AuthUser,
|
||||||
|
listingType: 'RENT' | 'BUY' = 'RENT'
|
||||||
|
): Promise<Record<number, POIDistanceInfo[]>> {
|
||||||
|
return apiRequest<Record<number, POIDistanceInfo[]>>(user, '/api/poi/distances/bulk', {
|
||||||
|
params: { listing_type: listingType },
|
||||||
|
});
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
In `frontend/src/services/index.ts`, add `fetchBulkPOIDistances` to the `poiService` re-export line.
|
||||||
|
|
||||||
|
**Step 5: Run tests**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pytest tests/ -v --tb=short -k "poi_distances_bulk or streaming"
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: All pass.
|
||||||
|
|
||||||
|
**Step 6: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add api/poi_routes.py frontend/src/services/poiService.ts frontend/src/services/index.ts
|
||||||
|
git commit -m "Add bulk POI distances endpoint for decoupled loading
|
||||||
|
|
||||||
|
New GET /api/poi/distances/bulk returns all POI distances keyed by
|
||||||
|
listing ID, allowing the frontend to fetch distances separately
|
||||||
|
from the listing stream and keep the stream on the cached path."
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 6: Eliminate Frontend Waterfall
|
||||||
|
|
||||||
|
Remove the sequential dependency: POI fetch → loadListings. Fire both in parallel. Merge POI distances into features after both resolve.
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `frontend/src/App.tsx:120-203` (auto-load, loadListings, POI merge)
|
||||||
|
- Modify: `frontend/src/services/streamingService.ts:68-73` (remove includePoiDistances)
|
||||||
|
|
||||||
|
**Step 1: Modify `loadListings` to never request POI distances inline**
|
||||||
|
|
||||||
|
In `frontend/src/App.tsx`, change the `loadListings` callback. Remove `includePoiDistances` from the streaming call — it should always be `false`:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
for await (const batch of streamListingGeoJSON(user, parameters, (progress) => {
|
||||||
|
setStreamingProgress(progress);
|
||||||
|
}, { signal: controller.signal })) {
|
||||||
|
```
|
||||||
|
|
||||||
|
Remove `includePoiDistances` from the options entirely.
|
||||||
|
|
||||||
|
**Step 2: Fire loadListings immediately on user auth, don't wait for POIs**
|
||||||
|
|
||||||
|
Change the auto-load `useEffect` (around line 259-272) to fire `loadListings` immediately. Remove `userPOIs` from the `loadListings` dependency array:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// Auto-load data with default filters when user is authenticated
|
||||||
|
useEffect(() => {
|
||||||
|
if (!user || initialLoadTriggeredRef.current) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
initialLoadTriggeredRef.current = true;
|
||||||
|
|
||||||
|
const defaultParams: ParameterValues = {
|
||||||
|
...DEFAULT_FILTER_VALUES,
|
||||||
|
available_from: new Date(),
|
||||||
|
};
|
||||||
|
|
||||||
|
loadListings(defaultParams);
|
||||||
|
}, [user, loadListings]);
|
||||||
|
```
|
||||||
|
|
||||||
|
The `loadListings` `useCallback` dependency array should no longer include `userPOIs`.
|
||||||
|
|
||||||
|
**Step 3: Add POI distance merging after both resolve**
|
||||||
|
|
||||||
|
Add a new `useEffect` that triggers when both `listingData` and `userPOIs` are available. It fetches bulk POI distances and merges them into the existing features:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// Merge POI distances into listing data after both are available
|
||||||
|
useEffect(() => {
|
||||||
|
if (!user || !listingData || userPOIs.length === 0) return;
|
||||||
|
|
||||||
|
let cancelled = false;
|
||||||
|
fetchBulkPOIDistances(user, queryParameters?.listing_type as 'RENT' | 'BUY' ?? 'RENT')
|
||||||
|
.then((distanceLookup) => {
|
||||||
|
if (cancelled) return;
|
||||||
|
// Merge distances into existing features
|
||||||
|
const updatedFeatures = accumulatedFeaturesRef.current.map(feature => {
|
||||||
|
const id = feature.properties?.id;
|
||||||
|
if (id && distanceLookup[id]) {
|
||||||
|
return {
|
||||||
|
...feature,
|
||||||
|
properties: {
|
||||||
|
...feature.properties,
|
||||||
|
poi_distances: distanceLookup[id],
|
||||||
|
},
|
||||||
|
};
|
||||||
|
}
|
||||||
|
return feature;
|
||||||
|
});
|
||||||
|
accumulatedFeaturesRef.current = updatedFeatures;
|
||||||
|
setListingData({
|
||||||
|
type: 'FeatureCollection',
|
||||||
|
features: [...updatedFeatures],
|
||||||
|
});
|
||||||
|
})
|
||||||
|
.catch(() => {}); // POI distances are best-effort
|
||||||
|
|
||||||
|
return () => { cancelled = true; };
|
||||||
|
}, [user, listingData?.features.length, userPOIs.length]);
|
||||||
|
```
|
||||||
|
|
||||||
|
Import `fetchBulkPOIDistances` from `@/services`.
|
||||||
|
|
||||||
|
**Step 4: Clean up streamingService**
|
||||||
|
|
||||||
|
In `frontend/src/services/streamingService.ts`, remove the `includePoiDistances` option from the function signature and query string building (lines 75-77):
|
||||||
|
|
||||||
|
Remove:
|
||||||
|
```typescript
|
||||||
|
if (options?.includePoiDistances) {
|
||||||
|
params.include_poi_distances = 'true';
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Update the options type to remove `includePoiDistances`.
|
||||||
|
|
||||||
|
**Step 5: Run frontend tests**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd frontend && npx vitest run --reporter=verbose
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: All pass (some tests may need the `includePoiDistances` option removed from their calls).
|
||||||
|
|
||||||
|
**Step 6: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add frontend/src/App.tsx frontend/src/services/streamingService.ts
|
||||||
|
git commit -m "Eliminate frontend POI waterfall for faster initial load
|
||||||
|
|
||||||
|
Listing stream fires immediately on auth without waiting for POI
|
||||||
|
fetch. POI distances are fetched separately and merged into features
|
||||||
|
after both resolve. This saves ~200-500ms on initial load and keeps
|
||||||
|
the stream on the cached Redis path."
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 7: Verify End-to-End
|
||||||
|
|
||||||
|
**Step 1: Re-run the performance test from the design doc**
|
||||||
|
|
||||||
|
Port-forward to the API pod and run the same Python timing script from the design diagnostics, comparing before/after:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
KUBECONFIG=.config kubectl port-forward deployment/realestate-crawler-api -n realestate-crawler 15001:5001 &
|
||||||
|
# Run timing script...
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify:
|
||||||
|
- `nr_throttled` delta is 0 during the request (Task 1)
|
||||||
|
- First batch arrives within ~100ms of connection (Tasks 2+3)
|
||||||
|
- `Server-Timing` header is present (Task 4)
|
||||||
|
|
||||||
|
**Step 2: Test in browser**
|
||||||
|
|
||||||
|
Open the app, clear cache, log in, and observe:
|
||||||
|
- First property markers appear within ~1s
|
||||||
|
- POI distances populate shortly after
|
||||||
|
- All filters work correctly
|
||||||
|
|
||||||
|
**Step 3: Commit the design doc update with results**
|
||||||
|
|
||||||
|
Update `docs/plans/2026-02-22-time-to-first-property-design.md` with measured results and commit.
|
||||||
Loading…
Add table
Add a link
Reference in a new issue