- Remove hard-coded limit=1000 default from listing_geojson and streaming
endpoints, allowing all matching results to be returned
- Add Redis caching service (db=2, 30min TTL) that caches query results
as Redis Lists for fast re-queries with reduced DB load
- Integrate cache into streaming endpoint: serve from cache on hit,
populate cache on miss during DB streaming
- Invalidate cache after scrape completes (both success and no-new-listings)
- Replace ScrollArea with react-virtuoso in ListView for virtual scrolling,
keeping only ~20-30 DOM nodes regardless of list size
- Handle metadata streaming message to show "0 / N" progress from start
- Throttle frontend state updates with requestAnimationFrame to prevent
UI jank from rapid re-renders during cached response streaming
The get_ids_to_process function was using set union instead of set
difference, causing it to return all existing listing IDs along with
new ones. This meant:
1. When there were no new listings, the task would iterate through all
existing listings, find nothing to process for each, and complete
almost instantly
2. The task showed no progress because processing was too fast
Fixed by:
- Changed `all_listing_ids.union(identifiers)` to `identifiers - all_listing_ids`
to only return IDs that are NOT already in the database
- Added explicit check for empty set with informative task state
"No new listings found" so users understand why the task completed quickly