Viktor Barzin
f880664a98
Add throttling detection and circuit breaker for Rightmove scraper
2026-02-06 20:47:36 +00:00
Viktor Barzin
e8293c6042
Add intelligent query splitting to maximize Rightmove data extraction
2026-02-06 20:47:36 +00:00
Kadir
b1e0a414cf
Used ruff to cleanup
...
I hope it just works right as I cannot test things if they work
2025-09-14 19:02:30 +01:00
Viktor Barzin
d4b22deda0
save user queries in redis so that user can refresh the page and still come back to their latest task
2025-07-06 12:02:25 +00:00
Viktor Barzin
20ff91d663
reduce concurrency when fetching images + add retries
2025-07-01 16:12:06 +00:00
Viktor Barzin
bf79d3c977
convert district at the last moment - when we send the query
2025-06-22 13:05:53 +00:00
Viktor Barzin
9b2653ce91
add tenacity to retry transient blockouts by rightmove
2025-06-08 20:59:04 +00:00
Viktor Barzin
289206afc0
some cleanups
2025-06-08 20:58:28 +00:00
Viktor Barzin
e317d2ec54
use query params to filter out models; also make csv exporter work with models
2025-06-08 17:01:33 +00:00
Viktor Barzin
4f5a934fa9
refactor dump listings to start using model instead of the data_access object
2025-06-07 12:46:53 +00:00
Viktor Barzin
842f7cefbe
merge dump listings and dump details commands - fetch both details and listings in the same command
2025-06-07 12:00:23 +00:00
Viktor Barzin
29213f3d26
refactor the semaphore when dumping listings
2025-06-06 20:09:16 +00:00
Viktor Barzin
f7fb891648
add models
2025-06-04 21:09:29 +00:00
Viktor Barzin
0acd417d34
add option to filter for min sqm per listing
2025-06-01 19:26:24 +00:00
Viktor Barzin
8b90ecde11
add filter for last seen days
2025-06-01 15:26:38 +00:00
Viktor Barzin
11315359d2
reuse query params when exporting to immoweb and allow filtering from available date
2025-06-01 15:17:14 +00:00
Viktor Barzin
9735db72a0
limit the number of concurrenct requests when dumping listings as right move blocks us
2025-06-01 00:27:12 +00:00
Viktor Barzin
24bf44caf9
trust env when fetching details and listings. this allows setting proxy env vars. I use this in conjunction with a tor proxy to avoid ip-bans from rightmove
2025-06-01 00:12:31 +00:00
Viktor Barzin
1122f5a96f
use aiohttp for dumping listing query
2025-05-31 23:48:45 +00:00
Viktor Barzin
7e8c79d3d1
add command to export the data in a way that the ui (immoweb) can consume
2025-05-26 19:37:06 +00:00
Viktor Barzin
9f3e466b23
add filter for furnished/unfurnished type for rented listings
2025-05-18 17:22:48 +00:00
Viktor Barzin
b873eaf203
fix types and format
2025-05-18 12:30:49 +00:00
Viktor Barzin
01ac24b4b7
use aiohttp to fetch details concurrently
2025-05-17 22:34:27 +00:00
Viktor Barzin
3e7a144fb4
make dumping details async
2025-05-17 22:11:33 +00:00
Viktor Barzin
ad879f2d4f
convert listings dump to asyncio
2025-05-17 21:55:42 +00:00
Viktor Barzin
df24c2c1b7
add cli param for querying properties to rent
...
example:
python main.py --data-dir data/rs2 dump-listings --max-price 3500 --min-bedrooms 2 --max-bedrooms 4 --district islington -t rent
2025-05-17 21:22:39 +00:00
Kadir
bb2488a63b
Removing sqlalchemy and the db part as it was never used
2025-05-12 01:17:45 +01:00
Viktor Barzin
835494d29f
reformat most things
2025-05-07 21:25:40 +00:00
Kadir
f98cd02696
adding better exception messages and interrupt free crawling
2024-05-06 18:54:55 +01:00
Kadir
b16fee0648
changing to 500 listings per query
2024-04-01 20:28:15 +02:00
Kadir
40285245d5
remove debug print statements
2024-03-30 18:35:32 +01:00
Kadir
ec86b572b3
adding districts and location identifier to search
2024-03-30 18:31:49 +01:00
Kadir
4c40462bb8
adding property type
2024-03-25 20:58:35 +00:00
Kadir
d777558b34
ruff format
2024-03-25 20:48:48 +00:00
Kadir
a9b8d4d630
adding musthave and lastXdays
2024-03-13 16:21:54 +00:00
Kadir
508aa02812
Real crawling scripts and floorplan detection
...
1. get all listings
2. get all detail jsons
3. get all images
4. get all floorplans
5. detecting floorplans
Also updating dependencies for huggingface etc.
2024-03-10 18:49:39 +00:00
Kadir
46bb641026
adding floorplans, detail json, refactored the folders
2024-03-07 22:02:09 +00:00
Kadir
e2f7998ee9
merging the visual query answering with the crawler. Monorepo go!
2024-03-01 16:42:48 +01:00