Viktor Barzin
d205d15c74
Add services layer, tests, streaming UI, and cleanup legacy code
2026-02-06 20:55:10 +00:00
Viktor Barzin
f880664a98
Add throttling detection and circuit breaker for Rightmove scraper
2026-02-06 20:47:36 +00:00
Viktor Barzin
e8293c6042
Add intelligent query splitting to maximize Rightmove data extraction
2026-02-06 20:47:36 +00:00
Kadir
0801aaf200
More ruff fixes ( #2 )
...
* adding ruff auto check for pull requests as well as fixing all ruff errors
* More ruff fixes: forgot half of the ruff checks
Forgot to do a git add all :D
---------
Co-authored-by: Kadir <git@k8n.dev>
2025-09-14 19:44:03 +01:00
Kadir
b1e0a414cf
Used ruff to cleanup
...
I hope it just works right as I cannot test things if they work
2025-09-14 19:02:30 +01:00
Viktor Barzin
d4b22deda0
save user queries in redis so that user can refresh the page and still come back to their latest task
2025-07-06 12:02:25 +00:00
Viktor Barzin
20ff91d663
reduce concurrency when fetching images + add retries
2025-07-01 16:12:06 +00:00
Viktor Barzin
bf79d3c977
convert district at the last moment - when we send the query
2025-06-22 13:05:53 +00:00
Viktor Barzin
9b2653ce91
add tenacity to retry transient blockouts by rightmove
2025-06-08 20:59:04 +00:00
Viktor Barzin
289206afc0
some cleanups
2025-06-08 20:58:28 +00:00
Viktor Barzin
e317d2ec54
use query params to filter out models; also make csv exporter work with models
2025-06-08 17:01:33 +00:00
Viktor Barzin
325823e631
refactor detect floorplan to use model listings
2025-06-07 14:30:32 +00:00
Viktor Barzin
4f5a934fa9
refactor dump listings to start using model instead of the data_access object
2025-06-07 12:46:53 +00:00
Viktor Barzin
842f7cefbe
merge dump listings and dump details commands - fetch both details and listings in the same command
2025-06-07 12:00:23 +00:00
Viktor Barzin
29213f3d26
refactor the semaphore when dumping listings
2025-06-06 20:09:16 +00:00
Viktor Barzin
f7fb891648
add models
2025-06-04 21:09:29 +00:00
Viktor Barzin
0acd417d34
add option to filter for min sqm per listing
2025-06-01 19:26:24 +00:00
Viktor Barzin
8b90ecde11
add filter for last seen days
2025-06-01 15:26:38 +00:00
Viktor Barzin
11315359d2
reuse query params when exporting to immoweb and allow filtering from available date
2025-06-01 15:17:14 +00:00
Viktor Barzin
9735db72a0
limit the number of concurrenct requests when dumping listings as right move blocks us
2025-06-01 00:27:12 +00:00
Viktor Barzin
24bf44caf9
trust env when fetching details and listings. this allows setting proxy env vars. I use this in conjunction with a tor proxy to avoid ip-bans from rightmove
2025-06-01 00:12:31 +00:00
Viktor Barzin
0b9d50af47
reformat with black; looks better
2025-05-31 23:50:43 +00:00
Viktor Barzin
1122f5a96f
use aiohttp for dumping listing query
2025-05-31 23:48:45 +00:00
Viktor Barzin
7e8c79d3d1
add command to export the data in a way that the ui (immoweb) can consume
2025-05-26 19:37:06 +00:00
Viktor Barzin
1e0f302178
defer transformers and pytesseract imports to when used. this shortens startup time of all other commands quite a bit
2025-05-21 21:24:57 +00:00
Viktor Barzin
10ae25e0d3
allow caching routing by destination and travel mode; also export all travel methods to csv column
2025-05-20 22:12:56 +00:00
Viktor Barzin
482fff689b
parameterize routing logic - extract api key as env var; allow searching dest by address; limit the number of listings to process to prevent accidental api key usage
2025-05-18 21:13:50 +00:00
Viktor Barzin
ea24847cae
add stratford region
2025-05-18 18:02:32 +00:00
Viktor Barzin
9f3e466b23
add filter for furnished/unfurnished type for rented listings
2025-05-18 17:22:48 +00:00
Viktor Barzin
b873eaf203
fix types and format
2025-05-18 12:30:49 +00:00
Viktor Barzin
01ac24b4b7
use aiohttp to fetch details concurrently
2025-05-17 22:34:27 +00:00
Viktor Barzin
3e7a144fb4
make dumping details async
2025-05-17 22:11:33 +00:00
Viktor Barzin
ad879f2d4f
convert listings dump to asyncio
2025-05-17 21:55:42 +00:00
Viktor Barzin
df24c2c1b7
add cli param for querying properties to rent
...
example:
python main.py --data-dir data/rs2 dump-listings --max-price 3500 --min-bedrooms 2 --max-bedrooms 4 --district islington -t rent
2025-05-17 21:22:39 +00:00
Viktor Barzin
ea56555884
refactor main.py click to use click commands to allow passing parameters to commands and enable fetching districts by district name
2025-05-14 19:42:08 +00:00
Kadir
bb2488a63b
Removing sqlalchemy and the db part as it was never used
2025-05-12 01:17:45 +01:00
Viktor Barzin
835494d29f
reformat most things
2025-05-07 21:25:40 +00:00
Kadir
2c2adcfa7c
improving OCR
2025-03-30 23:41:52 +01:00
Kadir
e0e7853c8c
updating districts of relevance to me
2025-02-14 21:35:44 +00:00
Kadir
dbf72e42e3
removing districts which are far away from the default crawl
2024-09-23 01:43:20 +01:00
Kadir
4dfbcc64c1
floorplan rec: fix regex rule + filter out extreme values
2024-08-25 12:11:55 +02:00
Kadir
2e94bda7fa
refac
2024-08-25 09:50:41 +02:00
Kadir
f98cd02696
adding better exception messages and interrupt free crawling
2024-05-06 18:54:55 +01:00
Kadir
874fed3e8f
format and committing notebook
2024-04-05 11:38:55 +01:00
Kadir
b16fee0648
changing to 500 listings per query
2024-04-01 20:28:15 +02:00
Kadir
a7a4f88a39
sort districts
2024-03-30 19:24:35 +01:00
Kadir
40285245d5
remove debug print statements
2024-03-30 18:35:32 +01:00
Kadir
ec86b572b3
adding districts and location identifier to search
2024-03-30 18:31:49 +01:00
Kadir
4c40462bb8
adding property type
2024-03-25 20:58:35 +00:00
Kadir
d777558b34
ruff format
2024-03-25 20:48:48 +00:00