Commit graph

120 commits

Author SHA1 Message Date
Viktor Barzin
4f5a934fa9
refactor dump listings to start using model instead of the data_access object 2025-06-07 12:46:53 +00:00
Viktor Barzin
842f7cefbe
merge dump listings and dump details commands - fetch both details and listings in the same command 2025-06-07 12:00:23 +00:00
Viktor Barzin
29213f3d26
refactor the semaphore when dumping listings 2025-06-06 20:09:16 +00:00
Viktor Barzin
b7a2ea75aa
add json field to store any additional blob of data that may be missing; also populate db when dumping listings 2025-06-06 19:57:50 +00:00
Viktor Barzin
8b2025e700
add command to dump existing listing from fs to db 2025-06-04 22:03:55 +00:00
Viktor Barzin
f7fb891648
add models 2025-06-04 21:09:29 +00:00
Viktor Barzin
e50f33ed14
filter out for furnished properties only 2025-06-03 20:02:29 +00:00
Viktor Barzin
0d3393ed94
add sqlmodel + alembic + setup models skeleton to slowly enable transition towards a db 2025-06-03 20:01:30 +00:00
Viktor Barzin
8c646a5322
touch geojson file to ensure it exists before opening to write 2025-06-01 22:13:22 +00:00
Viktor Barzin
16b88c4aac
remove unused torch pkg as that was consuming 12gb of the container img 2025-06-01 20:31:50 +00:00
Viktor Barzin
0bdfeec195
make the csv exporter use the filtering params to allow exporting customizable reports 2025-06-01 20:11:00 +00:00
Viktor Barzin
0acd417d34
add option to filter for min sqm per listing 2025-06-01 19:26:24 +00:00
Viktor Barzin
ab5c4ce0a6
refactor runall script to share filter options when dumping listings and when exporting to immoweb 2025-06-01 18:10:09 +00:00
Viktor Barzin
2e3e900ec3
store price history when fetching details. this will be helpful later on as we start periodically scraping to store prices over time 2025-06-01 18:09:27 +00:00
Viktor Barzin
8b90ecde11
add filter for last seen days 2025-06-01 15:26:38 +00:00
Viktor Barzin
11315359d2
reuse query params when exporting to immoweb and allow filtering from available date 2025-06-01 15:17:14 +00:00
Viktor Barzin
a23a5ae192
extract common listing filtering options into a decorator to enable reuse between commands 2025-06-01 12:11:15 +00:00
Viktor Barzin
c543be7ff6
add let date available from parameter 2025-06-01 00:51:44 +00:00
Viktor Barzin
9735db72a0
limit the number of concurrenct requests when dumping listings as right move blocks us 2025-06-01 00:27:12 +00:00
Viktor Barzin
24bf44caf9
trust env when fetching details and listings. this allows setting proxy env vars. I use this in conjunction with a tor proxy to avoid ip-bans from rightmove 2025-06-01 00:12:31 +00:00
Viktor Barzin
0b9d50af47
reformat with black; looks better 2025-05-31 23:50:43 +00:00
Viktor Barzin
1122f5a96f
use aiohttp for dumping listing query 2025-05-31 23:48:45 +00:00
Viktor Barzin
b312a53526
remove exporting to csv in the runall script - we now use immoweb as a frontend 2025-05-31 23:31:41 +00:00
Viktor Barzin
f9a7620bf0
recalculate floorplans when theres only an empty object 2025-05-31 23:31:06 +00:00
Viktor Barzin
20ab7fde95
add all of dict_nicely to the export for immoweb to allow using any of the available listing data 2025-05-31 21:57:44 +00:00
Viktor Barzin
18b0e39495
add photo thumbnail to dict_nicely to allow visualizing in the ui 2025-05-31 21:57:03 +00:00
Viktor Barzin
7e8c79d3d1
add command to export the data in a way that the ui (immoweb) can consume 2025-05-26 19:37:06 +00:00
Viktor Barzin
102c20ac42
[bugfix] if no district is passed, fetch data for all 2025-05-26 19:31:33 +00:00
Viktor Barzin
56c203189f
dockerize 2025-05-21 21:30:00 +00:00
Viktor Barzin
1e0f302178
defer transformers and pytesseract imports to when used. this shortens startup time of all other commands quite a bit 2025-05-21 21:24:57 +00:00
Viktor Barzin
10ae25e0d3
allow caching routing by destination and travel mode; also export all travel methods to csv column 2025-05-20 22:12:56 +00:00
Viktor Barzin
f2118c9bc4
update runall script to use new command for sourcing routes 2025-05-18 21:32:16 +00:00
Viktor Barzin
d9fe5421fc
fix type hint in paths vs strings 2025-05-18 21:31:51 +00:00
Viktor Barzin
482fff689b
parameterize routing logic - extract api key as env var; allow searching dest by address; limit the number of listings to process to prevent accidental api key usage 2025-05-18 21:13:50 +00:00
Viktor Barzin
57f477c54d
add .env.sample env with api key for routing service exported 2025-05-18 19:52:49 +00:00
Viktor Barzin
ea24847cae
add stratford region 2025-05-18 18:02:32 +00:00
Viktor Barzin
9067d45327
add council tax band when exporting 2025-05-18 18:02:19 +00:00
Viktor Barzin
5192b1955c
export agency 2025-05-18 17:41:50 +00:00
Viktor Barzin
9f3e466b23
add filter for furnished/unfurnished type for rented listings 2025-05-18 17:22:48 +00:00
Viktor Barzin
b873eaf203
fix types and format 2025-05-18 12:30:49 +00:00
Viktor Barzin
91d3237516
remove some empty fields in exploration notebook 2025-05-17 23:24:40 +00:00
Viktor Barzin
0b3c644ea3
set the min price param in runall 2025-05-17 23:18:27 +00:00
Viktor Barzin
2741a77560
sort output dataframe by price per sqm 2025-05-17 23:14:18 +00:00
Viktor Barzin
47347543d2
update runall script to use the click entrypoint 2025-05-17 23:14:00 +00:00
Viktor Barzin
b1e0ed170b
detect floorplan using asyncio 2025-05-17 22:58:35 +00:00
Viktor Barzin
68cc70bd11
dump images using aiohttp and concurrently 2025-05-17 22:43:05 +00:00
Viktor Barzin
01ac24b4b7
use aiohttp to fetch details concurrently 2025-05-17 22:34:27 +00:00
Viktor Barzin
61b8c82592
add ipdb to the dev deps 2025-05-17 22:11:51 +00:00
Viktor Barzin
3e7a144fb4
make dumping details async 2025-05-17 22:11:33 +00:00
Viktor Barzin
ad879f2d4f
convert listings dump to asyncio 2025-05-17 21:55:42 +00:00