Commit graph

62 commits

Author SHA1 Message Date
Viktor Barzin
ea56555884 refactor main.py click to use click commands to allow passing parameters to commands and enable fetching districts by district name 2025-05-14 19:42:08 +00:00
Kadir
bb2488a63b Removing sqlalchemy and the db part as it was never used 2025-05-12 01:17:45 +01:00
Kadir
3f4be8b7ff decisions + logger 2025-05-12 01:01:48 +01:00
Viktor Barzin
962c9a2f38
print all steps that we are running with 2025-05-11 19:13:19 +00:00
Viktor Barzin
9134145e02
[5/n] click-ify add routing command
run with:
poetry run python main.py --step routing
2025-05-11 19:11:23 +00:00
Viktor Barzin
48f694e002
[4/n] click-ify add detect floorplan command
run with
poetry run python main.py --step detect_floorplan
2025-05-11 19:06:08 +00:00
Viktor Barzin
70e8ef9f95
[3/n] click-ify add dump images command
run with
poetry run python main.py --step dump_images
2025-05-11 19:04:19 +00:00
Viktor Barzin
c2196c15c1
[2/n] click-ify - add 2_dump_detail command
run with
poetry run python main.py --step dump_detail
2025-05-11 19:02:23 +00:00
Viktor Barzin
90b531f5d9
[1/n] click-ify - add entrypoint for click script and add
1_dump_listings command

run via:
poetry run python main.py --step dump_listings
2025-05-11 18:59:41 +00:00
Viktor Barzin
0a66efa48a
update pyproject.toml to use latest version of dev dependency section 2025-05-07 21:56:12 +00:00
Viktor Barzin
835494d29f
reformat most things 2025-05-07 21:25:40 +00:00
Viktor Barzin
bd7c781adb
remove requirements.txt as we are using poetry...im blind.. 2025-05-07 21:06:43 +00:00
Viktor Barzin
028eef63a8
add readme and .style.yapf for formatting 2025-05-07 20:58:22 +00:00
Viktor Barzin
d24b667e73
add sqlalchemy to requirements.txt 2025-05-07 20:57:28 +00:00
Viktor Barzin
dd2de477bc
Add requirements.txt 2025-05-07 17:31:29 +00:00
Kadir
8f8956818e refactor routing executor to:
- be nicer
- include checks on floorplan
- include checks on if listing disabled
2025-03-31 03:06:41 +01:00
Kadir
cd21bd0bb6 notebooks 2025-03-30 23:42:29 +01:00
Kadir
2c2adcfa7c improving OCR 2025-03-30 23:41:52 +01:00
Kadir
1b69fd4305 adding new dependencies 2025-03-02 17:11:48 +00:00
Kadir
244de72877 reverting isRemoved function to fix 2025-02-16 04:44:42 +00:00
Kadir
302ca95cfb fixing full detail dumping 2025-02-16 03:02:21 +00:00
Kadir
e0e7853c8c updating districts of relevance to me 2025-02-14 21:35:44 +00:00
Kadir
bb765a9312 making last_seen to a relative date 2025-02-14 21:21:50 +00:00
Kadir
29c8d1960b adding last seen date into the listing 2025-01-26 21:41:18 +00:00
Kadir
4b6b8628c2 add runall script, update parameters to 4 bed etc and allow incremental updating 2024-11-23 22:57:22 +00:00
Kadir
dbf72e42e3 removing districts which are far away from the default crawl 2024-09-23 01:43:20 +01:00
Kadir
f66007bc85 adding status and other fields 2024-09-22 11:31:32 +01:00
Kadir
4dfbcc64c1 floorplan rec: fix regex rule + filter out extreme values 2024-08-25 12:11:55 +02:00
Kadir
2e94bda7fa refac 2024-08-25 09:50:41 +02:00
Kadir
6d343e52e7 adding days updated 2024-08-11 19:36:25 +01:00
Kadir
f98cd02696 adding better exception messages and interrupt free crawling 2024-05-06 18:54:55 +01:00
Kadir
874fed3e8f format and committing notebook 2024-04-05 11:38:55 +01:00
Kadir
69a5bba65e adding service charge to excel 2024-04-05 11:38:31 +01:00
Kadir
b5e316d1df making sure the routing is only applied to listings which are not done already 2024-04-05 11:38:09 +01:00
Kadir
966b9007a0 adding service charge as the info into the sheet 2024-04-05 11:37:03 +01:00
Kadir
5305451fe8 changing detail downloads to prefiltering first. Making the progress bar more accurate and frontloading 2024-04-01 20:28:37 +02:00
Kadir
b16fee0648 changing to 500 listings per query 2024-04-01 20:28:15 +02:00
Kadir
7ce54a01bd removing download of the normal images.. I dont use them 2024-04-01 15:26:34 +02:00
Kadir
c8c53b8696 Reduced need for routing by limiting to a radius of 7 miles 2024-04-01 15:22:08 +02:00
Kadir
6a43a7f485 adding tasks and updating exploration notebook 2024-04-01 15:10:15 +02:00
Kadir
a7a4f88a39 sort districts 2024-03-30 19:24:35 +01:00
Kadir
acc67192c9 add counter to 2_dump_details to understand how many new are crawled 2024-03-30 19:24:03 +01:00
Kadir
5720e68547 Rewriting 1_dump to enabling crawling of more real estate 2024-03-30 19:23:19 +01:00
Kadir
40285245d5 remove debug print statements 2024-03-30 18:35:32 +01:00
Kadir
ec86b572b3 adding districts and location identifier to search 2024-03-30 18:31:49 +01:00
Kadir
4c40462bb8 adding property type 2024-03-25 20:58:35 +00:00
Kadir
d777558b34 ruff format 2024-03-25 20:48:48 +00:00
Kadir
37e3e8ad6f using ruff 2024-03-25 20:48:27 +00:00
Kadir
d5a0964b12 exploration notebook 2024-03-25 20:47:50 +00:00
Kadir
47f7b2b672 Adding initial walking time and identifier to the information 2024-03-25 20:47:31 +00:00