Viktor Barzin
bd7c781adb
remove requirements.txt as we are using poetry...im blind..
2025-05-07 21:06:43 +00:00
Viktor Barzin
028eef63a8
add readme and .style.yapf for formatting
2025-05-07 20:58:22 +00:00
Viktor Barzin
d24b667e73
add sqlalchemy to requirements.txt
2025-05-07 20:57:28 +00:00
Viktor Barzin
dd2de477bc
Add requirements.txt
2025-05-07 17:31:29 +00:00
Kadir
8f8956818e
refactor routing executor to:
...
- be nicer
- include checks on floorplan
- include checks on if listing disabled
2025-03-31 03:06:41 +01:00
Kadir
cd21bd0bb6
notebooks
2025-03-30 23:42:29 +01:00
Kadir
2c2adcfa7c
improving OCR
2025-03-30 23:41:52 +01:00
Kadir
1b69fd4305
adding new dependencies
2025-03-02 17:11:48 +00:00
Kadir
244de72877
reverting isRemoved function to fix
2025-02-16 04:44:42 +00:00
Kadir
302ca95cfb
fixing full detail dumping
2025-02-16 03:02:21 +00:00
Kadir
e0e7853c8c
updating districts of relevance to me
2025-02-14 21:35:44 +00:00
Kadir
bb765a9312
making last_seen to a relative date
2025-02-14 21:21:50 +00:00
Kadir
29c8d1960b
adding last seen date into the listing
2025-01-26 21:41:18 +00:00
Kadir
4b6b8628c2
add runall script, update parameters to 4 bed etc and allow incremental updating
2024-11-23 22:57:22 +00:00
Kadir
dbf72e42e3
removing districts which are far away from the default crawl
2024-09-23 01:43:20 +01:00
Kadir
f66007bc85
adding status and other fields
2024-09-22 11:31:32 +01:00
Kadir
4dfbcc64c1
floorplan rec: fix regex rule + filter out extreme values
2024-08-25 12:11:55 +02:00
Kadir
2e94bda7fa
refac
2024-08-25 09:50:41 +02:00
Kadir
6d343e52e7
adding days updated
2024-08-11 19:36:25 +01:00
Kadir
f98cd02696
adding better exception messages and interrupt free crawling
2024-05-06 18:54:55 +01:00
Kadir
874fed3e8f
format and committing notebook
2024-04-05 11:38:55 +01:00
Kadir
69a5bba65e
adding service charge to excel
2024-04-05 11:38:31 +01:00
Kadir
b5e316d1df
making sure the routing is only applied to listings which are not done already
2024-04-05 11:38:09 +01:00
Kadir
966b9007a0
adding service charge as the info into the sheet
2024-04-05 11:37:03 +01:00
Kadir
5305451fe8
changing detail downloads to prefiltering first. Making the progress bar more accurate and frontloading
2024-04-01 20:28:37 +02:00
Kadir
b16fee0648
changing to 500 listings per query
2024-04-01 20:28:15 +02:00
Kadir
7ce54a01bd
removing download of the normal images.. I dont use them
2024-04-01 15:26:34 +02:00
Kadir
c8c53b8696
Reduced need for routing by limiting to a radius of 7 miles
2024-04-01 15:22:08 +02:00
Kadir
6a43a7f485
adding tasks and updating exploration notebook
2024-04-01 15:10:15 +02:00
Kadir
a7a4f88a39
sort districts
2024-03-30 19:24:35 +01:00
Kadir
acc67192c9
add counter to 2_dump_details to understand how many new are crawled
2024-03-30 19:24:03 +01:00
Kadir
5720e68547
Rewriting 1_dump to enabling crawling of more real estate
2024-03-30 19:23:19 +01:00
Kadir
40285245d5
remove debug print statements
2024-03-30 18:35:32 +01:00
Kadir
ec86b572b3
adding districts and location identifier to search
2024-03-30 18:31:49 +01:00
Kadir
4c40462bb8
adding property type
2024-03-25 20:58:35 +00:00
Kadir
d777558b34
ruff format
2024-03-25 20:48:48 +00:00
Kadir
37e3e8ad6f
using ruff
2024-03-25 20:48:27 +00:00
Kadir
d5a0964b12
exploration notebook
2024-03-25 20:47:50 +00:00
Kadir
47f7b2b672
Adding initial walking time and identifier to the information
2024-03-25 20:47:31 +00:00
Kadir
ce632c795d
fixing gitignores
2024-03-25 20:47:15 +00:00
Kadir
e79e24ff98
fix: remove debug statement
2024-03-18 01:02:06 +00:00
Kadir
4dea766a12
fixing floorplan detection and adding recalculation method
2024-03-18 00:56:39 +00:00
Kadir
335adc0856
add routing, incremental crawling, travel time, lease and development
2024-03-13 16:24:57 +00:00
Kadir
d4f87bed76
add routing and util to the right places
2024-03-13 16:22:53 +00:00
Kadir
a9b8d4d630
adding musthave and lastXdays
2024-03-13 16:21:54 +00:00
Kadir
36258d877f
crawling for 3 and refactoring to allow incremental crawls
2024-03-11 14:43:53 +00:00
Kadir
de2639f9c3
fixing bugs, adding properties for querying and analysis§
2024-03-11 09:44:37 +00:00
Kadir
d108bf11ee
adding tesseract OCR for floorplan detection
2024-03-10 22:32:34 +00:00
Kadir
508aa02812
Real crawling scripts and floorplan detection
...
1. get all listings
2. get all detail jsons
3. get all images
4. get all floorplans
5. detecting floorplans
Also updating dependencies for huggingface etc.
2024-03-10 18:49:39 +00:00
Kadir
46bb641026
adding floorplans, detail json, refactored the folders
2024-03-07 22:02:09 +00:00