wrongmove/crawler/1_dump_listings.py
Kadir 508aa02812 Real crawling scripts and floorplan detection
1. get all listings
2. get all detail jsons
3. get all images
4. get all floorplans
5. detecting floorplans

Also updating dependencies for huggingface etc.
2024-03-10 18:49:39 +00:00

22 lines
608 B
Python

from rec.query import listing_query
import pathlib
import json
d = listing_query(1, 1, 2, 15, 0, 800000)
folder = pathlib.Path("data/rs/")
for i in range(1, 10000):
try:
print(f"page {i}")
d = listing_query(i, 1, 2, 15, 0, 800000)
except:
break
for property in d['properties']:
identifier = property['identifier']
listing_folder = folder / str(identifier)
listing_folder.mkdir(exist_ok=True, parents=True)
listing_path = listing_folder / f"listing.json"
with open(listing_path, 'w') as f:
json.dump(property, f)