wrongmove/crawler/2_dump_detail.py
Kadir 508aa02812 Real crawling scripts and floorplan detection
1. get all listings
2. get all detail jsons
3. get all images
4. get all floorplans
5. detecting floorplans

Also updating dependencies for huggingface etc.
2024-03-10 18:49:39 +00:00

25 lines
563 B
Python

import pathlib
import json
from rec.query import detail_query
folder = pathlib.Path('data/rs/')
listings = folder.glob('*/listing.json')
for listing_path in listings:
with open(listing_path) as f:
listing = json.load(f)
identifier = listing['identifier']
try:
d = detail_query(identifier)
except:
print('Failed at: ', identifier)
raise
print(identifier)
detail_path = pathlib.Path(f'data/rs/{identifier}/detail.json')
with open(detail_path, 'w') as f:
json.dump(d, f)