wrongmove/crawler
2025-05-17 23:24:40 +00:00
..
data decisions + logger 2025-05-12 01:01:48 +01:00
proof_of_concept ruff format 2024-03-25 20:48:48 +00:00
rec use aiohttp to fetch details concurrently 2025-05-17 22:34:27 +00:00
.style.yapf add readme and .style.yapf for formatting 2025-05-07 20:58:22 +00:00
1_dump_listings.py convert listings dump to asyncio 2025-05-17 21:55:42 +00:00
2_dump_detail.py dump images using aiohttp and concurrently 2025-05-17 22:43:05 +00:00
3_dump_images.py update runall script to use the click entrypoint 2025-05-17 23:14:00 +00:00
4_detect_floorplan.py detect floorplan using asyncio 2025-05-17 22:58:35 +00:00
5_routing.py parameterize routing step to work with custom data paths 2025-05-14 21:08:03 +00:00
9_recalculate_regex_squaremeter.py ruff format 2024-03-25 20:48:48 +00:00
91_recalculate_floorplan.py reformat most things 2025-05-07 21:25:40 +00:00
csv_exporter.py sort output dataframe by price per sqm 2025-05-17 23:14:18 +00:00
data_access.py detect floorplan using asyncio 2025-05-17 22:58:35 +00:00
exploration.ipynb remove some empty fields in exploration notebook 2025-05-17 23:24:40 +00:00
GUIDE merging the visual query answering with the crawler. Monorepo go! 2024-03-01 16:42:48 +01:00
logger.py decisions + logger 2025-05-12 01:01:48 +01:00
main.py detect floorplan using asyncio 2025-05-17 22:58:35 +00:00
main_tmp.py ruff format 2024-03-25 20:48:48 +00:00
ocr.ipynb notebooks 2025-03-30 23:42:29 +01:00
poetry.lock use aiohttp to fetch details concurrently 2025-05-17 22:34:27 +00:00
pyproject.toml use aiohttp to fetch details concurrently 2025-05-17 22:34:27 +00:00
README.md add readme and .style.yapf for formatting 2025-05-07 20:58:22 +00:00
runall.sh set the min price param in runall 2025-05-17 23:18:27 +00:00
TASKS.md adding tasks and updating exploration notebook 2024-04-01 15:10:15 +02:00

Setup

pip install -r requirements.txt

Formatting

yapf --style .style.yapf --recursive .

For VSCode - install yapf extension. Enable formatting using yap and the style file in this repo (there may be an easier way; I put this in my user settings json):

{
    "[python]": {
        "editor.formatOnSaveMode": "file",
        "editor.formatOnSave": true,
        "editor.defaultFormatter": "eeyore.yapf",
        "editor.formatOnType": false
      },
      "yapf.args": ["--style", "/home/wizard/code/realestate-crawler/crawler/.style.yapf"]
}