wrongmove/crawler
2025-08-23 22:20:42 +00:00
..
alembic make a few columns in the listing model indices to help with search 2025-06-30 22:57:41 +00:00
api add opentelemetry 2025-08-02 17:25:56 +00:00
data removing decisions from the repo 2025-06-18 00:59:16 +01:00
frontend make property image clickable 2025-07-27 11:11:04 +00:00
models make a few columns in the listing model indices to help with search 2025-06-30 22:57:41 +00:00
proof_of_concept ruff format 2024-03-25 20:48:48 +00:00
rec save user queries in redis so that user can refresh the page and still come back to their latest task 2025-07-06 12:02:25 +00:00
repositories add apprise and send notification when refreshing listings 2025-07-25 21:32:06 +00:00
tasks make task processing a bit better. still doing 1 query to check if needs processing; will fix later 2025-07-27 20:09:41 +00:00
.dockerignore add dockerignore to ignore frontend as that is built separately 2025-06-16 22:40:36 +00:00
.env.sample add celery env variables to sample env 2025-06-22 21:15:15 +00:00
.style.yapf add readme and .style.yapf for formatting 2025-05-07 20:58:22 +00:00
1_dump_listings.py add timeout when fetching details and use new entrypoint for task processing 2025-08-23 22:20:42 +00:00
3_dump_images.py migrate processing to a pipeline approach where each listing is processed in a pipeline in parallel and status reported back to track progress 2025-07-27 18:33:39 +00:00
4_detect_floorplan.py do 1 attempt to detect sqm from floorplan. if that fails then set to 0 instead of returning none 2025-06-21 12:06:57 +00:00
5_routing.py migrate routing command to use the models and store data there 2025-06-08 11:45:05 +00:00
9_recalculate_regex_squaremeter.py ruff format 2024-03-25 20:48:48 +00:00
91_recalculate_floorplan.py reformat most things 2025-05-07 21:25:40 +00:00
alembic.ini migrate to using db connection string from env 2025-06-22 21:16:55 +00:00
celery_app.py migrate background tasks to celery 2025-06-22 21:18:52 +00:00
csv_exporter.py use query params to filter out models; also make csv exporter work with models 2025-06-08 17:01:33 +00:00
data_access.py add apprise and send notification when refreshing listings 2025-07-25 21:32:06 +00:00
database.py bugfix - print sql statemnts only in dev 2025-06-30 23:22:12 +00:00
Dockerfile change api port to 5001 2025-06-24 19:12:20 +00:00
exploration.ipynb remove some empty fields in exploration notebook 2025-05-17 23:24:40 +00:00
GUIDE merging the visual query answering with the crawler. Monorepo go! 2024-03-01 16:42:48 +01:00
listing_processor.py add timeout when fetching details and use new entrypoint for task processing 2025-08-23 22:20:42 +00:00
main.py migrate processing to a pipeline approach where each listing is processed in a pipeline in parallel and status reported back to track progress 2025-07-27 18:33:39 +00:00
main_tmp.py ruff format 2024-03-25 20:48:48 +00:00
notifications.py migrate processing to a pipeline approach where each listing is processed in a pipeline in parallel and status reported back to track progress 2025-07-27 18:33:39 +00:00
ocr.ipynb notebooks 2025-03-30 23:42:29 +01:00
poetry.lock add opentelemetry 2025-08-02 17:25:56 +00:00
pyproject.toml add opentelemetry 2025-08-02 17:25:56 +00:00
README.md add helpful adb commands to the readme 2025-06-22 13:06:37 +00:00
redis_repository.py save user queries in redis so that user can refresh the page and still come back to their latest task 2025-07-06 12:02:25 +00:00
runall.sh some cleanups 2025-06-08 20:58:28 +00:00
start.sh add daily scrape of interesting rent listings 2025-07-25 22:14:45 +00:00
TASKS.md adding tasks and updating exploration notebook 2024-04-01 15:10:15 +02:00
ui_exporter.py migrate background tasks to celery 2025-06-22 21:18:52 +00:00

Setup

pip install -r requirements.txt

Formatting

yapf --style .style.yapf --recursive .

For VSCode - install yapf extension. Enable formatting using yap and the style file in this repo (there may be an easier way; I put this in my user settings json):

{
    "[python]": {
        "editor.formatOnSaveMode": "file",
        "editor.formatOnSave": true,
        "editor.defaultFormatter": "eeyore.yapf",
        "editor.formatOnType": false
      },
      "yapf.args": ["--style", "/home/wizard/code/realestate-crawler/crawler/.style.yapf"]
}

ADB commands (from /Applications/BlueStacks.app/Contents/MacOS):

Set proxy

./hd-adb shell settings put global http_proxy 192.168.9.110:8080

Disable proxy:

/hd-adb shell settings put global http_proxy :0

Connect adb

./hd-adb connect 127.0.0.1:5555

Disconnect adb

/hd-adb disconnect