wrongmove/crawler
2026-02-02 23:01:13 +00:00
..
alembic add alembic mutation for logitute name 2026-01-28 21:00:46 +00:00
api Fix stuck Celery tasks and add purge all tasks functionality 2026-02-01 20:40:07 +00:00
config Add throttling detection and circuit breaker for Rightmove scraper 2026-02-02 22:50:19 +00:00
data removing decisions from the repo 2025-06-18 00:59:16 +01:00
frontend Improve login UI with error handling and callback page 2026-02-02 20:08:59 +00:00
models More ruff fixes (#2) 2025-09-14 19:44:03 +01:00
proof_of_concept More ruff fixes (#2) 2025-09-14 19:44:03 +01:00
rec Add throttling detection and circuit breaker for Rightmove scraper 2026-02-02 22:50:19 +00:00
repositories Add proper buy listing support with type-aware UI filters and display 2026-02-01 19:13:29 +00:00
services Add throttling detection and circuit breaker for Rightmove scraper 2026-02-02 22:50:19 +00:00
tasks Add comprehensive logging to Celery tasks and listing processor 2026-02-02 23:01:13 +00:00
tests Add throttling detection and circuit breaker for Rightmove scraper 2026-02-02 22:50:19 +00:00
.dockerignore update dockerignore 2026-02-01 20:22:03 +00:00
.env.sample Add throttling detection and circuit breaker for Rightmove scraper 2026-02-02 22:50:19 +00:00
.style.yapf add readme and .style.yapf for formatting 2025-05-07 20:58:22 +00:00
5_routing.py More ruff fixes (#2) 2025-09-14 19:44:03 +01:00
9_recalculate_regex_squaremeter.py ruff format 2024-03-25 20:48:48 +00:00
91_recalculate_floorplan.py reformat most things 2025-05-07 21:25:40 +00:00
alembic.ini migrate to using db connection string from env 2025-06-22 21:16:55 +00:00
celery_app.py migrate background tasks to celery 2025-06-22 21:18:52 +00:00
CLAUDE.md Add intelligent query splitting to maximize Rightmove data extraction 2026-02-02 21:58:55 +00:00
csv_exporter.py use query params to filter out models; also make csv exporter work with models 2025-06-08 17:01:33 +00:00
data_access.py More ruff fixes (#2) 2025-09-14 19:44:03 +01:00
database.py bugfix - print sql statemnts only in dev 2025-06-30 23:22:12 +00:00
docker-compose.yml Add configurable scheduling, UI health/task indicators, and auto-load map with default filters 2026-02-01 17:28:37 +00:00
Dockerfile change api port to 5001 2025-06-24 19:12:20 +00:00
exploration.ipynb remove some empty fields in exploration notebook 2025-05-17 23:24:40 +00:00
GUIDE merging the visual query answering with the crawler. Monorepo go! 2024-03-01 16:42:48 +01:00
listing_processor.py Add comprehensive logging to Celery tasks and listing processor 2026-02-02 23:01:13 +00:00
main.py More ruff fixes (#2) 2025-09-14 19:44:03 +01:00
main_tmp.py ruff format 2024-03-25 20:48:48 +00:00
notifications.py Used ruff to cleanup 2025-09-14 19:02:30 +01:00
ocr.ipynb notebooks 2025-03-30 23:42:29 +01:00
podman-compose.yml Fix deps and move to a better local environment 2025-09-14 19:00:26 +01:00
poetry.lock Add intelligent query splitting to maximize Rightmove data extraction 2026-02-02 21:58:55 +00:00
pyproject.toml Add intelligent query splitting to maximize Rightmove data extraction 2026-02-02 21:58:55 +00:00
README.md update readme with instructions on how to run everything 2025-08-28 21:39:20 +00:00
redis_repository.py Fix stuck Celery tasks and add purge all tasks functionality 2026-02-01 20:40:07 +00:00
runall.sh some cleanups 2025-06-08 20:58:28 +00:00
start.sh add redis container in start.sh in case it is not running in dev mode 2025-08-28 20:48:42 +00:00
TASKS.md adding tasks and updating exploration notebook 2024-04-01 15:10:15 +02:00
ui_exporter.py Add proper buy listing support with type-aware UI filters and display 2026-02-01 19:13:29 +00:00

Setup

  1. Instal deps:
poetry install && cp .env.sample .env
  1. Check .env if you want to customize settings for broker and db
  2. run ./start.sh

This starts the backend

To start the fronend:

cd frontend && cp .env.sample .env

Change the DEV_HOST to any name you want to use to access the web interface.

Next, setup the DNS record (e.g in your /etc/hosts) file. This is important as auth is done via external [authentik] service that needs to redirect to a name.

Run ./start.sh

This starts a Caddy proxy with correct certificates, and npm dev server. All requests going to the frontend are forwarded to the npm server and the ones for the backed (that go to /api/*) are forwarded to the backend service.

Lastly, reachout to Viktor to allowlist your DEV_HOST so that authentik can authorize callbacks to your host.

Formatting

yapf --style .style.yapf --recursive .

For VSCode - install yapf extension. Enable formatting using yap and the style file in this repo (there may be an easier way; I put this in my user settings json):

{
    "[python]": {
        "editor.formatOnSaveMode": "file",
        "editor.formatOnSave": true,
        "editor.defaultFormatter": "eeyore.yapf",
        "editor.formatOnType": false
      },
      "yapf.args": ["--style", "/home/wizard/code/realestate-crawler/crawler/.style.yapf"]
}

ADB commands (from /Applications/BlueStacks.app/Contents/MacOS):

Set proxy

./hd-adb shell settings put global http_proxy 192.168.9.110:8080

Disable proxy:

/hd-adb shell settings put global http_proxy :0

Connect adb

./hd-adb connect 127.0.0.1:5555

Disconnect adb

/hd-adb disconnect