Structured logging via JsonFormatter replaces uvicorn's default format so
Loki can parse timestamps and fields. 14 business metrics (scrape stats,
throttle events, circuit breaker state, cache hit rate, OCR success rate,
Celery task lifecycle) are defined in a shared metrics module and
instrumented across the scraper pipeline, API, and workers. Celery
workers expose a Prometheus HTTP endpoint on configurable ports.
- Disable OpenAPI docs/redoc/openapi.json when APP_ENV=production
- Strip uvicorn Server header with --no-server-header in Dockerfile and docker-compose.yml
- Add OriginValidatorMiddleware to reject state-changing requests from disallowed origins
- Add global exception handler to prevent stack trace leakage on unhandled errors
- Add tests for all new security features (OpenAPI, origin validation, exception handler, server header)
- Update Geofabrik download URL from great-britain to united-kingdom
(old path returns 302 redirect to homepage).
- Switch OSRM Docker volumes from named volume to bind mount
(./osrm-data:/data) so osrm-setup.sh output is used directly.
- Add osrm-data/ to .gitignore (large binaries, regenerated by script).
Adds osrm-foot, osrm-bicycle, and otp services to Docker Compose under
a 'routing' profile (opt-in). Setup scripts download Greater London OSM
data and pre-process for OSRM foot/bicycle profiles, plus TfL GTFS for
OTP transit. Routing engine env vars added to .env.sample.
The crawler subdirectory was the only active project. Moving it to the
repo root simplifies paths and removes the unnecessary nesting. The
vqa/ and immoweb/ directories were legacy/unused and have been removed.
Updated .drone.yml, .gitignore, .claude/ docs, and skills to reflect
the new flat structure.
2026-02-07 23:01:20 +00:00
Renamed from crawler/docker-compose.yml (Browse further)