Instrument every API endpoint with Server-Timing headers so sub-operation
durations are visible in browser DevTools Network tab. Also adds Grafana
dashboard panels for per-endpoint latency comparison (p50/p95 timeseries
and p95 ranking bar gauge).
Init OTel metrics at module level in celery_app.py so prefork child
processes inherit the MeterProvider and PrometheusMetricReader from
the parent. Previously, worker_process_init created a separate
MeterProvider in each child, disconnected from the HTTP server in
the main process — so all scrape/celery/OCR metrics were silently
lost.
Update Grafana dashboard with API Performance and Frontend Performance
sections, synced from the live cluster dashboard.
Structured logging via JsonFormatter replaces uvicorn's default format so
Loki can parse timestamps and fields. 14 business metrics (scrape stats,
throttle events, circuit breaker state, cache hit rate, OCR success rate,
Celery task lifecycle) are defined in a shared metrics module and
instrumented across the scraper pipeline, API, and workers. Celery
workers expose a Prometheus HTTP endpoint on configurable ports.