infra/stacks/chrome-service/files/snapshot_server.py

69 lines
2.3 KiB
Python
Raw Normal View History

chrome-service: switch to CDP + persistent profile + hourly snapshot pipeline The chrome-service stack ran `playwright launch-server`, which creates ephemeral browser contexts per `connect()`. Despite the encrypted PVC mounted at /profile, no chromium user-data ever persisted — only npm cache + fontconfig. Logging in via noVNC was effectively a no-op. Refactor: - Replace launch-server with direct chromium (TCP CDP on :9223 internal), fronted by a Python HTTP+WS bridge on :9222 that rewrites the Host header to bypass Chrome's hardcoded DNS-rebinding protection (no `--remote-allow-hosts` flag exists in stock Chrome 130; verified by binary string grep). Bridge also forces Connection: close on HTTP responses so Node ws opens a fresh TCP for the WS upgrade rather than trying to reuse the dead keep-alive socket. - Add `--user-data-dir=/profile/chromium-data` so cookies/localStorage actually persist on the encrypted PVC. - New snapshot-server sidecar (stdlib python HTTP) serves GET /api/snapshot at chrome.viktorbarzin.me/api/snapshot, bearer-token-gated by the existing api_bearer_token. - New chrome-service-snapshot-harvester CronJob (hourly) connects via CDP, dumps storage_state() (cookies + localStorage), writes atomically to /profile/snapshots/storage-state.json. - NetworkPolicy: TCP/9222 (was :3000), TCP/8088 added for traefik. Caller migration: - f1-stream: `chromium.connect(ws_url)` → `chromium.connect_over_cdp(cdp_url)`, env var CHROME_WS_URL → CHROME_CDP_URL. CHROME_WS_TOKEN dropped (no longer used by code; ExternalSecret kept for symmetry with the snapshot endpoint). Dev-box side (out of scope for this commit — see ~/.config/systemd/user/): - playwright-mcp.service flips to `--isolated --storage-state=...` so per-Claude-Code-session ephemeral contexts seed from the snapshot. - playwright-snapshot-refresh.{service,timer} (hourly) pulls the snapshot via the bearer-gated HTTPS endpoint. Docs updated: - docs/architecture/chrome-service.md — new architecture diagram + wire protocol. - docs/runbooks/chrome-service-snapshot.md — day-2 ops (refresh, rotation, failure modes, restore). - stacks/chrome-service/README.md — connect_over_cdp recipe. Design spec at docs/superpowers/specs/2026-06-04-playwright-per-session-browser-design.md.
2026-06-04 05:15:49 +00:00
#!/usr/bin/env python3
"""Tiny HTTP server that exposes /api/snapshot, gated by a bearer token.
Runs as a sidecar in the chrome-service pod. Reads the persisted storage
state written hourly by the snapshot-harvester CronJob and returns it to
authenticated callers (the dev-box `playwright-snapshot-refresh` timer).
Token is read from the PW_TOKEN env var, same secret the legacy WS path
used. The endpoint is mounted behind Traefik on `chrome.viktorbarzin.me`
at the `/api/snapshot` path (auth=none at the ingress; the bearer check
is here).
"""
import os
import sys
from http.server import HTTPServer, BaseHTTPRequestHandler
TOKEN = os.environ.get("PW_TOKEN")
SNAPSHOT_PATH = os.environ.get(
"SNAPSHOT_PATH", "/profile/snapshots/storage-state.json"
)
PORT = int(os.environ.get("PORT", "8088"))
class Handler(BaseHTTPRequestHandler):
server_version = "chrome-snapshot/1"
def _short(self, status: int, body: bytes = b"") -> None:
self.send_response(status)
self.send_header("Content-Length", str(len(body)))
self.end_headers()
if body:
self.wfile.write(body)
def do_GET(self):
if self.path == "/healthz":
self._short(200, b"ok\n")
return
if self.path != "/api/snapshot":
self._short(404)
return
if TOKEN is None:
self._short(503, b"{\"error\":\"token not configured\"}\n")
return
if self.headers.get("Authorization", "") != f"Bearer {TOKEN}":
self._short(401, b"{\"error\":\"invalid bearer\"}\n")
return
try:
with open(SNAPSHOT_PATH, "rb") as f:
data = f.read()
except FileNotFoundError:
self._short(404, b"{\"error\":\"snapshot not yet available\"}\n")
return
self.send_response(200)
self.send_header("Content-Type", "application/json")
self.send_header("Cache-Control", "no-cache")
self.send_header("Content-Length", str(len(data)))
self.end_headers()
self.wfile.write(data)
def log_message(self, fmt, *args):
sys.stderr.write(
"[snapshot-server] %s - %s\n" % (self.address_string(), fmt % args)
)
if __name__ == "__main__":
HTTPServer(("0.0.0.0", PORT), Handler).serve_forever()