chrome-service: switch to CDP + persistent profile + hourly snapshot pipeline
The chrome-service stack ran `playwright launch-server`, which creates
ephemeral browser contexts per `connect()`. Despite the encrypted PVC
mounted at /profile, no chromium user-data ever persisted — only npm
cache + fontconfig. Logging in via noVNC was effectively a no-op.
Refactor:
- Replace launch-server with direct chromium (TCP CDP on :9223 internal),
fronted by a Python HTTP+WS bridge on :9222 that rewrites the Host
header to bypass Chrome's hardcoded DNS-rebinding protection (no
`--remote-allow-hosts` flag exists in stock Chrome 130; verified by
binary string grep). Bridge also forces Connection: close on HTTP
responses so Node ws opens a fresh TCP for the WS upgrade rather than
trying to reuse the dead keep-alive socket.
- Add `--user-data-dir=/profile/chromium-data` so cookies/localStorage
actually persist on the encrypted PVC.
- New snapshot-server sidecar (stdlib python HTTP) serves
GET /api/snapshot at chrome.viktorbarzin.me/api/snapshot,
bearer-token-gated by the existing api_bearer_token.
- New chrome-service-snapshot-harvester CronJob (hourly) connects via
CDP, dumps storage_state() (cookies + localStorage), writes atomically
to /profile/snapshots/storage-state.json.
- NetworkPolicy: TCP/9222 (was :3000), TCP/8088 added for traefik.
Caller migration:
- f1-stream: `chromium.connect(ws_url)` → `chromium.connect_over_cdp(cdp_url)`,
env var CHROME_WS_URL → CHROME_CDP_URL. CHROME_WS_TOKEN dropped (no
longer used by code; ExternalSecret kept for symmetry with the snapshot
endpoint).
Dev-box side (out of scope for this commit — see ~/.config/systemd/user/):
- playwright-mcp.service flips to `--isolated --storage-state=...`
so per-Claude-Code-session ephemeral contexts seed from the snapshot.
- playwright-snapshot-refresh.{service,timer} (hourly) pulls the
snapshot via the bearer-gated HTTPS endpoint.
Docs updated:
- docs/architecture/chrome-service.md — new architecture diagram + wire protocol.
- docs/runbooks/chrome-service-snapshot.md — day-2 ops (refresh, rotation,
failure modes, restore).
- stacks/chrome-service/README.md — connect_over_cdp recipe.
Design spec at docs/superpowers/specs/2026-06-04-playwright-per-session-browser-design.md.
This commit is contained in:
parent
b64d8d6168
commit
deede6dd11
10 changed files with 1152 additions and 177 deletions
68
stacks/chrome-service/files/snapshot_server.py
Normal file
68
stacks/chrome-service/files/snapshot_server.py
Normal file
|
|
@ -0,0 +1,68 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Tiny HTTP server that exposes /api/snapshot, gated by a bearer token.
|
||||
|
||||
Runs as a sidecar in the chrome-service pod. Reads the persisted storage
|
||||
state written hourly by the snapshot-harvester CronJob and returns it to
|
||||
authenticated callers (the dev-box `playwright-snapshot-refresh` timer).
|
||||
|
||||
Token is read from the PW_TOKEN env var, same secret the legacy WS path
|
||||
used. The endpoint is mounted behind Traefik on `chrome.viktorbarzin.me`
|
||||
at the `/api/snapshot` path (auth=none at the ingress; the bearer check
|
||||
is here).
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
from http.server import HTTPServer, BaseHTTPRequestHandler
|
||||
|
||||
TOKEN = os.environ.get("PW_TOKEN")
|
||||
SNAPSHOT_PATH = os.environ.get(
|
||||
"SNAPSHOT_PATH", "/profile/snapshots/storage-state.json"
|
||||
)
|
||||
PORT = int(os.environ.get("PORT", "8088"))
|
||||
|
||||
|
||||
class Handler(BaseHTTPRequestHandler):
|
||||
server_version = "chrome-snapshot/1"
|
||||
|
||||
def _short(self, status: int, body: bytes = b"") -> None:
|
||||
self.send_response(status)
|
||||
self.send_header("Content-Length", str(len(body)))
|
||||
self.end_headers()
|
||||
if body:
|
||||
self.wfile.write(body)
|
||||
|
||||
def do_GET(self):
|
||||
if self.path == "/healthz":
|
||||
self._short(200, b"ok\n")
|
||||
return
|
||||
if self.path != "/api/snapshot":
|
||||
self._short(404)
|
||||
return
|
||||
if TOKEN is None:
|
||||
self._short(503, b"{\"error\":\"token not configured\"}\n")
|
||||
return
|
||||
if self.headers.get("Authorization", "") != f"Bearer {TOKEN}":
|
||||
self._short(401, b"{\"error\":\"invalid bearer\"}\n")
|
||||
return
|
||||
try:
|
||||
with open(SNAPSHOT_PATH, "rb") as f:
|
||||
data = f.read()
|
||||
except FileNotFoundError:
|
||||
self._short(404, b"{\"error\":\"snapshot not yet available\"}\n")
|
||||
return
|
||||
self.send_response(200)
|
||||
self.send_header("Content-Type", "application/json")
|
||||
self.send_header("Cache-Control", "no-cache")
|
||||
self.send_header("Content-Length", str(len(data)))
|
||||
self.end_headers()
|
||||
self.wfile.write(data)
|
||||
|
||||
def log_message(self, fmt, *args):
|
||||
sys.stderr.write(
|
||||
"[snapshot-server] %s - %s\n" % (self.address_string(), fmt % args)
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
HTTPServer(("0.0.0.0", PORT), Handler).serve_forever()
|
||||
Loading…
Add table
Add a link
Reference in a new issue