chrome-service: switch to CDP + persistent profile + hourly snapshot pipeline

The chrome-service stack ran `playwright launch-server`, which creates
ephemeral browser contexts per `connect()`. Despite the encrypted PVC
mounted at /profile, no chromium user-data ever persisted — only npm
cache + fontconfig. Logging in via noVNC was effectively a no-op.

Refactor:
- Replace launch-server with direct chromium (TCP CDP on :9223 internal),
  fronted by a Python HTTP+WS bridge on :9222 that rewrites the Host
  header to bypass Chrome's hardcoded DNS-rebinding protection (no
  `--remote-allow-hosts` flag exists in stock Chrome 130; verified by
  binary string grep). Bridge also forces Connection: close on HTTP
  responses so Node ws opens a fresh TCP for the WS upgrade rather than
  trying to reuse the dead keep-alive socket.
- Add `--user-data-dir=/profile/chromium-data` so cookies/localStorage
  actually persist on the encrypted PVC.
- New snapshot-server sidecar (stdlib python HTTP) serves
  GET /api/snapshot at chrome.viktorbarzin.me/api/snapshot,
  bearer-token-gated by the existing api_bearer_token.
- New chrome-service-snapshot-harvester CronJob (hourly) connects via
  CDP, dumps storage_state() (cookies + localStorage), writes atomically
  to /profile/snapshots/storage-state.json.
- NetworkPolicy: TCP/9222 (was :3000), TCP/8088 added for traefik.

Caller migration:
- f1-stream: `chromium.connect(ws_url)` → `chromium.connect_over_cdp(cdp_url)`,
  env var CHROME_WS_URL → CHROME_CDP_URL. CHROME_WS_TOKEN dropped (no
  longer used by code; ExternalSecret kept for symmetry with the snapshot
  endpoint).

Dev-box side (out of scope for this commit — see ~/.config/systemd/user/):
- playwright-mcp.service flips to `--isolated --storage-state=...`
  so per-Claude-Code-session ephemeral contexts seed from the snapshot.
- playwright-snapshot-refresh.{service,timer} (hourly) pulls the
  snapshot via the bearer-gated HTTPS endpoint.

Docs updated:
- docs/architecture/chrome-service.md — new architecture diagram + wire protocol.
- docs/runbooks/chrome-service-snapshot.md — day-2 ops (refresh, rotation,
  failure modes, restore).
- stacks/chrome-service/README.md — connect_over_cdp recipe.

Design spec at docs/superpowers/specs/2026-06-04-playwright-per-session-browser-design.md.
This commit is contained in:
Viktor Barzin 2026-06-04 05:15:49 +00:00
parent b64d8d6168
commit deede6dd11
10 changed files with 1152 additions and 177 deletions

View file

@ -336,14 +336,32 @@ class PlaybackVerifier:
logger.error("playwright not installed — playback verification disabled")
return None
self._playwright = await async_playwright().start()
ws_base = os.getenv("CHROME_WS_URL")
ws_token = os.getenv("CHROME_WS_TOKEN")
if ws_base and ws_token:
self._browser = await self._playwright.chromium.connect(
f"{ws_base.rstrip('/')}/{ws_token}", timeout=15_000,
)
logger.info("connected to remote chrome-service (concurrency=%d)", MAX_CONCURRENCY)
else:
# CHROME_CDP_URL points to chrome-service's CDP endpoint
# (http://chrome-service.chrome-service.svc:9222 by default).
# Migrated 2026-06-04 from `chromium.connect(ws_url)` because
# chrome-service now runs chromium directly with persistent
# user-data-dir for cookie warming — launch-server couldn't
# persist. The CDP `Browser` exposes the persistent default
# context via `browser.contexts[0]`; here we just call
# `new_context()` for incognito-style isolation per verify
# round, matching the previous behaviour.
cdp_url = os.getenv("CHROME_CDP_URL")
if cdp_url:
try:
self._browser = await self._playwright.chromium.connect_over_cdp(
cdp_url, timeout=15_000,
)
logger.info("connected to remote chrome-service via CDP (concurrency=%d)", MAX_CONCURRENCY)
except Exception:
logger.exception(
"CDP connect failed (%s) — falling back to in-process Chromium", cdp_url,
)
self._browser = None
if self._browser is None:
# Either CHROME_CDP_URL was unset, or CDP connect failed.
# Fall back to in-process headless so the verifier still
# returns playable/unplayable verdicts (degraded but
# functional — anti-bot pages may bypass).
self._browser = await self._playwright.chromium.launch(
headless=True,
args=[
@ -355,7 +373,10 @@ class PlaybackVerifier:
"--autoplay-policy=no-user-gesture-required",
],
)
logger.warning("CHROME_WS_URL not set — using in-process Chromium (concurrency=%d)", MAX_CONCURRENCY)
logger.warning(
"using in-process Chromium (CHROME_CDP_URL unset or CDP connect failed) (concurrency=%d)",
MAX_CONCURRENCY,
)
return self._browser
async def shutdown(self) -> None: