The chrome-service stack ran `playwright launch-server`, which creates
ephemeral browser contexts per `connect()`. Despite the encrypted PVC
mounted at /profile, no chromium user-data ever persisted — only npm
cache + fontconfig. Logging in via noVNC was effectively a no-op.
Refactor:
- Replace launch-server with direct chromium (TCP CDP on :9223 internal),
fronted by a Python HTTP+WS bridge on :9222 that rewrites the Host
header to bypass Chrome's hardcoded DNS-rebinding protection (no
`--remote-allow-hosts` flag exists in stock Chrome 130; verified by
binary string grep). Bridge also forces Connection: close on HTTP
responses so Node ws opens a fresh TCP for the WS upgrade rather than
trying to reuse the dead keep-alive socket.
- Add `--user-data-dir=/profile/chromium-data` so cookies/localStorage
actually persist on the encrypted PVC.
- New snapshot-server sidecar (stdlib python HTTP) serves
GET /api/snapshot at chrome.viktorbarzin.me/api/snapshot,
bearer-token-gated by the existing api_bearer_token.
- New chrome-service-snapshot-harvester CronJob (hourly) connects via
CDP, dumps storage_state() (cookies + localStorage), writes atomically
to /profile/snapshots/storage-state.json.
- NetworkPolicy: TCP/9222 (was :3000), TCP/8088 added for traefik.
Caller migration:
- f1-stream: `chromium.connect(ws_url)` → `chromium.connect_over_cdp(cdp_url)`,
env var CHROME_WS_URL → CHROME_CDP_URL. CHROME_WS_TOKEN dropped (no
longer used by code; ExternalSecret kept for symmetry with the snapshot
endpoint).
Dev-box side (out of scope for this commit — see ~/.config/systemd/user/):
- playwright-mcp.service flips to `--isolated --storage-state=...`
so per-Claude-Code-session ephemeral contexts seed from the snapshot.
- playwright-snapshot-refresh.{service,timer} (hourly) pulls the
snapshot via the bearer-gated HTTPS endpoint.
Docs updated:
- docs/architecture/chrome-service.md — new architecture diagram + wire protocol.
- docs/runbooks/chrome-service-snapshot.md — day-2 ops (refresh, rotation,
failure modes, restore).
- stacks/chrome-service/README.md — connect_over_cdp recipe.
Design spec at docs/superpowers/specs/2026-06-04-playwright-per-session-browser-design.md.
69 lines
2.4 KiB
Python
69 lines
2.4 KiB
Python
#!/usr/bin/env python3
|
|
"""Connect to chrome-service via CDP, dump storage state, write atomically.
|
|
|
|
Runs hourly as a Kubernetes CronJob. Mounts the chrome-service encrypted
|
|
PVC at /profile (same node via pod-affinity) and writes the snapshot to
|
|
/profile/snapshots/storage-state.json. The snapshot-server sidecar reads
|
|
from the same path and serves it bearer-gated.
|
|
|
|
CDP endpoint is plain HTTP — protection is the chrome-service
|
|
NetworkPolicy (allow only labelled client namespaces). Same security model
|
|
as the previous WS endpoint, just unauthenticated within the trust zone.
|
|
"""
|
|
|
|
import asyncio
|
|
import logging
|
|
import os
|
|
import pathlib
|
|
import sys
|
|
|
|
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
|
|
log = logging.getLogger("snapshot-harvester")
|
|
|
|
CDP_URL = os.environ.get(
|
|
"CDP_URL", "http://chrome-service.chrome-service.svc.cluster.local:9222"
|
|
)
|
|
SNAPSHOT_DIR = pathlib.Path(os.environ.get("SNAPSHOT_DIR", "/profile/snapshots"))
|
|
SNAPSHOT_FILE = SNAPSHOT_DIR / "storage-state.json"
|
|
TMP_FILE = SNAPSHOT_DIR / "storage-state.json.tmp"
|
|
|
|
|
|
async def main() -> int:
|
|
try:
|
|
from playwright.async_api import async_playwright
|
|
except ImportError:
|
|
log.error("playwright not installed in image")
|
|
return 2
|
|
|
|
SNAPSHOT_DIR.mkdir(parents=True, exist_ok=True)
|
|
|
|
async with async_playwright() as p:
|
|
try:
|
|
browser = await p.chromium.connect_over_cdp(CDP_URL, timeout=20_000)
|
|
except Exception:
|
|
log.exception("connect_over_cdp failed (%s)", CDP_URL)
|
|
return 3
|
|
|
|
try:
|
|
contexts = browser.contexts
|
|
if not contexts:
|
|
log.error("no browser contexts found — chrome-service may not have launched a persistent context yet")
|
|
return 4
|
|
ctx = contexts[0]
|
|
# storage_state writes cookies + localStorage to a JSON file.
|
|
# IndexedDB and sessionStorage are NOT included (known Playwright limitation).
|
|
await ctx.storage_state(path=str(TMP_FILE))
|
|
os.replace(TMP_FILE, SNAPSHOT_FILE)
|
|
size = SNAPSHOT_FILE.stat().st_size
|
|
log.info("wrote snapshot (%d bytes) to %s", size, SNAPSHOT_FILE)
|
|
finally:
|
|
try:
|
|
await browser.close()
|
|
except Exception:
|
|
pass
|
|
|
|
return 0
|
|
|
|
|
|
if __name__ == "__main__":
|
|
sys.exit(asyncio.run(main()))
|