The f1-stream verifier's in-process headless Chromium kept tripping hmembeds' disable-devtool.js Performance detector (CDP latency on console.log vs console.table) and getting redirected to google.com. This adds a single-replica chrome-service stack running Playwright launch-server under Xvfb so callers can connect via WS+token to a shared headed browser. f1-stream's _ensure_browser now prefers chromium.connect(CHROME_WS_URL/CHROME_WS_TOKEN) and adds a vendored stealth init script (webdriver/plugins/languages/Permissions/WebGL spoofs + querySelector hijack to disarm disable-devtool-auto) on every new context. Falls back to in-process headless if the env vars aren't set. Encrypted PVC for profile + npm cache, NetworkPolicy to TCP/3000 gated by client-namespace label, 6h tar.gz backup CronJob to NFS, Authentik-gated nginx sidecar at chrome.viktorbarzin.me for human liveness checks. Image pinned to playwright:v1.48.0-noble in lockstep with the Python client's playwright==1.48.0. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
5.9 KiB
chrome-service — In-cluster headed Chromium pool
Overview
chrome-service is a single-replica, persistent-profile, bearer-token-gated
Playwright launch-server that exposes a headed Chromium browser over a
WebSocket. Sibling services connect to it instead of running their own
in-process Chromium when the upstream's anti-bot tooling
(disable-devtool.js redirect-to-google trap, console-clear timing tricks,
navigator.webdriver checks) defeats a headless browser.
Initial caller: f1-stream's playback_verifier. Future callers attach
via the WS+token contract documented in stacks/chrome-service/README.md.
Why a separate stack
In-process Chromium inside f1-stream:
- Runs headless by default (no
Xvfb/DISPLAY). - Has the
HeadlessChromium/...UA suffix andnavigator.webdriver === true. - Trips
disable-devtool.js's Performance detector — Playwright's CDP adds latency toconsole.log(largeArray)vsconsole.table(largeArray), which the lib reads as "DevTools is open" and redirects tohttps://www.google.com/.
chrome-service solves this by:
- Running headed under
Xvfb :99(viaplaywright launch-serverwith a JSON config that pinsheadless: false). - Living in a long-lived pod so JIT browser launch latency disappears.
- Allowing a per-context init script
(
stacks/chrome-service/files/stealth.js~ 40 lines, vendored frompuppeteer-extra-plugin-stealth) to spoofwebdriver,chrome.runtime,plugins,languages,Permissions.query, WebGL renderer strings, and to hide thedisable-devtool-autoscript-tag attribute so the lib's IIFE exits early.
Wire protocol
ws://chrome-service.chrome-service.svc.cluster.local:3000/<TOKEN>
│
┌───────────────────────────────┼───────────────────────────────┐
│ caller pod │ chrome-service pod
│ (e.g. f1-stream) │ (single replica)
│ │
│ CHROME_WS_URL ──────────────┘
│ CHROME_WS_TOKEN ─── from `secret/chrome-service.api_bearer_token` (ESO)
│
│ await chromium.connect(f"{ws}/{token}")
│ await ctx.add_init_script(STEALTH_JS)
│ page.goto("https://upstream.com/embed/...")
│
└─── ←── pages render under Xvfb, headed Chromium ──── ─────────┘
Image pin
Both the server image (mcr.microsoft.com/playwright:v1.48.0-noble in
stacks/chrome-service/main.tf) and the Python client
(playwright==1.48.0 in callers' requirements.txt) must match
minor-versions. Bump in lockstep — Playwright protocol changes between
minors and the client cannot connect to a mismatched server.
The Microsoft image ships only the browser binaries, not the playwright
npm SDK; the start command runs npx -y playwright@1.48.0 launch-server
which downloads the SDK on first start (cached under $HOME/.npm via the
PVC) and reuses it on subsequent restarts.
Storage
chrome-service-profile-encrypted(PVC, 2Gi → 10Gi autoresize,proxmox-lvm-encrypted) — Chromium user-data dir + npm cache. Encrypted because cookies/localStorage may include third-party auth tokens for sites callers drive.HOME=/profileso npx caches there.chrome-service-backup-host(NFS, RWX) — destination for a 6-hourly CronJob thattar -czf /backup/<YYYY_MM_DD_HH>.tar.gz -C /profile ., retention 30 days.
Auth + secrets
- Vault KV
secret/chrome-service.api_bearer_token— 32-byte URL-safe random, rotated by hand:vault kv put secret/chrome-service api_bearer_token=$(python3 -c 'import secrets; print(secrets.token_urlsafe(32))'). - ESO syncs into namespace-local Secret
chrome-service-secrets(server pod) andchrome-service-client-secrets(each caller pod). - Reloader (
reloader.stakater.com/auto = "true") cascades token rotation to both server and any annotated caller — no manual rollout.
Network controls
kubernetes_network_policy_v1.ws_ingress— only namespaces labelledchrome-service.viktorbarzin.me/client = "true"(plus an explicit fallback forf1-streambykubernetes.io/metadata.name) can reach TCP/3000.- WS port 3000 is internal-only (no ingress, no Cloudflare DNS).
- HTTP port 80 (sidecar
nginxinc/nginx-unprivileged:alpine) serves a static health stub atchrome.viktorbarzin.me, Authentik-gated. Lets a human confirm pod liveness without spinning a browser.
Adding a new caller
See stacks/chrome-service/README.md for the four-step recipe:
- Label the caller's namespace.
- Add an
ExternalSecretpullingsecret/chrome-service. - Inject
CHROME_WS_URL+CHROME_WS_TOKENenv vars. - Vendor
stealth.jsand apply viaawait context.add_init_script(...)after everynew_context().
Limits + risks
- Anti-bot vs stealth arms race — when an upstream beats us (DRM
license check, device-fingerprint mismatch, hotlink protection that
whitelists specific parent domains), the verifier returns
is_playable=Falseand the extractor moves on. No user-visible breakage, just empty stream lists for that source. - JWPlayer DRM error 102630 — observed with several hmembeds embeds even from the headed chrome-service. The license check bails because the request origin isn't on the embed's allowlist; this is upstream policy, not an infra defect.
- Single replica + RWO PVC — the deployment uses
Recreatestrategy. Brief outage on rollout, ~30s for browser warmup. - No
/metricsendpoint — the cluster's genericKubePodCrashLoopingrule covers basic alerting. A Prometheus scrape exporter is day-2 work.