homelab v0.8.0: browser verbs for headful anti-bot web automation
Some checks are pending
Build infra CLI / build (push) Waiting to run
ci/woodpecker/push/default Pipeline was successful

Add `homelab browser run|open` so agents can drive the cluster's headful
Chrome (chrome-service) over CDP from the devvm. The headless playwright/mcp
browser can load anti-bot sites and fill their forms, but the gated submit
silently fails — e.g. the Stirling Ackroyd Fixflo tenant portal returned
net::ERR_FILE_NOT_FOUND on its pre-submit check and hung, creating nothing.
Driving the real headful Chrome submits first try. That capability already
existed but was undiscoverable, so it cost ~40 min + redundant form re-runs to
find; now it is one command, versioned, test-covered, and `browser --help`
carries the when-to-use signature + an error-code cheat-sheet so the right tool
is reached at the right moment (the failure was judgment, not setup).

- port-forward svc/chrome-service:9222 (tunnels API-server->pod, so it bypasses
  the :9222 NetworkPolicy), assert non-headless via /json/version,
  connect_over_cdp, inject the same vendored stealth.js the in-cluster callers
  use; the port-forward is always torn down, on success and on error.
- node CDP client pinned to playwright-core@1.48.2 to match the v1.48.0-noble
  image (Chromium 130); self-provisioned lazily into ~/.cache/homelab, no
  per-user setup.
- default is a fresh incognito context (safe for the shared browser + concurrent
  callers); --shared-context reuses the warmed persistent profile.
- TDD: cmd_browser_test.go covers arg parsing, headless detection, the version
  pin, the help cheat-sheet, and a stealth.js drift guard. Verified end-to-end
  against bot.sannysoft.com (real Chrome UA, webdriver hidden, plugins/WebGL
  spoofed) and `browser open`.
- docs: README v0.8 section, ADR-0013, and a chrome-service.md "driving from
  outside the cluster" section.

Closes: code-nepg

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-22 12:22:22 +00:00
parent de163aa6af
commit a6b52a5839
10 changed files with 966 additions and 2 deletions

View file

@ -180,6 +180,42 @@ minor, with Python-side bindings pre-installed.
See `stacks/chrome-service/README.md` for the recipe (label namespace,
inject `CHROME_CDP_URL`, vendor `stealth.js`).
## Driving from OUTSIDE the cluster (`homelab browser`)
Agents on the devvm reach this browser through the **`homelab browser`** CLI
(`cli/`, ADR-0013) — the packaged, discoverable form of the ad-hoc
`connect_over_cdp` recipe. Use it when a site loads but a gated action
(submit/login) silently fails or hangs — the signature of headless / anti-bot
detection.
```text
devvm: homelab browser run flow.js
│ kubectl port-forward svc/chrome-service :9222 (random local port)
http://127.0.0.1:<port> ──► chrome-service pod :9222 (CDP)
│ assert /json/version Browser is "Chrome/…", not "HeadlessChrome"
│ node + playwright-core@1.48.2 → connectOverCDP
│ context.addInitScript(stealth.js) ← same vendored file as in-cluster
│ run the user's Playwright script with page/context/browser in scope
└─ port-forward always torn down (success or error)
```
Key facts:
- **port-forward bypasses the `:9222` NetworkPolicy.** It tunnels
API-server→pod, so the devvm needs no `chrome-service.viktorbarzin.me/client`
label — unlike in-cluster callers.
- **Client pinned to the image minor.** The node client is
`playwright-core@1.48.2` (matches `v1.48.0-noble` / Chromium 130), installed
lazily into `~/.cache/homelab/browser-client/`. Bump it in lockstep when the
server image bumps (same rule as the in-cluster Python clients — see "Image
pin" above).
- **Default context is a fresh incognito one** (closed on exit), safe for the
shared browser; `--shared-context` reuses the warmed persistent profile.
- **`stealth.js` is vendored** into the CLI (`cli/browser_stealth.js`) as a
byte-identical copy of `files/stealth.js`, guarded by a drift test — so the
CLI's stealth never diverges from the in-cluster callers'.
## Limits + risks
- **Anti-bot vs stealth arms race** — when an upstream beats us (DRM