workstation: per-user playwright browser MCP for all users, reproducible from git

Viktor asked that the playwright browser MCP be available for every devvm user
in every directory, with each user running their own server and multiple
concurrent sessions per user.

Before this, playwright was hand-set-up per user (~/.config/systemd/user/
playwright-mcp.service on 8931/8932/8933) and only wizard was actually wired —
emo's and anca's servers ran but their ~/.claude.json had no playwright entry,
so their Claude never connected. None of it was reproducible from git (units,
refresh script, and the Vault snapshot token lived only in user homes), so a
devvm rebuild would silently lose it.

This makes it reproducible and fixes the unwired users:

- roster_engine.py: sticky per-user PLAYWRIGHT_PORT (PLAYWRIGHT_BASE_PORT=8931,
  allocated for every roster user incl. the admin), emitted in the derive JSON.
- scripts/workstation/playwright/: system-level TEMPLATE units
  (playwright-mcp@.service + playwright-snapshot-refresh@.{service,timer},
  User=%i — system manager, so no systemd --user / linger) + the refresh script.
  @playwright/mcp pinned to 0.0.76 (avoids the @latest silent-fleet-roll
  footgun, same rationale as T3_PIN).
- setup-devvm.sh: install the templates + script (9e); stage the chrome-service
  snapshot bearer token from Vault to a root file (8c) — the hourly root
  reconcile has no Vault token, mirrors the Claude OAuth staging in 8a.
- t3-provision-users.sh: install_playwright() (ALL tiers incl. admin) writes
  PLAYWRIGHT_PORT, seeds the token if-absent, wires the user-scope ~/.claude.json
  by running `claude mcp add` AS the user (clobber-proof + if-absent, so it fixes
  existing/new/admin without rewriting a populated config), and enable --now's the
  instances (idempotent, never restarts a running server). Also hardened the
  section-1 *.env scan to skip the new playwright-*.env files (no T3_PORT -> grep
  no-match would abort under set -e -o pipefail).
- Docs: chrome-service-snapshot runbook (new Provisioning section + system-unit
  commands), multi-tenancy.md, and the 2026-06-07 plan Task 2.3.

Supersedes the hand-made per-user --user units (one-time idle-gated migration to
follow on the live host).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-16 20:33:47 +00:00
parent eb47eb1d10
commit 0a6ed4b2fe
11 changed files with 373 additions and 29 deletions

View file

@ -307,18 +307,76 @@ install_user_claude_native() {
fi
}
# Per-user playwright-mcp browser MCP — ALL tiers incl. admin (every user's Claude
# sessions connect to their OWN isolated server; a user's concurrent sessions are
# kept apart by the unit's --isolated). Idempotent + if-absent, so a routine
# reconcile never disturbs a live user: (1) seed the chrome-service snapshot token
# if the user has none; (2) wire the user-scope `playwright` MCP entry by running
# `claude mcp add` AS the user (writes THEIR ~/.claude.json, never reads another's;
# the CLI merges one key and REFUSES to clobber an existing one, so it's safe on a
# populated config), guarded by `claude mcp get`; (3) `enable --now` the system
# template instances (idempotent — does NOT restart an already-running server).
# Needs PLAYWRIGHT_PORT already in the per-user playwright env (written by the
# section-5c loop) + the token staged by setup-devvm.sh (section 8c).
install_playwright() {
local user="$1" home port token_staged=/etc/t3-serve/chrome-service-token
home="$(getent passwd "$user" | cut -d: -f6)"
[[ -n "$home" && -d "$home" ]] || return 0
port="$(grep -oE 'PLAYWRIGHT_PORT=[0-9]+' "$ENVDIR/playwright-$user.env" 2>/dev/null | cut -d= -f2 || true)"
[[ -n "$port" ]] || { log "WARN: no PLAYWRIGHT_PORT for $user -> skip playwright"; return 0; }
# (1) chrome-service snapshot token, if-absent (0600, owned by the user)
if [[ ! -f "$home/.config/playwright/token" && -r "$token_staged" ]]; then
if [[ "$DRY_RUN" == 1 ]]; then echo "[dry-run] seed playwright token -> $user"; else
install -d -o "$user" -g "$user" -m 0700 "$home/.config/playwright"
install -o "$user" -g "$user" -m 0600 "$token_staged" "$home/.config/playwright/token"
log "seeded playwright snapshot token -> $user"
fi
fi
# (2) wire user-scope ~/.claude.json (AS the user, login shell so the native
# ~/.local/bin/claude is on PATH; clobber-proof + if-absent via `mcp get`)
if [[ "$DRY_RUN" == 1 ]]; then
echo "[dry-run] wire playwright MCP (:$port) if-absent -> $user"
elif runuser -u "$user" -- bash -lc 'command -v claude >/dev/null 2>&1'; then
if ! runuser -u "$user" -- bash -lc 'claude mcp get playwright >/dev/null 2>&1'; then
runuser -u "$user" -- bash -lc "claude mcp add --scope user --transport http playwright 'http://localhost:$port/mcp' >/dev/null 2>&1" \
&& log "wired playwright MCP (user scope, :$port) -> $user" \
|| log "WARN: claude mcp add playwright failed for $user (retries next run)"
fi
else
log "WARN: claude not found for $user -> playwright MCP not wired (retries next run)"
fi
# (3) enable the system template instances. `enable --now` is idempotent and
# does NOT restart a running unit, so a live user is undisturbed.
run systemctl enable --now "playwright-mcp@$user.service" >/dev/null 2>&1 || true
run systemctl enable --now "playwright-snapshot-refresh@$user.timer" >/dev/null 2>&1 || true
}
[[ $EUID -eq 0 ]] || { echo "t3-provision-users: must run as root" >&2; exit 1; }
for bin in python3 jq; do command -v "$bin" >/dev/null || { echo "missing $bin" >&2; exit 1; }; done
[[ -f "$ROSTER" && -f "$ENGINE" ]] || { echo "roster/engine not under $WORKSTATION_DIR" >&2; exit 1; }
install -d -m 0755 "$ENVDIR"
# 1) current sticky ports from existing .env files -> {os_user: port}
ports_file="$(mktemp)"; trap 'rm -f "$ports_file" "${desired_file:-}"' EXIT
ports_file="$(mktemp)"; pw_ports_file="$(mktemp)"
trap 'rm -f "$ports_file" "$pw_ports_file" "${desired_file:-}"' EXIT
{ echo "{}"; for f in "$ENVDIR"/*.env; do
[[ -e "$f" ]] || continue
u="$(basename "$f" .env)"; p="$(grep -oE 'T3_PORT=[0-9]+' "$f" | cut -d= -f2)"
case "$(basename "$f")" in playwright-*) continue;; esac # not a t3-serve env (handled below)
# `|| true`: grep returns non-zero on no-match, which would abort under `set -e -o pipefail`.
u="$(basename "$f" .env)"; p="$(grep -oE 'T3_PORT=[0-9]+' "$f" | cut -d= -f2 || true)"
[[ -n "$p" ]] && jq -n --arg u "$u" --argjson p "$p" '{($u): $p}'
done; } | jq -s 'add' > "$ports_file"
# sticky PLAYWRIGHT ports from playwright-<os_user>.env (skipped by the loop above).
# Seeds roster_engine so the live per-user assignments stick across reconciles.
{ echo "{}"; for f in "$ENVDIR"/playwright-*.env; do
[[ -e "$f" ]] || continue
u="$(basename "$f" .env)"; u="${u#playwright-}"
p="$(grep -oE 'PLAYWRIGHT_PORT=[0-9]+' "$f" | cut -d= -f2 || true)"
[[ -n "$p" ]] && jq -n --arg u "$u" --argjson p "$p" '{($u): $p}'
done; } | jq -s 'add' > "$pw_ports_file"
# 2) tier validation vs live k8s_users (best-effort; aborts only on a real conflict)
if command -v vault >/dev/null; then
@ -336,7 +394,7 @@ fi
# 3) derive desired state
desired_file="$(mktemp)"
python3 "$ENGINE" derive --roster "$ROSTER" --ports-json "$ports_file" > "$desired_file"
python3 "$ENGINE" derive --roster "$ROSTER" --ports-json "$ports_file" --playwright-ports-json "$pw_ports_file" > "$desired_file"
jq -e . "$desired_file" >/dev/null || { echo "[t3-provision] derive produced invalid JSON" >&2; exit 1; }
# 3b) machine-wide Claude managed config (repo -> /etc; per-user codex mirrors in the loop below)
@ -396,6 +454,16 @@ while IFS=$'\t' read -r os_user port; do
id "$os_user" >/dev/null 2>&1 && run systemctl enable --now "t3-serve@$os_user.service" >/dev/null 2>&1 || true
done < <(jq -r '.ports | to_entries[] | [.key, .value] | @tsv' "$desired_file")
# 5c) per-user playwright-mcp (ALL tiers incl. admin): write the sticky
# PLAYWRIGHT_PORT to the per-user playwright env, then seed token + wire
# ~/.claude.json + enable the system template instances. if-absent /
# idempotent — never disturbs a live user's running server or existing config.
while IFS=$'\t' read -r os_user pw_port; do
id "$os_user" >/dev/null 2>&1 || continue
env_set "$ENVDIR/playwright-$os_user.env" PLAYWRIGHT_PORT "$pw_port"
install_playwright "$os_user"
done < <(jq -r '.playwright_ports | to_entries[] | [.key, .value] | @tsv' "$desired_file")
# 5b) machine-wide (once, not per-user): keep the t3 pinned-version ENFORCER enabled (it
# re-asserts T3_PIN daily; a no-op when already correct). NOT --now: with Persistent=true
# a `--now` enable fires the missed daily job IMMEDIATELY, which on 2026-06-09 pulled a