All checks were successful
ci/woodpecker/push/default Pipeline was successful
A heavy user (emo) runs 8+ always-on `claude` agents + their t3-serve instance, all sharing one ~/.claude/.credentials.json. When the shared access token expires the processes refresh simultaneously; OAuth refresh-token rotation makes the losing writer persist an EMPTY refresh token, logging the user out roughly every access-token lifetime (~8h). Re-issuing the credential never sticks — the race recurs (this is why emo's "standalone token" fix kept regressing). Fix: an opt-in, per-user, non-rotating setup-token (sk-ant-oat01, ~1y, scope user:inference) kept in the user's OWN Vault path (field `setup_token`). claude-auth-sync materializes it to a user-owned ~/.config/claude-auth-sync/claude-oauth.env and, while it is present, SKIPS the rotating-credential validate/backup/restore (so no false WorkstationClaudeAuthInvalid). start-claude.sh and t3-serve@.service load it as CLAUDE_CODE_OAUTH_TOKEN, so every session of that user uses the non-rotating token and there is nothing to race on. Fail-safe + opt-in: with no `setup_token` in Vault, every path is a no-op, so users on the normal per-user Enterprise-SSO flow are unaffected. This is each user's OWN identity, never the forbidden shared CLAUDE_CODE_OAUTH_TOKEN. Runbook documents enable/disable/rotate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
37 lines
1.5 KiB
Desktop File
37 lines
1.5 KiB
Desktop File
[Unit]
|
|
Description=T3 Code server for %i (t3 serve, per-user)
|
|
Documentation=https://github.com/pingdotgg/t3code
|
|
After=network.target
|
|
|
|
[Service]
|
|
Type=simple
|
|
User=%i
|
|
Group=%i
|
|
Environment=HOME=/home/%i
|
|
Environment=PATH=/usr/local/bin:/usr/bin:/bin:/home/%i/.local/bin
|
|
Environment=NODE_ENV=production
|
|
EnvironmentFile=/etc/t3-serve/%i.env
|
|
# Optional per-user long-lived CLAUDE_CODE_OAUTH_TOKEN, materialized by
|
|
# claude-auth-sync from the user's own Vault path. Non-rotating, so t3's
|
|
# concurrent agent sessions can't race on OAuth refresh-token rotation and wipe
|
|
# the shared ~/.claude/.credentials.json. Leading '-' = optional (absent for
|
|
# users on the normal per-user Enterprise-SSO credential flow).
|
|
EnvironmentFile=-/home/%i/.config/claude-auth-sync/claude-oauth.env
|
|
WorkingDirectory=/home/%i
|
|
ExecStart=/usr/bin/t3 serve --host 0.0.0.0 --port ${T3_PORT} --base-dir /home/%i/.t3
|
|
Restart=on-failure
|
|
RestartSec=5
|
|
# Memory containment (2026-06-10): agent children live in this cgroup; a
|
|
# runaway agent (10.8G anon on a 23G host) swap-thrashed the whole devvm —
|
|
# every >20s stall fires the t3 client watchdog (visible "disconnects") —
|
|
# then global-OOMed. Cap the cgroup so a runaway OOMs early and locally,
|
|
# and forbid swap so stalls can't smear into minutes-long freezes.
|
|
MemoryHigh=12G
|
|
MemoryMax=16G
|
|
MemorySwapMax=0
|
|
# Default OOMPolicy=stop kills the WHOLE unit (8.5min outage 2026-06-10
|
|
# 19:56) when ANY child is OOM-killed; continue = runaway dies, server stays.
|
|
OOMPolicy=continue
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|