Viktor hit "~/.local/bin is not part of the PATH". Root cause: the native claude
binary lives in ~/.local/bin, but the terminal launcher (start-claude.sh) runs in
tmux's NON-login bash env, which doesn't source the user's shell rc where the native
installer put ~/.local/bin on PATH. So `command -v claude` failed there → the
launcher's bootstrap re-ran the native installer → the installer printed the PATH
warning. (Interactive zsh already had ~/.local/bin via the per-user installer rc edit,
and t3-serve sets PATH in its unit — so only the terminal launcher was affected.)
- skel/start-claude.sh: prepend ~/.local/bin to PATH near the top (guarded/idempotent),
before the launch logic — so `claude` is found, no reinstall, no warning.
- setup-devvm.sh: install /etc/profile.d/10-local-bin.sh — adds ~/.local/bin to PATH for
all LOGIN shells machine-wide (SSH etc.), independent of the per-user installer rc edit
(fresh-user-safe). zsh login picks it up via /etc/zsh/zprofile -> /etc/profile.
- docs/architecture/multi-tenancy.md: documented the three PATH-injection points.
Verified: guard adds-when-missing / no-dup-when-present; all scripts pass bash -n.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Question from Viktor: should claude run via the binary or npx? Answer: the native
install is the recommended runtime (self-contained, self-updating ~/.local/bin/claude;
installMethod=native) — and every existing user had already auto-migrated to it, leaving
the npm-global copy empty and the npx fallback dead. "Leave only the recommended setup":
- setup-devvm.sh: node is now installed ONLY for the t3 CLI; dropped the machine-wide
`npm install -g @anthropic-ai/claude-code` (npm/npx is not the recommended runtime and
just shadowed the per-user native installs).
- t3-provision-users.sh: new per-user `install_user_claude_native` (runs the official
https://claude.ai/install.sh AS the user, idempotent/skip-if-present) — provisions native
claude for BOTH the terminal launcher and each t3-serve instance, replacing the npm bootstrap.
- skel/start-claude.sh: launcher runs the native `claude` only; if missing it bootstraps via
the native installer (was an `npx @anthropic-ai/claude-code` fallback).
- docs/architecture/multi-tenancy.md: documented the native-only runtime model.
node stays (the pinned t3 CLI is npm-global). Verified: native installer reachable +
produces ~/.local/bin/claude 2.1.177; all three scripts pass bash -n + shellcheck.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
emo reported being "logged out" on terminal.viktorbarzin.me: every new shell
dropped him at the first-run "Choose the text style" wizard, even though he'd
used many sessions and is in fact fully authenticated. Root cause is NOT a
logout — ~/.claude.json is a single file that all of a user's concurrent claude
processes (the ttyd terminal + their t3-serve instance + agent sessions)
read-modify-write, and a stale writer periodically drops top-level keys,
including hasCompletedOnboarding. That bounces the next interactive session back
to onboarding; credentials are safe in the separate ~/.claude/.credentials.json
(which is why T3 kept working). wizard's own ~/.claude.json showed the same key
loss, so this hits any heavy multi-session user.
Fix:
- skel/start-claude.sh: ensure_onboarding() idempotently re-asserts
hasCompletedOnboarding (+ lastOnboardingVersion) in ~/.claude.json right before
launching claude. Merge-only (never clobbers other keys), runs as the user, and
no-ops if jq is missing or the file is empty/corrupt. So even if the race drops
the flag, the next launch restores it before claude reads it.
- t3-provision-users.sh: deploy_user_launcher() re-copies skel/start-claude.sh
into every non-admin home (copy-if-changed) on the hourly reconcile. /etc/skel
only seeds the launcher at account creation, so without this the fix (and any
future launcher edit) would never reach existing users. .tmux.conf is
deliberately not re-copied — terminal-lobby appends a managed section to it.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Viktor asked to log Anca into her GitHub account so she can develop on
the devvm and deploy her apps through the existing CI/CD. Her GitHub
repos (Plotting-Your-Dream-Book, travel, My-Wardrobe — now cloned into
her ~/code workspace) default to main, and the launcher freshen only
fast-forwarded master, silently skipping them. ff the current branch's
upstream instead — same safety gates (on a branch, clean tree, upstream
configured, ff-only). Single-layout infra clones behave identically.
[ci skip]
Viktor asked to restructure Anca's setup: her ~/code WAS the infra clone
itself; he wants ~/code to be the directory where all her project repos
(tripit etc.) live side by side, with infra moved to a subdirectory.
- roster.yaml gains per-user 'code_layout: single|workspace' + 'repos',
validated + derived by roster_engine.py (12 new tests, 40 total).
- t3-provision-users reconcile: auto-migrates a single-layout ~/code to
~/code/infra (running processes follow the moved inode), hoists nested
project clones to the workspace root, clones roster repos from Forgejo
AS the user (their PAT makes private repos work), and wires the
documented forgejo remote + forgejo/master upstream into clones that
predate that contract.
- Fixed a latent TSV bug: empty jq @tsv fields collapse under tab-IFS
read, shifting later fields left (groups was only safe by being the
last field) — emit '-' sentinels instead.
- start-claude.sh session freshen is layout-aware (freshens each repo
under ~/code for workspace users).
- managed claudeMd + AGENTS.md non-admin recipe + multi-tenancy.md
updated in the same change.
Applied live: ancamilea = workspace (infra at ~/code/infra, her existing
tripit clone hoisted to ~/code/tripit, master upstream switched to
forgejo/master); emo stays single layout, untouched. [ci skip]
Non-admins (emo) need current master without manual pulls. Two layers:
- t3-provision-users reconcile gains refresh_locked_clone: fetch all
remotes + ff-only master, guarded (on master, clean tree, upstream
set); dirty/diverged clones are left alone with a WARN.
- start-claude.sh freshens ~/code at session launch, 15s-capped so an
offline remote never delays the session.
Verified live on emo's clone: stale clone ff'd to tip by the
reconciler; launcher snippet ff's when clean and refuses while a
dirty file exists. Deployed to /usr/local/bin/t3-provision-users,
/etc/skel/start-claude.sh, and emo's launcher.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The per-user launcher hardcoded --model claude-opus-4-8; an explicit --model flag overrides the managed default in /etc/claude-code/managed-settings.json (claude-fable-5). Dropping it lets emo and all new accounts inherit the org default (per-session /model still works). Deployed to /etc/skel and emo live copy in the same change.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
6d224861 came from a --no-checkout worktree whose empty index made the
commit drop every file except two. This restores 05b50d2b's full tree and
correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su
entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the
live infra was never applied from the broken commit.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CronJob stem95su-gdrive-sync (*/10) mounts the content PVC RW and
rclone-syncs the read-only Drive folder "claude" (stem claude/files) onto
it (rclone/rclone:1.74.3, scope=drive.readonly, empty-source guard +
--max-delete 25). ESO ExternalSecret stem95su-rclone <- Vault
secret/stem95su. Requires the GCP OAuth app published to Production or the
refresh token expires ~weekly.
Lands the gdrive-sync stack on master (it had landed on a feature branch
by accident on the shared devvm checkout).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>