homelab: v0.6.0 — usage telemetry (usage top), evidence-driven verb prioritization
Answers the question that drove the whole CLI — which verbs to add next — with
data instead of one maintainer's habits, and resolves the cross-user-usage ask
in-bounds (no reading anyone's home).
- emit on dispatch: every verb fire-and-forgets one Loki line {job,user,verb} +
"exit=N ver=X". ONLY the verb path + exit code — never args, paths, flags, or
secrets (the emit never sees arguments). Best-effort: 800ms timeout, errors
swallowed, never affects the command; opt-out HOMELAB_TELEMETRY=0. Discovery
verbs (manifest/version/help) and usage itself don't self-record.
- usage top [--since 30d] [--user U] [--json]: ranks verbs via
sum by (verb)(count_over_time({job="homelab-usage"}[…])) against the shared
Loki. Cross-user analytics WITHOUT touching ~/.claude — the privacy-preserving
answer to "what does the team use".
- Loki sink (zero new infra, dogfoods v0.5 logs path); push verified HTTP 204 no
auth. ADR docs/adr/0011.
Live-verified: ran 4 verbs, usage top ranked them correctly (metrics query=2).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
666fefd22b
commit
3e3fdb34f0
9 changed files with 215 additions and 4 deletions
34
docs/adr/0011-homelab-usage-telemetry.md
Normal file
34
docs/adr/0011-homelab-usage-telemetry.md
Normal file
|
|
@ -0,0 +1,34 @@
|
|||
# homelab usage telemetry: evidence-driven verb prioritization, privacy by construction
|
||||
|
||||
v0.6 adds `usage top` plus a fire-and-forget emit on every dispatched verb. It
|
||||
exists to answer the question that drove the whole CLI — *which verbs are worth
|
||||
adding next* — with data instead of one maintainer's habits (the earlier mining
|
||||
covered a single user's ~51k commands, so the surface is shaped to that user).
|
||||
|
||||
## Decisions
|
||||
|
||||
- **Emit on dispatch, in `dispatch()`.** The longest-prefix match already knows
|
||||
the verb path; after `Run` returns we emit `{verb, exit}`. Discovery verbs
|
||||
don't go through `dispatch()` (`manifest`/`version`/`help` are handled in
|
||||
`dispatchTop`), so they don't self-record; `usage *` is skipped explicitly so
|
||||
the analytics reader doesn't pollute its own data.
|
||||
- **Payload is deliberately minimal: verb path + exit code only.** Labels
|
||||
`{job=homelab-usage, user, verb}` (all low-cardinality) + line `exit=N ver=X`.
|
||||
**No args, paths, flags, hostnames, or secrets** ever leave the process — the
|
||||
emit sees only the matched verb name, not the arguments. This is what makes
|
||||
cross-user aggregation safe.
|
||||
- **Shared Loki sink → cross-user analytics WITHOUT reading homes.** Each user's
|
||||
CLI writes its own invocations (attributed to its OS user) to the shared Loki
|
||||
push API via the Traefik LB (verified: HTTP 204, no auth). `usage top` reads
|
||||
back with a LogQL metric query. This is the privacy-preserving resolution to
|
||||
"what does everyone (e.g. another user) use" — it never touches anyone's
|
||||
`~/.claude`, which the org per-user policy bars (see the per-user red-line in
|
||||
managed-settings; reading another user's home is off-limits even for an owner
|
||||
in-session — a fresh session under changed MDM policy is the only legitimate
|
||||
path, and even then this telemetry is the better answer).
|
||||
- **Best-effort, never affects the command.** All errors swallowed; an 800ms
|
||||
client timeout bounds the cost; opt-out via `HOMELAB_TELEMETRY=0`. Telemetry
|
||||
must never slow or break the tool it measures.
|
||||
- **Loki, not a new datastore.** Zero new infra, and it dogfoods the v0.5 `logs`
|
||||
path (same host, same LB dial). Presence MySQL was the alternative (queryable
|
||||
SQL) but would add a write dependency and creds; Loki needs neither.
|
||||
Loading…
Add table
Add a link
Reference in a new issue