Commit graph

43 commits

Author SHA1 Message Date
Viktor Barzin
3e3fdb34f0 homelab: v0.6.0 — usage telemetry (usage top), evidence-driven verb prioritization
Some checks are pending
Build infra CLI / build (push) Waiting to run
ci/woodpecker/push/default Pipeline was successful
Answers the question that drove the whole CLI — which verbs to add next — with
data instead of one maintainer's habits, and resolves the cross-user-usage ask
in-bounds (no reading anyone's home).

- emit on dispatch: every verb fire-and-forgets one Loki line {job,user,verb} +
  "exit=N ver=X". ONLY the verb path + exit code — never args, paths, flags, or
  secrets (the emit never sees arguments). Best-effort: 800ms timeout, errors
  swallowed, never affects the command; opt-out HOMELAB_TELEMETRY=0. Discovery
  verbs (manifest/version/help) and usage itself don't self-record.
- usage top [--since 30d] [--user U] [--json]: ranks verbs via
  sum by (verb)(count_over_time({job="homelab-usage"}[…])) against the shared
  Loki. Cross-user analytics WITHOUT touching ~/.claude — the privacy-preserving
  answer to "what does the team use".
- Loki sink (zero new infra, dogfoods v0.5 logs path); push verified HTTP 204 no
  auth. ADR docs/adr/0011.

Live-verified: ran 4 verbs, usage top ranked them correctly (metrics query=2).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 22:29:01 +00:00
Viktor Barzin
e91e1612dd homelab: v0.5.0 — net/dns/metrics/logs probes (endpoint resolution)
The remaining verbs that pass the "saves reasoning, not just typing" test the
user posed mid-session: each encodes the non-obvious which-endpoint-reached-how
resolution otherwise re-derived every time. (Same test deprioritized node-ssh
and secret-get aliasing — thin wrappers over commands already known.)

- net check <host> [path]: two-legged reachability — external (public DNS→CF)
  vs internal (Traefik LB) — so you see WHERE a break is, not just that one path
  works. (live: surfaced the LB at 6ms vs CF 77ms.)
- dns lookup <name> [type]: Technitium (10.0.20.201) vs public (1.1.1.1) diff.
- metrics query "<promql>" / metrics alerts: Prometheus via the LB
  (prometheus-query.viktorbarzin.lan); alerts uses the synthetic ALERTS series
  since the query frontend has no /api/v1/alerts and Alertmanager has no ingress.
- logs query "<logql>" [--since 1h] [--limit N]: Loki range query via the LB.

All reach auth-free internal ingresses through the LB (Go form of
curl --resolve host:443:10.0.20.203) — no port-forward, no kubectl. In-cluster-
only endpoints (Alertmanager v2) deliberately out of scope. Verified live before
building; all five smoke-tested green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 11:27:31 +00:00
Viktor Barzin
9189560ac3 homelab: v0.4.0 — ci/deploy verbs (watch what you trigger)
Some checks are pending
Build infra CLI / build (push) Waiting to run
ci/woodpecker/push/default Pipeline was successful
Adds the verb-group that kills the single biggest reasoning sink in agent
sessions — watching a build/deploy to completion (proven the session that built
it: hours hand-rolling Woodpecker polling + DB-schema spelunking for one CI
incident).

- ci status/watch: Woodpecker REST API (version-stable, not its DB schema),
  reached via the internal Traefik LB (dial 10.0.20.203, SNI=ci.viktorbarzin.me
  so the cert verifies — the Go form of the house `curl --resolve` pattern),
  token from WOODPECKER_TOKEN/Vault, repo id resolved from the cwd remote, with
  retries that ride Woodpecker's intermittent empty responses. watch matches the
  HEAD/given commit (avoids the post-push race) and exits non-zero on failure.
- deploy wait: image-sha match THEN rollout status (rollout status alone returns
  success on the old ReplicaSet); kubectl-based.
- work land now auto-watches CI to green on the landed commit (--no-ci-watch to
  skip), closing the v0.1 gap.
- ci logs deferred to v0.4.1 (Woodpecker detail/log endpoints were the least
  reliable; status/watch use the working list endpoint).

Live-verified ci status/watch against the live API.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 10:59:14 +00:00
Viktor Barzin
787ce4edfa homelab: v0.3.1 — fix k8s db PG target (resolve CNPG primary pod, not the Service)
Some checks are pending
Build infra CLI / build (push) Waiting to run
ci/woodpecker/push/default Pipeline was successful
`k8s db <app>` (Postgres path) execed `pg-cluster-rw`, which is the CNPG
read-write SERVICE, not a pod — so kubectl exec failed with
`pods "pg-cluster-rw" not found`. The unit test only checked the plan; the verb
was never fired at live state (the gap flagged in v0.2), so it shipped broken.

Fix: the PG plan now carries a label selector (cnpg.io/instanceRole=primary)
instead of a pod name, and k8s db resolves the actual primary POD via
`kubectl get pod -l <selector>` before exec. MySQL path (real pod
mysql-standalone-0) unchanged. Live-verified both paths (psql + mysql).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 09:09:34 +00:00
Viktor Barzin
48b63ffa6f homelab: add memory verb-group (v0.3.0) — direct claude-memory HTTP client
Some checks failed
Build infra CLI / build (push) Waiting to run
ci/woodpecker/push/default Pipeline failed
Lets agents search/navigate memory via the CLI, as the first step toward
deprecating the memory MCP. claude-memory is a FastAPI service (the MCP is just
one frontend); homelab memory is a thin Bearer-auth HTTP client over the same
API, using the env the hooks already set (CLAUDE_MEMORY_API_URL/KEY). It works
even when the MCP frontend is down — the recurring disconnect that took the MCP
offline for this whole session.

Verbs: recall (server-side semantic search), list, categories, tags, stats,
secret (read); store, update, delete (write). Validated against the live API
including a store→recall→delete round-trip — full data-plane parity with the MCP.

The deprecation itself (rewiring the per-prompt auto-recall + auto-learn hooks to
the CLI, then uninstalling the MCP) is a deliberate follow-up, sequenced after
the CLI is proven in the hooks — see docs/adr/0008.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 05:56:25 +00:00
Viktor Barzin
3594485f77 homelab: v0.2.0 — docs + version for the k8s verb-group
Some checks are pending
Build infra CLI / build (push) Waiting to run
ci/woodpecker/push/default Pipeline was successful
Bump cli/VERSION to v0.2.0; document the k8s verbs (README table + resolver
note), add docs/adr/0007 (resolver, read/write split, config-mutation stays
raw, db dbaas pattern), and extend the AGENTS.md discovery pointer with the
Kubernetes surface.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 22:30:41 +00:00
Viktor Barzin
1f7438bb18 homelab: add k8s verb-group (v0.2) — the biggest remaining surface
Mining the post-v0.1 corpus showed kubectl is the dominant remaining domain by
far: 11,291 commands across 243 sessions (more than everything else combined).
This adds the full k8s verb-group built on an app→namespace→pod resolver (most
namespaces hold one app, so <app> defaults to the namespace and the target
defaults to deploy/<app>, letting kubectl resolve the pod; -n/--pod/-c/-l/--tty
override).

Read: status (pods + non-Normal events), get, logs, describe, debug (one-shot
triage), pf, rollout-status. Write/operational: db (the dbaas psql/mysql exec
pattern — PG via pg-cluster-rw -c postgres, MySQL via mysql-standalone-0 with the
env-password bash wrapper, never inline), exec, rm-pod (pods/jobs ONLY), restart.
Config-mutation verbs (apply/edit/patch/scale/create) are deliberately NOT
exposed — they stay raw per the Terraform-only policy.

Smoke-verified read verbs against the live cluster (get/logs/rollout-status);
write verbs are unit-tested (resolver, db-plan, shell-quoting) but not fired at
live state.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 22:29:51 +00:00
Viktor Barzin
66caa0bf7f homelab: v0.1 docs, distribution wiring, and version
Some checks are pending
Build infra CLI / build (push) Waiting to run
ci/woodpecker/push/default Pipeline was successful
Completes v0.1: documentation, build/install path, and version stamping.

- cli/VERSION (v0.1.0) stamped into the binary via ldflags.
- cli/README.md rewritten as the homelab overview (verbs + tiers, manifest,
  build, the preserved legacy webhook use-cases).
- docs/adr/0004-0006: why homelab exists (grown in place from infra/cli, not a
  separate repo), v0.1 scope + everything-allowed/tiers-recorded, and the
  work/tf behaviour (native worktree entry, verification-gated auto-land,
  presence-coupled apply).
- setup-devvm.sh builds cli/ -> /usr/local/bin/homelab each provisioning run
  (t3-dispatch pattern), so every devvm user gets the current binary.
- AGENTS.md: discovery pointer under Common Operations.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 19:25:51 +00:00
Viktor Barzin
087b415f73 homelab: add work verbs (start/land/clean) with a land verification gate
Completes the infra-loop verb surface. work start creates .worktrees/<topic>
on <user>/<topic> off <remote>/master (git-crypt-aware, ensures .worktrees is
ignored) and prints the path for native EnterWorktree entry. work land fetches,
merges master in, verifies, pushes HEAD:master with non-fast-forward retry, and
falls back to pushing the feature branch for a PR when the direct push is
rejected (branch protection). work clean removes the worktree + branch.

Safety: work land REFUSES to push when it cannot verify (no --verify-cmd and no
auto-detected suite) unless --no-verify is passed. This was added after an
accidental smoke-test invocation pushed unverified WIP to master (benign — the
infra CI applied 0 stacks since the diff was cli/-only — but the gate makes an
unverified land a deliberate choice, not the default).

Known v0.1 limitation: land does not yet block on CI to green; that arrives with
the ci/deploy watch verbs. It prints a reminder to follow the pipeline manually.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 19:24:08 +00:00
Viktor Barzin
36d562c15c homelab: add tf verbs + stack/git-crypt substrate
Some checks are pending
Build infra CLI / build (push) Waiting to run
ci/woodpecker/push/default Pipeline was successful
Adds the tf verb-group and the resolver substrate beneath it, continuing the
v0.1 infra-loop build.

- substrate: findInfraRoot (walk up to terragrunt.hcl + stacks/), stack→dir
  resolver, and repo/remote/git-crypt detection (preferRemote forgejo>origin,
  hasGitCryptAttr, gitCryptFlags) — the last is for `work` next.
- tf plan/validate/fmt/force-unlock/apply, resolving the stack from cwd and
  delegating to scripts/tg (which owns state decrypt/encrypt, the Vault lock,
  and the ingress auth-comment check) rather than calling terragrunt directly.
- tf apply is presence-coupled: claims stack:<name>, ALWAYS releases on exit
  (normal, error, or SIGINT/SIGTERM via sync.Once + signal handler) — fixing
  the documented ~200-claim leak — and prints an out-of-band reminder since CI
  applies canonically on push.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 19:16:33 +00:00
Viktor Barzin
ed6f22fd53 homelab: scaffold unified CLI (registry, manifest, claim/release) in infra/cli
Begin evolving the existing infra/cli into the agent-facing "homelab" CLI
decided in the design/grilling session: one composable, JSON-capable surface
for the operations agents run over and over (mined from 51k commands across
2,225 past sessions; the infra inner-loop is ~29% of them). v0.1 targets that
loop — work/tf/claim — and ships here, in place, in infra/cli.

This first slice:
- command registry + dispatcher (longest-prefix verb matching) and a
  `manifest`/`manifest --json` progressive-discovery entrypoint; every verb
  declares a read|write tier so write-gating can be added later (everything is
  allowed for now).
- claim/release verbs wrapping the existing presence script (not reimplemented),
  with label-taxonomy validation.
- main() front-dispatches the homelab verb surface but falls through to the
  legacy webhook -use-case path verbatim, so the in-cluster infra-cli image is
  unaffected.
- fix a pre-existing vet error (glog.Infof missing format directive) that
  blocked `go test`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 19:12:57 +00:00
Viktor Barzin
fd0f4a0365 fix: restore tree dropped by 6d224861; land stem95su gdrive-sync (10m) [ci skip]
6d224861 came from a --no-checkout worktree whose empty index made the
commit drop every file except two. This restores 05b50d2b's full tree and
correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su
entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the
live infra was never applied from the broken commit.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 08:45:33 +00:00
Viktor Barzin
6d224861c4 stem95su: scheduled Drive->site sync CronJob (every 10m)
CronJob stem95su-gdrive-sync (*/10) mounts the content PVC RW and
rclone-syncs the read-only Drive folder "claude" (stem claude/files) onto
it (rclone/rclone:1.74.3, scope=drive.readonly, empty-source guard +
--max-delete 25). ESO ExternalSecret stem95su-rclone <- Vault
secret/stem95su. Requires the GCP OAuth app published to Production or the
refresh token expires ~weekly.

Lands the gdrive-sync stack on master (it had landed on a feature branch
by accident on the shared devvm checkout).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 08:42:26 +00:00
Viktor Barzin
644562454c add IPv6 connectivity via Hurricane Electric 6in4 tunnel
- Add public_ipv6 variable and AAAA records for all 34 non-proxied services
- Fix stale DNS records (85.130.108.6 → 176.12.22.76, old IPv6 → HE tunnel)
- Update SPF record with current IPv4/IPv6 addresses
- Add AAAA update support to Technitium DNS updater CLI
- Pin mailserver MetalLB IP to 10.0.20.201 for stable pfSense NAT
- pfSense: HE_IPv6 interface, strict firewall (80,443,25,465,587,993 + ICMPv6),
  socat IPv6→IPv4 proxy, removed dangerous "Allow all DEBUG" rules
2026-03-23 02:22:00 +02:00
Viktor Barzin
a475de6923 update @ record as well 2024-12-02 21:51:05 +00:00
Viktor Barzin
5ae9422083 always update techntiium for test 2024-12-02 21:33:11 +00:00
Viktor Barzin
bd061dfd55 fix typoe for host selection whe updating technitium dns 2024-12-02 21:23:11 +00:00
Viktor Barzin
543fca5dbb update dns update ip job to update technitium via api 2024-01-23 20:30:39 +00:00
Viktor Barzin
33b51bbb09 ensure updating ip job works only with ipv4 2023-09-18 14:14:30 +00:00
Viktor Barzin
47b4ba6a5d make notify message more detailed 2023-05-10 18:04:19 +00:00
Viktor Barzin
76a04674b0 use hardcoded old str as tfvars is not being changed 2023-05-10 17:47:59 +00:00
Viktor Barzin
3281458c4a add notification when ip changes 2023-05-10 17:22:07 +00:00
Viktor Barzin
fc3392a77c fix typo in flag name 2023-05-10 15:47:20 +00:00
Viktor Barzin
e9dcb3840f add cli option to update ip based on the dyndns 2023-05-10 15:42:57 +00:00
Viktor Barzin
b874047a74 add ability to cli to update public-facing ip based on dyndns [ci skip] 2023-05-10 13:43:16 +00:00
viktorbarzin
559c127876 fix bug where alway overwrite... 2021-04-09 21:08:30 +01:00
viktorbarzin
a16685411c improve logging [ci skip] 2021-04-09 20:47:17 +01:00
viktorbarzin
368efca158 improve alias file overwriting 2021-04-09 20:26:33 +01:00
viktorbarzin
28b5c0f9b6 allow infra cli to add email aliases 2021-04-08 23:25:09 +01:00
viktorbarzin
b46a5722c6 allow infra cli to add email aliases 2021-04-08 23:00:31 +01:00
viktorbarzin
7d6358cf53 add setup open wrt dns use case 2021-03-31 23:54:15 +01:00
viktorbarzin
b7e422f5f8 update base64 regex for pub key 2021-03-25 23:02:20 +00:00
viktorbarzin
c4465b9425 update base64 regex for pub key 2021-03-25 22:57:49 +00:00
viktorbarzin
059283aca1 add flag to print only allocated ip for vpn use case 2021-03-25 19:55:20 +00:00
viktorbarzin
2ce19c3faa fix typo in regex 2021-03-24 23:03:48 +00:00
viktorbarzin
f52f85bf83 add vpn cli checks 2021-03-24 21:24:40 +00:00
viktorbarzin
bf78be47b8 print client name on success [CI SKIP] 2021-03-17 21:51:30 +00:00
viktorbarzin
fb9e534cff make pushes to origin trigger drone pipeline 2021-03-17 00:06:19 +00:00
viktorbarzin
34e0e17820 make pushes to origin trigger drone pipeline 2021-03-16 23:43:10 +00:00
viktorbarzin
6894e92eeb add author to commit name 2021-03-16 23:35:22 +00:00
viktorbarzin
be6ec8ce70 rename binary name 2021-03-16 19:30:40 +00:00
viktorbarzin
899805ddac add dockerfile for cli and add to drone 2021-03-16 00:11:00 +00:00
viktorbarzin
70d306d0d2 add cli client for repo 2021-03-15 23:32:56 +00:00