infra

Author	SHA1	Message	Date
Viktor Barzin	59a531b8e0	coredns: pods get internal split-horizon answers for viktorbarzin.me [ci skip] Forward the viktorbarzin.me:53 pod block to the Technitium ClusterIP (10.96.0.53, same as the .lan block) instead of 8.8.8.8/1.1.1.1. Pods become ordinary internal clients (CNAME -> apex -> live Traefik LB; mail -> 10.0.20.1), fixing the 27 non-proxied [External] uptime-kuma monitors that rode the TP-Link NAT loopback (hard-down since 06-09; loopback refuses flows whose source equals the reflection target, which all pfSense-SNAT'd cluster traffic does). Enabled by re-testing a stale premise: on k8s 1.34 pods DO reach the ETP=Local Traefik LB IP (kube-proxy short-circuits in-cluster traffic to LB IPs; verified from pods on three non-Traefik nodes) — re-verify after major k8s upgrades; canary = [External] fleet going red. The NAT-layer alternatives (pfSense rdr, SNAT-drop) were rejected: both fight return-path asymmetry and deepen TP-Link dependency. Verified in-pod: immich -> .203 + HTTPS 200, mail -> 10.0.20.1, forgejo -> Traefik ClusterIP (pin kept for Technitium-outage resilience). Proxied [External] monitors now test the internal path — true edge fidelity moves to the external vantage (ha-london, next fix). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 16:21:34 +00:00
Viktor Barzin	a1b7b0ca53	forgejo retention: revert to DRY_RUN — first live run orphaned OCI indexes [ci skip] The keep-set (newest 10 versions + latest + cache tags) treats multi-arch/attestation index CHILDREN — separate untagged sha256 versions — as deletable: for images not rebuilt recently they sort outside the newest-10 window and were pruned while their kept parent index survived. kms-website :latest and :dfc83fb children 404'd (RegistryManifestIntegrityFailure, caught by forgejo-integrity-probe within hours; deployed tag a794d1a unaffected). Healed: :latest re-pointed at the intact a794d1a index (also the newest commit), corrupt :dfc83fb version deleted, probe re-run clean (0 failures / 22 repos / 63 tags / 59 indexes). DRY_RUN=true applied live. Re-enable only with a container-aware keep-set — options in the post-mortem. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 09:22:47 +00:00
Viktor Barzin	2b8c0def30	dns: pfSense forward-zone for viktorbarzin.me, nodes fully stock [ci skip] Round 3 of the forgejo-pull hairpin fix (per Viktor: no per-node customization — split-brain lives in the DNS infra): - pfSense Unbound domain override viktorbarzin.me -> Technitium 10.0.20.201 (applied via php write_config, backup on-box). Every Unbound client on every VLAN now gets the internal split-horizon answers (live Traefik IP via apex CNAME) with zero per-host config. - CoreDNS carve-out (TF, applied): dedicated viktorbarzin.me:53 block — forgejo pinned to Traefik ClusterIP via data source (pods cannot reach the ETP=Local LB IP pfSense now returns), all other .me names kept on public resolvers (pods' pre-existing behavior). Replaces the .:53 forgejo rewrite. - Removed the same-day resolved routing-domain drop-ins from all 7 nodes; node5/6 link DNS repointed Technitium -> pfSense (netplan + qm 205/206) for fleet parity; cloud-init no longer writes any DNS drop-ins. - Docs: dns.md, pfsense-unbound runbook (override + rollback), registry bullet, post-mortem final-architecture addendum. Verified: nodes resolve forgejo -> .203 via pfSense, crictl pull OK, pods resolve forgejo -> ClusterIP / others -> public, mail record works, .lan zone unaffected. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 08:32:34 +00:00
Viktor Barzin	1ee1bf0817	forgejo pulls: route *.viktorbarzin.me to Technitium, drop /etc/hosts pins [ci skip] Supersedes this morning's per-node /etc/hosts pin (no hardcoded service IPs on nodes, per Viktor). Technitium's split-horizon zone already resolves forgejo.viktorbarzin.me -> CNAME apex -> live Traefik LB IP (ingress-dns-sync auto-CNAMEs every ingress host; apex drift probe alerts) -- the nodes just never queried it. Rolled the devvm's systemd-resolved routing-domain pattern (~viktorbarzin.me -> 10.0.20.201) to all 7 nodes, removed the pins, verified getent + crictl pull via pure DNS. Also demoted node5/6's cloud-init global-dns.conf (DNS=8.8.8.8 1.1.1.1) to FallbackDNS-only: public servers in the global set race the routing domain. Its justification ("Technitium NXDOMAINs forgejo") was obsolete -- exactly the stale comment that pointed new nodes at the hairpin. hosts.toml mirror kept but documented as vestigial (Traefik 404s bare-IP requests; registry auth realm is an absolute URL). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 07:56:31 +00:00
Viktor Barzin	b6976ce014	forgejo pulls: pin registry name to internal Traefik in node /etc/hosts [ci skip] tuya-bridge was down 7.5h (ImagePullBackOff on k8s-node3): fresh kubelet pulls of forgejo.viktorbarzin.me images depended on the intermittently broken public-IP hairpin. The containerd hosts.toml mirror cannot keep pulls internal on its own — Traefik 404s its bare-IP requests (no Host/SNI match) and the registry Bearer realm is an absolute public URL fetched outside the mirror. Third incident of this class (buildkit 06-04, tripit/devvm 06-09). Fix: /etc/hosts pin 10.0.20.203 forgejo.viktorbarzin.me on every node — covers resolve + token + blob legs with correct SNI and valid cert. Applied live to all 7 nodes; persisted in the cloud-init bootstrap and the existing-node rollout script. Docs updated (registry bullet, dns.md hairpin scope + stale .200 literals, runbook) + post-mortem. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 07:15:24 +00:00
Viktor Barzin	7330cb6a0b	backup: image-level vzdump of hand-managed VMs (devvm) — close no-VM-backup DR gap The hand-managed Linux VMs (not in Terraform) were never imaged: the PVC/NFS/pfSense/PVE-config scripts cover cluster data but no VM disk. A lost devvm disk = unrecoverable home dirs + local-only git repos (monorepo root has no remote). vzdump-vms.{sh,service,timer}: daily 01:00 live `vzdump --mode snapshot` of VZDUMP_VMIDS (default 102=devvm) -> /mnt/backup/vzdump (Copy 2), keep 3; the monthly offsite-sync full pass mirrors it to Synology (Copy 3). Guest agent enabled -> fs-consistent. Nice/idle-ionice so it never starves etcd. Pushgateway job vzdump-backup. Deployed live to PVE + timer enabled. Docs updated: backup-dr.md (new VM-image layer + protection matrix), infra CLAUDE.md, AGENTS.md. [ci skip] Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 21:41:54 +00:00
viktor	edaee13be3	docs(ci-cd): tripit auto-deploy (GHA->Woodpecker 167) + svu semver in GHA [ci skip]	2026-06-09 21:41:53 +00:00
viktor	93ec0c66fd	docs(ci-cd): add off-infra GHA->GHCR build pattern for private Forgejo repos (tripit pilot) [ci skip]	2026-06-09 21:41:53 +00:00
Viktor Barzin	e0452611b5	forgejo: survive CI-build registry-push storms (mem 3Gi + working retention) Heavy in-cluster builds (e.g. tripit buildkit) were taking Forgejo down via two vectors. Fixes both, without moving Forgejo off the sdc HDD (code-oflt deferred): - Memory 1Gi -> 3Gi (requests=limits). Forgejo was OOMKilled (exit 137) under registry-push load; VPA upperBound ~1.5Gi was suppressed by the 1Gi cap it kept OOMing against. Size for the push spike. - Activate registry retention (DRY_RUN false). Verified the delete list against all running viktor/* images first: 0 running images affected. Pruned 478 -> 161 package versions; PVC was at its 50Gi autoresize ceiling. - FIX broken retention auth: the cleanup PAT was ci-pusher's, but Forgejo scopes container packages per-user, so DELETE on viktor/* returned 403 (the dry-run only did GETs, hiding it). Repointed forgejo_cleanup_token to viktor's write:package PAT. Retention had never actually worked. - Protect buildkit cache tags from retention (cleanup.sh keep-set) so the gentler-builds layer cache survives daily pruning. [ci skip] — already applied via scripts/tg. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 21:41:53 +00:00
Viktor Barzin	fd0f4a0365	fix: restore tree dropped by `6d224861`; land stem95su gdrive-sync (10m) [ci skip] `6d224861` came from a --no-checkout worktree whose empty index made the commit drop every file except two. This restores 05b50d2b's full tree and correctly adds stacks/stem95su/gdrive-sync.tf + the service-catalog stem95su entry. Forward-only (parent=6d224861, no force-push); [ci skip] since the live infra was never applied from the broken commit. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 08:45:33 +00:00
Viktor Barzin	6d224861c4	stem95su: scheduled Drive->site sync CronJob (every 10m) CronJob stem95su-gdrive-sync (*/10) mounts the content PVC RW and rclone-syncs the read-only Drive folder "claude" (stem claude/files) onto it (rclone/rclone:1.74.3, scope=drive.readonly, empty-source guard + --max-delete 25). ESO ExternalSecret stem95su-rclone <- Vault secret/stem95su. Requires the GCP OAuth app published to Production or the refresh token expires ~weekly. Lands the gdrive-sync stack on master (it had landed on a feature branch by accident on the shared devvm checkout). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 08:42:26 +00:00
Viktor Barzin	23602f393e	matrix: migrate Synapse -> tuwunel (Rust homeserver, fresh start, federated) Replace the cramped Synapse deployment with tuwunel v1.7.1: embedded RocksDB drops the CNPG dependency (both init-containers, the db ESO, the Reloader annotation all gone), env-var config, fsGroup-owned encrypted PVC, federation on, tuwunel-served well-known delegation to :443. server_name unchanged (matrix.viktorbarzin.me); fresh start (no Synapse->RocksDB migration path). Registered @viktor admin then disabled registration (403). Cleanup: removed the orphaned pg-matrix Vault static role and dropped the matrix Postgres DB/role; updated service-catalog, upgrade-config, CLAUDE.md PG-rotation list, and the Matrix OIDC->orphaned auth notes. Design+plan in docs/plans/2026-06-08-matrix-synapse-to-tuwunel-*. Already applied via scripts/tg (matrix tier-1 + targeted vault tier-0), so [ci skip] to avoid CI reconciling an unrelated pre-existing vault OIDC tune-TTL drift. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 11:58:17 +00:00
Viktor Barzin	9529eedfe0	docs(security): bot-block-proxy is a no-op while poison-fountain is at 0 [ci skip] Reflect commit b6dd23b1: bot-block-proxy short-circuits /auth to return 200 instead of proxying to the scaled-to-0 poison-fountain. - security.md Layer 1 + tarpit description + troubleshooting (fix stale stacks/platform path -> traefik stack; drop misleading restart-poison-fountain step). - .claude/CLAUDE.md: add matrix to PG rotation list; document that startup-read secret consumers need a Reloader annotation (matrix root cause, found via Loki 2026-06-05). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-06 16:51:26 +00:00
Viktor Barzin	52f5de905d	docs(context): freshen infra glossary (modules, tiers, new concepts) [ci skip] Refresh CONTEXT.md against current repo + cluster reality (grill-with-docs): - Module taxonomy rewrite: drop fictional k8s_app/helm_app/postgres_app factory modules (never existed); name the real four (ingress_factory, nfs_volume, anubis_instance, setup_tls_secret) + the shared / Stack-local / flat distinction; flag vestigial modules/kubernetes/<app> dirs. - Rename "Ingress auth tier" -> "Ingress auth" (discrete modes, not tiers); reserve "tier" for State tier + Namespace tier only. - Add local-path entry (cluster default SC; node-local footgun warning). - Add concepts: Keel, Diun, CNPG/pg-cluster, MetalLB LB-IP split, Calico. - Add "policy" ambiguity flag (Kyverno vs Calico NetworkPolicy vs Vault/RBAC). - Fix node count 5 -> 7 (k8s-master + k8s-node1..6). Doc-sync (same commit per repo rules): - overview.md: replace fictional factory modules with the real shared modules + the flat/stack-local pattern. - .claude/CLAUDE.md: drop dead nfs-proxmox column from the storage decision table + stale cross-reference (vault migrated off it 2026-04-25). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 19:34:49 +00:00
Viktor Barzin	3796a84e04	docs: f1-stream is Woodpecker-native (Forgejo viktor/f1-stream), not GHA/repo-10 f1-stream was extracted to its own Forgejo repo + deployed from the Forgejo registry (2026-06-05). Correct the stale "Migrated to GHA / repo id 10" claims: - CLAUDE.md + ci-cd.md: move f1-stream from the GHA list to the Woodpecker-native owned-app group; note old github source archived + GHA Woodpecker repo 10 deactivated; f1-stream is now Woodpecker repo 166. - service-catalog: note the source repo + deploy model.	2026-06-05 09:19:12 +00:00
Viktor Barzin	147a8cff40	Restore f1-stream stack — undo accidental bundling into 63fe7d2b Commit 63fe7d2b (fan-control) was made with a bare `git commit` in the shared infra working tree and inadvertently swept in a parallel session's staged f1-stream-extraction work (main.tf repoint, ~48 files/ removals, ci-cd.md + .claude docs, two extraction plan docs). This returns every f1-stream-related path to its pre-63fe7d2b state (3493c347) so that extraction can be committed cleanly by its own session. The fan-control files added in 63fe7d2b are untouched. [ci skip] Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 09:19:12 +00:00
Viktor Barzin	90ad6b9125	fan-control: presence-aware IPMI fan curve for the R730 PVE host The iDRAC stock curve runs the CPU at ~72°C on the 7080 RPM floor even under load (optimises for quiet, not cool). Add a bash daemon + systemd unit that drives the chassis fans from CPU temp on two curves, picked by garage occupancy (the server is in the garage): COOL when empty (measured ~58-65°C under load), QUIET near the silent floor when the ha-sofia garage door shows someone is there (open, or <15min since last activity). Manual fan mode is backstopped: bash EXIT trap + systemd ExecStopPost hand fans back to Dell auto on stop/crash; CPU>=83°C or repeated IPMI failures do the same. Pushgateway metrics (job=fan_control). 36 unit tests cover the pure curve/hysteresis/presence/parse logic; DRY_RUN + RUN_ONCE for integration checks. Deployed and verified on 192.168.1.127 (CPU 70->58°C in cool mode, hysteresis stepping confirmed). Design: docs/plans/2026-06-04-pve-fan-control-design.md Runbook: docs/runbooks/fan-control.md [ci skip] Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 09:19:11 +00:00
Viktor Barzin	f201e4573e	immich: fix slow context search — prewarm clip_index + latency alert/healthcheck Context (smart) search latency was caused by the 665MB vchord clip_index decaying out of PG shared_buffers (~33% resident -> ~1.8s cold ANN reads vs ~4ms warm), NOT by yesterday's ML MODEL_TTL/clip-keepalive change (CLIP textual is warm ~15ms on GPU). The postStart prewarm runs once at pod start and pg_prewarm.autoprewarm only re-warms at startup, so the index decays under job buffer-pressure over days. - clip-index-prewarm CronJob (immich, /5): pg_prewarm('clip_index') keeps the whole index resident -> searches stay ~4ms. - immich-search-probe CronJob (immich, /5): times a random-vector ANN query + reads clip_index residency, pushes gauges to the Pushgateway. - Prometheus alerts ImmichSmartSearchSlow / ImmichClipIndexColdCache / ImmichSearchProbeStale (+ inhibition when the probe is stale). - cluster_healthcheck.sh check #46 check_immich_search (TOTAL_CHECKS 45->46). - Docs: infra CLAUDE.md immich note, monitoring.md, cluster-health skill. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 09:19:07 +00:00
Viktor Barzin	98f29edf34	technitium: CoreDNS rewrite forgejo.viktorbarzin.me -> Traefik ClusterIP In-cluster pods resolved forgejo.viktorbarzin.me to the public IP (176.12.22.76) and hairpinned out through the WAN gateway, intermittently timing out buildkit pushes from Woodpecker build pods (which, unlike kubelet, don't use the per-node containerd Forgejo mirror). This silently failed CI build-and-push for Forgejo-hosted repos (recruiter-responder pipelines #15-#18 at the push step). Add a CoreDNS `rewrite name exact forgejo.viktorbarzin.me traefik.traefik.svc.cluster.local` so pods resolve to the Traefik ClusterIP (reachable in-cluster, unlike the ETP=Local LB .203; the Service-name target auto-tracks the ClusterIP so it can't rot on a Traefik renumber). Traefik's *.viktorbarzin.me wildcard keeps SNI/TLS valid. Makes the per-pod woodpecker-server hostAlias belt-and-suspenders. Applied via targeted apply (coredns ConfigMap only, to avoid reconciling 7 unrelated pre-existing drifts in the stack) + verified: - pod resolves forgejo.viktorbarzin.me -> 10.111.111.95 (Traefik ClusterIP) - recruiter-responder pipeline #20 build-and-push succeeds via ClusterIP Docs: networking.md (K8s cluster DNS path) + .claude/CLAUDE.md (forgejo registry quick-ref). Advances beads code-yh33. [ci skip] Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-04 07:34:30 +00:00
Viktor Barzin	922d95af9c	Reapply "tripit: Gmail ingest (12-month) + vbarzin owner + plans@ forward-to-parse" This reverts commit a82ba46ad83e85a231d839564c2f009c700dc4d1.	2026-06-03 10:24:25 +00:00
Viktor Barzin	f0843e398b	Revert "tripit: Gmail ingest (12-month) + vbarzin owner + plans@ forward-to-parse" This reverts commit 4cc9229e716b6683418a148a0f896442d5ab07ad.	2026-06-03 10:24:25 +00:00
Viktor Barzin	0c7ec3d470	tripit: Gmail ingest (12-month) + vbarzin owner + plans@ forward-to-parse Reconciles the tripit stack source with live state and adds the forward flow. Ingest now polls vbarzin@gmail.com [Gmail]/All Mail read-only over a rolling 12-month X-GM-RAW travel-sender window (Croatia Jet2 refs excluded), filing trips under MAIL_DEFAULT_OWNER_EMAIL=vbarzin@gmail.com (Viktor's Authentik login identity). Adds an ingest-plans CronJob that polls spam@ filtered to To:plans@viktorbarzin.me (the @viktorbarzin.me catch-all target) so forwarded bookings are extracted and attached to the matching trip; IMAP_PASSWORD is overridden per-job to spam@'s creds (PLANS_IMAP_PASSWORD). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 10:24:25 +00:00
Viktor Barzin	01ea7d6fa1	immich: clip-keepalive CronJob to pin smart-search model warm MACHINE_LEARNING_MODEL_TTL=600 is a single global knob, so it unloads the CLIP textual (smart-search) encoder after idle exactly like OCR/face — immich has no per-model pin. This CronJob pings the textual encoder every 5 min (< the 600s TTL) via immich-ml /predict, so a search query never pays the ~1.5s cold-load, while idle OCR/face still free their VRAM on the shared T4. Textual-only (search = text->embedding->pgvector); the visual encoder is import-time and left to unload. curl baked into the image (no runtime install). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-03 10:24:24 +00:00
Viktor Barzin	fe8db19aaf	job-hunter: build-triggers-deploy model; CronJob :latest + docs CI now drives the Deployment rollout (kubectl set image to the build SHA in .woodpecker.yml), so the stack moves to image_tag = "latest": the Deployment runs whatever CI last set (image ignore_changes keeps TF from fighting it), and the CronJob uses :latest + imagePullPolicy=Always (fresh pod each weekly run). Keel stays enrolled in parallel as a redundant net. Docs: rewrite the runbook "Deploying" section for build-triggers-deploy; record the reversal of decision #12 in the auto-upgrade design doc (owned apps drive their own rollout, Keel parallel — upstream stays Keel-only); add the owned-app deploy model to infra/.claude/CLAUDE.md CI/CD section. [ci skip] — applied locally (stack-scoped); avoids a broad CI auto-apply. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 20:24:50 +00:00
Viktor Barzin	052c776eba	immich: set MACHINE_LEARNING_MODEL_TTL 0->600 to stop GPU VRAM hog immich-ml at TTL=0 never unloaded models; a heavy OCR library job inflated onnxruntime's CUDA arena to ~10.7GB and held it on the shared time-sliced T4, starving llama-swap (qwen3-8b) so recruiter-responder triage 502'd silently for hours (emails preserved unseen, no loss). TTL=600 lets idle ad-hoc models (OCR, face) free VRAM while preloaded CLIP/smart-search stays warm. Docs: correct stale llama-cpp GPU notes (T4 is time-sliced, no VRAM isolation; add qwen3-8b to model table), immich MODEL_TTL gotcha in .claude/CLAUDE.md, and a post-mortem. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 20:16:11 +00:00
Viktor Barzin	481585f6e6	immich: cap streaming transcode bitrate to fix 4K video stutter [ci skip] Transcodes were uncapped (ffmpeg maxBitrate=0 + preset=ultrafast + targetResolution=original) -> 77-264 Mbps 4K H.264 files. Mobile playback streams that copy off the shared 7200rpm sdc pool over inter-VLAN NFS; a single stream needs ~10-13.5 MB/s and stuttered for every client, local and remote. Fix (DB system-config, applied via API): maxBitrate=20000k, preset=medium, transcode=bitrate. 4K resolution preserved; originals never modified. Existing oversized transcodes regenerated by deleting their asset_file encoded_video rows + videoConversion force=false (concurrency 1). Document config + add runbook docs/runbooks/immich-transcode-bitrate.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 19:24:30 +00:00
Viktor Barzin	a382683c0e	infra: fix containerd forgejo-registry redirect .200->.203 (+skip_verify) Traefik moved off shared .200 to its dedicated .203 on 2026-05-30, but the containerd hosts.toml redirect for forgejo.viktorbarzin.me still pointed at the now-dead .200:443 -> every FRESH forgejo pull failed (cached images kept running, so it stayed hidden until a new image tag was pulled). Retarget to .203 and add skip_verify (node dials Traefik by IP; cert is for forgejo.viktorbarzin.me) in both the new-node cloud-init and existing-node deploy scripts. Already rolled to all 7 nodes (rewrite + restart containerd, no drain). Doc fix in .claude/CLAUDE.md.	2026-06-01 21:22:05 +00:00
Viktor Barzin	ddd582a28c	backup: stop offsite-copying regenerable data; shrink nextcloud backup; pin nextcloud image The offsite Synology hit 97% — the Backup share grew +670G in a week, traced to the 2026-05-26 change that began mirroring large regenerable services offsite, plus an unbounded nextcloud.log bloating its backups to 87G. - nfs-mirror: re-exclude ollama, prometheus-backup, audiblez, ebook2audiobook (regenerable; live-only on sdc). Keep *-backup DB dumps (real safety copies). - offsite-sync Step 2: nfs-ssd leg is now immich-only; ollama/llamacpp on the SSD no longer ship offsite (re-pullable models). - daily-backup: skip nextcloud/nextcloud-data-proxmox (orphaned pre-encryption PV, still backed up weekly). - nextcloud: cap+rotate the log (log_rotate_size=10MB); the dedicated backup now excludes html/ (app code, from image), logs, and preview cache and keeps only the latest copy (pvc-data holds version history) → <5G (was 87G). - nextcloud: pin image to 32.0.9 in chart_values. A 2026-05-26 Keel bump moved the live pod to 32.0.9 (data migrated to 32.0.9.2) but TF still defaulted to 32.0.3; reconciling that drift this session rolled a 32.0.3 pod that CrashLooped on the downgrade. Pinning eliminates the drift. Docs: backup-dr.md + infra CLAUDE.md updated (add nfs-mirror, new exclusions). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-01 15:15:26 +00:00
Viktor Barzin	0dd4a31eff	docs(immich): cap server-side job concurrency to protect sdc + log recurrence A library-wide Duplicate Detection run on 2026-06-01 fanned the ML/thumbnail backfill out at thumbnailGeneration concurrency 8, saturating the shared sdc HDD and starving etcd -> kube-apiserver down ~30 min (5th IO-pressure incident on sdc). Capped server-side thumbnailGeneration/metadataExtraction/library to 2 in the Immich DB system-config; documented in the Immich row and recorded the recurrence + still-TODO IO-isolation fixes in the 2026-05-25 post-mortem (this also commits that previously-untracked post-mortem). [ci skip] Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-01 15:15:26 +00:00
Viktor Barzin	5bcb4525a4	traefik: uncap download duration (writeTimeout 60s->0), upload window 3600s [ci skip] Large Immich video downloads and uploads failed at a hard ~60s wall. The websecure entrypoint set respondingTimeouts.{read,write}Timeout=60s; unlike nginx proxy_*_timeout (per-read idle), Traefik respondingTimeouts are hard caps on total request/response duration, so every transfer slower than 60s was cut mid-stream. Reproduced: a 6 MB/s throttled 650MB download died at 386MB / 62s with an HTTP/2 stream reset. - writeTimeout=0 (Traefik's default, which Immich's reverse-proxy guidance assumes): unlimited download size/duration. - readTimeout=3600s: passes multi-GB uploads while keeping a slow-loris backstop (Immich has no resumable upload, so the window must exceed real upload times). Verified: the same 650MB download now completes fully (650MB / 102s, exit 0). IPv6 path needs no change - the pfSense bridge HAProxy 1h timeouts are inactivity-based, not total caps. Applied via tg (Tier 1 / PG-authoritative state); this commit syncs source + docs only, hence [ci skip]. Docs: networking.md (Entrypoint Transport Timeouts + troubleshooting), .claude/CLAUDE.md networking note. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-30 17:46:59 +00:00
Viktor Barzin	e9046e5a26	traefik+pfsense: real IPv6 client IPs via HAProxy PROXY-v2 bridge Replace the pfSense socat IPv6 forwarder (which masked every IPv6 client as 10.0.20.1) with a standalone HAProxy bridge using send-proxy-v2, so real IPv6 client IPs reach Traefik/CrowdSec. Traefik now trusts PROXY-v2 only from 10.0.20.1 on the web/websecure entrypoints; real IPv4 clients (ETP=Local, own source IP) are unaffected. Mail-over-IPv6 routed through the mail NodePorts (send-proxy-v2) too. Bridge is TCP/h2 only (no QUIC over IPv6). Persistence on pfSense: rc.d/ipv6proxy + ipv6_proxy.sh (config.xml shellcmd), keeping the nginx-off-[::] patch. Also fixes stale networking.md: Traefik was still documented on the shared .200; it moved to dedicated .203/ETP=Local on 2026-05-30. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-30 09:51:23 +00:00
Viktor Barzin	16c9aafafa	docs: Traefik dedicated-IP + ETP=Local cutover SUCCEEDED (attempt 2) Records the successful cutover and the key fix that made it safe: decouple cloudflared from the LB IP first (point its tunnel ingress at the in-cluster Traefik Service), so moving Traefik 10.0.20.200 -> 10.0.20.203 no longer breaks proxied apps or Vault's ingress. Updates infra CLAUDE.md Networking notes with the new Traefik LB IP / ETP=Local / cloudflared->ClusterIP state. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-30 08:12:57 +00:00
Viktor Barzin	bc41fe572a	immich: GPU-accelerate video transcoding (NVENC + NVDEC) Pin immich-server to the GPU node with a time-sliced nvidia.com/gpu slice so ffmpeg uses hardware NVENC encode + NVDEC decode instead of software. This frees the ~3-4 CPU cores the software transcoder was burning inside the request-serving pod (which was slowing thumbnail/photo browsing), and makes incompatible (HEVC/iPhone) videos playable in seconds. Activation is ffmpeg.accel=nvenc + accelDecode=true in the DB system-config (Immich app config is DB-managed here, like oauth/smtp — not Terraform). Also give immich-frame the same Keel ignore_changes immich-server already has, so an untargeted apply no longer churns it (pre-existing drift). Docs: .claude/CLAUDE.md Immich row + compute.md GPU-workloads list. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-29 18:05:34 +00:00
Viktor Barzin	188bdd50a0	infra: decommission foolery agent UI User no longer actively using foolery. Removed: - TF stack stacks/foolery (Cloudflare DNS, Traefik IngressRoute, Authentik forward-auth integration, K8s Service+Endpoints) - Devvm systemd unit /etc/systemd/system/foolery.service - Runtime at ~/.local/share/foolery and launcher ~/.local/bin/foolery - Stale foolery reference in .claude/CLAUDE.md auth="required" examples Uptime Kuma [External] foolery monitor will auto-prune on next external-monitor-sync reconcile. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-28 16:08:41 +00:00
Viktor Barzin	8b4bcc0ca2	blog: Anubis carve-out for /net-diag.sh curl\|bash clients can't solve PoW, so /net-diag.sh has to bypass Anubis. Adds a second ingress_factory pointing /net-diag.sh at the bare blog service (port 80), keeping every other path on the existing Anubis chain. Path-prefix specificity wins in Traefik routing — / stays gated. dns_type = "none" because the apex viktorbarzin.me CF record already exists from the main ingress. Doc update: CLAUDE.md Anubis section notes blog now follows the wrongmove carve-out pattern.	2026-05-28 13:22:57 +00:00
Viktor Barzin	9277d71d81	nfs-mirror: append transferred files to offsite-sync manifest Some checks failed ci/woodpecker/push/default Pipeline is running Details ci/woodpecker/push/build-cli Pipeline failed Details Step 1 of offsite-sync-backup is incremental on non-monthly days, driven by /mnt/backup/.changed-files which only daily-backup wrote to. nfs-mirror's writes were therefore invisible to Step 1 until the next monthly --delete pass — which would also wipe data pre-positioned on Synology pve-backup/ (e.g. the in-place btrfs rename we just did to relocate ~160G of NFS subtrees from /Backup/Viki/nfs/<svc>/ to /Backup/Viki/pve-backup/<svc>/). Fix: snapshot a timestamp before rsync, then after rsync use `find -newer $STAMP -type f -printf '%P\n'` to enumerate every file nfs-mirror created/modified and append to the manifest. Paths are relative to /mnt/backup/ (matches Step 1 --files-from expectation). State files are excluded. The current in-flight first run started before this patch was deployed, so its writes won't auto-populate the manifest — a one-off manual backfill will be done after it completes.	2026-05-24 15:32:22 +00:00
Viktor Barzin	6024cfb410	docs: update MySQL restore runbook + CLAUDE.md after 8.4.9 recovery Runbook rewritten for the standalone setup (InnoDB Cluster gone since 2026-04-16) and now covers the full disaster-recovery flow we just executed: stop pod, wipe PVC (incl. PV reclaim-policy flip from Retain → Delete), re-apply TF, restore via in-namespace Job, drop+create static users with fresh Vault passwords, restart dependents. CLAUDE.md MySQL row notes the 8.4.8 pin + links the runbook. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 22:51:52 +00:00
Viktor Barzin	01de3babd6	docs(security): wave 1 plan — Kyverno enforce, NetworkPolicy egress, audit logging, source-IP anomaly Locked design for wave 1 of cluster security hardening. Plan only — implementation lives in beads code-8ywc and follow-up commits. Captures: - security.md: Kyverno policy table updated (Audit → Enforce planned for the four security policies with the 31-namespace exclude list). New section "Audit Logging & Anomaly Detection" detailing the K8s API audit policy, Vault audit device + X-Forwarded-For trust, source-IP anomaly rules (K9, V7, S1), and the rejected-canary-tokens / rejected-K1 rationales. New section "NetworkPolicy Default-Deny Egress" describing the observe-then-enforce (γ) approach for tier 3+4. - monitoring.md: new "Security Alerts (Wave 1)" section listing the 16 rules (K2-K9, V1-V7, S1) and the Loki ruler → Alertmanager → #security routing path. - runbooks/security-incident.md (new): per-alert response playbook with LogQL queries, action steps, false-positive triage, and SEV1 escalation. - .claude/CLAUDE.md: new "Security Posture" section summarising the locked decisions: identity allowlist is me@viktorbarzin.me ONLY, source-IP allowlist CIDRs, no public-IP access policy, rationale for not adopting canary tokens. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 19:10:16 +00:00
Viktor Barzin	e030750507	openclaw: native MCP servers + daily claude-memory sync Wire ha-mcp, context7, and the in-pod playwright sidecar as native MCP servers on OpenClaw via `mcp set` in the container startup (ConfigMap-baked mcp.servers gets stripped by `doctor --fix`; CLI-set entries persist). HA URL pulled from new Vault key secret/openclaw.ha_sofia_mcp_url and passed via the HA_SOFIA_MCP_URL env var. Add a daily 03:00 UTC `memory-sync` CronJob in the openclaw namespace: pulls all non-sensitive memories from claude-memory.claude-memory.svc:80/api/memories, groups by category, writes 18 Markdown files into /workspace/memory/projects/claude- memory-sync/ (the path memory-core indexes), then triggers `openclaw memory index --force` via kubectl exec. Reuses the existing cluster-healthcheck SA (pods+pods/exec). Smoke test: 1488 memories synced, 25/25 files indexed, search returns hits. Also drops the legacy /app/extensions entry from plugins.load.paths (doctor warning), wires HA_SOFIA_MCP_URL env, and one-shot deletes the stale 2026-02-28 metaclaw-export.json from the openclaw home volume. claude_memory MCP intentionally NOT wired — its /mcp/mcp transport 404s on the deployed claude-memory-mcp:17 image (tracked as code-z1so). Shared knowledge is delivered via the CronJob's REST sync instead. Adding claude_memory to mcp.servers is a one-line follow-up once that's fixed.	2026-05-16 14:01:46 +00:00
Viktor Barzin	910167105e	Phase 0: install Keel + Kyverno auto-update annotation injector Foundation for opt-out-pure auto-update model per docs/plans/2026-05-16-auto-upgrade-apps-{design,plan}.md. - New stack `stacks/keel/` deploys Keel via Helm (charts.keel.sh, v1.0.6). Polls registries hourly per design decision #8. Default schedule overridable per-workload via keel.sh/pollSchedule annotation. - New Kyverno ClusterPolicy `inject-keel-annotations` mutates Deployments, StatefulSets, and DaemonSets in namespaces labeled `keel.sh/enrolled=true` with keel.sh/policy=force + trigger=poll + pollSchedule=@every 1h. - Phase 0 enrolls no namespaces. Phase 1 (next session) labels the self-hosted set. - Per-workload opt-out: label `keel.sh/policy: never` (used by rollback runbook and chrome-service-style deliberate pins). - Keel namespace excluded from the mutate — supervisor self-update has too-bad a failure mode (decision #11). - AGENTS.md: KYVERNO_LIFECYCLE_V2 marker convention added for the ignore_changes block enrolled workloads need. - .claude/CLAUDE.md: docker-images rule flagged as transitional. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-16 12:19:34 +00:00
Viktor Barzin	0712a1b659	infra/scripts/tg: enforce ingress_factory auth-comment convention Every `tg plan/apply/destroy/refresh` now runs `scripts/check-ingress-auth-comments.py` against the current stack before invoking terragrunt. The check fails closed if any `auth = "app"` or `auth = "none"` line in the stack's .tf files lacks an immediately-preceding `# auth = "<tier>": ...` comment documenting what gates the app (for "app") or why the endpoint is intentionally public (for "none"). Why tg-level (not git pre-commit): tg is the universal entry point for all infra changes. CI runs it, headless agents run it, humans run it. A pre-commit hook only catches the human path. Wiring the check into tg means the anti-exposure guard fires regardless of who or what is invoking terragrunt. Stack-scoped: each stack documents itself the next time it's edited. The 30+ existing `auth = "none"` stacks that predate this guard are not blocked from operating today; they'll need the comment added the next time someone runs `tg plan` on them — at which point the gate forces a conscious "yes, this is intentional" moment before any state change can land. Skipped on: init, fmt, validate, output, etc. — anything that doesn't read or write infra state.	2026-05-11 19:18:27 +00:00
Viktor Barzin	459b00fa74	infra/ingress_factory: add auth = "app" mode for self-authed backends Adds a fourth auth tier alongside required/public/none. "app" is functionally identical to "none" — no Authentik middleware attached — but the distinct name records intent at the call site: this backend has its own user login (NextAuth, Django, OAuth, bearer-token API, etc.) and Authentik would only break it. Why the new tier: with only required/none, every "the app has its own auth so drop Authentik" decision looked identical at the call site to "this is an OAuth callback / webhook receiver / native-client API". Future readers couldn't tell whether a stack was intentionally unauthenticated or relying on backend auth. Now they can. Migrates the 8 stacks flipped earlier this session (novelapp, immich, linkwarden, tandoor, freshrss, affine, actualbudget, ebooks/audiobookshelf) from "none" to "app". Confirmed no-op: `tg plan` on novelapp showed "No changes" — same middleware chain, same live state. The variable description and the .claude/CLAUDE.md Auth section now spell out the anti-exposure rule: only pick "app" or "none" AFTER verifying the app has its own user auth ("app") or the endpoint is intentionally public ("none"). Default stays "required" so accidental omission fails closed. [ci skip]	2026-05-11 18:59:20 +00:00
Viktor Barzin	2db8bdac0d	state(dbaas): update encrypted state	2026-05-10 21:00:00 +00:00
Viktor Barzin	fecfa211fd	fix: pvc-autoresizer threshold should be 10%, not 80% topolvm/pvc-autoresizer's threshold annotation is the FREE-SPACE percentage below which expansion fires (per upstream README). Setting it to "80%" means "expand when free-space drops below 80%", i.e. as soon as the PVC crosses 20% utilization — which caused prometheus-data-proxmox to be repeatedly expanded from 200Gi to 433Gi in 70 minutes (six 10% bumps, all when the volume was only ~14% used). Once the SC opt-in fix landed (`1e4eac53`) and the inode metrics fix landed (`02a12f1a`), the autoresizer started actively misfiring across 75+ PVCs cluster-wide. Flip the value to "10%" everywhere — that's "expand when free-space drops below 10%", i.e. at 90% utilization, which is the conventional semantic and matches the alert thresholds in prometheus_chart_values.tpl (PVAutoExpanding fires at 80%, PVFillingUp at 95%). The CLAUDE.md PVC template was the source of the misconfig, so update it too. Live PVC annotations were patched in parallel via kubectl annotate; TF apply on each affected stack will be a no-op against those live values. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 19:56:16 +00:00
Viktor Barzin	efd28ccce5	anubis: fix 500 on multi-replica + roll out to 6 more public sites Browser visits to viktorbarzin.me started returning HTTP 500 with `store: key not found: "challenge:..."` in pod logs. Root cause: each Anubis pod stores in-flight challenges in process memory; with 2 replicas behind a ClusterIP, the PoW-solved request can be routed to a different pod than the one that issued the challenge. Anubis upstream documents the same caveat ("when running multiple instances on the same base domain, the key must be the same across all instances" — true for the ed25519 signing key, but the challenge store is still pod-local without a shared backend). Drop module default replicas: 2 → 1. Worst-case: ~1s cold-start on pod restart. Real fix (Redis-backed challenge store) noted as a follow-up in CLAUDE.md. Roll Anubis out to: f1-stream, cyberchef (cc), jsoncrack (json), privatebin (pb), homepage (home), real-estate-crawler (wrongmove UI only — `/api` ingress stays direct via path-based ingress carve- out so XHRs from the SPA bypass the challenge). End-state: 9 public hosts now Anubis-fronted (blog, www, kms, travel, f1, cc, json, pb, home, wrongmove). All return the challenge HTML to bare curl/browser; verified-IP search engines and /robots.txt + /.well-known still skip via the strict-policy allowlist.	2026-05-10 00:50:30 +00:00
Viktor Barzin	f48da84770	anubis: per-site PoW reverse proxy on blog + kms + travel-blog Adds modules/kubernetes/anubis_instance/ — a per-site reverse proxy instance pinned to ghcr.io/techarohq/anubis:v1.25.0. Each instance issues a 30-day JWT cookie scoped to viktorbarzin.me after a tiny proof-of-work (difficulty 2 ≈ 250 ms desktop / 700 ms mobile). The shared ed25519 signing key (Vault: secret/viktor → anubis_ed25519_key) makes a single solve good across every Anubis-fronted subdomain. Wired into blog (viktorbarzin.me + www), kms.viktorbarzin.me, and travel.viktorbarzin.me — each with anti_ai_scraping=false on the ingress so the redundant ai-bot-block forwardAuth is dropped from the chain. Skipped forgejo (Git/API clients can't solve PoW) and resume (replicas=0). Also tightens bot-block-proxy nginx timeouts (3s/5s → 100ms/200ms) so any ingress still using the ai-bot-block forwardAuth pays at most ~150 ms when poison-fountain is scaled down, instead of 3 s. End-to-end TTFB on viktorbarzin.me dropped from ~3.2 s to ~150-200 ms. Docs: .claude/reference/patterns.md "Anti-AI Scraping" updated to 4 layers; .claude/CLAUDE.md adds the Anubis usage paragraph and Forgejo/API caveat.	2026-05-10 00:06:21 +00:00
Viktor Barzin	d62a9dcda1	docs: PVC templates need lifecycle.ignore_changes for autoresizer The canonical proxmox-lvm and proxmox-lvm-encrypted PVC templates were missing `lifecycle { ignore_changes = [spec[0].resources[0].requests] }`. Without it, every PVC created from these templates becomes a drift bomb the moment pvc-autoresizer expands it: the next `tg apply` on that stack will try to shrink the PVC back to the TF-declared size, K8s rejects the shrink, and apply fails. This was latent because pvc-autoresizer was silently broken cluster-wide (commit `9d5da4d8` fixed it by allow-listing kubelet_volume_stats_available_bytes in Prometheus). Now that the autoresizer actually works, every existing proxmox-lvm/encrypted PVC without ignore_changes is at risk. Sweep needed (separate task): grep for kubernetes_persistent_volume_claim across stacks/ and add ignore_changes to any with resize.topolvm.io annotations.	2026-05-09 12:02:18 +00:00
Viktor Barzin	3148d15d5a	[forgejo] Phases 3+4+5: cutover, decommission, docs sweep End of forgejo-registry-consolidation. After Phase 0/1 already landed (Forgejo ready, dual-push CI, integrity probe, retention CronJob, images migrated via forgejo-migrate-orphan-images.sh), this commit flips everything off registry.viktorbarzin.me onto Forgejo and removes the legacy infrastructure. Phase 3 — image= flips: * infra/stacks/{payslip-ingest,job-hunter,claude-agent-service, fire-planner,freedify/factory,chrome-service,beads-server}/main.tf — image= now points to forgejo.viktorbarzin.me/viktor/<name>. * infra/stacks/claude-memory/main.tf — also moved off DockerHub (viktorbarzin/claude-memory-mcp:17 → forgejo.viktorbarzin.me/viktor/...). * infra/.woodpecker/{default,drift-detection}.yml — infra-ci pulled from Forgejo. build-ci-image.yml dual-pushes still until next build cycle confirms Forgejo as canonical. * /home/wizard/code/CLAUDE.md — claude-memory-mcp install URL updated. Phase 4 — decommission registry-private: * registry-credentials Secret: dropped registry.viktorbarzin.me / registry.viktorbarzin.me:5050 / 10.0.20.10:5050 auths entries. Forgejo entry is the only one left. * infra/stacks/infra/main.tf cloud-init: dropped containerd hosts.toml entries for registry.viktorbarzin.me + 10.0.20.10:5050. (Existing nodes already had the file removed manually by `setup-forgejo-containerd-mirror.sh` rollout — the cloud-init template only fires on new VM provision.) * infra/modules/docker-registry/docker-compose.yml: registry-private service block removed; nginx 5050 port mapping dropped. Pull- through caches for upstream registries (5000/5010/5020/5030/5040) stay on the VM permanently. * infra/modules/docker-registry/nginx_registry.conf: upstream `private` block + port 5050 server block removed. * infra/stacks/monitoring/modules/monitoring/main.tf: registry_ integrity_probe + registry_probe_credentials resources stripped. forgejo_integrity_probe is the only manifest probe now. Phase 5 — final docs sweep: * infra/docs/runbooks/registry-vm.md — VM scope reduced to pull- through caches; forgejo-registry-breakglass.md cross-ref added. * infra/docs/architecture/ci-cd.md — registry component table + diagram now reflect Forgejo. Pre-migration root-cause sentence preserved as historical context with a pointer to the design doc. * infra/docs/architecture/monitoring.md — Registry Integrity Probe row updated to point at the Forgejo probe. * infra/.claude/CLAUDE.md — Private registry section rewritten end- to-end (auth, retention, integrity, where the bake came from). * prometheus_chart_values.tpl — RegistryManifestIntegrityFailure alert annotation simplified now that only one registry is in scope. Operational follow-up (cannot be done from a TF apply): 1. ssh root@10.0.20.10 — edit /opt/registry/docker-compose.yml to match the new template AND `docker compose up -d --remove-orphans` to actually stop the registry-private container. Memory id=1078 confirms cloud-init won't redeploy on TF apply alone. 2. After 1 week of no incidents, `rm -rf /opt/registry/data/private/` on the VM (~2.6GB freed). 3. Open the dual-push step in build-ci-image.yml and drop registry.viktorbarzin.me:5050 from the `repo:` list — at that point the post-push integrity check at line 33-107 also needs to be repointed at Forgejo or removed (the per-build verify is redundant with the every-15min Forgejo probe). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 18:30:02 +00:00
Viktor Barzin	cd96fb64a8	phpipam-pfsense-import: every 5min → hourly Reduces 5-min disk-write spikes on PVE sdc. The cronjob was the heaviest single contributor in our hourly fan-out investigation (11.2 MB/s burst when it fired). Kea DDNS still handles real-time DNS auto-registration; phpIPAM inventory just lags by up to 1h, which we don't need fresher. Docs (dns.md, networking.md, .claude/CLAUDE.md) updated to match.	2026-04-26 22:48:43 +00:00
Viktor Barzin	7e34b67f24	[docs] Architecture docs: registry integrity probe, pin, new CI pipelines Bring the architecture set in line with what's actually deployed after today's registry reliability work (commits `7cb44d72` → `42961a5f`): - docs/architecture/ci-cd.md: expand Infra Pipelines table with build-ci-image (+ verify-integrity step), registry-config-sync, pve-nfs-exports-sync, postmortem-todos, drift-detection, issue-automation, provision-user. Note registry:2.8.3 pin + integrity probe in the image-registry flow section. - docs/architecture/monitoring.md: add Registry Integrity Probe to components table; add 3-alert section (Manifest Integrity Failure / Probe Stale / Catalog Inaccessible). - .claude/CLAUDE.md: one-line on the pin, auto-sync pipeline, and the revision-link-not-blob rule so the next agent knows the right check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 17:51:26 +00:00

1 2 3 4

159 commits