The kured sentinel gate DaemonSet requires /sentinel to exist on
all nodes. Without it, kured pods get stuck in ContainerCreating
with hostPath mount failure. Previously created manually; now
provisioned automatically for new nodes.
Upstream v2.4.1 accidentally dropped idrac_power_supply_input_voltage from
the legacy RefreshPowerOld code path during a Huawei OEM support refactor.
Built a patched image that restores the single missing line:
mc.NewPowerSupplyInputVoltage(ch, psu.LineInputVoltage, id)
Image: viktorbarzin/idrac-redfish-exporter:2.4.1-voltage-fix
[ci skip]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 2Gi WAL tmpfs (medium: Memory) counts against the container's
memory limit. At 3Gi, Prometheus OOM-kills during WAL replay on
startup (heap + tmpfs > 3Gi). Reverting to 4Gi restores headroom.
- Add input validation: username regex + email format check in pipeline
- Quote variables in .provision-env to prevent shell injection
- Remove dead source command (each Woodpecker command is separate shell)
- Use jq to build JSON payloads (prevents injection via group names)
- Clean up git-crypt key on failure (use ; instead of &&)
- Add Kyverno ndots lifecycle ignore to webhook-handler deployment
Vault stack can't be applied in CI (git-crypt TLS certs + sensitive
for_each on k8s_users). Pipeline now automates Vault KV update +
Authentik group creation, then notifies admin to apply stacks manually.
This matches the existing pattern — vault is not in default.yml either.
Woodpecker performs compile-time substitution on ${...} patterns,
replacing pipeline variables with empty strings. Using $VAR without
braces lets the shell evaluate them at runtime.
- Add Transit mount + per-stack Transit keys to vault stack TF
- Auto-create sops-user-<name> policy scoping decrypt to owned stacks
- Auto-create sops-<name> external group + alias for Authentik mapping
- Add sops-admin policy to authentik-admins group
- Attach sops-user policy to namespace-owner identity entities
- Update add-user skill with SOPS onboarding steps and Authentik group
- Adding a user to k8s_users + applying vault stack = full SOPS access
[ci skip]
- Each stack gets its own Vault Transit key (transit/keys/sops-state-<stack>)
- state-sync passes per-stack Transit URI + age keys on encrypt
- Vault policies scope namespace-owners to their stacks only:
- sops-admin: wildcard access to all transit keys
- sops-user-<name>: access only to owned stack keys
- Anca (plotting-book) can only decrypt plotting-book state
- Admin can decrypt everything (via admin Transit policy or age fallback)
- External group sops-plotting-book maps Authentik group to Vault policy
- Updated CLAUDE.md with state sync documentation
- .sops.yaml: add hc_vault_transit_uri for transit/keys/sops-state
- state-sync: try Vault Transit first, fall back to age key on disk
- Re-encrypted all 101 state files with both Vault Transit + age
- Normal workflow: vault login → decrypt via Transit (no key files)
- Bootstrap/DR: age key at ~/.config/sops/age/keys.txt
Phase 3: all 27 platform modules now run as independent stacks.
Platform reduced to empty shell (outputs only) for backward compat
with 72 app stacks that declare dependency "platform".
Fixed technitium cross-module dashboard reference by copying file.
Woodpecker pipeline applies all 27+1 stacks in parallel via loop.
All applied with zero destroys.
Phase 2 of platform stack split. 5 more modules extracted into
independent stacks. All applied successfully with zero destroys.
Cloudflared now reads k8s_users from Vault directly to compute
user_domains. Woodpecker pipeline runs all 8 extracted stacks
in parallel. Memory bumped to 6Gi for 9 concurrent TF processes.
Platform reduced from 27 to 19 modules.
Phase 1 of platform stack split for parallel CI applies.
All 3 modules were fully independent (no cross-module refs).
State migrated via terraform state mv. All 3 stacks applied
with zero changes (dbaas had pre-existing ResourceQuota drift).
Woodpecker pipeline updated to run extracted stacks in parallel.
Vault DB engine rotates passwords weekly but 5 stacks baked passwords
at Terraform plan time, causing stale credentials until next apply.
- real-estate-crawler: add vault-database ESO, use secret_key_ref in 3 deployments
- nextcloud: switch Helm chart to existingSecret for DB password
- grafana: add vault-database ESO, use envFromSecrets in Helm values
- woodpecker: use extraSecretNamesForEnvFrom, remove plan-time data source chain
- affine: add vault-database ESO, use secret_key_ref in deployment + init container
Split monolithic orchestrator into triage (haiku), historian (sonnet),
and report-writer (opus) stages. Each stage gets its own tool budget.
Added sev-context.sh for structured cluster context gathering.