k8s-version-upgrade: auto-restore apiserver OIDC after control-plane bumps
Some checks failed
ci/woodpecker/push/default Pipeline failed

kubeadm upgrade apply regenerates the apiserver static-pod manifest and drops
the --authentication-config flag, silently breaking SSO (kubectl/kubelogin + the
k8s dashboard) until someone manually re-applied the rbac stack. That manual step
ran after every control-plane upgrade — the one thing keeping autonomous patch
upgrades from being truly hands-off (it bit us this cycle: an earlier master bump
left SSO broken until we noticed).

Automate it: the rbac stack now publishes its existing OIDC restore script (the
same one its null_resource runs) to a kube-system/apiserver-oidc-restore
ConfigMap, and the upgrade chain's phase_master re-runs it on master right after
the kubeadm upgrade — while tigera-operator is still quiesced so the flag-add
apiserver restart can't crashloop it. The script is idempotent and health-gates
/livez with auto-rollback; the step is non-fatal (a failure only lags SSO until
the next rbac apply, it won't abort the upgrade). phase_master already self-skips
when master is at target, so this only fires when master was actually upgraded.

The chain SA gets a name-scoped get on that one ConfigMap. Runbook updated: the
manual restore is now a documented fallback (command corrected — it needs
-replace, since the null_resource trigger hash never changes).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-19 06:04:30 +00:00
parent 48b63ffa6f
commit 077ac97df5
4 changed files with 80 additions and 10 deletions

View file

@ -158,3 +158,22 @@ resource "null_resource" "apiserver_oidc_config" {
auth_config = sha256(local.apiserver_auth_config_yaml)
}
}
# Publish the restore script to a ConfigMap so the k8s-version-upgrade chain can
# re-apply apiserver OIDC on master immediately after a `kubeadm upgrade` (which
# regenerates the apiserver manifest and drops --authentication-config breaks
# SSO). This is the SAME script the null_resource above runs over SSH, so the
# rbac stack stays the single source of truth the chain just re-runs it
# post-upgrade (phase_master in
# stacks/k8s-version-upgrade/scripts/upgrade-step.sh) instead of waiting for a
# manual `tg apply`. Content is config (issuer URLs + claim mappings), not
# secrets, so a ConfigMap is appropriate.
resource "kubernetes_config_map_v1" "apiserver_oidc_restore" {
metadata {
name = "apiserver-oidc-restore"
namespace = "kube-system"
}
data = {
"restore.sh" = local.apiserver_auth_remote_script
}
}