fix(authentik): long-lived social-login sessions + shield auth from CrowdSec lockout
All checks were successful
ci/woodpecker/push/default Pipeline was successful

Viktor's passkeys all vanished and he was suddenly being asked to log in
multiple times a day instead of ~monthly. Root cause: on 2026-06-18 an ad-hoc
tripit passkey E2E test (run from the devvm as akadmin via python-httpx) cleaned
up "the demo user's" passkeys with GET /core/users/?search={demo} then DELETE
each device of users[0] — but the fuzzy search returned the REAL account, so it
wiped all 6 real passkeys. Losing passkeys forced fallback to Google login, and
the social-login stage (default-source-authentication-login) had the provider
default session_duration=seconds=0, which falls back to UNAUTHENTICATED_AGE=2h —
hence the constant re-logins. (Password + passkey logins were already weeks=4.)

Changes:
- authentik: adopt default-source-authentication-login into Terraform (import)
  and pin session_duration=weeks=4, so Google/GitHub/Facebook logins last as long
  as password/passkey. Immediate relief without re-enrolling.
- authentik: document the provider-schema gotcha — authentik_stage_identification
  exposes no webauthn_stage / enable_remember_me attribute, so they must NOT be in
  ignore_changes (commit 4e882989 removed them for this reason; re-adding breaks
  every apply). The passkey break was purely the missing device records, not drift.
- edge (rybbit): shield auth so a CrowdSec hit can never wall a user out of login —
  carve authentik.viktorbarzin.me + public-auth out of the zone WAF block rule,
  make the LAPI->edge sync ban-only (stop downgrading captcha to a hard block),
  and set exclude_crowdsec on the Authentik UI ingress (auth keeps rate-limiting).
- docs: record the session-duration change, the edge enforcement + auth carve-out
  (previously undocumented), and the pre-existing broken crowdsec-cf-sync CronJob
  (CF cursor pagination 400 + ~31k IPs vs list capacity -> edge list inert).

Passkey re-enrollment is a manual user action (devices are gone from the DB);
nothing auto-re-deletes them.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-06-20 23:40:22 +00:00
parent 600f1f933c
commit 46166c63b2
6 changed files with 119 additions and 20 deletions

View file

@ -206,6 +206,36 @@ resource "authentik_stage_user_login" "default_login" {
}
}
# -----------------------------------------------------------------------------
# Source (social-login) User Login stage bound to default-source-authentication-flow.
# Adopted into Terraform 2026-06-20: its session_duration was the provider default
# "seconds=0", which falls back to AUTHENTIK_SESSIONS__UNAUTHENTICATED_AGE (hours=2).
# So Google/GitHub/Facebook logins expired every 2h while password and passkey
# logins (default-authentication-login) lasted weeks=4. After the 2026-06-18 passkey
# wipe forced fallback to Google login, this 2h cap became the "re-login multiple
# times daily" symptom. Pinning weeks=4 makes every login path consistent.
# See docs/architecture/authentication.md.
# -----------------------------------------------------------------------------
import {
to = authentik_stage_user_login.default_source_login
id = "4c6977d2-eaae-4033-b1db-21b48c6b47f0"
}
resource "authentik_stage_user_login" "default_source_login" {
name = "default-source-authentication-login"
session_duration = "weeks=4"
lifecycle {
# Pin only session_duration; everything else stays UI-managed (same pattern
# as authentik_stage_user_login.default_login above).
ignore_changes = [
remember_me_offset,
terminate_other_sessions,
geoip_binding,
network_binding,
]
}
}
# -----------------------------------------------------------------------------
# Default Identification stage adopted 2026-06-10 to embed the password
# field on the identification screen (single-screen login: one round trip and
@ -227,6 +257,13 @@ resource "authentik_stage_identification" "default_identification" {
lifecycle {
# Pin only password_stage; everything else stays UI-managed (same pattern
# as authentik_stage_user_login.default_login above).
# NOTE: do NOT add webauthn_stage / enable_remember_me here the pinned
# authentik TF provider's authentik_stage_identification resource exposes no
# such attributes (verified 2026-06-20: `tg plan` => "Unsupported attribute").
# They exist on Authentik's IdentificationStage *model* but not in the
# provider schema, so Terraform never manages or nulls them; they are purely
# UI/app-managed and need no ignore_changes entry. Commit 4e882989 removed
# them for exactly this reason re-adding them breaks every apply.
ignore_changes = [
user_fields,
case_insensitive_matching,

View file

@ -82,6 +82,13 @@ module "ingress" {
service_name = "goauthentik-server"
tls_secret_name = var.tls_secret_name
anti_ai_scraping = false
# Never let the in-cluster CrowdSec bouncer serve a Turnstile/captcha
# interstitial or 403 on Authentik's own login + WebAuthn XHR endpoints that
# walls users out of the very gate they authenticate through (a CrowdSec hit
# would break the passkey ceremony / session refresh mid-flow). Auth keeps
# Traefik rate-limiting; the Cloudflare edge WAF also carves out this host
# (stacks/rybbit/crowdsec_edge.tf). 2026-06-20.
exclude_crowdsec = true
extra_annotations = {
"gethomepage.dev/enabled" = "true"
"gethomepage.dev/name" = "Authentik"