t3: session-auth detection for the gated nightly tracker (dispatch fallback logging + Loki alerts)
Some checks failed
ci/woodpecker/push/default Pipeline failed
Some checks failed
ci/woodpecker/push/default Pipeline failed
Before auto-tracking t3 nightly builds (Viktor's call, risk accepted), stand up the detection that was missing on 2026-06-09 — when an auto-pulled nightly broke pairing for ALL users and nothing alerted. Viktor's explicit requirement: make sure session auth keeps working and revert if the pairing fallback/failure rate climbs. This is phase 0 (detection) of that work. - t3-dispatch: exchangeCredential now reports WHICH pairing endpoint answered, and autoPair logs every outcome (paired user=.. endpoint=.. fallback=..) — so the real-user browser-session->bootstrap fallback rate is observable. A non-zero rate flags that a build moved the pairing API (the 2026-06-09 class). - Loki ruler alerts (devvm journal -> Alertmanager -> Slack): T3PairingBroken (real users failing to pair), T3PairFallbackHigh (build moved the pairing API), T3AutoUpdateRolledBack / RollbackFailed / Frozen (enforcer outcomes). Closes the post-mortem's open "nothing monitors end-to-end pairing" detection gap. The existing t3-probe only checks GET /api/auth/session==200, which stays 200 even when pairing is dead, so it never caught the outage class. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
e783cae2cb
commit
994d305d04
3 changed files with 132 additions and 8 deletions
|
|
@ -262,6 +262,42 @@ func TestAutoPairAcrossVersions(t *testing.T) {
|
|||
}
|
||||
}
|
||||
|
||||
// TestExchangeCredentialReportsEndpoint: exchangeCredential must report WHICH
|
||||
// pairing endpoint accepted the credential, so the dispatch can log it and we
|
||||
// can alert on the browser-session -> bootstrap fallback rate (a non-zero rate
|
||||
// means the running t3 build moved/renamed the pairing API — contract drift, the
|
||||
// 2026-06-09 failure class). fallback = endpoint is not the first-preference one.
|
||||
func TestExchangeCredentialReportsEndpoint(t *testing.T) {
|
||||
for _, tc := range []struct {
|
||||
name, pairPath, wantEP string
|
||||
wantFallback bool
|
||||
}{
|
||||
{"0.0.25 browser-session (primary)", "/api/auth/browser-session", "/api/auth/browser-session", false},
|
||||
{"0.0.24 bootstrap (fallback)", "/api/auth/bootstrap", "/api/auth/bootstrap", true},
|
||||
} {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
var hit string
|
||||
ts := pairInstance(tc.pairPath, &hit)
|
||||
defer ts.Close()
|
||||
|
||||
resp, ep, err := exchangeCredential(portOf(t, ts), "tok")
|
||||
if err != nil {
|
||||
t.Fatalf("exchangeCredential: %v", err)
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
if resp.StatusCode != http.StatusOK {
|
||||
t.Fatalf("status = %d, want 200", resp.StatusCode)
|
||||
}
|
||||
if ep != tc.wantEP {
|
||||
t.Fatalf("endpoint = %q, want %q", ep, tc.wantEP)
|
||||
}
|
||||
if gotFallback := ep != pairEndpoints[0]; gotFallback != tc.wantFallback {
|
||||
t.Fatalf("fallback = %v, want %v", gotFallback, tc.wantFallback)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestProbeHealthz(t *testing.T) {
|
||||
mux := http.NewServeMux()
|
||||
registerProbe(mux)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue