tuya-bridge: liveness probe hits /health so k8s restarts silently-hung bridge

The bridge was down 10h 40m on 2026-04-22 without being restarted —
the liveness probe hit `/` (trivial Flask handler) which passed while
the actual Tuya-cloud call path was stuck. /health now reports Tuya
cloud reachability via a background probe in the app; point both
probes at it. Liveness: 60s grace + 6x30s = 3min of 503s before
restart; readiness: 2x15s = 30s before removal from service.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Viktor Barzin 2026-04-23 07:47:41 +00:00
parent 8e55c4357a
commit 5ebd3a81c3

View file

@ -118,6 +118,26 @@ resource "kubernetes_deployment" "tuya-bridge" {
}
}
}
liveness_probe {
http_get {
path = "/health"
port = 8080
}
initial_delay_seconds = 60
period_seconds = 30
timeout_seconds = 5
failure_threshold = 6
}
readiness_probe {
http_get {
path = "/health"
port = 8080
}
initial_delay_seconds = 10
period_seconds = 15
timeout_seconds = 5
failure_threshold = 2
}
resources {
requests = {
cpu = "10m"