Commit graph

2129 commits

Author SHA1 Message Date
Viktor Barzin
252b65a574 fix: increase memory limits for OOMKilled pods (immich, clickhouse, speedtest)
- immich-server: limits 1700Mi → 2500Mi (70 restarts from media processing spikes)
- clickhouse: limits 1Gi → 1536Mi, max_server_memory_usage 800Mi → 1200Mi
- speedtest: limits 256Mi → 512Mi, requests 256Mi → 128Mi (daily OOM during test)
2026-03-27 13:57:16 +02:00
Viktor Barzin
399f0e2bd0 state(rybbit): update encrypted state 2026-03-27 13:56:54 +02:00
Viktor Barzin
44a1c3a155 state(immich): update encrypted state 2026-03-27 13:54:19 +02:00
Viktor Barzin
e23202399e state(speedtest): update encrypted state 2026-03-27 13:54:09 +02:00
Viktor Barzin
1ec480e5fa novelapp: grant vabbit81 (Gheorghe) admin RBAC on novelapp namespace 2026-03-26 17:34:48 +02:00
Viktor Barzin
2dc27ca128 state(novelapp): update encrypted state 2026-03-26 17:34:44 +02:00
Viktor Barzin
64d1a3bd24 state(woodpecker): update encrypted state 2026-03-26 17:34:18 +02:00
Viktor Barzin
e774d486fd state(rbac): add vabbit81 RBAC resources 2026-03-26 17:33:16 +02:00
Viktor Barzin
e65647edb4 state(vault): add vabbit81 user resources 2026-03-26 17:32:34 +02:00
Viktor Barzin
4e8d087b24 state(novelapp): update encrypted state 2026-03-26 17:23:34 +02:00
Viktor Barzin
5e6e71e727 novelapp: add NextAuth + Google OAuth env vars
Replace AUTH_SECRET with NEXTAUTH_URL, NEXTAUTH_SECRET, GOOGLE_CLIENT_ID,
and GOOGLE_CLIENT_SECRET for Google OAuth integration.
2026-03-26 17:14:16 +02:00
Viktor Barzin
70ea01fb6e vault: increase k8s auth token TTLs and add periodic renewal
Stagger token periods across roles (7d/8d/9d/10d) to prevent
bulk lease revocation storms that caused transient 504s.
Periodic tokens auto-renew indefinitely, eliminating mass expiry.
2026-03-26 12:21:47 +02:00
Viktor Barzin
b6ac68d7f2 state(vault): update encrypted state 2026-03-26 12:21:23 +02:00
Viktor Barzin
b8a5740138 reduce alert noise: remove 4 memory alerts, raise latency threshold [ci skip]
- Remove ClusterMemoryRequestsHigh, ContainerNearOOM, NodeLowFreeMemory,
  NodeMemoryPressureTrending — all fire regularly due to intentional
  memory overcommit and are not actionable
- Keep ContainerOOMKilled (actionable — container actually died)
- Raise HighServiceLatency p99 threshold from 10s to 30s to ignore
  transient spikes
2026-03-26 01:15:18 +02:00
Viktor Barzin
2445edea8f state(freedify): update encrypted state 2026-03-26 01:13:29 +02:00
Viktor Barzin
30d58bc4c8 state(freedify): update encrypted state 2026-03-26 01:11:16 +02:00
Viktor Barzin
9e99c14a77 state(freedify): update encrypted state 2026-03-26 00:36:47 +02:00
Viktor Barzin
9bc37bf257 state(freedify): update encrypted state 2026-03-26 00:15:49 +02:00
Viktor Barzin
c732e92613 state(reverse-proxy): update encrypted state 2026-03-26 00:07:46 +02:00
Viktor Barzin
074a2fceec state(reverse-proxy): update encrypted state 2026-03-26 00:07:41 +02:00
Viktor Barzin
50a3f81261 state(freedify): update encrypted state 2026-03-25 23:58:01 +02:00
Viktor Barzin
4e74f816bc cleanup: remove calibre and audiobookshelf stacks after ebooks migration [ci skip]
Both services migrated to unified ebooks namespace. Remove:
- Old stack directories and Terraform state
- calibre references from monitoring namespace lists
- calibre/audiobookshelf from operational scripts
2026-03-25 23:56:07 +02:00
Viktor Barzin
809e2a7624 state(audiobookshelf): update encrypted state 2026-03-25 23:54:55 +02:00
Viktor Barzin
60e83526ec state(calibre): update encrypted state 2026-03-25 23:54:38 +02:00
Viktor Barzin
3eb0418595 state(freedify): update encrypted state 2026-03-25 23:53:31 +02:00
Viktor Barzin
57d31de5a5 state(freedify): update encrypted state 2026-03-25 23:50:51 +02:00
Viktor Barzin
f49776aec9 state(freedify): update encrypted state 2026-03-25 23:41:15 +02:00
Viktor Barzin
95e49134ae cleanup: remove old audiobook-search, superseded by book-search
- Delete servarr/audiobook-search TF module (moved to ebooks/book-search)
- Remove audiobook-search from cloudflare_proxied_names
- Remove commented-out module reference in servarr/main.tf
- Clean up "renamed from" comment in ebooks/main.tf
- K8s resources (deploy/svc/ingress) deleted from servarr namespace
- Cloudflare DNS record already absent
- Import book-search and insta2spotify DNS records into cloudflared state
2026-03-25 23:16:01 +02:00
Viktor Barzin
97c789510e state(freedify): update encrypted state 2026-03-25 23:14:44 +02:00
Viktor Barzin
b731af1b91 state(platform): update encrypted state 2026-03-25 23:10:10 +02:00
Viktor Barzin
111201796b state(freedify): update encrypted state 2026-03-25 23:05:32 +02:00
Viktor Barzin
fe27709fd4 fix email monitor: use internal URL for Uptime Kuma push
Pods can't reach uptime.viktorbarzin.me externally. Switch to
http://uptime-kuma.uptime-kuma.svc.cluster.local for the push endpoint.
2026-03-25 22:59:26 +02:00
Viktor Barzin
a48149ff0d state(mailserver): update encrypted state 2026-03-25 22:58:35 +02:00
Viktor Barzin
b877ff18fc state(freedify): update encrypted state 2026-03-25 22:52:08 +02:00
Viktor Barzin
78dec8f0ad add e2e email roundtrip monitoring
CronJob (every 30 min) sends test email via Mailgun API to
smoke-test@viktorbarzin.me, verifies IMAP delivery in spam@ catch-all,
deletes test email, pushes metrics to Pushgateway + Uptime Kuma.

Prometheus alerts: EmailRoundtripFailing, EmailRoundtripStale,
EmailRoundtripNeverRun. Uptime Kuma: SMTP/IMAP port checks + E2E push.
2026-03-25 22:50:22 +02:00
Viktor Barzin
b9c2d7c1f6 state(freedify): update encrypted state 2026-03-25 22:24:39 +02:00
Viktor Barzin
49de96a0c1 state(mailserver): update encrypted state 2026-03-25 22:20:02 +02:00
Viktor Barzin
d1036de313 state(mailserver): update encrypted state 2026-03-25 22:16:06 +02:00
Viktor Barzin
a08b1e8384 state(freedify): update encrypted state 2026-03-25 22:15:24 +02:00
Viktor Barzin
f33940cbce state(mailserver): update encrypted state 2026-03-25 22:10:26 +02:00
Viktor Barzin
26ab7acbda state(mailserver): update encrypted state 2026-03-25 22:08:50 +02:00
Viktor Barzin
3adaf88f62 add MAM_ID env var to book-search deployment [ci skip] 2026-03-25 15:52:24 +02:00
Viktor Barzin
946ea9e1f3 fix ebooks stack: prefix PV names, add book-search DNS, add secrets symlink [ci skip] 2026-03-25 15:14:08 +02:00
Viktor Barzin
8be9e765dc state(platform): update encrypted state 2026-03-25 15:09:00 +02:00
Viktor Barzin
6e1d8c0c8b add ebooks stack: consolidate book services into single namespace [ci skip]
- New ebooks namespace with CWA, Stacks, Audiobookshelf, book-search
- book-search (renamed from audiobook-search) with CWA ingest volume
- Comment out audiobook_search module from servarr
- All NFS volumes and secrets consolidated
2026-03-25 15:04:27 +02:00
Viktor Barzin
14bbab3041 state(servarr): update encrypted state 2026-03-25 14:23:00 +02:00
Viktor Barzin
5d23e68f9d state(servarr): update encrypted state 2026-03-25 14:19:44 +02:00
Viktor Barzin
1ce8b3d899 remove setup_tls_secret from insta2spotify (Kyverno auto-syncs) 2026-03-25 13:44:34 +02:00
Viktor Barzin
fe109d9f96 add homepage auto-discovery documentation [ci skip] 2026-03-25 13:06:43 +02:00
Viktor Barzin
6dda15afa0 add insta2spotify stack: namespace, ESO, NFS, 2-container deploy, split ingress
- Namespace insta2spotify (tier 4-aux)
- ExternalSecret from Vault secret/insta2spotify
- NFS volume at /mnt/main/insta2spotify for SQLite + Spotify cache
- Frontend (128Mi) + backend (512Mi req / 2Gi limit) in one pod
- Split ingress: protected (Authentik) for frontend, unprotected for /api/*
- DNS via Cloudflare (proxied)
2026-03-25 13:03:35 +02:00