Compare commits

...

18 commits

Author SHA1 Message Date
Viktor Barzin
01351e4ce2 tripit: deploy stack + DB provisioning + ongoing mail-ingest [ci skip]
- stacks/tripit: namespace, ESO (vault-kv + vault-database), Deployment
  (alembic init + app), Service, NFS document PVC, ingress (Authentik
  forward-auth) + /api/calendar carve-out (auth=none, HMAC-token gated),
  and 3 worker CronJobs. ingest-mail is live: real IMAP (me@, read-only
  BODY.PEEK, recent-30) + local LLM (qwen3vl-4b on llama-swap), idempotent
  (skips seen message_ids), owner me@viktorbarzin.me.
- stacks/dbaas: create CNPG role+db `tripit`.
- stacks/vault: pg-tripit static role (7d rotation) + allowed_roles entry.

Deployed at tripit.viktorbarzin.me. [ci skip]: stacks were applied
out-of-band via scripts/tg this session; a CI re-apply would also apply
unrelated pre-existing dbaas/vault drift (MySQL StatefulSet, vault OIDC).

Refs: code-bb9g, code-muqi

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 10:23:11 +00:00
Viktor Barzin
e9046e5a26 traefik+pfsense: real IPv6 client IPs via HAProxy PROXY-v2 bridge
Replace the pfSense socat IPv6 forwarder (which masked every IPv6 client
as 10.0.20.1) with a standalone HAProxy bridge using send-proxy-v2, so
real IPv6 client IPs reach Traefik/CrowdSec. Traefik now trusts PROXY-v2
only from 10.0.20.1 on the web/websecure entrypoints; real IPv4 clients
(ETP=Local, own source IP) are unaffected. Mail-over-IPv6 routed through
the mail NodePorts (send-proxy-v2) too. Bridge is TCP/h2 only (no QUIC
over IPv6). Persistence on pfSense: rc.d/ipv6proxy + ipv6_proxy.sh
(config.xml shellcmd), keeping the nginx-off-[::] patch.

Also fixes stale networking.md: Traefik was still documented on the
shared .200; it moved to dedicated .203/ETP=Local on 2026-05-30.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 09:51:23 +00:00
Viktor Barzin
16c9aafafa docs: Traefik dedicated-IP + ETP=Local cutover SUCCEEDED (attempt 2)
Records the successful cutover and the key fix that made it safe: decouple
cloudflared from the LB IP first (point its tunnel ingress at the in-cluster
Traefik Service), so moving Traefik 10.0.20.200 -> 10.0.20.203 no longer
breaks proxied apps or Vault's ingress. Updates infra CLAUDE.md Networking
notes with the new Traefik LB IP / ETP=Local / cloudflared->ClusterIP state.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 08:12:57 +00:00
Viktor Barzin
0c01adac95 traefik: dedicate LB IP 10.0.20.203 + externalTrafficPolicy=Local
Gives direct (non-proxied) apps real client IPs for CrowdSec (were SNAT'd to
the node IP under ETP=Cluster) and working QUIC. Companion change (NOT in TF —
remote cloudflared tunnel config, done via CF API): tunnel ingress repointed
from https://10.0.20.200:443 to https://traefik.traefik.svc.cluster.local:443
so proxied apps are decoupled from the LB IP. pfSense 443 NAT -> traefik_lb
alias (.203). See docs/plans/2026-05-30-traefik-dedicated-ip-etp-local-*.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 08:09:37 +00:00
Viktor Barzin
d6a61f00ad state(vault): update encrypted state 2026-05-30 07:59:28 +00:00
Viktor Barzin
aceee34889 state(dbaas): update encrypted state 2026-05-30 07:55:42 +00:00
Viktor Barzin
1473a94f29 docs/plans: Traefik dedicated-IP cutover attempt 1 post-mortem (rolled back)
Attempt rolled back to .200 baseline. Root blocker: cloudflared is a
token/dashboard-managed tunnel whose ingress targets the Traefik LB IP
(10.0.20.200), so moving Traefik to .203 took down all proxied apps. Retry
must also repoint the tunnel ingress (Cloudflare API). Also documents the
vault-ingress circular dep, SIGPIPE->stuck PG state-lock gotcha, and the
ETP=Local hairpin caveat.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 01:27:29 +00:00
Viktor Barzin
09a0c1fad4 docs/plans: Traefik dedicated IP + ETP=Local migration (design + plan)
Move Traefik off shared MetalLB IP 10.0.20.200 to a dedicated 10.0.20.203
with externalTrafficPolicy=Local, to (1) restore real client IPs for CrowdSec
on the 24 non-proxied apps (currently SNAT'd to a node IP) and (2) enable QUIC.
Forced off the shared IP because MetalLB forbids mixed ETP on a shared IP
(10.0.20.200 also carries the Terraform state DB). In-place cutover selected.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 00:27:04 +00:00
Viktor Barzin
0f26bf030b kyverno: exclude postiz namespace from Keel auto-update injection
Postiz was generating hourly Slack spam and a wedged rollout, both
Keel-driven:
- Bundled redis StatefulSets run docker.io/bitnamilegacy/redis; Keel
  tried 7.4.0->7.4.1/7.4.2 every poll but require-trusted-registries
  denies bitnamilegacy/* (only bitnami/* allowlisted) -> endless
  deny/retry/Slack-ping loop.
- Keel bumped postiz-app v2.21.7->v2.21.8 on 2026-05-26; the surge pod
  couldn't schedule under the 3Gi tier-4-aux quota, wedging the rollout
  for 3 days.

postiz Terraform state is heavily drifted (~2/30 resources tracked), so
per-workload opt-out can't be applied from the postiz stack. Durable
guard is here (clean kyverno state). Operational steps applied live via
kubectl (postiz stack can't apply): removed keel.sh/enrolled=true from
the namespace, set keel.sh/policy=never (annotation+label) on all 4
workloads, rolled postiz back to the running v2.21.7. Keel restarted
(scale 0->1) to drop postiz-app from its in-memory tracker; confirmed it
no longer tracks postiz.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 19:16:58 +00:00
root
ae72ad51bb Woodpecker CI deploy [CI SKIP] 2026-05-29 18:07:00 +00:00
Viktor Barzin
bc41fe572a immich: GPU-accelerate video transcoding (NVENC + NVDEC)
Pin immich-server to the GPU node with a time-sliced nvidia.com/gpu slice
so ffmpeg uses hardware NVENC encode + NVDEC decode instead of software.
This frees the ~3-4 CPU cores the software transcoder was burning inside
the request-serving pod (which was slowing thumbnail/photo browsing), and
makes incompatible (HEVC/iPhone) videos playable in seconds. Activation is
ffmpeg.accel=nvenc + accelDecode=true in the DB system-config (Immich app
config is DB-managed here, like oauth/smtp — not Terraform).

Also give immich-frame the same Keel ignore_changes immich-server already
has, so an untargeted apply no longer churns it (pre-existing drift).

Docs: .claude/CLAUDE.md Immich row + compute.md GPU-workloads list.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 18:05:34 +00:00
Viktor Barzin
b10233975b llama-cpp: restore replicas to 1; fire-planner: fix llama-swap URL
llama-cpp was scaled to 0 during 2026-05-25 IO-storm recovery
(TEMP-SCALEDOWN). Cluster is now stable; only frigate competes for the
GPU on k8s-node1. Restoring to 1 to unblock fire-planner's Reddit
examples ingest, which needs qwen3-8b for structured extraction.

fire-planner's llama_cpp_base_url default pointed at a non-existent
service:port (llama-cpp:8000) — the real service is `llama-swap` on
port 8080. First 2026-05-28 bulk Job exited 0 with 0 rows because of
this. Correcting.
2026-05-29 06:20:03 +00:00
Viktor Barzin
478629c1ee keel+anubis: extend sweep to non-V2 raw deployments; fix anubis replicas validation
Second-tier keel drift: actualbudget, mailserver (docker-mailserver + roundcube),
servarr (8 deployments), and authentik pgbouncer are live-enrolled (Kyverno injects
keel.sh/policy=patch) and drifting, but never had the V2 block in Terraform. Added
the full block (KYVERNO_LIFECYCLE_V2 + keel.sh/match-tag + per-container
KEEL_IGNORE_IMAGE + KEEL_LIFECYCLE_V1) to all 13 deployments. The docker-mailserver
deployment had no resource-level lifecycle at all — added one.

Also fixes a pre-existing bug in modules/kubernetes/anubis_instance: the `replicas`
validation `var.replicas == null || (...)` doesn't null-short-circuit in the current
TF version, failing apply on every single-replica Anubis site (blog, cyberchef,
f1-stream, homepage, jsoncrack, kms, postiz, real-estate-crawler, travel_blog) with
"argument must not be null". Switched to a null-safe ternary.

Verified: actualbudget plan shows no image drift (http-api 26.5.2 downgrade prevented).
The anubis module change triggers a full platform apply.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 06:02:24 +00:00
root
fe1a16a5f5 Woodpecker CI deploy [CI SKIP] 2026-05-29 05:48:10 +00:00
Viktor Barzin
5bc7a76630 tuya-bridge: switch to Forgejo image + CI-driven deploy
Mirrors the kms-website pattern: deployment image now points to
forgejo.viktorbarzin.me/viktor/tuya_bridge:${var.image_tag} and the
new Woodpecker pipeline in tuya_bridge/.woodpecker.yml drives the
rollout via `kubectl set image` on every push.

Changes:
- Extract `tls_secret_name` and add `image_tag` (default "latest")
  to a new variables.tf, matching the kms / fire-planner /
  payslip-ingest convention.
- Add `image_pull_secrets { name = "registry-credentials" }` (Kyverno
  ClusterPolicy sync-registry-credentials already syncs the Secret
  into every namespace).
- Set explicit `image_pull_policy = "IfNotPresent"` — SHA-tagged
  images are immutable, no need to re-pull on every restart.

The image attribute remains in `lifecycle.ignore_changes` (line was
already there from the prior Keel-managed era), so future `tg apply`s
do not fight Woodpecker's `kubectl set image`. Keel is still enrolled
on the namespace but will skip SHA-tagged images under `policy: patch`
(non-semver), so the CI pipeline is the sole rollout mechanism.

Backstory: the 2026-05-26 cluster-health incident was tuya-bridge
crashlooping after Keel rewrote `:latest` to a stale broken `:0.1`
tag on Docker Hub (which predated the `prometheus_exporter.py`
addition). Manual rebuild + push was the immediate fix; this commit
plus tuya_bridge/.woodpecker.yml close the underlying gap so a
source change reliably produces a fresh registry image.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 05:45:16 +00:00
Viktor Barzin
7870e62a07 uptime-kuma: declare Proxmox UI monitor in TF
Yesterday's session SQL-patched monitor 313 to `https://192.168.1.127:8006/`
+ ignore_tls=1 because the prior URL `http://proxmox.reverse-proxy.svc.cluster.local:8006`
hit a CoreDNS pod-level cache returning stale `10.0.10.1` (pfSense GW)
intermittently, false-tripping ExternalAccessDivergence. A kuma DB
restore would have lost the SQL fix. Declare the monitor in
`internal_monitors` so the existing sync CronJob self-heals it.

Extends the schema with optional `url` / `accepted_statuscodes` /
`ignore_tls` fields (null on the existing DB/port entries) and
teaches the sync script the MonitorType.HTTP branch — url +
accepted_statuscodes + ignoreTls (camelCase on the API), matching
drift fields the same way PORT does for hostname/port.

Verified: manually triggered the sync after apply; it found monitor
313 by name and reported "already in desired state".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 05:40:18 +00:00
Viktor Barzin
7c73c69f9b keel: add KEEL_LIFECYCLE_V1 + image-ignore to fire-planner
Completes the enrolled-workload sweep from cdb7d9a8. fire-planner was held
back because a parallel session was mid-apply on it (presence board); that
claim has since cleared.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 23:12:49 +00:00
Viktor Barzin
cdb7d9a81a keel: sweep KEEL_LIFECYCLE_V1 + per-container KEEL_IGNORE_IMAGE across enrolled workloads
Every Keel-enrolled workload (policy=patch, match-tag=true, injected by the
inject-keel-annotations Kyverno policy) was fighting Terraform: Keel rewrites
the image tag and restamps keel.sh/update-time, change-cause and the rollout
revision on each poll; without ignore_changes every `tg apply` reverted those
— downgrading the image and forcing a spurious rollout that Keel then re-did.

Only llama-cpp had the full block (added 2026-05-24); the other ~73 workloads
drifted. This sweep adds, to every enrolled deployment/daemonset lifecycle:
  - container[N].image (one per container index + init_container[N]) # KEEL_IGNORE_IMAGE
  - keel.sh/match-tag, keel.sh/update-time, kubernetes.io/change-cause,
    deployment.kubernetes.io/revision  # KEEL_LIFECYCLE_V1

Verified via `tg plan` on speedtest (single-container: image downgrade
0.24.3->0.24.1 + annotation strip now gone) and changedetection (multi-container:
both container images no longer drift). AGENTS.md drift-suppression section
updated with the canonical block + marker legend.

fire-planner deferred (parallel session mid-apply per presence board).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 23:09:30 +00:00
108 changed files with 5634 additions and 3806 deletions

View file

@ -124,14 +124,16 @@ Repo IDs: infra=1, Website=2, finance=3, health=4, travel_blog=5, webhook-handle
- **CrowdSec bouncer**: graceful degradation mode (fail-open on error).
- **Rate limiting**: Return 429 (not 503). Per-service tuning: Immich/Nextcloud need higher limits.
- **Retry middleware**: 2 attempts, 100ms — in default ingress chain.
- **HTTP/3 (QUIC)**: Enabled cluster-wide via Traefik.
- **HTTP/3 (QUIC)**: Enabled on Traefik. Works for **direct (non-proxied) apps** via the dedicated LB IP below (ETP=Local). Proxied apps get QUIC at the Cloudflare edge.
- **Traefik LB IP = `10.0.20.203`, `externalTrafficPolicy: Local`** (dedicated, NOT the shared `.200`). Moved off the shared `.200` on 2026-05-30 so direct/non-proxied apps preserve the **real client IP for CrowdSec** (ETP=Cluster SNAT'd them to the node IP) and so QUIC works. **The shared `10.0.20.200` keeps the other 10 LB services** (PG state-backend `postgresql-lb`, headscale, wireguard, coturn, xray, etc. — all ETP=Cluster; MetalLB forbids mixed ETP on a shared IP, hence Traefik's own IP). **cloudflared targets the in-cluster Traefik Service** (`https://traefik.traefik.svc.cluster.local:443`, remote/dashboard tunnel config — edit via CF Global API Key in `secret/platform`), so proxied apps are decoupled from the LB IP. pfSense WAN 443 (tcp+udp) NAT → alias `traefik_lb` (`.203`). Internal split-horizon apex `viktorbarzin.me A``.203`. Full runbook + post-mortem: `docs/plans/2026-05-30-traefik-dedicated-ip-etp-local-*`.
- **IPv6 ingress** = HE 6in4 tunnel (`2001:470:6e:43d::2`) → **standalone HAProxy on pfSense** (`/usr/local/etc/ipv6-haproxy.cfg`, NOT the HAProxy package) using `send-proxy-v2` → Traefik `.203` (web 443/80) + mail NodePorts `30125-30128` (25/465/587/993) — so **real IPv6 client IPs reach CrowdSec**. Traefik trusts PROXY-v2 **only from `10.0.20.1`** (`entryPoints.web/websecure.proxyProtocol.trustedIPs`); real IPv4 clients (own source IP) unaffected. **No QUIC over IPv6** (bridge is TCP/h2). Replaced socat 2026-05-30 (socat masked every v6 client as `10.0.20.1`). Boot/persistence: config.xml `<shellcmd>``ipv6_proxy.sh` (patches nginx off `[::]:443/:80` to free the tunnel IPv6, then `service ipv6proxy onestart`); `rc.d/ipv6proxy` manages HAProxy. Backends use **no health `check`** (a plain TCP check false-DOWNs the PROXY-expecting listeners). As-built: `docs/architecture/networking.md` → "IPv6 Ingress".
- **IPAM & DNS auto-registration**: pfSense Kea DHCP serves all 3 subnets (VLAN 10, VLAN 20, 192.168.1.x). Kea DDNS auto-registers every DHCP client in Technitium (RFC 2136, A+PTR). CronJob `phpipam-pfsense-import` (hourly) pulls Kea leases + ARP into phpIPAM via SSH (passive, no scanning). CronJob `phpipam-dns-sync` (15min) bidirectional sync phpIPAM ↔ Technitium. 42 MAC reservations for 192.168.1.x.
## Service-Specific Notes
| Service | Key Operational Knowledge |
|---------|--------------------------|
| Nextcloud | MaxRequestWorkers=150, needs 8Gi limit (Apache transient memory spikes, see commit eb94144), very generous startup probe |
| Immich | ML on SSD, disable ModSecurity (breaks streaming), CUDA for ML, frequent upgrades |
| Immich | ML on SSD (CUDA), disable ModSecurity (breaks streaming), frequent upgrades. **Video transcoding is GPU-accelerated**: `immich-server` is pinned to GPU node1 (nodeSelector `nvidia.com/gpu.present` + NoSchedule toleration + `gpu-workload` priority) with a time-sliced `nvidia.com/gpu=1` slice — the stock immich-server image's ffmpeg already ships h264/hevc_nvenc + NVDEC. Activated via `ffmpeg.accel=nvenc` + `accelDecode=true` in the **DB** system-config (`system_metadata` table, key `system-config`, JSONB — NOT Terraform; app config is DB-managed here like oauth/smtp). Direct DB edits need a pod **recreate** to reload (config is cached at boot; only API-driven changes broadcast a reload). If Immich is ever reinstalled fresh (not restored), re-set these two keys. Thumbnails/previews live on SSD NFS (sdb) — do NOT move to block storage (HDD sdc = slower + the contended IO domain). |
| CrowdSec | Pin version, disable Metabase when not needed (CPU hog), LAPI scaled to 3, **DB on PostgreSQL** (migrated from MySQL), flush config: max_items=10000/max_age=7d/agents_autodelete=30d, DECISION_DURATION=168h in blocklist CronJob |
| Frigate | GPU stall detection in liveness probe (inference speed check), high CPU |
| Authentik | 3 replicas, PgBouncer in front of PostgreSQL, strip auth headers before forwarding |

View file

@ -156,15 +156,16 @@ lifecycle {
### `# KYVERNO_LIFECYCLE_V2` — Keel auto-update annotations
When a namespace is labeled `keel.sh/enrolled=true`, the `inject-keel-annotations` ClusterPolicy (`stacks/kyverno/modules/kyverno/keel-annotations.tf`) injects three annotations on every Deployment / StatefulSet / DaemonSet:
When a namespace is labeled `keel.sh/enrolled=true`, the `inject-keel-annotations` ClusterPolicy (`stacks/kyverno/modules/kyverno/keel-annotations.tf`) injects these annotations on every Deployment / StatefulSet / DaemonSet:
```
keel.sh/policy: force
keel.sh/policy: patch
keel.sh/match-tag: "true"
keel.sh/trigger: poll
keel.sh/pollSchedule: "@every 1h"
```
To suppress the resulting Terraform drift, **enrolled workloads** must extend their `ignore_changes` block:
To suppress the resulting Terraform drift, **enrolled workloads** must carry the complete `ignore_changes` block below. This is the canonical form — it folds together every marker (see the legend after it):
```hcl
lifecycle {
@ -173,15 +174,31 @@ lifecycle {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE — Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
```
The V2 snippet is added **per workload** as namespaces are phase-enrolled — not as a mass sweep. Workloads in un-enrolled namespaces do not receive the annotation and don't need the V2 block.
**Marker legend** (the names are historical; grep each to audit coverage):
| Marker | Ignores | Why |
|---|---|---|
| `# KYVERNO_LIFECYCLE_V1` | `dns_config` | Kyverno injects pod DNS `ndots` config |
| `# KYVERNO_LIFECYCLE_V2` | `keel.sh/policy`, `/trigger`, `/pollSchedule` | Kyverno-injected Keel control annotations |
| `# KEEL_IGNORE_IMAGE` | `container[N].image` (one line **per container index**, incl. `init_container[N]`) | Keel rewrites the image tag on `policy=patch`; without this, `apply` reverts the bump (a **downgrade**) |
| `# KEEL_LIFECYCLE_V1` | `keel.sh/match-tag`, `keel.sh/update-time` (pod template), `kubernetes.io/change-cause`, `deployment.kubernetes.io/revision` | every Keel digest-update restamps these; without ignoring them `apply` strips them → forces a rollout → Keel re-stamps → fight loop |
**Multi-container caveat**: `container[0].image` only covers the first container. Add one `container[N].image` line for **every** container index, plus `init_container[N].image` for init containers — otherwise the un-ignored container's image still drifts/downgrades.
The `KEEL_LIFECYCLE_V1` + per-container `KEEL_IGNORE_IMAGE` lines were swept across all enrolled workloads on **2026-05-28** (previously only `llama-cpp` had them; the rest fought on every apply). New enrolled workloads must include the full block. Workloads in un-enrolled namespaces don't receive the annotations and don't need the block.
Per-workload opt-out: add the label `keel.sh/policy: never` on the Deployment metadata (not pod template); the policy's `exclude` clause respects it, no annotation gets injected, no `ignore_changes` needed.
**Audit**: `rg "KYVERNO_LIFECYCLE_V2" stacks/` — count should equal the number of enrolled workloads.
**Audit**: `rg "KYVERNO_LIFECYCLE_V2" stacks/` — count should equal the number of enrolled workloads. `rg "KEEL_LIFECYCLE_V1" stacks/` should match it (every enrolled workload also carries the V1 lines).
**Design context**: `docs/plans/2026-05-16-auto-upgrade-apps-{design,plan}.md`.

View file

@ -330,10 +330,14 @@ label with it, and `null_resource.gpu_node_config` re-applies the
next apply (discovery keyed on
`feature.node.kubernetes.io/pci-10de.present=true`).
**GPU Workloads**:
- Ollama (LLM inference)
- ComfyUI (Stable Diffusion workflows)
- Stable Diffusion WebUI
**GPU Workloads** (time-sliced — node advertises `Tesla-T4-SHARED`,
`sharing-strategy=time-slicing`, `nvidia.com/gpu.replicas=100`, so many pods
share the single T4; request `nvidia.com/gpu: 1` for a slice, not the whole card):
- immich-machine-learning (CLIP smart-search + facial recognition, CUDA)
- immich-server (NVENC/NVDEC video transcoding — `ffmpeg.accel=nvenc` + `accelDecode=true`)
- Frigate (object-detection inference)
- llama-cpp / llama-swap (LLM inference)
- nvidia-exporter + gpu-pod-exporter (DCGM metrics)
## Configuration

View file

@ -254,11 +254,11 @@ Additional middleware:
### MetalLB & Load Balancing
MetalLB v0.15.3 allocates IPs from the range 10.0.20.200-10.0.20.220 in **Layer 2 mode**. Most LoadBalancer services share **10.0.20.200** using the `metallb.io/allow-shared-ip: shared` annotation. Technitium DNS has a **dedicated IP (10.0.20.201)** with `externalTrafficPolicy: Local` to preserve client source IPs for query logging.
MetalLB v0.15.3 allocates IPs from the range 10.0.20.200-10.0.20.220 in **Layer 2 mode**. Most LoadBalancer services share **10.0.20.200** using the `metallb.io/allow-shared-ip: shared` annotation. Two services have **dedicated IPs** with `externalTrafficPolicy: Local` to preserve real client source IPs: **Traefik (10.0.20.203)** — so CrowdSec sees real public IPs on the direct-ingress path and QUIC/HTTP3 works (a shared IP forbids the mixed ETP that QUIC's UDP listener needs) — and **Technitium DNS (10.0.20.201)** for query logging.
| Service | Namespace | IP | Ports |
|---------|-----------|-----|-------|
| traefik | traefik | 10.0.20.200 (shared) | 80, 443, 443/UDP (HTTP/3), 10200, 10300, 11434/TCP |
| traefik | traefik | **10.0.20.203 (dedicated, ETP=Local)** | 80, 443, 443/UDP (HTTP/3), 10200, 10300, 11434/TCP |
| coturn | coturn | 10.0.20.200 (shared) | 3478/UDP (STUN/TURN), 49152-49252/UDP (relay) |
| headscale | headscale | 10.0.20.200 (shared) | 41641/UDP, 3479/UDP |
| windows-kms¹ | kms | 10.0.20.200 (shared) | 1688/TCP |
@ -270,7 +270,7 @@ MetalLB v0.15.3 allocates IPs from the range 10.0.20.200-10.0.20.220 in **Layer
| xray-reality | xray | 10.0.20.200 (shared) | 7443/TCP |
| **technitium-dns** | **technitium** | **10.0.20.201 (dedicated)** | **53/UDP+TCP** |
pfSense aliases reference these IPs: `k8s_shared_lb` (10.0.20.200), `technitium_dns` (10.0.20.201). NAT rules use aliases for maintainability.
pfSense aliases reference these IPs: `k8s_shared_lb` (10.0.20.200), `traefik_lb` (10.0.20.203), `technitium_dns` (10.0.20.201). NAT rules use aliases for maintainability — the WAN 443 (TCP+UDP) forward targets `traefik_lb`.
¹ **windows-kms is publicly WAN-exposed.** pfSense forwards WAN TCP/1688 → `k8s_shared_lb:1688` so any internet host can activate. The matching filter rule applies a per-source rate limit (`max-src-conn 50`, `max-src-conn-rate 10/60`) with `overload <virusprot>` flush — offenders are auto-added to pfSense's stock `virusprot` pf table for follow-on blocks. Operations (rate-limit tuning, log locations, revocation) are documented in `docs/runbooks/kms-public-exposure.md`.
@ -283,6 +283,28 @@ Critical services are scaled to **3 replicas**:
PodDisruptionBudgets ensure at least 2 replicas remain during node maintenance or disruptions.
### IPv6 Ingress (HE Tunnel + HAProxy Bridge)
Public IPv6 reaches the cluster over a **Hurricane Electric 6in4 tunnel** terminated on pfSense (`gif0`; tunnel endpoint `2001:470:6e:43d::2`, LAN prefix `2001:470:6f:43d::/64`). The apex `viktorbarzin.me AAAA``2001:470:6e:43d::2`.
pfSense cannot NAT IPv6→IPv4, so ingress is bridged by a **standalone HAProxy** on pfSense (a separate config/service — *not* the pfSense HAProxy package) that listens on the tunnel IPv6 and forwards to the IPv4 cluster LBs with **PROXY protocol v2 (`send-proxy-v2`)**, so real client IPv6 addresses propagate to CrowdSec instead of being masked as `10.0.20.1`:
| Listen `[2001:470:6e:43d::2]:` | → Backend (`send-proxy-v2`) | Purpose |
|---|---|---|
| 443, 80 | Traefik `10.0.20.203:443` / `:80` | Web apps |
| 25, 465, 587, 993 | mail NodePorts `30125` / `30126` / `30127` / `30128` on .101-103 | SMTP / SMTPS / Submission / IMAPS |
The web path works because Traefik trusts PROXY-v2 **only from `10.0.20.1`** (`entryPoints.web/websecure.proxyProtocol.trustedIPs` in `stacks/traefik/.../main.tf`) — real IPv4 clients arrive via ETP=Local with their own source IP (never `10.0.20.1`), so they are unaffected. Mail backends hit the mailserver's PROXY-aware alt-listeners (same pattern as the IPv4 mail HAProxy — see `mailserver.md`).
**No QUIC over IPv6** — the bridge is TCP/h2 only; IPv4 carries QUIC/HTTP3.
pfSense files (out-of-band, **not Terraform**):
- `/usr/local/etc/ipv6-haproxy.cfg` — the 6-frontend bridge config above.
- `/usr/local/etc/rc.d/ipv6proxy` — service wrapper (`service ipv6proxy {start,stop,status}`); `start` does a graceful `-sf` reload.
- `/usr/local/etc/ipv6_proxy.sh` — boot entrypoint (config.xml `<shellcmd>`): patches pfSense nginx off `[::]:443/:80` (rebinds to LAN IPv6) to free the tunnel IPv6, then `service ipv6proxy onestart`.
**Gotcha:** the backends use **no health `check`** — a plain TCP check hits the PROXY-expecting listeners without a PROXY header and would false-mark them DOWN. This path previously used `socat` (functional, but masked every IPv6 client as `10.0.20.1`); replaced by HAProxy on 2026-05-30 for real client IPs.
### Container Registry Pull-Through Cache
**Location**: Registry VM at 10.0.20.10

View file

@ -0,0 +1,110 @@
# Design: Dedicated MetalLB IP for Traefik with externalTrafficPolicy=Local
**Date:** 2026-05-30
**Status:** Draft — for review (no changes applied yet)
**Author:** Viktor + Claude
## Problem
Two issues share one root cause on the Traefik ingress LoadBalancer:
1. **CrowdSec is blind to real client IPs on the 24 non-proxied/direct apps.**
Traefik logs `10.0.20.103` (k8s-node3's IP) as the client for the
overwhelming majority of direct-app requests (measured: 2522 hits vs 3
real external IPs). Cause: the Traefik LB is `externalTrafficPolicy:
Cluster`, so kube-proxy SNATs every external client to the MetalLB-elected
node's IP before Traefik sees it. CrowdSec therefore makes ban decisions
against an internal node IP it would never block → **no effective IP-based
protection on the direct apps** (immich, forgejo, send, ytdlp, servarr,
ebooks, novelapp, freedify, affine, health, f1-stream, kms, k8s-portal,
etc. — 24 total).
*Proxied apps are unaffected — they arrive via the cloudflared tunnel and
get real IPs through Cloudflare's `X-Forwarded-For`.*
2. **HTTP/3 / QUIC does not complete for the direct apps.** An external probe
(`http3check.net`) confirms "QUIC connection could not be established"
despite `Alt-Svc: h3` being advertised and UDP 443 reaching Traefik
(verified: pfSense NATs UDP 443 → Traefik LB; Traefik binds UDP 8443).
Same root cause: `ETP=Cluster` + 3 replicas means kube-proxy SNATs and can
spread the UDP flow across pods, which breaks the QUIC handshake.
Both are fixed by `externalTrafficPolicy: Local` on the Traefik LB (no SNAT →
real client IPs preserved → QUIC stays pinned to one pod).
## Why we can't just flip ETP on the current IP
Traefik currently shares MetalLB IP **`10.0.20.200`** with **9 other services**
via `metallb.io/allow-shared-ip`:
`dbaas/postgresql-lb` (**Terraform state backend**), `headscale/headscale-server`,
`wireguard/wireguard`, `coturn/coturn`, `xray/xray-reality`,
`shadowsocks/shadowsocks`, `beads-server/dolt`, `servarr/qbittorrent-torrenting`,
`tor-proxy/torrserver-bt`.
Per MetalLB docs, services sharing an IP **must all use `Cluster`** (or point
to identical pods). Mixing `Local` and `Cluster` on a shared IP is **not
allowed** and would break the IP allocation — taking down all ingress **and
the Terraform state DB** (locking out `terragrunt` itself), plus VPN/DNS path.
→ Traefik must move to its **own** IP.
## Target state
- New dedicated MetalLB IP **`10.0.20.203`** (free; pool is `10.0.20.200-220`),
**not** shared, `externalTrafficPolicy: Local`, for the Traefik LB.
- `10.0.20.200` keeps the other 9 services unchanged (still all `Cluster`).
- Internal split-horizon DNS apex `viktorbarzin.me A``10.0.20.203`
(currently `10.0.20.200`). All `*.viktorbarzin.me` CNAME → apex, so this one
record moves every internal ingress hostname.
- pfSense: the WAN 443 (TCP **and** UDP) port-forward target moves from the
`<nginx>` alias to a **new pfSense alias** for `10.0.20.203`
(per request: define a VIP/alias, do **not** hardcode the IP in rules —
matches the existing `<nginx>` / `<k8s_shared_lb>` alias pattern).
## Key decisions
- **Dedicated IP, not shared** — forced by the MetalLB mixed-ETP rule above.
- **`10.0.20.203`** — first free IP after technitium (.201) and kms (.202).
- **pfSense reference by alias, not literal IP** (user requirement) — create
alias e.g. `traefik_lb` = `10.0.20.203`, reference it in the rdr + firewall
pass rule. One place to change later.
- **Cutover style** — two options, decided at review (see plan):
- *In-place* (recommended for maintainability): change the Helm Service to
the new IP + ETP=Local in one edit; brief cutover window (mitigated by
pre-lowering DNS TTL + staging the pfSense change).
- *Additive* (zero-downtime): stand up a second LB Service on `.203`
(ETP=Local) alongside the existing `.200` one, cut DNS/pfSense over, then
retire Traefik from `.200`. More moving parts to maintain.
## Risks & watch-items
- **Terraform state backend lives on `.200`** — every phase must verify
`dbaas/postgresql-lb:5432` stays reachable. We never touch `.200`'s config,
only remove Traefik from it at the end; low risk but explicitly checked.
- **Live-firewall edit** (pfSense rdr + alias) — done via the pfSense UI
(persisted in config.xml); CLI `pfctl` edits don't persist. Per the
network-device rule, this step is operator-driven/confirmed, not automated.
- **CrowdSec behavior change** — once it sees *real* public IPs on direct
apps, it will start making real ban decisions there. Confirm the security
allowlist (source-IP allowlist `10.0.20.0/22`, `192.168.1.0/24`, tailnet;
identity `me@viktorbarzin.me`) is correct so family/legit IPs aren't banned.
- **MetalLB ETP=Local node election**`.203` is announced only from a node
running a ready Traefik pod. Traefik has 3 replicas (node4, node5, +1) and
PDB minAvailable=2, so ≥2 eligible nodes always exist; re-elects on failure.
- **Cloudflare-proxied apps** route via the cloudflared tunnel → Traefik
ClusterIP, **not** the LB IP, so they are unaffected — verified in plan.
- **Cutover window** for the in-place option — keep it short; have rollback
staged.
## Out of scope
- No change to the 9 services on `10.0.20.200`.
- No change to Cloudflare-proxied apps' path.
- No re-architecture of the pfSense↔K8s ingress beyond the 443 target move.
## Affected docs (update on apply)
- `.claude/CLAUDE.md` (Networking & Resilience / Service-Specific notes)
- `docs/architecture/networking.md` (or equivalent — Traefik LB IP, ETP)
- `docs/runbooks/` — add a short "Traefik LB IP / ETP" runbook entry
- `.claude/reference/service-catalog.md` if it records LB IPs
- memory: update the QUIC/ingress entries (ids 3241-3246)

View file

@ -0,0 +1,230 @@
# Plan: Migrate Traefik to dedicated IP 10.0.20.203 + ETP=Local
**Date:** 2026-05-30 · **Pairs with:** `2026-05-30-traefik-dedicated-ip-etp-local-design.md`
**Status:** Draft — review required before executing. Nothing applied yet.
Goal: real client IPs to CrowdSec + working QUIC on the 24 direct apps, by
moving Traefik off the shared `10.0.20.200` onto its own `10.0.20.203` with
`externalTrafficPolicy: Local`. Shared IP `.200` (incl. the TF state DB) is
left untouched until the final cleanup step.
> Recommended cutover: **in-place** (simplest, most maintainable) inside a short
> planned window. Additive/zero-downtime variant noted at the end.
## Phase 0 — Pre-flight (read-only, ~10 min)
- [ ] Snapshot current state (already captured in chat; re-confirm at execution):
- Traefik svc: IP `10.0.20.200`, `allow-shared-ip=shared`, ETP=Cluster.
- `.200` shared by 10 services incl. `dbaas/postgresql-lb:5432` (TF state).
- DNS apex `viktorbarzin.me A = 10.0.20.200` (Technitium primary, split-horizon).
- pfSense rdr: WAN 443 tcp+udp → alias `<nginx>` (=10.0.20.200); `admin@10.0.20.1`.
- Traefik 3 replicas (node4, node5, +1), PDB minAvailable=2.
- [ ] Confirm `10.0.20.203` still free in pool `10.0.20.200-220`.
- [ ] **Lower DNS TTL** on the apex record to 60s (Technitium) ~30 min ahead of
cutover to shrink the window. (Restore to normal afterward.)
- [ ] Baseline checks to compare against (run now, save output):
- `curl -sI https://immich.viktorbarzin.me` (direct app) → 200/redirect
- `curl -sI https://<a-proxied-app>` → 200 (proxied path)
- PG state reachable: `nc -vz 10.0.20.200 5432` (or a `terragrunt plan` no-op)
- Traefik access log shows `10.0.20.103` for a direct app (the bug we're fixing)
- `http3check.net` for immich → QUIC FAILS (baseline)
## Phase 1 — Terraform: dedicated IP + ETP=Local (reversible)
Edit `stacks/traefik/modules/traefik/main.tf`, Helm `service` block (~L165-173):
```hcl
service = {
type = "LoadBalancer"
annotations = {
"metallb.io/loadBalancerIPs" = "10.0.20.203" # was 10.0.20.200
# allow-shared-ip REMOVED — Traefik no longer shares an IP
}
spec = {
externalTrafficPolicy = "Local" # was Cluster
}
}
```
- [ ] `scripts/tg plan` in `stacks/traefik` — review: only the Traefik Service
changes (new IP, ETP, annotation removed). No change to other stacks.
- [ ] `scripts/tg apply`.
- [ ] **Immediately verify** (ingress is briefly broken until DNS+pfSense move):
- `kubectl get svc traefik -n traefik` → IP `10.0.20.203`, ETP=Local.
- `kubectl get svc -A | grep 10.0.20.200` → the other 9 services still hold `.200`.
- **`nc -vz 10.0.20.200 5432`** → TF state DB still reachable (critical).
- `curl -sI --resolve <app>:443:10.0.20.203 https://<direct-app>` → 200
(proves `.203` serves before DNS moves).
**Rollback (Phase 1):** revert the three lines → `scripts/tg apply`. Back to `.200`.
## Phase 2 — Internal DNS cutover (Technitium)
- [ ] Update split-horizon apex: `viktorbarzin.me A → 10.0.20.203` (primary;
AXFR replicates to secondary/tertiary, or kick `technitium-zone-sync`).
- [ ] Verify internal resolution: `dig +short immich.viktorbarzin.me``10.0.20.203`
from a cluster/LAN client; `curl -sI https://immich.viktorbarzin.me` → 200.
**Rollback (Phase 2):** apex A → `10.0.20.200`.
## Phase 3 — pfSense (live firewall — operator-driven, alias not literal)
Per the "create a VIP/alias, don't hardcode" requirement:
- [ ] **Create a pfSense Firewall Alias** (Firewall ▸ Aliases), type Host:
name `traefik_lb`, value `10.0.20.203`. *(This is the correct pfSense
object for a NAT-forward target — same kind as the existing `<nginx>`
alias. If a CARP/IP-Alias Virtual IP is intended instead, confirm at
review; a routed K8s LB IP normally uses an Alias, not a VIP.)*
- [ ] **Repoint the 443 forward** (Firewall ▸ NAT ▸ Port Forward): change the
existing WAN `https` (TCP **and** UDP) rule's target from `nginx`
`traefik_lb`. Leave the auto firewall rule linked. Do **not** touch the
`http-alt`/`7443` rules (those are xray on `<k8s_shared_lb>`).
- [ ] Apply pfSense changes.
- [ ] Verify externally:
- `http3check.net` for immich → **QUIC OK** (h3 established).
- External `curl` to a few direct apps → 200.
- Traefik access log now shows **real client IPs** for direct apps (not `10.0.20.103`).
**Rollback (Phase 3):** point the 443 rule's target back to `nginx`.
## Phase 4 — Verify CrowdSec + the fleet (the real prize)
- [ ] Traefik logs: real public IPs on direct apps (sample several).
- [ ] CrowdSec: confirm it now ingests real IPs (a test decision / metrics);
**confirm the source-IP allowlist** (`10.0.20.0/22`, `192.168.1.0/24`,
tailnet) is active so family/LAN aren't banned.
- [ ] Proxied apps unaffected (spot-check 2-3 — still real IPs via Cloudflare).
- [ ] All other `.200` services healthy (PG state, headscale, wireguard, coturn,
xray, etc.).
- [ ] Restore DNS TTL to normal.
## Phase 5 — Cleanup / docs
- [ ] Confirm Traefik no longer answers on `.200` (it shouldn't after Phase 1).
- [ ] Update docs (design doc "Affected docs" list): `.claude/CLAUDE.md`,
`docs/architecture/networking.md`, service-catalog, memory ids 3241-3246.
- [ ] Commit TF + docs.
## Rollback (full)
Reverse order: pfSense 443 target → `nginx`; apex A → `.200`; revert the
Traefik Service TF (IP `.200`, `allow-shared-ip=shared`, ETP=Cluster) → apply.
kubectl/Helm reach the API server directly (not via Traefik), so control is
retained even if ingress is down mid-cutover.
## Additive (zero-downtime) variant — if the window is unacceptable
Instead of editing the Helm Service in place: add a second raw
`kubernetes_service` (type LoadBalancer, IP `.203`, ETP=Local, ports
web/80→8000, websecure/443→8443 TCP, websecure-http3/443→8443 UDP, selector =
Traefik pod labels). Both `.200` (old) and `.203` (new) serve Traefik. Cut
DNS+pfSense to `.203`, verify, then convert the Helm Service to ClusterIP
(drops `.200`). More config to carry long-term (a hand-maintained Service
duplicating Helm) — weigh against the brief in-place window.
## Attempt 1 — 2026-05-30 — ROLLED BACK (post-mortem)
First execution was rolled back to the `.200` baseline; all service restored,
TF state reconciled (`No changes`). The cutover **achieved its primary goal
mid-flight** (real external client IPs reached CrowdSec — confirmed real IPs
like `34.107.119.124` in Traefik logs instead of node `10.0.20.103`), but a
**missed dependency took proxied apps down**, forcing rollback. Fix the plan
before retrying:
1. **BLOCKER — cloudflared targets the LB IP.** The `cloudflared` tunnel is
**token-based / Cloudflare-dashboard-managed** (`args: [tunnel]` +
`TUNNEL_TOKEN`; no local `config.yaml`). Its ingress sends `*.viktorbarzin.me`
to the **Traefik LB IP `10.0.20.200`**. Moving Traefik to `.203` left
cloudflared pointing at a dead IP → **every proxied app (vault, home, …)
went down**. **The retry MUST also repoint the tunnel ingress `.200 → .203`
in Cloudflare (API/dashboard)** as part of the same cutover — ideally point
cloudflared at the Traefik *ClusterIP/service* so it's IP-independent.
2. **Vault-ingress circular dependency.** Fetching the Technitium password from
Vault *during* the window failed (Vault's ingress was down). Fix used:
pre-fetch all creds before touching Traefik (worked). The DNS step then
restored Vault.
3. **SIGPIPE → stuck PG state locks.** Piping `scripts/tg` through `head`/`grep`
(early pipe close) SIGPIPE-killed terragrunt before it released the PG
advisory lock, leaving an idle `terraform_state` connection holding the lock
(`force-unlock` can't release another session's advisory lock). **Always run
`tg` to a file, never pipe through early-closing filters.** Clear a stuck
one by terminating the idle backend: `pg_terminate_backend(<pid>)` for the
idle conn holding `pg_locks.objid` of the workspace.
4. **ETP=Local + hairpin.** Internal hosts that resolve `*.viktorbarzin.me` via
*public* DNS and hairpin (e.g. the devvm) become flaky under ETP=Local.
True external clients and internal-direct (`.203`) clients work. Ensure such
hosts resolve internally (Technitium split-horizon).
5. **QUIC verification.** `http3check.net` was unreliable here (failed on TCP
while real clients got 200s) — don't rely on it; confirm from a real device
on cellular.
**Left in place for retry:** pfSense alias `traefik_lb` (=`10.0.20.203`, NAT
reverted to `nginx`); pfSense `config.xml` backups `config.xml.bak-traefik-*`.
## Attempt 2 — 2026-05-30 — SUCCESS
Live and verified, **no proxied/Vault outage** this time. Key change vs attempt 1:
**decouple cloudflared from the LB IP FIRST**, so moving Traefik no longer
touches the proxied path or Vault's ingress.
Executed order (all lessons applied — `tg` always run to a file, creds
pre-fetched while Vault up):
1. **Cloudflare tunnel ingress repointed** `https://10.0.20.200:443`
`https://traefik.traefik.svc.cluster.local:443` (both `*.viktorbarzin.me`
and apex rules; `noTLSVerify` kept; catch-all 404 kept). Done via the
**Cloudflare Global API Key** (`secret/platform``cloudflare_api_key`,
email `vbarzin@gmail.com`, `X-Auth-Email`+`X-Auth-Key` headers — NOT the
tunnel token, which is not an API credential). Tunnel: account
`02e035473cfc4834fb10c5d35470d8b4`, id `75182cd7-bb91-4310-b961-5d8967da8b41`.
→ proxied apps now IP-independent.
2. Traefik Service → `10.0.20.203` + `ETP=Local` (single service; `tg apply`).
Proxied apps + Vault stayed up (cloudflared → ClusterIP).
3. Technitium apex `viktorbarzin.me A``10.0.20.203` (ttl 60).
4. pfSense 443 (tcp+udp) NAT `nginx``traefik_lb` (`.203`); `/etc/rc.filter_configure`.
**Verified:** proxied 307/200 throughout; direct apps 200; **real external
client IPs now reach Traefik/CrowdSec** (`216.73.217.51`, `54.x`, `52.x` — not
node `10.0.20.103`); PG state DB OK; TF state reconciled (`tg apply` exit 0).
**Notes / follow-ups:**
- **Out-of-band (not in TF):** the cloudflared tunnel ingress (remote/dashboard
config) and the pfSense `traefik_lb` alias + NAT. Codify the tunnel config in
TF (`cloudflare_zero_trust_tunnel_cloudflared_config`) so `→ClusterIP` is
declarative — pre-existing gap (tunnel was already remote-managed).
- **QUIC:** infra correct (ETP=Local + UDP 443 → `.203` + Traefik h3 listener).
`http3check.net` is unreliable here — it hits the IPv6 AAAA
(`2001:470:6e:43d::2`, separate HE-tunnel path, unchanged) and fails before
reaching Traefik. Confirm QUIC from a real device (Chrome → Protocol `h3`).
- pfSense `nginx` alias (=`.200`) is now unused; `traefik_lb` (=`.203`) is live.
## IPv6 follow-up — 2026-05-30 — DONE (HAProxy bridge, real client IPs)
The ETP=Local cutover fixed real client IPs + QUIC on the **IPv4** direct path
only. The **IPv6** path (HE 6in4 tunnel `2001:470:6e:43d::2` → pfSense) still ran
`socat`, which (a) masked every IPv6 client as `10.0.20.1`, and (b) broke
outright once Traefik's `proxyProtocol.trustedIPs` started requiring PROXY-v2
from `10.0.20.1`. Replaced socat with a **standalone HAProxy bridge** on pfSense
using `send-proxy-v2` so real IPv6 client IPs reach Traefik/CrowdSec.
Executed:
1. **Traefik** (TF, `stacks/traefik/.../main.tf`): added
`proxyProtocol = { trustedIPs = ["10.0.20.1"] }` to the `web` + `websecure`
entrypoints. Bounded risk — only connections *from* `10.0.20.1` (the bridge)
are PROXY-parsed; real IPv4 clients (ETP=Local, own source IP) are untouched.
Applied; IPv4 + proxied verified 200 immediately after.
2. **pfSense HAProxy** (`/usr/local/etc/ipv6-haproxy.cfg`): 6 frontends on
`[::2]:{443,80,25,465,587,993}` → Traefik `.203:{443,80}` and mail NodePorts
`{30125,30126,30127,30128}` (.101-103), all `send-proxy-v2`, **no `check`**
(a plain check would false-DOWN the PROXY-expecting listeners).
3. **Persistence**: rewrote `rc.d/ipv6proxy` → manages HAProxy
(`service ipv6proxy {start,stop,status}`, graceful `-sf`); rewrote
`ipv6_proxy.sh` (config.xml `<shellcmd>` boot entrypoint) to keep the
nginx-off-`[::]` patch then `service ipv6proxy onestart`. socat backups kept
as `*.socat-bak-*`.
**Verified:** web over `::2` = 200; Traefik logs show real public IPv6 clients
(e.g. `2620:10d:c092:500::6:1eda`), **zero** `10.0.20.1` artifacts; mail-over-IPv6
`220` banners on `::2:25/587` + IMAPS connect on `::2:993`; IPv4 direct/proxied
+ QUIC (`alt-svc: h3=":443"`) unaffected. **No QUIC over IPv6** (bridge is TCP/h2).
Authoritative as-built: `docs/architecture/networking.md` → "IPv6 Ingress".

View file

@ -60,7 +60,7 @@ variable "replicas" {
description = "Optional replica count override. When null, defaults to 1 if shared_store_url is null and 2 otherwise. Capped at 2 — Redis can handle more but anti-affinity assumes ≤2 replicas per Anubis instance on a 5-node cluster."
validation {
condition = var.replicas == null || (var.replicas >= 1 && var.replicas <= 2)
condition = var.replicas == null ? true : (var.replicas >= 1 && var.replicas <= 2)
error_message = "replicas must be 1 or 2 (or null to auto-pick from shared_store_url presence)."
}
}

View file

@ -135,7 +135,17 @@ resource "kubernetes_deployment" "actualbudget" {
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
@ -243,7 +253,17 @@ resource "kubernetes_deployment" "actualbudget-http-api" {
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -337,6 +337,12 @@ resource "kubernetes_deployment" "affine" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].init_container[0].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -151,7 +151,17 @@ resource "kubernetes_deployment" "pgbouncer" {
depends_on = [kubernetes_secret.pgbouncer_auth]
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -478,6 +478,11 @@ resource "kubernetes_deployment" "workbench" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].init_container[0].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE
]
}
@ -755,6 +760,11 @@ resource "kubernetes_deployment" "beadboard" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].init_container[0].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE
]
}

View file

@ -82,6 +82,12 @@ resource "kubernetes_deployment" "blog" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].container[1].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -201,6 +201,11 @@ resource "kubernetes_deployment" "changedetection" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[1].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -323,6 +323,13 @@ resource "kubernetes_deployment" "chrome_service" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].container[1].image,
spec[0].template[0].spec[0].init_container[0].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -74,6 +74,10 @@ resource "kubernetes_deployment" "city-guesser" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -253,6 +253,10 @@ resource "kubernetes_deployment" "claude-memory" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -201,6 +201,10 @@ resource "kubernetes_deployment" "coturn" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -83,6 +83,11 @@ resource "kubernetes_deployment" "cyberchef" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -107,6 +107,10 @@ resource "kubernetes_deployment" "dashy" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -332,6 +332,11 @@ resource "kubernetes_deployment" "dawarich" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[1].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -1284,6 +1284,33 @@ resource "null_resource" "pg_job_hunter_db" {
}
}
# Create tripit database for the TripIt travel app (FastAPI + SvelteKit SPA).
# Role password is managed by Vault Database Secrets Engine (static role
# `pg-tripit`, 7d rotation). Tables live in schema `tripit` (alembic creates
# them on the app's first migrate).
resource "null_resource" "pg_tripit_db" {
depends_on = [null_resource.pg_cluster]
triggers = {
db_name = "tripit"
username = "tripit"
}
provisioner "local-exec" {
command = <<-EOT
PRIMARY=$(kubectl --kubeconfig ${var.kube_config_path} get cluster -n dbaas pg-cluster -o jsonpath='{.status.currentPrimary}')
kubectl --kubeconfig ${var.kube_config_path} exec -n dbaas $PRIMARY -c postgres -- \
bash -c '
psql -U postgres -tc "SELECT 1 FROM pg_catalog.pg_roles WHERE rolname = '"'"'tripit'"'"'" | grep -q 1 || \
psql -U postgres -c "CREATE ROLE tripit WITH LOGIN PASSWORD '"'"'changeme-vault-will-rotate'"'"'"
psql -U postgres -tc "SELECT 1 FROM pg_catalog.pg_database WHERE datname = '"'"'tripit'"'"'" | grep -q 1 || \
psql -U postgres -c "CREATE DATABASE tripit OWNER tripit"
psql -U postgres -c "GRANT ALL PRIVILEGES ON DATABASE tripit TO tripit"
'
EOT
}
}
# Postiz: 3 databases (postiz, temporal, temporal_visibility) all owned by the
# `postiz` role. Bundled bitnami PostgreSQL was retired 2026-05-09 in favour of
# this CNPG cluster covered by postgresql-backup-per-db automatically.

View file

@ -244,6 +244,10 @@ resource "kubernetes_deployment" "diun" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -127,6 +127,10 @@ resource "kubernetes_deployment" "ebook2audiobook" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
@ -334,6 +338,10 @@ resource "kubernetes_deployment" "audiblez" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
@ -429,6 +437,10 @@ resource "kubernetes_deployment" "audiblez-web" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -371,6 +371,10 @@ resource "kubernetes_deployment" "calibre-web-automated" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
@ -498,6 +502,10 @@ resource "kubernetes_deployment" "annas-archive-stacks" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
@ -653,6 +661,10 @@ resource "kubernetes_deployment" "audiobookshelf" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
@ -934,6 +946,10 @@ resource "kubernetes_deployment" "book_search" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -81,6 +81,10 @@ resource "kubernetes_deployment" "echo" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -112,6 +112,10 @@ resource "kubernetes_deployment" "excalidraw" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -189,6 +189,11 @@ resource "kubernetes_deployment" "f1-stream" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -318,6 +318,12 @@ resource "kubernetes_deployment" "fire_planner" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].init_container[0].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
@ -614,8 +620,11 @@ resource "kubernetes_config_map" "grafana_fire_planner_datasource" {
variable "llama_cpp_base_url" {
type = string
description = "llama-cpp /v1/chat/completions endpoint for primary LLM extraction"
default = "http://llama-cpp.llama-cpp.svc.cluster.local:8000/v1/chat/completions"
description = "llama-swap /v1/chat/completions endpoint for primary LLM extraction"
# Service is named `llama-swap`, NOT `llama-cpp` the proxy in front of
# the actual llama-cpp pod. Port 8080. (Initial 2026-05-28 value pointed
# at a non-existent service:port and the bulk Job produced 0 rows.)
default = "http://llama-swap.llama-cpp.svc.cluster.local:8080/v1/chat/completions"
}
variable "claude_agent_service_url" {

View file

@ -202,6 +202,7 @@ resource "kubernetes_deployment" "forgejo" {
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"],

View file

@ -209,6 +209,11 @@ resource "kubernetes_deployment" "freshrss" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -237,6 +237,11 @@ for name, det in stats.get('detectors', {}).items():
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -340,6 +340,12 @@ resource "kubernetes_deployment" "grampsweb" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].container[1].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -178,6 +178,11 @@ resource "kubernetes_deployment" "hackmd" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -159,6 +159,11 @@ resource "kubernetes_deployment" "health" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -392,6 +392,12 @@ resource "kubernetes_deployment" "hermes_agent" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].init_container[0].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -124,6 +124,11 @@ resource "kubernetes_deployment" "cache_proxy" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -29,6 +29,21 @@ provider "registry.terraform.io/gavinbunney/kubectl" {
constraints = "~> 1.14"
hashes = [
"h1:9QkxPjp0x5FZFfJbE+B7hBOoads9gmdfj9aYu5N4Sfc=",
"zh:1dec8766336ac5b00b3d8f62e3fff6390f5f60699c9299920fc9861a76f00c71",
"zh:43f101b56b58d7fead6a511728b4e09f7c41dc2e3963f59cf1c146c4767c6cb7",
"zh:4c4fbaa44f60e722f25cc05ee11dfaec282893c5c0ffa27bc88c382dbfbaa35c",
"zh:51dd23238b7b677b8a1abbfcc7deec53ffa5ec79e58e3b54d6be334d3d01bc0e",
"zh:5afc2ebc75b9d708730dbabdc8f94dd559d7f2fc5a31c5101358bd8d016916ba",
"zh:6be6e72d4663776390a82a37e34f7359f726d0120df622f4a2b46619338a168e",
"zh:72642d5fcf1e3febb6e5d4ae7b592bb9ff3cb220af041dbda893588e4bf30c0c",
"zh:9b12af85486a96aedd8d7984b0ff811a4b42e3d88dad1a3fb4c0b580d04fa425",
"zh:a1da03e3239867b35812ee031a1060fed6e8d8e458e2eaca48b5dd51b35f56f7",
"zh:b98b6a6728fe277fcd133bdfa7237bd733eae233f09653523f14460f608f8ba2",
"zh:bb8b071d0437f4767695c6158a3cb70df9f52e377c67019971d888b99147511f",
"zh:dc89ce4b63bfef708ec29c17e85ad0232a1794336dc54dd88c3ba0b77e764f71",
"zh:dd7dd18f1f8218c6cd19592288fde32dccc743cde05b9feeb2883f37c2ff4b4e",
"zh:ec4bd5ab3872dedb39fe528319b4bba609306e12ee90971495f109e142d66310",
"zh:f610ead42f724c82f5463e0e71fa735a11ffb6101880665d93f48b4a67b9ad82",
]
}
@ -37,6 +52,20 @@ provider "registry.terraform.io/goauthentik/authentik" {
constraints = "~> 2024.10"
hashes = [
"h1:roBMd+gi+TGgikH/bMzEI8JfvJiMAQWt+8FmokCrQIs=",
"zh:090260dc7889ea822ec1d899344e1ee23eba5290461989c0796149c9511f2316",
"zh:13c2655ff824b0dc4b9bb832b5ca6d41dba97cb280330258c5fef4115e236209",
"zh:166a73c3a810c9c895d68a8ff968158f339f8a2c1c03e20ec9fc5ed99cc64e20",
"zh:203777eae1cdc711233315499643180604cff2324411b186b7cf07fdbe16f655",
"zh:3b2f18c9a8d28dac74dc6bbf168c946855ab9c68f053578d4630c50d5eaf30a0",
"zh:4822275985f6b74b6196c47112316a4252db22cf4ceaef7c9ab4c66d488abf2f",
"zh:53ea97562666c8a5a2f6d63d418a302a7f8ee4b7bb7da35dedaa89aa5708b7f0",
"zh:56b8a230901e3550c92a1d3f58ee9dafe9853f30fe4315af3ab28ae63262e15d",
"zh:6293ab7b1fd8206a0c853591f50186aca4a1eff117b2a773e10760a23a2c83e9",
"zh:9433970f79fb92d8aae3ee436db5630ab312c78b6dc9df9c1db3273a18f8aaa1",
"zh:95df406214f79b3b98222d7c7fe8fc319a3d90b7a9d53e1d5abbda5dfb8b9436",
"zh:a85880da0552a42c8f449390fbd7d8b03541d1a13e04bba9f1404fa658754260",
"zh:a95f6e9bd62c67e70eba1b1a14728856b9a6a28cd1e5e3be54a7718882c87e7f",
"zh:dd599b51c5beb34a4c6feece244fde07d2558d69929449ab1fd39a5ebe738781",
]
}
@ -44,6 +73,18 @@ provider "registry.terraform.io/hashicorp/helm" {
version = "3.1.2"
hashes = [
"h1:lIuknMfM7+QTzPWs8VBocstZF0B3TpEMIj/bw+dLAOs=",
"zh:1086b24b20d94afc331eb38c52b70848899fd0efaed46d9f4646180b96e9dffd",
"zh:28bebd04f8d0c44291dc961597c89de5be1e62153191b8b466dbbfb254c696aa",
"zh:49a7dd287c2c80621ba0c25834b1afac88c45d47ad3a24cd0aed634d78b1bbd4",
"zh:574e146b128be51cd4d9ee66cb8352eac82c7e3be2dbf53a51516ca701bb8b7c",
"zh:68285c8987affaa635c9590a0cefe238ba277e12532b64cb2d7ffec570ade064",
"zh:6ce12b5eb8f1d9aa61c4d336905e0186f9ea82c8767169533be5b206e4bd33f4",
"zh:83b7743951c989732f191cb429549296bca6faecffed492094bef92bec5c9dcb",
"zh:84fe2d11907b4e9d0c536d8b50bb63ad4056f60a73c4b734d5de7435784e53a7",
"zh:c8a25498bfbde4916f178d6880d9ee56ed9ceb88bef4842cd47360faadbb3dfb",
"zh:dfad553c09b36a7df68c3622c78b835669e69aaf954735802e85375a8df01dff",
"zh:f569b65999264a9416862bca5cd2a6177d94ccb0424f3a4ef424428912b9cb3c",
"zh:fd6f36da732f442e421d2b90ed3925a1c9ad0992c380a61fe7681d90b34aa5f3",
]
}
@ -51,6 +92,18 @@ provider "registry.terraform.io/hashicorp/kubernetes" {
version = "3.1.0"
hashes = [
"h1:oodIAuFMikXNmEtil5MQgP4dfSctUBYQiGJfjbsF3NY=",
"zh:0215c5c60be62028c09a2f22458e89cda3ef5830a632299f1d401eb3538874b0",
"zh:09ebb9f442431e278a310a9423f32caf467cb4b3cad3fe59573ca71fa7b14e20",
"zh:0c4e5912f83bb35846ae0a9ae54fc320706ee61894cd21cc6b4181b1c5a2fa5c",
"zh:1678c982853ad461e65ccb5e79d585e13ed109dd47dab2a66d3a7a304faeef65",
"zh:1c050a5c15e330457a9c18caacf61a923c59d663e13f2962e4b32f04fef523a0",
"zh:2c55bcec83be58ec132c7cb0a1ac644758b800d794fdc636d53a0eada0358a3a",
"zh:a062bb0aa316c08d8460c66a5d68da71da40de5d3bc3b31abcf3a1a9a19650f1",
"zh:a26fdea0afaa9b247c73c0b42843ca51ba7db0ac2571f9d3d50dcabd20ca1b98",
"zh:c872c9385a78d502bf5823d61cd3bb0f9a0585030e025eb12585c83451beeaa1",
"zh:f180879af931182beee4c8c0d9dab62b81d86f17ddcbe3786ef4c7cec9163a4e",
"zh:f569b65999264a9416862bca5cd2a6177d94ccb0424f3a4ef424428912b9cb3c",
"zh:f70f5789264069e0eef06f9b5d5fde955ef7206f7d446d1ce51a4c37a3f3e02f",
]
}
@ -79,5 +132,19 @@ provider "registry.terraform.io/telmate/proxmox" {
constraints = "3.0.2-rc07"
hashes = [
"h1:zp5hpQJQ4t4zROSLqdltVpBO+Riy9VugtfFbpyTw1aM=",
"zh:2ee860cd0a368b3eaa53f4a9ea46f16dab8a97929e813ea6ef55183f8112c2ca",
"zh:415965fd915bae2040d7f79e45f64d6e3ae61149c10114efeac1b34687d7296c",
"zh:6584b2055df0e32062561c615e3b6b2c291ca8c959440adda09ef3ec1e1436bd",
"zh:65dcfad71928e0a8dd9befc22524ed686be5020b0024dc5cca5184c7420eeb6b",
"zh:7253dc29bd265d33f2791ac4f779c5413f16720bb717de8e6c5fcb2c858648ea",
"zh:7ec8993da10a47606670f9f67cfd10719a7580641d11c7aa761121c4a2bd66fb",
"zh:999a3f7a9dcf517967fc537e6ec930a8172203642fb01b8e1f78f908373db210",
"zh:a50e6df7280eb6584a5fd2456e3f5b6df13b2ec8a7fa4605511e438e1863be42",
"zh:b25b329a1e42681c509d027fee0365414f0cc5062b65690cfc3386aab16132ae",
"zh:c028877fdb438ece48f7bc02b65bbae9ca7b7befbd260e519ccab6c0cbb39f26",
"zh:cf0eaa3ea9fcc6d62793637947f1b8d7c885b6ad74695ab47e134e4ff132190f",
"zh:d5ade3fae031cc629b7c512a7b60e46570f4c41665e88a595d7efd943dde5ab2",
"zh:f388c15ad1ecfc09e7361e3b98bae9b627a3a85f7b908c9f40650969c949901c",
"zh:f415cc6f735a3971faae6ac24034afdb9ee83373ef8de19a9631c187d5adc7db",
]
}

View file

@ -1,7 +1,7 @@
# Generated by Terragrunt. Sig: nIlQXj57tbuaRZEa
terraform {
backend "pg" {
conn_str = "postgres://terraform_state:LicuZK1nVl4ILE5HF-A9@10.0.20.200:5432/terraform_state?sslmode=disable"
conn_str = "postgres://terraform_state:WR2rnNyiLIb-gUcIxOeF@10.0.20.200:5432/terraform_state?sslmode=disable"
schema_name = "immich"
}
}

View file

@ -96,8 +96,17 @@ resource "kubernetes_deployment" "immich-frame" {
}
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE
]
}
}

View file

@ -145,7 +145,7 @@ resource "kubernetes_namespace" "immich" {
# so this stack can own the tier-quota with a higher memory cap.
"resource-governance/custom-quota" = "true"
tier = local.tiers.gpu
"keel.sh/enrolled" = "true"
"keel.sh/enrolled" = "true"
}
}
lifecycle {
@ -221,7 +221,11 @@ resource "kubernetes_deployment" "immich_server" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE
]
}
@ -252,6 +256,19 @@ resource "kubernetes_deployment" "immich_server" {
}
spec {
# Pinned to the GPU node for NVENC hardware video transcoding (Tesla T4,
# time-sliced). The immich-server image's ffmpeg ships h264/hevc_nvenc;
# activation is via system-config ffmpeg.accel=nvenc.
priority_class_name = "gpu-workload"
node_selector = {
"nvidia.com/gpu.present" : "true"
}
toleration {
key = "nvidia.com/gpu"
operator = "Equal"
value = "true"
effect = "NoSchedule"
}
container {
name = "immich-server"
image = "ghcr.io/immich-app/immich-server:${var.immich_version}"
@ -320,8 +337,8 @@ resource "kubernetes_deployment" "immich_server" {
path = "/api/server/ping"
port = "http"
}
period_seconds = 10
timeout_seconds = 1
period_seconds = 10
timeout_seconds = 1
# Bumped 30 360 (5min 1h): after a PG restart, immich-server
# reindexes the clip_index + face_index vector tables before binding
# the API port. Hundreds of thousands of rows take longer than 5min
@ -367,7 +384,8 @@ resource "kubernetes_deployment" "immich_server" {
memory = "8Gi"
}
limits = {
memory = "8Gi"
memory = "8Gi"
"nvidia.com/gpu" = "1"
}
}
}
@ -458,7 +476,7 @@ resource "kubernetes_deployment" "immich-postgres" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE
]
}
@ -628,7 +646,11 @@ resource "kubernetes_deployment" "immich-machine-learning" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE
]
}

View file

@ -215,6 +215,12 @@ resource "kubernetes_deployment" "insta2spotify" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].container[1].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -367,6 +367,11 @@ resource "kubernetes_deployment" "instagram_poster" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}

View file

@ -105,6 +105,11 @@ resource "kubernetes_deployment" "isponsorblocktv-vermont" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -271,6 +271,12 @@ resource "kubernetes_deployment" "job_hunter" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].init_container[0].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}

View file

@ -63,6 +63,11 @@ resource "kubernetes_deployment" "jsoncrack" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -311,6 +311,12 @@ resource "kubernetes_deployment" "windows_kms" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].container[1].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
depends_on = [kubernetes_manifest.kms_slack_external_secret]

View file

@ -313,6 +313,11 @@ resource "kubernetes_daemon_set_v1" "kured_sentinel_gate" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -110,6 +110,24 @@ resource "kubectl_manifest" "policy_inject_keel_annotations" {
# cnpg-system + dbaas (state-coupled), nvidia (pinned to
# 570.195.03 until NVIDIA ships ubuntu26.04 images per
# code-8vr0), kube-system (k8s built-ins).
#
# 2026-05-29: ADDED postiz. Two Keel failure modes, both
# unfixable while postiz stays enrolled:
# 1. Bundled redis StatefulSets run docker.io/bitnamilegacy/
# redis (the Broadcom archive repo). Keel hourly resolves
# newer patch tags (7.4.07.4.1/7.4.2) and tries to roll,
# but require-trusted-registries (security-policies.tf)
# denies bitnamilegacy/* (only bitnami/* is allowlisted).
# Endless denyretrySlack-ping loop.
# 2. Keel bumped postiz-app v2.21.7v2.21.8 (2026-05-26); the
# surge pod can't schedule under the 3Gi tier-4-aux quota,
# wedging the rollout for 3 days (rolled back to v2.21.7).
# postiz Terraform state is heavily drifted (~2/30 resources
# tracked memory id=2798/2840), so per-workload opt-out can't
# be applied from the postiz stack. Namespace exclude here
# (clean kyverno state) is the reliable guard. Workloads also
# carry keel.sh/policy=never (annotation+label) set via kubectl
# since the postiz stack can't apply.
namespaces = [
"keel",
"calico-system",
@ -118,6 +136,7 @@ resource "kubectl_manifest" "policy_inject_keel_annotations" {
"nvidia",
"kube-system",
"tigera-operator",
"postiz",
]
}
},

View file

@ -208,6 +208,11 @@ resource "kubernetes_deployment" "linkwarden" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -99,8 +99,8 @@ resource "kubernetes_namespace" "llama_cpp" {
metadata {
name = local.namespace
labels = {
tier = local.tiers.gpu
"istio-injection" = "disabled"
tier = local.tiers.gpu
"istio-injection" = "disabled"
"keel.sh/enrolled" = "true"
}
}
@ -280,10 +280,12 @@ resource "kubernetes_deployment" "llama_swap" {
# for it to be reachable".
wait_for_rollout = false
spec {
# TEMP-SCALEDOWN-2026-05-25-IO-STORM: scaled to 0 during cluster recovery.
# Restore to 1 when cluster is fully stable. See post-mortem
# docs/post-mortems/2026-05-25-immich-anca-elements-io-storm.md.
replicas = 0
# Restored to 1 on 2026-05-29 (was 0 during 2026-05-25 IO-storm recovery
# see docs/post-mortems/2026-05-25-immich-anca-elements-io-storm.md). The
# immediate trigger was fire-planner's examples ingest needing qwen3-8b for
# bulk Reddit-post extraction; only frigate is currently on the GPU on
# k8s-node1 so contention is minimal.
replicas = 1
strategy { type = "Recreate" }
selector {
@ -380,7 +382,7 @@ resource "kubernetes_deployment" "llama_swap" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE
# KEEL_LIFECYCLE_V1 stop the applykeel fight: every keel digest
# update patches `keel.sh/update-time` on the pod template and
# `kubernetes.io/change-cause` + bumps the K8s rollout revision on

View file

@ -191,6 +191,11 @@ resource "kubernetes_deployment" "local_path_provisioner" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -627,6 +627,19 @@ resource "kubernetes_deployment" "mailserver" {
}
}
}
lifecycle {
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
resource "kubernetes_service" "mailserver" {

View file

@ -247,7 +247,17 @@ resource "kubernetes_deployment" "roundcubemail" {
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -210,6 +210,13 @@ resource "kubernetes_deployment" "matrix" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].init_container[0].image,
spec[0].template[0].spec[0].init_container[1].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -256,6 +256,12 @@ EOT
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].init_container[0].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -374,6 +374,11 @@ resource "kubernetes_deployment" "n8n" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -216,6 +216,11 @@ resource "kubernetes_deployment" "navidrome" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -207,6 +207,11 @@ resource "kubernetes_deployment" "netbox" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -77,6 +77,11 @@ resource "kubernetes_deployment" "networking-toolbox" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -169,6 +169,11 @@ resource "kubernetes_deployment" "ntfy" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -219,6 +219,11 @@ resource "kubernetes_deployment" "onlyoffice-document-server" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -2102,6 +2102,11 @@ resource "kubernetes_deployment" "openlobster" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -119,6 +119,11 @@ resource "kubernetes_deployment" "osrm-foot" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
@ -208,6 +213,11 @@ resource "kubernetes_deployment" "osrm-bicycle" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
@ -301,6 +311,11 @@ resource "kubernetes_deployment" "otp" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -200,6 +200,11 @@ resource "kubernetes_deployment" "owntracks" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -187,6 +187,10 @@ resource "kubernetes_deployment" "paperless-mcp" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"],
metadata[0].annotations["keel.sh/match-tag"],
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -217,6 +217,11 @@ resource "kubernetes_deployment" "paperless-ngx" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -303,6 +303,12 @@ resource "kubernetes_deployment" "payslip_ingest" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].init_container[0].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}

View file

@ -207,6 +207,11 @@ resource "kubernetes_deployment" "phpipam_web" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -185,6 +185,11 @@ resource "kubernetes_deployment" "poison_fountain" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -423,6 +423,12 @@ resource "kubernetes_deployment" "temporal" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].init_container[0].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
depends_on = [helm_release.postiz]

View file

@ -154,6 +154,12 @@ resource "kubernetes_deployment" "priority-pass" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].container[1].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -119,6 +119,11 @@ resource "kubernetes_deployment" "privatebin" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -577,6 +577,11 @@ resource "kubernetes_deployment" "realestate-crawler-celery" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
@ -695,6 +700,11 @@ resource "kubernetes_deployment" "realestate-crawler-celery-beat" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -297,6 +297,12 @@ resource "kubernetes_deployment" "recruiter_responder" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].init_container[0].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}

View file

@ -150,6 +150,11 @@ resource "kubernetes_deployment" "printer" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
@ -332,6 +337,11 @@ resource "kubernetes_deployment" "resume" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -236,6 +236,11 @@ resource "kubernetes_deployment" "clickhouse" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
@ -450,6 +455,11 @@ resource "kubernetes_deployment" "rybbit" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
@ -556,6 +566,11 @@ resource "kubernetes_deployment" "rybbit-client" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -154,6 +154,11 @@ resource "kubernetes_deployment" "send" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -149,7 +149,17 @@ resource "kubernetes_deployment" "aiostreams" {
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -49,7 +49,17 @@ resource "kubernetes_deployment" "flaresolverr" {
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -117,7 +117,17 @@ resource "kubernetes_deployment" "lidarr" {
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -107,7 +107,17 @@ resource "kubernetes_deployment" "listenarr" {
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -135,7 +135,17 @@ resource "kubernetes_deployment" "prowlarr" {
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -153,7 +153,17 @@ resource "kubernetes_deployment" "qbittorrent" {
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -117,7 +117,17 @@ resource "kubernetes_deployment" "readarr" {
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -81,7 +81,17 @@ resource "kubernetes_deployment" "soulseek" {
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].template[0].spec[0].dns_config]
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -121,6 +121,11 @@ resource "kubernetes_deployment" "shadowsocks" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -214,6 +214,11 @@ resource "kubernetes_deployment" "speedtest" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -117,6 +117,11 @@ resource "kubernetes_deployment" "stirling-pdf" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -238,6 +238,11 @@ resource "kubernetes_deployment" "tandoor" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -112,6 +112,11 @@ resource "kubernetes_deployment" "tor-proxy" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}
@ -250,6 +255,11 @@ resource "kubernetes_deployment" "torrserver" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -130,6 +130,9 @@ resource "helm_release" "traefik" {
}
}
}
proxyProtocol = {
trustedIPs = ["10.0.20.1"]
}
}
websecure = {
port = 8443
@ -147,6 +150,13 @@ resource "helm_release" "traefik" {
enabled = true
advertisedPort = 443
}
# Accept PROXY-v2 ONLY from the pfSense HAProxy IPv6 bridge (10.0.20.1)
# so IPv6 clients (forwarded [2001:470:6e:43d::2] -> here) get their real
# IP for CrowdSec. Real IPv4 clients arrive with their own source IP
# (ETP=Local, not 10.0.20.1) and are unaffected.
proxyProtocol = {
trustedIPs = ["10.0.20.1"]
}
}
whisper-tcp = {
port = 10300
@ -165,11 +175,14 @@ resource "helm_release" "traefik" {
service = {
type = "LoadBalancer"
annotations = {
"metallb.io/loadBalancerIPs" = "10.0.20.200"
"metallb.io/allow-shared-ip" = "shared"
# Dedicated IP + ETP=Local so direct-app clients keep their real source
# IP (CrowdSec) and QUIC handshakes pin to one pod. Proxied apps are
# unaffected cloudflared targets the in-cluster Traefik Service
# (traefik.traefik.svc), not this LB IP, so the LB IP can move freely.
"metallb.io/loadBalancerIPs" = "10.0.20.203"
}
spec = {
externalTrafficPolicy = "Cluster"
externalTrafficPolicy = "Local"
}
}

View file

@ -82,6 +82,11 @@ resource "kubernetes_deployment" "blog" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

485
stacks/tripit/main.tf Normal file
View file

@ -0,0 +1,485 @@
variable "image_tag" {
type = string
default = "latest"
description = "tripit image tag. Use 8-char git SHA in CI; :latest only for local trials."
}
variable "postgresql_host" { type = string }
variable "nfs_server" { type = string }
variable "tls_secret_name" {
type = string
sensitive = true
}
locals {
namespace = "tripit"
image = "forgejo.viktorbarzin.me/viktor/tripit:${var.image_tag}"
labels = {
app = "tripit"
}
# Env shared by the Deployment app container and the three worker CronJobs.
# Providers are pinned to fakes/no-op until the real integrations are wired:
# FLIGHT_PROVIDER=fake, WEATHER_PROVIDER=openmeteo, PUSH_PROVIDER=webpush,
# LLM_MODE=fake, MAIL_INGEST_ENABLED=false.
# AUTH_MODE=forwardauth: the backend trusts the Authentik-injected
# X-authentik-email header (forward-auth at the ingress). STORAGE_DIR points
# at the RWX NFS PVC the app's default ./var is not writable by the
# non-root user.
app_env = {
AUTH_MODE = "forwardauth"
SERVE_FRONTEND_DIR = "/app/frontend_build"
STORAGE_DIR = "/data/documents"
FLIGHT_PROVIDER = "fake"
WEATHER_PROVIDER = "openmeteo"
PUSH_PROVIDER = "webpush"
LLM_MODE = "fake"
MAIL_INGEST_ENABLED = "false"
}
}
resource "kubernetes_namespace" "tripit" {
metadata {
name = local.namespace
labels = {
tier = local.tiers.aux
"istio-injection" = "disabled"
# Opt into Keel auto-update (inject-keel-annotations ClusterPolicy).
"keel.sh/enrolled" = "true"
}
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: goldilocks-vpa-auto-mode ClusterPolicy stamps this label.
ignore_changes = [metadata[0].labels["goldilocks.fairwinds.com/vpa-update-mode"]]
}
}
# App secrets seed these in Vault before applying:
# secret/tripit
# VAPID_PUBLIC_KEY Web Push (VAPID) public key for push subscriptions
# VAPID_PRIVATE_KEY Web Push (VAPID) private key
# VAPID_SUBJECT VAPID subject (mailto: or https: URL)
# CALENDAR_TOKEN_SECRET HMAC secret used to mint/verify the per-user
# .ics calendar feed tokens (the /api/calendar
# carve-out is gated by these tokens, not Authentik)
#
# Schema in CNPG: `tripit` (alembic creates tables on first migrate).
# DB user: created via Vault database engine see static-creds/pg-tripit.
resource "kubernetes_manifest" "external_secret" {
manifest = {
apiVersion = "external-secrets.io/v1beta1"
kind = "ExternalSecret"
metadata = {
name = "tripit-secrets"
namespace = local.namespace
}
spec = {
refreshInterval = "15m"
secretStoreRef = {
name = "vault-kv"
kind = "ClusterSecretStore"
}
target = {
name = "tripit-secrets"
template = {
metadata = {
annotations = {
"reloader.stakater.com/match" = "true"
}
}
}
}
data = [
{ secretKey = "VAPID_PUBLIC_KEY", remoteRef = { key = "tripit", property = "VAPID_PUBLIC_KEY" } },
{ secretKey = "VAPID_PRIVATE_KEY", remoteRef = { key = "tripit", property = "VAPID_PRIVATE_KEY" } },
{ secretKey = "VAPID_SUBJECT", remoteRef = { key = "tripit", property = "VAPID_SUBJECT" } },
{ secretKey = "CALENDAR_TOKEN_SECRET", remoteRef = { key = "tripit", property = "CALENDAR_TOKEN_SECRET" } },
{ secretKey = "IMAP_PASSWORD", remoteRef = { key = "tripit", property = "IMAP_PASSWORD" } },
]
}
}
depends_on = [kubernetes_namespace.tripit]
}
# DB credentials from Vault database engine (7-day rotation).
# Builds the asyncpg DSN consumed by the FastAPI app as DB_CONNECTION_STRING.
# Pre-req in dbaas: CNPG cluster has DB `tripit`, role `tripit`, and Vault
# role `static-creds/pg-tripit`.
resource "kubernetes_manifest" "db_external_secret" {
manifest = {
apiVersion = "external-secrets.io/v1beta1"
kind = "ExternalSecret"
metadata = {
name = "tripit-db-creds"
namespace = local.namespace
}
spec = {
refreshInterval = "15m"
secretStoreRef = {
name = "vault-database"
kind = "ClusterSecretStore"
}
target = {
name = "tripit-db-creds"
template = {
metadata = {
annotations = {
"reloader.stakater.com/match" = "true"
}
}
data = {
DB_CONNECTION_STRING = "postgresql+asyncpg://tripit:{{ .password }}@${var.postgresql_host}:5432/tripit"
DB_PASSWORD = "{{ .password }}"
}
}
}
data = [{
secretKey = "password"
remoteRef = {
key = "static-creds/pg-tripit"
property = "password"
}
}]
}
}
depends_on = [kubernetes_namespace.tripit]
}
# RWX NFS PVC for the documents vault. Mounted at /data/documents on the
# Deployment app container and on every worker CronJob (they all share the
# same document store, hence RWX). Lives under /srv/nfs on the Proxmox host,
# so the daily-backup pipeline auto-discovers and versions it.
module "documents_nfs" {
source = "../../modules/kubernetes/nfs_volume"
name = "tripit-documents-host"
namespace = kubernetes_namespace.tripit.metadata[0].name
nfs_server = var.nfs_server
nfs_path = "/srv/nfs/tripit-documents"
storage = "5Gi"
access_modes = ["ReadWriteMany"]
}
resource "kubernetes_deployment" "tripit" {
metadata {
name = "tripit"
namespace = kubernetes_namespace.tripit.metadata[0].name
labels = merge(local.labels, {
tier = local.tiers.aux
})
annotations = {
"reloader.stakater.com/search" = "true"
}
}
spec {
# Single leader: APScheduler-style reminders + the RWX document store want
# one writer. Recreate avoids two pods racing the same NFS volume.
replicas = 1
strategy {
type = "Recreate"
}
selector {
match_labels = local.labels
}
template {
metadata {
labels = local.labels
}
spec {
image_pull_secrets {
name = "registry-credentials"
}
init_container {
name = "alembic-migrate"
image = local.image
command = ["alembic", "upgrade", "head"]
env_from {
secret_ref { name = "tripit-secrets" }
}
env_from {
secret_ref { name = "tripit-db-creds" }
}
resources {
requests = { cpu = "50m", memory = "256Mi" }
limits = { memory = "512Mi" }
}
}
container {
name = "tripit"
image = local.image
port {
container_port = 8080
}
env_from {
secret_ref { name = "tripit-secrets" }
}
env_from {
secret_ref { name = "tripit-db-creds" }
}
dynamic "env" {
for_each = local.app_env
content {
name = env.key
value = env.value
}
}
volume_mount {
name = "documents"
mount_path = "/data/documents"
}
readiness_probe {
http_get {
path = "/healthz"
port = 8080
}
initial_delay_seconds = 5
period_seconds = 10
}
liveness_probe {
http_get {
path = "/healthz"
port = 8080
}
initial_delay_seconds = 30
period_seconds = 30
}
resources {
requests = { cpu = "100m", memory = "384Mi" }
limits = { memory = "768Mi" }
}
}
volume {
name = "documents"
persistent_volume_claim {
claim_name = module.documents_nfs.claim_name
}
}
}
}
}
lifecycle {
ignore_changes = [
spec[0].template[0].spec[0].dns_config, # KYVERNO_LIFECYCLE_V1
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
spec[0].template[0].spec[0].init_container[0].image,
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
depends_on = [
kubernetes_manifest.external_secret,
kubernetes_manifest.db_external_secret,
]
}
# Worker CronJobs share the app image + secret/env wiring. Defined via a map
# so the three jobs (poll-flights, run-reminders, ingest-mail) stay identical
# except for schedule, subcommand, and the suspend flag.
locals {
cronjobs = {
poll-flights = {
schedule = "*/30 * * * *"
command = ["python", "-m", "tripit_api", "poll-flights"]
suspend = false
extra_env = {}
}
run-reminders = {
schedule = "*/15 * * * *"
command = ["python", "-m", "tripit_api", "run-reminders"]
suspend = false
extra_env = {}
}
# Ongoing forward-to-parse ingest of me@viktorbarzin.me's mailbox. Uses the
# real local LLM (qwen3vl-4b on llama-swap qwen3-8b OOMs the shared T4).
# Read-only IMAP (BODY.PEEK), bounded to the 30 most-recent messages/run;
# the pipeline is idempotent (skips message_ids already in inbound_email),
# so re-reading the recent window is a no-op for already-seen mail.
# IMAP_PASSWORD is injected from secret/tripit via the tripit-secrets ES.
ingest-mail = {
schedule = "*/30 * * * *"
command = ["python", "-m", "tripit_api", "ingest-mail"]
suspend = false
extra_env = {
LLM_MODE = "llamacpp"
LLM_ENDPOINT = "http://llama-swap.llama-cpp.svc.cluster.local:8080"
LLM_MODEL = "qwen3vl-4b"
MAIL_INGEST_ENABLED = "true"
MAIL_DEFAULT_OWNER_EMAIL = "me@viktorbarzin.me"
IMAP_HOST = "mailserver.mailserver.svc.cluster.local"
IMAP_PORT = "993"
IMAP_USER = "me@viktorbarzin.me"
IMAP_FOLDER = "INBOX"
IMAP_USE_SSL = "true"
IMAP_RECENT_N = "30"
}
}
}
}
resource "kubernetes_cron_job_v1" "tripit_worker" {
for_each = local.cronjobs
metadata {
name = "tripit-${each.key}"
namespace = kubernetes_namespace.tripit.metadata[0].name
labels = local.labels
}
spec {
schedule = each.value.schedule
suspend = each.value.suspend
concurrency_policy = "Forbid"
successful_jobs_history_limit = 3
failed_jobs_history_limit = 5
starting_deadline_seconds = 300
job_template {
metadata {
labels = local.labels
}
spec {
backoff_limit = 1
ttl_seconds_after_finished = 86400
template {
metadata {
labels = local.labels
}
spec {
restart_policy = "OnFailure"
image_pull_secrets {
name = "registry-credentials"
}
container {
name = "worker"
image = local.image
command = each.value.command
env_from {
secret_ref { name = "tripit-secrets" }
}
env_from {
secret_ref { name = "tripit-db-creds" }
}
dynamic "env" {
for_each = merge(local.app_env, each.value.extra_env)
content {
name = env.key
value = env.value
}
}
volume_mount {
name = "documents"
mount_path = "/data/documents"
}
resources {
requests = { cpu = "50m", memory = "256Mi" }
limits = { memory = "512Mi" }
}
}
volume {
name = "documents"
persistent_volume_claim {
claim_name = module.documents_nfs.claim_name
}
}
}
}
}
}
}
lifecycle {
# KYVERNO_LIFECYCLE_V1: Kyverno admission webhook mutates dns_config with ndots=2
ignore_changes = [spec[0].job_template[0].spec[0].template[0].spec[0].dns_config]
}
depends_on = [
kubernetes_manifest.external_secret,
kubernetes_manifest.db_external_secret,
]
}
resource "kubernetes_service" "tripit" {
metadata {
name = "tripit"
namespace = kubernetes_namespace.tripit.metadata[0].name
labels = local.labels
annotations = {
"prometheus.io/scrape" = "true"
"prometheus.io/path" = "/metrics"
"prometheus.io/port" = "8080"
}
}
spec {
type = "ClusterIP"
selector = local.labels
port {
name = "http"
port = 8080
target_port = 8080
}
}
}
# Kyverno ClusterPolicy `sync-tls-secret` auto-clones the wildcard TLS
# secret into every namespace, so we don't need a setup_tls_secret module.
# Main host Authentik forward-auth gates every request. The backend reads
# the injected X-authentik-email header (AUTH_MODE=forwardauth) for multi-user
# SSO; it ships no own login, so Authentik is the gate.
module "ingress" {
source = "../../modules/kubernetes/ingress_factory"
auth = "required"
dns_type = "proxied"
namespace = kubernetes_namespace.tripit.metadata[0].name
name = "tripit"
port = 8080
tls_secret_name = var.tls_secret_name
}
# Calendar feed carve-out for the same host: path /api/calendar served by the
# bare tripit service, bypassing Authentik.
module "ingress_calendar" {
source = "../../modules/kubernetes/ingress_factory"
# auth = "none": GET /api/calendar/{token}.ics is token-gated by an HMAC
# secret (CALENDAR_TOKEN_SECRET), not Authentik external calendar clients
# (Apple Calendar, Google, Thunderbird) can't complete the Authentik login
# dance, so forward-auth would break ICS subscriptions. The token is the gate.
auth = "none"
anti_ai_scraping = false
dns_type = "none" # main `module.ingress` owns the DNS record for this host
namespace = kubernetes_namespace.tripit.metadata[0].name
name = "tripit-calendar"
service_name = "tripit"
full_host = "tripit.viktorbarzin.me"
ingress_path = ["/api/calendar"]
port = 8080
tls_secret_name = var.tls_secret_name
}

View file

@ -0,0 +1,23 @@
include "root" {
path = find_in_parent_folders()
}
dependency "platform" {
config_path = "../platform"
skip_outputs = true
}
dependency "vault" {
config_path = "../vault"
skip_outputs = true
}
dependency "external-secrets" {
config_path = "../external-secrets"
skip_outputs = true
}
inputs = {
# Override per-deploy in CI / commit.
image_tag = "latest"
}

View file

@ -24,11 +24,48 @@ provider "registry.terraform.io/cloudflare/cloudflare" {
]
}
provider "registry.terraform.io/gavinbunney/kubectl" {
version = "1.19.0"
constraints = "~> 1.14"
hashes = [
"h1:9QkxPjp0x5FZFfJbE+B7hBOoads9gmdfj9aYu5N4Sfc=",
"zh:1dec8766336ac5b00b3d8f62e3fff6390f5f60699c9299920fc9861a76f00c71",
"zh:43f101b56b58d7fead6a511728b4e09f7c41dc2e3963f59cf1c146c4767c6cb7",
"zh:4c4fbaa44f60e722f25cc05ee11dfaec282893c5c0ffa27bc88c382dbfbaa35c",
"zh:51dd23238b7b677b8a1abbfcc7deec53ffa5ec79e58e3b54d6be334d3d01bc0e",
"zh:5afc2ebc75b9d708730dbabdc8f94dd559d7f2fc5a31c5101358bd8d016916ba",
"zh:6be6e72d4663776390a82a37e34f7359f726d0120df622f4a2b46619338a168e",
"zh:72642d5fcf1e3febb6e5d4ae7b592bb9ff3cb220af041dbda893588e4bf30c0c",
"zh:9b12af85486a96aedd8d7984b0ff811a4b42e3d88dad1a3fb4c0b580d04fa425",
"zh:a1da03e3239867b35812ee031a1060fed6e8d8e458e2eaca48b5dd51b35f56f7",
"zh:b98b6a6728fe277fcd133bdfa7237bd733eae233f09653523f14460f608f8ba2",
"zh:bb8b071d0437f4767695c6158a3cb70df9f52e377c67019971d888b99147511f",
"zh:dc89ce4b63bfef708ec29c17e85ad0232a1794336dc54dd88c3ba0b77e764f71",
"zh:dd7dd18f1f8218c6cd19592288fde32dccc743cde05b9feeb2883f37c2ff4b4e",
"zh:ec4bd5ab3872dedb39fe528319b4bba609306e12ee90971495f109e142d66310",
"zh:f610ead42f724c82f5463e0e71fa735a11ffb6101880665d93f48b4a67b9ad82",
]
}
provider "registry.terraform.io/goauthentik/authentik" {
version = "2024.12.1"
constraints = "~> 2024.10"
hashes = [
"h1:roBMd+gi+TGgikH/bMzEI8JfvJiMAQWt+8FmokCrQIs=",
"zh:090260dc7889ea822ec1d899344e1ee23eba5290461989c0796149c9511f2316",
"zh:13c2655ff824b0dc4b9bb832b5ca6d41dba97cb280330258c5fef4115e236209",
"zh:166a73c3a810c9c895d68a8ff968158f339f8a2c1c03e20ec9fc5ed99cc64e20",
"zh:203777eae1cdc711233315499643180604cff2324411b186b7cf07fdbe16f655",
"zh:3b2f18c9a8d28dac74dc6bbf168c946855ab9c68f053578d4630c50d5eaf30a0",
"zh:4822275985f6b74b6196c47112316a4252db22cf4ceaef7c9ab4c66d488abf2f",
"zh:53ea97562666c8a5a2f6d63d418a302a7f8ee4b7bb7da35dedaa89aa5708b7f0",
"zh:56b8a230901e3550c92a1d3f58ee9dafe9853f30fe4315af3ab28ae63262e15d",
"zh:6293ab7b1fd8206a0c853591f50186aca4a1eff117b2a773e10760a23a2c83e9",
"zh:9433970f79fb92d8aae3ee436db5630ab312c78b6dc9df9c1db3273a18f8aaa1",
"zh:95df406214f79b3b98222d7c7fe8fc319a3d90b7a9d53e1d5abbda5dfb8b9436",
"zh:a85880da0552a42c8f449390fbd7d8b03541d1a13e04bba9f1404fa658754260",
"zh:a95f6e9bd62c67e70eba1b1a14728856b9a6a28cd1e5e3be54a7718882c87e7f",
"zh:dd599b51c5beb34a4c6feece244fde07d2558d69929449ab1fd39a5ebe738781",
]
}
@ -56,6 +93,18 @@ provider "registry.terraform.io/hashicorp/kubernetes" {
version = "3.1.0"
hashes = [
"h1:oodIAuFMikXNmEtil5MQgP4dfSctUBYQiGJfjbsF3NY=",
"zh:0215c5c60be62028c09a2f22458e89cda3ef5830a632299f1d401eb3538874b0",
"zh:09ebb9f442431e278a310a9423f32caf467cb4b3cad3fe59573ca71fa7b14e20",
"zh:0c4e5912f83bb35846ae0a9ae54fc320706ee61894cd21cc6b4181b1c5a2fa5c",
"zh:1678c982853ad461e65ccb5e79d585e13ed109dd47dab2a66d3a7a304faeef65",
"zh:1c050a5c15e330457a9c18caacf61a923c59d663e13f2962e4b32f04fef523a0",
"zh:2c55bcec83be58ec132c7cb0a1ac644758b800d794fdc636d53a0eada0358a3a",
"zh:a062bb0aa316c08d8460c66a5d68da71da40de5d3bc3b31abcf3a1a9a19650f1",
"zh:a26fdea0afaa9b247c73c0b42843ca51ba7db0ac2571f9d3d50dcabd20ca1b98",
"zh:c872c9385a78d502bf5823d61cd3bb0f9a0585030e025eb12585c83451beeaa1",
"zh:f180879af931182beee4c8c0d9dab62b81d86f17ddcbe3786ef4c7cec9163a4e",
"zh:f569b65999264a9416862bca5cd2a6177d94ccb0424f3a4ef424428912b9cb3c",
"zh:f70f5789264069e0eef06f9b5d5fde955ef7206f7d446d1ce51a4c37a3f3e02f",
]
}
@ -79,3 +128,25 @@ provider "registry.terraform.io/hashicorp/vault" {
"zh:ff35fb1ab6add288f0f368981e56f780b50405accd1937131cba1137999c8d83",
]
}
provider "registry.terraform.io/telmate/proxmox" {
version = "3.0.2-rc07"
constraints = "3.0.2-rc07"
hashes = [
"h1:zp5hpQJQ4t4zROSLqdltVpBO+Riy9VugtfFbpyTw1aM=",
"zh:2ee860cd0a368b3eaa53f4a9ea46f16dab8a97929e813ea6ef55183f8112c2ca",
"zh:415965fd915bae2040d7f79e45f64d6e3ae61149c10114efeac1b34687d7296c",
"zh:6584b2055df0e32062561c615e3b6b2c291ca8c959440adda09ef3ec1e1436bd",
"zh:65dcfad71928e0a8dd9befc22524ed686be5020b0024dc5cca5184c7420eeb6b",
"zh:7253dc29bd265d33f2791ac4f779c5413f16720bb717de8e6c5fcb2c858648ea",
"zh:7ec8993da10a47606670f9f67cfd10719a7580641d11c7aa761121c4a2bd66fb",
"zh:999a3f7a9dcf517967fc537e6ec930a8172203642fb01b8e1f78f908373db210",
"zh:a50e6df7280eb6584a5fd2456e3f5b6df13b2ec8a7fa4605511e438e1863be42",
"zh:b25b329a1e42681c509d027fee0365414f0cc5062b65690cfc3386aab16132ae",
"zh:c028877fdb438ece48f7bc02b65bbae9ca7b7befbd260e519ccab6c0cbb39f26",
"zh:cf0eaa3ea9fcc6d62793637947f1b8d7c885b6ad74695ab47e134e4ff132190f",
"zh:d5ade3fae031cc629b7c512a7b60e46570f4c41665e88a595d7efd943dde5ab2",
"zh:f388c15ad1ecfc09e7361e3b98bae9b627a3a85f7b908c9f40650969c949901c",
"zh:f415cc6f735a3971faae6ac24034afdb9ee83373ef8de19a9631c187d5adc7db",
]
}

View file

@ -1,7 +1,7 @@
# Generated by Terragrunt. Sig: nIlQXj57tbuaRZEa
terraform {
backend "pg" {
conn_str = "postgres://terraform_state:SBlzGxotNUN6HH9d0S-m@10.0.20.200:5432/terraform_state?sslmode=disable"
conn_str = "postgres://terraform_state:WR2rnNyiLIb-gUcIxOeF@10.0.20.200:5432/terraform_state?sslmode=disable"
schema_name = "tuya-bridge"
}
}

View file

@ -1,8 +1,3 @@
variable "tls_secret_name" {
type = string
sensitive = true
}
resource "kubernetes_namespace" "tuya-bridge" {
metadata {
name = "tuya-bridge"
@ -77,9 +72,13 @@ resource "kubernetes_deployment" "tuya-bridge" {
}
}
spec {
image_pull_secrets {
name = "registry-credentials"
}
container {
image = "viktorbarzin/tuya_bridge:latest"
name = "tuya-bridge"
image = "forgejo.viktorbarzin.me/viktor/tuya_bridge:${var.image_tag}"
image_pull_policy = "IfNotPresent"
name = "tuya-bridge"
port {
container_port = 8080
}
@ -158,6 +157,11 @@ resource "kubernetes_deployment" "tuya-bridge" {
metadata[0].annotations["keel.sh/policy"],
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
metadata[0].annotations["keel.sh/match-tag"],
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
]
}
}

View file

@ -13,6 +13,17 @@ terraform {
source = "goauthentik/authentik"
version = "~> 2024.10"
}
# kubectl (gavinbunney) workaround for hashicorp/kubernetes
# `kubernetes_manifest` panics on Kyverno CRDs. See beads code-e2dp.
# Declared for all stacks but only used where opted-in.
kubectl = {
source = "gavinbunney/kubectl"
version = "~> 1.14"
}
proxmox = {
source = "telmate/proxmox"
version = "3.0.2-rc07"
}
}
}
@ -35,3 +46,8 @@ provider "vault" {
address = "https://vault.viktorbarzin.me"
skip_child_token = true
}
provider "kubectl" {
config_path = var.kube_config_path
load_config_file = true
}

View file

@ -0,0 +1,10 @@
variable "tls_secret_name" {
type = string
sensitive = true
}
variable "image_tag" {
type = string
default = "latest"
description = "tuya_bridge image tag pushed to forgejo.viktorbarzin.me/viktor/tuya_bridge. Each Woodpecker run does `kubectl set image` to the 8-char git SHA; this variable is only used on initial create / TF recreate (image is in lifecycle.ignore_changes)."
}

View file

@ -193,6 +193,10 @@ resource "kubernetes_deployment" "uptime-kuma" {
# back to `force` and re-enable auto-updates.
metadata[0].annotations["keel.sh/trigger"],
metadata[0].annotations["keel.sh/pollSchedule"], # KYVERNO_LIFECYCLE_V2
spec[0].template[0].spec[0].container[0].image, # KEEL_IGNORE_IMAGE Keel manages tag updates
metadata[0].annotations["kubernetes.io/change-cause"],
metadata[0].annotations["deployment.kubernetes.io/revision"],
spec[0].template[0].metadata[0].annotations["keel.sh/update-time"], # KEEL_LIFECYCLE_V1
metadata[0].annotations["keel.sh/match-tag"], # injected by Kyverno
]
}
@ -591,6 +595,9 @@ locals {
database_password_vault_key = "uptimekuma_db_password"
hostname = null
port = null
url = null
accepted_statuscodes = null
ignore_tls = null
interval = 60
retry_interval = 60
max_retries = 2
@ -606,6 +613,9 @@ locals {
database_password_vault_key = null
hostname = null
port = null
url = null
accepted_statuscodes = null
ignore_tls = null
interval = 60
retry_interval = 30
max_retries = 3
@ -621,10 +631,36 @@ locals {
database_password_vault_key = null
hostname = "192.168.1.1"
port = 443
url = null
accepted_statuscodes = null
ignore_tls = null
interval = 60
retry_interval = 30
max_retries = 3
},
{
# Proxmox web UI on the PVE host. Probes the IP directly (NOT a
# `*.viktorbarzin.lan` name) because in-cluster lookups for those
# are vulnerable to CoreDNS pod-level cache skew pre-fix, this
# monitor would intermittently land on a stale `10.0.10.1`
# (pfSense gateway, nothing on :8006) and spuriously alert
# `ExternalAccessDivergence`. Direct-IP HTTPS eliminates that
# variable. Self-signed cert ignore_tls=true. The 301HTTPS
# redirect from pveproxy lands in the 300-399 band, so we accept
# 200-499 to cover redirect + auth-prompt responses.
name = "Proxmox UI"
type = "http"
database_connection_string = null
database_password_vault_key = null
hostname = null
port = null
url = "https://192.168.1.127:8006/"
accepted_statuscodes = ["200-299", "300-399", "400-499"]
ignore_tls = true
interval = 300
retry_interval = 60
max_retries = 2
},
]
}
@ -657,6 +693,9 @@ resource "kubernetes_config_map_v1" "internal_monitor_targets" {
database_connection_string = m.database_connection_string
hostname = m.hostname
port = m.port
url = m.url
accepted_statuscodes = m.accepted_statuscodes
ignore_tls = m.ignore_tls
password_env = m.database_password_vault_key != null ? "DB_PASSWORD_${upper(replace(m.name, "/[^A-Za-z0-9]/", "_"))}" : null
interval = m.interval
retry_interval = m.retry_interval
@ -710,7 +749,8 @@ for t in targets:
# MYSQL uses `databaseConnectionString` + `radiusPassword` (UK v2 re-uses
# radiusPassword for mysql auth backwards compat). Redis has auth
# disabled on the cluster, so password_env is null. PORT monitors use
# hostname + port directly.
# hostname + port directly. HTTP monitors use url + accepted_statuscodes
# + ignoreTls (camelCase on the API; stored as `ignore_tls` in DB).
desired = {
"type": mtype,
"name": name,
@ -721,6 +761,10 @@ for t in targets:
if mtype == MonitorType.PORT:
desired["hostname"] = t["hostname"]
desired["port"] = t["port"]
elif mtype == MonitorType.HTTP:
desired["url"] = t["url"]
desired["accepted_statuscodes"] = t["accepted_statuscodes"]
desired["ignoreTls"] = bool(t["ignore_tls"])
else:
desired["databaseConnectionString"] = t["database_connection_string"]
if t.get("password_env"):
@ -733,6 +777,8 @@ for t in targets:
drift_fields = ["interval", "retryInterval", "maxretries"]
if mtype == MonitorType.PORT:
drift_fields += ["hostname", "port"]
elif mtype == MonitorType.HTTP:
drift_fields += ["url", "accepted_statuscodes", "ignoreTls"]
else:
drift_fields += ["databaseConnectionString"]
if "radiusPassword" in desired:

Some files were not shown because too many files have changed in this diff Show more