kms: dedicate MetalLB IP 10.0.20.202 + filter probe noise

Two coupled fixes for the hourly Slack noise + missing client IPs: 1. Move windows-kms off shared 10.0.20.200 to a dedicated MetalLB IP 10.0.20.202 with externalTrafficPolicy=Local, so vlmcsd sees real WAN client IPs (pfSense WAN forwards do DNAT-only; ETP=Local skips kube-proxy SNAT). Same pattern mailserver used pre-2026-04-19. Sharing 10.0.20.200 is blocked because all 10 services there are ETP=Cluster and MetalLB requires consistent ETP per shared IP. 2. Slack notifier now suppresses Slack posts for bare TCP open/close pairs (no Application/Activation block) — these are Uptime Kuma's port monitor and the new kubelet readiness/liveness probes. Probe counts go to a new metric kms_connection_probes_total{source} where source classifies the IP as internal_pod / cluster_node / external. Real activations are unaffected. Pod fluidity: added TCP readiness/liveness probes on 1688 to gate Pod Ready on the listener actually being up — required for ETP=Local so MetalLB only advertises 10.0.20.202 from a node where vlmcsd is serving. pfSense side (applied separately, not codified): - New alias k8s_kms_lb = 10.0.20.202 (KMS-only) - WAN:1688 NAT + filter rule retargeted from k8s_shared_lb to k8s_kms_lb - All other forwards on k8s_shared_lb (WireGuard, HTTPS, shadowsocks, smtps, etc.) untouched Runbook updated. Tests added for classify_source / is_probe / process_line.
2026-05-10 13:02:58 +00:00 · 2026-05-10 13:02:58 +00:00 · 9d4e2fc4a0
commit 9d4e2fc4a0
parent 295cfd776c
5 changed files with 317 additions and 56 deletions
--- a/docs/runbooks/kms-public-exposure.md
+++ b/docs/runbooks/kms-public-exposure.md
@ -9,15 +9,22 @@ how to tune the rate limit, how to revoke if abused.

 ## Architecture

- **K8s service**: `windows-kms` in namespace `kms`, MetalLB shared LB IP
-  `10.0.20.200:1688`. ETP=Cluster, so client IPs in vlmcsd logs are SNAT'd
-  k8s node IPs (not real-world client IPs). Trade-off accepted —
-  preserving real client IPs would require a dedicated MetalLB IP with
-  ETP=Local or a PROXY-protocol bounce; vlmcsd doesn't speak PROXY-v2.
- **pfSense WAN forward**: `WAN TCP/1688 → k8s_shared_lb:1688`
-  (alias = `10.0.20.200`). Description: `KMS public — kms.viktorbarzin.me`.
- **Filter rule** on the WAN interface, TCP/1688, with state-table
-  per-source caps:
+- **K8s service**: `windows-kms` in namespace `kms`, MetalLB **dedicated**
+  LB IP `10.0.20.202:1688`. ETP=Local, so vlmcsd sees real WAN client IPs
+  in its log (pfSense WAN forwards do DNAT-only, no SNAT; ETP=Local skips
+  the kube-proxy SNAT too). Same pattern mailserver used pre-2026-04-19.
+  Sharing `10.0.20.200` isn't an option — all 10 services there are
+  ETP=Cluster and MetalLB requires a single ETP per shared IP.
+- **Pod fluidity**: deployment has `replicas=1` (notifier dedup state is
+  per-pod) with no node affinity. TCP readiness/liveness probes on 1688
+  gate Pod Ready on the listener actually being up, so MetalLB only
+  advertises `10.0.20.202` from a node where vlmcsd is serving.
+- **pfSense WAN forward**: `WAN TCP/1688 → k8s_kms_lb:1688`
+  (alias = `10.0.20.202`, dedicated to KMS). Description: `KMS public —
+  kms.viktorbarzin.me`. Other forwards using `k8s_shared_lb` (WireGuard,
+  HTTPS, shadowsocks, smtps, etc.) are unaffected.
+- **Filter rule** on the WAN interface, TCP/1688 destination
+  `<k8s_kms_lb>`, with state-table per-source caps:
  - `max-src-conn 50` — concurrent connections per source IP
  - `max-src-conn-rate 10/60` — 10 new connections per 60 seconds per
    source
@ -26,6 +33,13 @@ how to tune the rate limit, how to revoke if abused.
    flushed. (`virusprot` is the only table pfSense's filter generator
    targets for `overload`; see `/etc/inc/filter.inc`. Don't try to point
    it at a custom table — the schema doesn't expose that knob.)
+- **Probe filter in slack-notifier**: a bare TCP open/close (no
+  Application/Activation block from vlmcsd) is treated as a probe — Uptime
+  Kuma's port-type monitor on `windows-kms.kms.svc:1688` and the kubelet
+  readiness/liveness probes both hit this path. Probes increment
+  `kms_connection_probes_total{source}` (`source` ∈ `internal_pod`,
+  `cluster_node`, `external`) and log to stdout, but never post to Slack.
+  Real activations still post.

 ## Where the logs are

@ -39,8 +53,11 @@ kubectl logs -n kms -l app=kms-service -c windows-kms --tail=50 -f
 kubectl logs -n kms -l app=kms-service -c windows-kms | grep "Incoming KMS request"
 ```

-Source IPs in this log are the SNAT'd node IPs because the LB Service uses
-ETP=Cluster on a shared MetalLB IP. Don't expect real WAN client IPs here.
+Source IPs from the WAN are real client IPs (pfSense DNAT-only + ETP=Local
+preserve them through the chain). LAN clients hitting the LB IP directly
+appear as their own IP. Pod-source probes (Uptime Kuma) appear as a Calico
+pod IP in `10.10.0.0/16`. Kubelet readiness/liveness probes appear as the
+hosting node IP in `10.0.20.0/24`.

 ### Slack notifier (kms namespace, k8s)

@ -53,6 +70,17 @@ also increment the Prometheus counter `kms_activations_total{product,status}`
 exposed on the same pod at `:9101/metrics` (scraped by the cluster-wide
 `kubernetes-pods` job; query via Prometheus or Grafana directly).

+Probe-only TCP connections (open+close, no KMS RPC) are silently filtered
+out of Slack and counted in `kms_connection_probes_total{source}`. Useful
+queries:
+```promql
+# Probe rate by source
+rate(kms_connection_probes_total[5m])
+# Probes from the public WAN (a non-zero rate here means real port-scans
+# are reaching us, not just internal monitoring)
+rate(kms_connection_probes_total{source="external"}[5m])
+```
+
 ### pfSense — virusprot table and filter hits

 ```bash
@ -93,18 +121,19 @@ The `overload` table entry survives pf reloads. Running
 If the activation surface needs to come down (abuse, legal, audit):

 1. **pfSense web UI** → `Firewall → NAT → Port Forward` → find
-   `WAN TCP/1688 → k8s_shared_lb` → **delete** (or disable). Apply.
+   `WAN TCP/1688 → k8s_kms_lb` → **delete** (or disable). Apply.
 2. **pfSense web UI** → `Firewall → Rules → WAN` → find
   `KMS public — kms.viktorbarzin.me` → **delete** (or disable). Apply.
 3. Verify externally: from a phone tether, `nc -zw3 kms.viktorbarzin.me 1688`
   should now fail.

 The k8s service stays reachable on the LAN
-(`10.0.20.200:1688` and the internal `kms.viktorbarzin.lan` ingress for
-the webpage) — only the WAN port-forward is removed.
+(`10.0.20.202:1688` directly, and the website at `kms.viktorbarzin.lan`
+via Traefik on `10.0.20.200:443`) — only the WAN port-forward is removed.

-To put it back, recreate the NAT rule (target alias `k8s_shared_lb`,
-port `1688`) and the filter rule with the same per-source caps.
+To put it back, recreate the NAT rule (target alias `k8s_kms_lb`,
+port `1688`) and the filter rule with the same per-source caps. The alias
+itself is independent of any forward and persists across delete/restore.

 ## Related