3.8 KiB
3.8 KiB
Network Traffic Visualization Design
Date: 2026-02-28 Goal: Real-time visualization of all network traffic — pod-to-pod (K8s) and full network (up to 192.168.1.1) — using Grafana as the single pane of glass.
Architecture
192.168.1.1 (ISP router)
└── 10.0.20.1 (pfSense + softflowd) ──NetFlow UDP──► GoFlow2 (K8s)
├── Proxmox (192.168.1.127) │
│ └── K8s nodes (10.0.20.100-104) ▼
│ └── Pods ◄──eBPF──► Caretta Prometheus
├── TrueNAS (10.0.10.15) │
└── Other devices ▼
Grafana
(Node Graph panels)
Two complementary data paths:
- Caretta (eBPF DaemonSet) → tracks pod-to-pod TCP connections → Prometheus metrics → Grafana Node Graph
- GoFlow2 (NetFlow collector) ← pfSense softflowd → Prometheus metrics → Grafana dashboards
Component 1: Caretta
- Stack:
stacks/caretta/ - Namespace:
caretta - Deployment: Helm release from
https://helm.groundcover.com/, chartcaretta - Config:
- Disable bundled Grafana (
grafana.enabled: false) - Disable bundled VictoriaMetrics (
victoria-metrics-single.enabled: false) - DaemonSet runs eBPF agent on each node
- Exposes Prometheus metrics on port 7117
- Disable bundled Grafana (
- Key metric:
caretta_links_observed{client_name, client_namespace, server_name, server_namespace, server_port} - Grafana: ConfigMap dashboard with Node Graph panel, label
grafana_dashboard: "1" - Resources: ~100Mi RAM, ~50m CPU per node
Component 2: GoFlow2
- Stack:
stacks/goflow2/ - Namespace:
goflow2 - Deployment: Raw Terraform (Deployment + Service) — single binary, no Helm chart needed
- Image:
netsampler/goflow2 - Ports:
- UDP 2055: NetFlow v9 receiver (from pfSense)
- TCP 8080: Prometheus metrics endpoint
- Service: NodePort for UDP 2055 so pfSense (10.0.20.1) can reach it on any node IP
- Key metrics:
flow_bytes,flow_packetswith labels for src/dst IP, port, protocol - Grafana: ConfigMap dashboard showing network flows (top talkers, protocol breakdown, inter-VLAN traffic)
- Resources: ~100Mi RAM, ~50m CPU (single pod, not DaemonSet)
Component 3: pfSense softflowd
- Host: 10.0.20.1 (SSH as admin)
- Package:
softflowd(install via pfSense package manager) - Config:
- Monitor LAN interface(s)
- Export NetFlow v9 to
<k8s-node-ip>:<goflow2-nodeport>(UDP) - Tracking level: full (track individual connections)
- Note: This is a manual SSH step — pfSense is not Terraform-managed
Component 4: Prometheus Integration
Two new scrape targets in stacks/platform/modules/monitoring/prometheus_chart_values.tpl (extraScrapeConfigs):
- job_name: 'caretta'
static_configs:
- targets: ["caretta.caretta.svc.cluster.local:7117"]
- job_name: 'goflow2'
static_configs:
- targets: ["goflow2.goflow2.svc.cluster.local:8080"]
Requires re-applying the platform stack.
Deployment Order
- Apply
stacks/caretta/— deploys eBPF DaemonSet - Apply
stacks/goflow2/— deploys NetFlow collector - Re-apply
stacks/platform/— adds Prometheus scrape targets - SSH to pfSense — install softflowd, configure NetFlow export to GoFlow2 NodePort
- Verify in Grafana — confirm both dashboards show data
Grafana Dashboards
Two dashboards, both auto-loaded via sidecar (ConfigMap label grafana_dashboard: "1"):
- K8s Pod Topology (Caretta): Node Graph panel showing pods as nodes, TCP connections as edges, byte counts as edge weights
- Network Flows (GoFlow2): Top talkers, protocol breakdown, inter-VLAN traffic, external destinations