infra/scripts/pve-promtail.yaml
Viktor Barzin aac807fb3a
All checks were successful
ci/woodpecker/push/default Pipeline was successful
ci/woodpecker/push/build-cli Pipeline was successful
pve-host: ship journal to Loki (snoopy command audit + sshd-pve) for emo's root SSH
Emo's Claude agent was given root SSH to the Proxmox host (`ssh pve`, dedicated
shared-root key emo-pve-agent@devvm) so he can manage the host — e.g. the R730
fan daemon — through his agent. To keep an audit trail of what that agent does,
and to feed the long-pending Wave-1 S1 security rule, the PVE host now ships its
systemd journal to cluster Loki:

- snoopy logs every execve() to journald (identifier=snoopy), enabled via
  /etc/ld.so.preload; config scripts/pve-snoopy.ini.
- promtail v3.5.1 (amd64) ships /var/log/journal to Loki as {job="pve-journal"}
  (full host journal; filter identifier="snoopy" for the command audit), and
  relabels sshd auth to {job="sshd-pve"} — which ACTIVATES S1 (it was PENDING
  only for lack of this shipper). Config/unit: scripts/pve-promtail.{yaml,service}.

S1 won't false-fire on legitimate access: the devvm SNATs through pfSense to
192.168.1.2, which is already in the S1 source-IP allowlist.

Loki is reached via an /etc/hosts pin (10.0.20.203 loki.viktorbarzin.lan);
follow-up noted to register a Technitium CNAME so it auto-tracks LB renumbers.

Host pieces are hand-managed (not Terraform), like fan-control and the rpi-sofia
promtail — these files are the source of truth. Docs updated: security.md
(S1 LIVE) and monitoring.md ("External host: pve").

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 19:31:45 +00:00

53 lines
2.4 KiB
YAML

# Promtail config for the PVE host (192.168.1.127) — ships the systemd journal to cluster Loki.
#
# NOT Terraform-managed (the PVE host is the hypervisor, outside k8s). Deployed by hand,
# same pattern as scripts/fan-control.* and the rpi-sofia promtail. This file is source-of-truth.
#
# Deploy:
# scp scripts/pve-promtail.yaml root@192.168.1.127:/etc/promtail/config.yml
# scp scripts/pve-promtail.service root@192.168.1.127:/etc/systemd/system/promtail.service
# ssh root@192.168.1.127 'mkdir -p /var/lib/promtail && systemctl daemon-reload && systemctl enable --now promtail'
# # Binary: grafana/loki v3.5.1 promtail-linux-amd64 -> /usr/local/bin/promtail (chmod 0755).
# # Loki reach: /etc/hosts pin "10.0.20.203 loki.viktorbarzin.lan" (Traefik LB, ETP-Local).
# # FOLLOW-UP: replace the pin with a Technitium CNAME loki.viktorbarzin.lan -> ingress.viktorbarzin.lan
# # so it auto-tracks Traefik LB renumbers (also fixes the rpi-sofia pin — see docs/architecture/monitoring.md).
#
# Streams produced:
# {job="pve-journal"} — full host journal (filter identifier="snoopy" for the command audit)
# {job="sshd-pve"} — sshd auth lines; feeds the Loki S1 security rule (docs/architecture/security.md)
# {job="pve-journal", identifier="snoopy"} — snoopy command audit (every execve on the host; see scripts/pve-snoopy.ini)
server:
http_listen_port: 9080
grpc_listen_port: 0
log_level: warn
positions:
filename: /var/lib/promtail/positions.yaml
clients:
- url: https://loki.viktorbarzin.lan/loki/api/v1/push
tls_config:
insecure_skip_verify: true
scrape_configs:
- job_name: journal
journal:
max_age: 12h
json: false
path: /var/log/journal
labels:
host: pve
job: pve-journal
relabel_configs:
- source_labels: ['__journal__systemd_unit']
target_label: unit
- source_labels: ['__journal_priority_keyword']
target_label: level
- source_labels: ['__journal_syslog_identifier']
target_label: identifier
# sshd auth lines (identifier sshd / sshd-session) -> job=sshd-pve so the Loki S1
# security rule ({job="sshd-pve"}) matches. snoopy command lines stay job=pve-journal.
- source_labels: ['__journal_syslog_identifier']
regex: 'sshd.*'
target_label: job
replacement: 'sshd-pve'