Viktor asked to add connection logs (Traefik/Cloudflare) to catch the
real-path t3 WS drops: a direct-to-t3-serve browser ran 40 min clean
while real tunnel sessions cycle every 15-35s, so the drop originates
above t3-serve and we need to see which layer cuts the socket.
Traefik (/ws duration) and cloudflared (WS close events) already ship to
Loki; the gap was the devvm side. This adds:
- t3-dispatch logs every /ws open/close with dur_ms + cause:
downstream_closed (client/CF/Traefik hung up = last-mile/network),
upstream_closed (t3-serve closed/reset), or graceful. Graceful closes
previously left no trace (default ReverseProxy only logs on error), so a
watchdog-driven reconnect was invisible. Helpers unit-tested.
- devvm-promtail.{yaml,service}: ships devvm journald (t3-dispatch +
t3-serve@<user>) to cluster Loki as job=devvm-journal, mirroring the
pve/rpi-sofia shippers. devvm was never in Loki (standalone VM).
Joined in Loki the three layers attribute any future drop to a segment
with no repro needed. Runbook + service-catalog updated.
17 lines
467 B
Desktop File
17 lines
467 B
Desktop File
# systemd unit for promtail on the devvm (10.0.10.10). Install to
|
|
# /etc/systemd/system/promtail.service. See scripts/devvm-promtail.yaml for the full deploy.
|
|
[Unit]
|
|
Description=Promtail (ships devvm journal -> cluster Loki)
|
|
After=network-online.target
|
|
Wants=network-online.target
|
|
|
|
[Service]
|
|
Type=simple
|
|
ExecStart=/usr/local/bin/promtail -config.file=/etc/promtail/config.yml
|
|
Restart=on-failure
|
|
RestartSec=5
|
|
User=root
|
|
Group=root
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|