Commit graph

91 commits

Author SHA1 Message Date
Viktor Barzin
34df786fe4 add haos monitoring job in prometheus 2025-11-29 11:46:42 +00:00
Viktor Barzin
1b0d5e60d8 add ${__field.name:wrap} in the idrac dashboard to fix wrapping issue[ci skip] 2025-11-15 05:15:50 +00:00
Viktor Barzin
0b7b092c26 add api key to tiny tuya target in prometheus scrape [ci skip] 2025-11-09 22:03:25 +00:00
Viktor Barzin
5f2cc75a8e add prometheus targets for fuses [ci skip] 2025-10-29 21:59:06 +00:00
Viktor Barzin
76103f52e3 add alert if we use inverter power for 1d straight - probably an issue with switching [ci skip] 2025-10-29 20:09:21 +00:00
Viktor Barzin
18a2695e64 add breakdown in main power source from inverterer in grafana [ci skip] 2025-10-28 22:41:44 +00:00
Viktor Barzin
77f92ae4ef update ups grafana dash to have inverter stats [ci skip] 2025-10-28 22:17:32 +00:00
Viktor Barzin
3161da29a4 add scrape config for tuya bridge and prohibit access to the metrics path via ingress [ci skip] 2025-10-28 21:38:40 +00:00
Viktor Barzin
d5dd81ba30 increaes threshold for high power usage to 180 as we have bigger cpu now [ci skip] 2025-10-08 20:33:51 +00:00
Viktor Barzin
c5bb343ebe disable errors for matrix ingress [ci skip] 2025-08-23 20:38:53 +00:00
Viktor Barzin
87c33629b4 backup all grafana dashboards [ci skip] 2025-08-23 20:30:37 +00:00
Viktor Barzin
1530323477 update registry prometheus url to devvm as pi was too slow [ci skip] 2025-08-23 20:15:05 +00:00
Viktor Barzin
a87b9793ad disable loki and alloy as it is not used [ci skip] 2025-08-23 20:02:37 +00:00
Viktor Barzin
c49e4d0a86 add loki + alloy deployments for logs collection [ci skip] 2025-05-04 11:25:39 +00:00
Viktor Barzin
c5ae6873c9 add registry monitoring to prometheus [ci skip] 2025-03-30 11:15:54 +00:00
Viktor Barzin
6100372a9e adjust batter low alert to fire only when there is no pwoer [ci skip] 2025-03-22 15:47:30 +00:00
Viktor Barzin
8eb18ec651 add power and ups battery over time widgets to grafana [ci skip] 2025-03-22 15:46:17 +00:00
Viktor Barzin
cd89d13ab2 disable alert for pods less than in spec [ci skip] 2025-03-16 18:27:13 +00:00
Viktor Barzin
59a651c2ad add 2 more oids for ups to monitor active and reactive power consumption [ci skip] 2025-03-15 17:54:04 +00:00
Viktor Barzin
dfe47657ee disable perms errors and server errors for grafana and nextcloud ingresses as they were too noisy [ci skip] 2025-03-15 17:53:24 +00:00
Viktor Barzin
1e674bff7e add alert for ups low battery remaining [ci skip] 2025-03-02 20:48:07 +00:00
Viktor Barzin
ab9b5b356a increase low voltage alert to 10 min [ci skip] 2025-03-01 14:28:56 +00:00
Viktor Barzin
6a6a6974e2 increase interval for 500 alerts to 20m [ci skip] 2025-01-10 20:47:25 +00:00
Viktor Barzin
f291d8545b move prometheus alerts to different channel and move high cpu period [ci skip] 2025-01-04 14:27:48 +00:00
Viktor Barzin
4643e22cc8 increase idle power threshold to 130w [ci skip] 2025-01-03 17:49:24 +00:00
Viktor Barzin
46736680a6 add alert status to message [ci skip] 2025-01-02 21:13:09 +00:00
Viktor Barzin
53d8b2d2c6 update prometheus alerts to be correctly grouped and sent to slack and deprecate some old ones [ci skip] 2025-01-02 20:33:55 +00:00
Viktor Barzin
48a0deb283 update prometheus chart values to get slack notiifcations to work and add alerts for 4xx and 5xx on ingress [ci skip] 2025-01-01 11:39:16 +00:00
Viktor Barzin
7336e7c033 fix monitoring stack [ci skip] 2024-12-31 17:15:06 +00:00
Viktor Barzin
55ac2f3707 add all grafana dashboards models [ci skip] 2024-12-24 13:48:21 +00:00
Viktor Barzin
5ee5e59e61 add low voltage alert to prometheus and update some dashboards [ci skip] 2024-12-23 18:21:01 +00:00
Viktor Barzin
9826f8546c fix typo in idrac voltage to be in volts not watts [ci skip] 2024-12-17 19:35:27 +00:00
Viktor Barzin
f587b2737c update ups grafana [ci skip] 2024-12-17 19:22:08 +00:00
Viktor Barzin
01f5f304d4 update idract refresh rate to 1m[ci skip] 2024-12-17 19:05:57 +00:00
Viktor Barzin
67d72fb7c0 add idrac grafana dashboard to repo [ci skip] 2024-12-16 22:36:00 +00:00
Viktor Barzin
3899dca6e6 add grafana dashboard for ups [ci skip] 2024-12-15 20:58:01 +00:00
Viktor Barzin
c987301c48 add ups snmp exporter to prometheus [ci skip] 2024-12-15 18:13:33 +00:00
Viktor Barzin
2a9560d4de move grafana and k8s dashboard to use authentik instead of oauth proxy [ci skip] 2024-11-22 00:47:00 +00:00
Viktor Barzin
72d780c26f replace oauth proxy with authentik auth [ci skip] 2024-11-18 22:06:31 +00:00
Viktor Barzin
cf39034bdf add homepage module and some more integrations [ci skip] 2024-10-20 13:05:03 +00:00
Viktor Barzin
ead57fe29b add meshcentral and diun[ci skip] 2024-08-18 18:14:22 +00:00
Viktor Barzin
c05d088598 reduce prometheus storage retention from 12w -> 8w to save ~30gb [ci skip] 2024-08-07 20:18:13 +00:00
Viktor Barzin
84b707fcf8 update old prometheus alert detectors and upgrade immich to 101 [ci skip] 2024-04-12 21:15:31 +00:00
Viktor Barzin
ffd5d970e0 remove hack for london openwrt monitoring after having tailscale now [ci skip] 2024-03-30 18:28:11 +00:00
Viktor Barzin
09d07639c9 update openwrt london prometheus target address [ci skip] 2024-03-29 22:20:29 +00:00
Viktor Barzin
1f8ef28435 add monitoring jobs to p8s for istiod and the service mesh [ci skip] 2024-01-07 17:47:36 +00:00
Viktor Barzin
6e38cb420d upgrade prometheus helm chart [ci skip] 2023-12-25 21:40:19 +00:00
Viktor Barzin
2f61744528 add baseurl to prometheus helm to chart so alertmanager sends correct links with prometheus public url instead of podname [ci skip] 2023-12-25 13:48:19 +00:00
Viktor Barzin
3e022b918d add prometheus monitoring to crowdsec [ci skip] 2023-11-25 13:34:16 +00:00
Viktor Barzin
33c1f786e5 move grafana to nfs [ci skip] 2023-11-11 00:16:58 +00:00