Viktor Barzin
|
34df786fe4
|
add haos monitoring job in prometheus
|
2025-11-29 11:46:42 +00:00 |
|
Viktor Barzin
|
0b7b092c26
|
add api key to tiny tuya target in prometheus scrape [ci skip]
|
2025-11-09 22:03:25 +00:00 |
|
Viktor Barzin
|
5f2cc75a8e
|
add prometheus targets for fuses [ci skip]
|
2025-10-29 21:59:06 +00:00 |
|
Viktor Barzin
|
76103f52e3
|
add alert if we use inverter power for 1d straight - probably an issue with switching [ci skip]
|
2025-10-29 20:09:21 +00:00 |
|
Viktor Barzin
|
3161da29a4
|
add scrape config for tuya bridge and prohibit access to the metrics path via ingress [ci skip]
|
2025-10-28 21:38:40 +00:00 |
|
Viktor Barzin
|
d5dd81ba30
|
increaes threshold for high power usage to 180 as we have bigger cpu now [ci skip]
|
2025-10-08 20:33:51 +00:00 |
|
Viktor Barzin
|
c5bb343ebe
|
disable errors for matrix ingress [ci skip]
|
2025-08-23 20:38:53 +00:00 |
|
Viktor Barzin
|
1530323477
|
update registry prometheus url to devvm as pi was too slow [ci skip]
|
2025-08-23 20:15:05 +00:00 |
|
Viktor Barzin
|
c5ae6873c9
|
add registry monitoring to prometheus [ci skip]
|
2025-03-30 11:15:54 +00:00 |
|
Viktor Barzin
|
6100372a9e
|
adjust batter low alert to fire only when there is no pwoer [ci skip]
|
2025-03-22 15:47:30 +00:00 |
|
Viktor Barzin
|
cd89d13ab2
|
disable alert for pods less than in spec [ci skip]
|
2025-03-16 18:27:13 +00:00 |
|
Viktor Barzin
|
dfe47657ee
|
disable perms errors and server errors for grafana and nextcloud ingresses as they were too noisy [ci skip]
|
2025-03-15 17:53:24 +00:00 |
|
Viktor Barzin
|
1e674bff7e
|
add alert for ups low battery remaining [ci skip]
|
2025-03-02 20:48:07 +00:00 |
|
Viktor Barzin
|
ab9b5b356a
|
increase low voltage alert to 10 min [ci skip]
|
2025-03-01 14:28:56 +00:00 |
|
Viktor Barzin
|
6a6a6974e2
|
increase interval for 500 alerts to 20m [ci skip]
|
2025-01-10 20:47:25 +00:00 |
|
Viktor Barzin
|
f291d8545b
|
move prometheus alerts to different channel and move high cpu period [ci skip]
|
2025-01-04 14:27:48 +00:00 |
|
Viktor Barzin
|
4643e22cc8
|
increase idle power threshold to 130w [ci skip]
|
2025-01-03 17:49:24 +00:00 |
|
Viktor Barzin
|
46736680a6
|
add alert status to message [ci skip]
|
2025-01-02 21:13:09 +00:00 |
|
Viktor Barzin
|
53d8b2d2c6
|
update prometheus alerts to be correctly grouped and sent to slack and deprecate some old ones [ci skip]
|
2025-01-02 20:33:55 +00:00 |
|
Viktor Barzin
|
48a0deb283
|
update prometheus chart values to get slack notiifcations to work and add alerts for 4xx and 5xx on ingress [ci skip]
|
2025-01-01 11:39:16 +00:00 |
|
Viktor Barzin
|
7336e7c033
|
fix monitoring stack [ci skip]
|
2024-12-31 17:15:06 +00:00 |
|
Viktor Barzin
|
5ee5e59e61
|
add low voltage alert to prometheus and update some dashboards [ci skip]
|
2024-12-23 18:21:01 +00:00 |
|
Viktor Barzin
|
c987301c48
|
add ups snmp exporter to prometheus [ci skip]
|
2024-12-15 18:13:33 +00:00 |
|
Viktor Barzin
|
72d780c26f
|
replace oauth proxy with authentik auth [ci skip]
|
2024-11-18 22:06:31 +00:00 |
|
Viktor Barzin
|
cf39034bdf
|
add homepage module and some more integrations [ci skip]
|
2024-10-20 13:05:03 +00:00 |
|
Viktor Barzin
|
ead57fe29b
|
add meshcentral and diun[ci skip]
|
2024-08-18 18:14:22 +00:00 |
|
Viktor Barzin
|
c05d088598
|
reduce prometheus storage retention from 12w -> 8w to save ~30gb [ci skip]
|
2024-08-07 20:18:13 +00:00 |
|
Viktor Barzin
|
84b707fcf8
|
update old prometheus alert detectors and upgrade immich to 101 [ci skip]
|
2024-04-12 21:15:31 +00:00 |
|
Viktor Barzin
|
ffd5d970e0
|
remove hack for london openwrt monitoring after having tailscale now [ci skip]
|
2024-03-30 18:28:11 +00:00 |
|
Viktor Barzin
|
09d07639c9
|
update openwrt london prometheus target address [ci skip]
|
2024-03-29 22:20:29 +00:00 |
|
Viktor Barzin
|
1f8ef28435
|
add monitoring jobs to p8s for istiod and the service mesh [ci skip]
|
2024-01-07 17:47:36 +00:00 |
|
Viktor Barzin
|
6e38cb420d
|
upgrade prometheus helm chart [ci skip]
|
2023-12-25 21:40:19 +00:00 |
|
Viktor Barzin
|
2f61744528
|
add baseurl to prometheus helm to chart so alertmanager sends correct links with prometheus public url instead of podname [ci skip]
|
2023-12-25 13:48:19 +00:00 |
|
Viktor Barzin
|
3e022b918d
|
add prometheus monitoring to crowdsec [ci skip]
|
2023-11-25 13:34:16 +00:00 |
|
Viktor Barzin
|
6b30a0e533
|
add alert if node memory exceeds 90% [ci skip]
|
2023-11-10 22:48:45 +00:00 |
|
Viktor Barzin
|
2fe0644c03
|
use .lan domain for idrac metrics scrape [ci skip]
|
2023-11-01 20:44:17 +00:00 |
|
Viktor Barzin
|
af53a5d42d
|
update redifhs exporter to new implementation [ci skip]
|
2023-10-24 11:44:19 +00:00 |
|
Viktor Barzin
|
4efa47172c
|
replace tls client cert auth with oauth and add localai stub [ci skip]
|
2023-10-22 14:07:18 +00:00 |
|
Viktor Barzin
|
68a3580d20
|
add alert on new client registration and update dns to use pfsense [ci skip]
|
2023-09-18 08:03:50 +00:00 |
|
Viktor Barzin
|
3047ff1d9b
|
disable email notifications as they are spammy and using sendgrid quota [ci skip]
|
2023-06-20 14:04:02 +00:00 |
|
viktorbarzin
|
ace5fa27be
|
remove 1gb limit for tsdb to confirm it was the root cause for memory issues [ci skip]
|
2023-04-21 23:04:39 +01:00 |
|
viktorbarzin
|
e0ff80e217
|
attempt ot reduce prometheus memory by setting --storage.tsdb.retention.size; laso add metrics-api which is not working atm [ci skip]
|
2023-04-17 01:28:03 +01:00 |
|
viktorbarzin
|
38297c2809
|
add alert for unhandled exceptions [ci skip]
|
2023-04-05 00:15:10 +01:00 |
|
viktorbarzin
|
609435c8bd
|
reword finance app webhook exception alerting [ci skip]
|
2023-04-03 23:35:33 +01:00 |
|
viktorbarzin
|
a93aa03f72
|
add counter for overall webhook failures
|
2023-04-03 22:37:59 +01:00 |
|
viktorbarzin
|
86df04d96f
|
add alert to monitor free memory on nodes [ci skip]
|
2023-03-26 17:22:04 +01:00 |
|
viktorbarzin
|
44ec092ef0
|
update idrac dns record to fix resolving issue [ci skip]
|
2023-02-15 21:24:17 +00:00 |
|
viktorbarzin
|
41b61ac42b
|
add openwrt to monitored guests
|
2023-01-23 21:38:46 +00:00 |
|
viktorbarzin
|
fb767033fa
|
reduce high power usage alert sensitivity
|
2022-01-11 20:24:14 +00:00 |
|
viktorbarzin
|
2dfa48c2e1
|
add slack to notifications and update alert definitions after upgrade [ci skip]
|
2022-01-06 20:09:20 +00:00 |
|