viktorbarzin
|
c87376b670
|
remove 1gb limit for tsdb to confirm it was the root cause for memory issues [ci skip]
|
2023-04-21 23:04:39 +01:00 |
|
viktorbarzin
|
64491f9028
|
attempt ot reduce prometheus memory by setting --storage.tsdb.retention.size; laso add metrics-api which is not working atm [ci skip]
|
2023-04-17 01:28:03 +01:00 |
|
viktorbarzin
|
a75e647b48
|
add alert for unhandled exceptions [ci skip]
|
2023-04-05 00:15:10 +01:00 |
|
viktorbarzin
|
9fe57e2ec6
|
reword finance app webhook exception alerting [ci skip]
|
2023-04-03 23:35:33 +01:00 |
|
viktorbarzin
|
6838d319a2
|
add counter for overall webhook failures
|
2023-04-03 22:37:59 +01:00 |
|
viktorbarzin
|
e9bf46caf8
|
add alert to monitor free memory on nodes [ci skip]
|
2023-03-26 17:22:04 +01:00 |
|
viktorbarzin
|
cbdd5d57ce
|
update idrac dns record to fix resolving issue [ci skip]
|
2023-02-15 21:24:17 +00:00 |
|
viktorbarzin
|
f8900ccf92
|
add openwrt to monitored guests
|
2023-01-23 21:38:46 +00:00 |
|
viktorbarzin
|
6c425b5573
|
reduce high power usage alert sensitivity
|
2022-01-11 20:24:14 +00:00 |
|
viktorbarzin
|
0b20fc1e73
|
add slack to notifications and update alert definitions after upgrade [ci skip]
|
2022-01-06 20:09:20 +00:00 |
|
viktorbarzin
|
b6f8160183
|
update home prometheus path
|
2021-09-05 18:39:18 +01:00 |
|
viktorbarzin
|
755942ee73
|
update home address [CI SKIP]
|
2021-08-17 23:25:13 +01:00 |
|
viktorbarzin
|
ae5cb2349b
|
do not use pvcs for alertmanager and allow overlapping blocks in prometues [ci skip]
|
2021-05-03 12:14:56 +01:00 |
|
viktorbarzin
|
28cb35da94
|
fix summary in alerts [ci skip]
|
2021-04-11 19:18:56 +01:00 |
|
viktorbarzin
|
b9c9d82a03
|
add more alerts for services being down
|
2021-04-10 18:28:26 +01:00 |
|
viktorbarzin
|
0c77660e93
|
add prometheus check for mailserver down
|
2021-04-10 18:16:04 +01:00 |
|
viktorbarzin
|
12e46fad2a
|
add redfish exporter [CI SKIP]
|
2021-04-05 15:07:29 +01:00 |
|
viktorbarzin
|
81e3f47f18
|
add config for chatbot and add alert for high mem openwrt [CI SKIP]
|
2021-03-07 23:14:43 +00:00 |
|
viktorbarzin
|
2b6de2113f
|
add value to alert summaries
|
2021-02-19 18:58:36 +00:00 |
|
viktorbarzin
|
8a9fff6799
|
revert debug value for node load alert
|
2021-02-17 18:33:52 +00:00 |
|
viktorbarzin
|
95750c6949
|
fix typo from templating which caused missing metrics and add alerts to prevent that from happening again
|
2021-02-10 23:14:09 +00:00 |
|
viktorbarzin
|
4caa987213
|
initial
|
2021-02-08 20:02:17 +00:00 |
|