viktorbarzin
|
c87376b670
|
remove 1gb limit for tsdb to confirm it was the root cause for memory issues [ci skip]
|
2023-04-21 23:04:39 +01:00 |
|
viktorbarzin
|
64491f9028
|
attempt ot reduce prometheus memory by setting --storage.tsdb.retention.size; laso add metrics-api which is not working atm [ci skip]
|
2023-04-17 01:28:03 +01:00 |
|
viktorbarzin
|
a75e647b48
|
add alert for unhandled exceptions [ci skip]
|
2023-04-05 00:15:10 +01:00 |
|
viktorbarzin
|
9fe57e2ec6
|
reword finance app webhook exception alerting [ci skip]
|
2023-04-03 23:35:33 +01:00 |
|
viktorbarzin
|
6e7de7e195
|
add prometheus pv and pvc [ci skip]
|
2023-04-03 23:21:48 +01:00 |
|
viktorbarzin
|
6838d319a2
|
add counter for overall webhook failures
|
2023-04-03 22:37:59 +01:00 |
|
viktorbarzin
|
e9bf46caf8
|
add alert to monitor free memory on nodes [ci skip]
|
2023-03-26 17:22:04 +01:00 |
|
viktorbarzin
|
cbdd5d57ce
|
update idrac dns record to fix resolving issue [ci skip]
|
2023-02-15 21:24:17 +00:00 |
|
viktorbarzin
|
f8900ccf92
|
add openwrt to monitored guests
|
2023-01-23 21:38:46 +00:00 |
|
viktorbarzin
|
695d002eef
|
update tf to work with k8s 1.25.0
|
2022-08-31 22:04:09 +01:00 |
|
viktorbarzin
|
6c425b5573
|
reduce high power usage alert sensitivity
|
2022-01-11 20:24:14 +00:00 |
|
viktorbarzin
|
0b20fc1e73
|
add slack to notifications and update alert definitions after upgrade [ci skip]
|
2022-01-06 20:09:20 +00:00 |
|
viktorbarzin
|
6870cee492
|
fix k8s upgrade issues [ci skip]
|
2022-01-06 00:07:48 +00:00 |
|
viktorbarzin
|
b6f8160183
|
update home prometheus path
|
2021-09-05 18:39:18 +01:00 |
|
viktorbarzin
|
755942ee73
|
update home address [CI SKIP]
|
2021-08-17 23:25:13 +01:00 |
|
viktorbarzin
|
c722825630
|
update home dns [ci skip]
|
2021-05-23 19:26:52 +01:00 |
|
viktorbarzin
|
9292b3285f
|
remove kubectl manifests bc drone is not happy running them :/
|
2021-05-08 14:03:34 +01:00 |
|
viktorbarzin
|
ead58dfc99
|
move grafana behind client tls until auth is setup [ci skip]
|
2021-05-08 01:19:24 +01:00 |
|
viktorbarzin
|
ae5cb2349b
|
do not use pvcs for alertmanager and allow overlapping blocks in prometues [ci skip]
|
2021-05-03 12:14:56 +01:00 |
|
viktorbarzin
|
eca78feb51
|
add shlink and parts of dbaas [ci skip]
|
2021-04-17 19:19:04 +01:00 |
|
viktorbarzin
|
28cb35da94
|
fix summary in alerts [ci skip]
|
2021-04-11 19:18:56 +01:00 |
|
viktorbarzin
|
b9c9d82a03
|
add more alerts for services being down
|
2021-04-10 18:28:26 +01:00 |
|
viktorbarzin
|
0c77660e93
|
add prometheus check for mailserver down
|
2021-04-10 18:16:04 +01:00 |
|
viktorbarzin
|
12e46fad2a
|
add redfish exporter [CI SKIP]
|
2021-04-05 15:07:29 +01:00 |
|
viktorbarzin
|
cccc49378e
|
add cronjob to monitor prometheus and init correct config for wireguard ui [CI SKIP]
|
2021-04-02 23:14:47 +01:00 |
|
viktorbarzin
|
81e3f47f18
|
add config for chatbot and add alert for high mem openwrt [CI SKIP]
|
2021-03-07 23:14:43 +00:00 |
|
viktorbarzin
|
f95fcc26c3
|
website www host, dns ipv6
|
2021-02-25 21:55:00 +00:00 |
|
viktorbarzin
|
2b6de2113f
|
add value to alert summaries
|
2021-02-19 18:58:36 +00:00 |
|
viktorbarzin
|
2673c16d98
|
add missing mailserver terraform items
|
2021-02-18 22:26:36 +00:00 |
|
viktorbarzin
|
40faa5dc0e
|
make tls crt and keys optional params to the create_tls_secret module
|
2021-02-17 19:36:30 +00:00 |
|
viktorbarzin
|
8a9fff6799
|
revert debug value for node load alert
|
2021-02-17 18:33:52 +00:00 |
|
viktorbarzin
|
95750c6949
|
fix typo from templating which caused missing metrics and add alerts to prevent that from happening again
|
2021-02-10 23:14:09 +00:00 |
|
viktorbarzin
|
4caa987213
|
initial
|
2021-02-08 20:02:17 +00:00 |
|