You can find documentation on Prometheus Query Language on the web:
To create an Alerting Rule with PromQL, go to notification page and create a new notification rule.
Select a scope:
and select Use PromQL to define conditions to trigger alarm to use PromQL.
You have to choose a name, this name will be used as alert name in status dashboard and notifications.
You can add warning and/or critical PromQL and configure the delay. The delay corresponds to the time during which the threshold of the PromQL must be exceeded to change its status.
We have several Cassandra server running on our Kubernetes, we want to trigger a warning alert if the sum of the cpu_used of the containers exceeds 50% during 300 seconds and trigger a critical alert if it exeeds during 600 seconds.
PromQL
sum(container_cpu_used{item=~"k8s_cassandra_cassandra.*"}) > 50
~"k8s_cassandra_cassandra.*"
This allows you to find all the containers whose item name begins with k8s_cassandra_cassandra
We have a Kubernetes cluster with 3 workers, we want to trigger a warning alert if the sum of the system_load of 3 workers exceeds 15 during 300 seconds and trigger a critical alert if exceed 30 during 300 seconds.
PromQL Warning:
sum(system_load1{instance=~"host-fqdn-1|host-fqdn-2|host-fqdn-3"}) > 15
PromQL Critical:
sum(system_load1{instance=~"host-fqdn-1|host-fqdn-2|host-fqdn-3"}) > 30
We want to create a notification rule on any server, we want to create an alerting rule which will trigger a warning alert if on a server we have the cpu_used > 50 and the mem_used_perc > 80 for 120 seconds, and which will trigger a critical alert if on a server we have the cpu_used > 80 and the mem_used_perc > 90 for 300 seconds.
PromQL Warning:
cpu_used > 50 and on (instance) mem_used_perc > 80
PromQL Critical:
cpu_used > 80 and on (instance) mem_used_perc > 90