Hi folks,
We’d like to use the Rancher Alert system to monitor for specific Rancher Pods that we’ve had problems with, such as cattle-cluster-agent & cattle-node-agent.
We would like to monitor for frequent restarts, or perhaps for the frequent occurrences of something like CrashLoopBackOff. I cannot figure out how to monitor individual pods using Rancher Alerts.
I have tried a few things such as kube_pod_container_status_restarts_total{namespace="cattle-system", pod=~"cattle-cluster-agent*}, as shown below but without success:
Can any one point in the correct direction?
-= Stefan
