mirror of
https://github.com/openshift/openshift-docs.git
synced 2026-02-05 12:46:18 +01:00
105 lines
4.2 KiB
Plaintext
105 lines
4.2 KiB
Plaintext
// Module included in the following assemblies:
|
|
//
|
|
// observability/logging/logging-6.0/log6x-loki.adoc
|
|
// observability/logging/logging-6.2/log6x-loki-6.2.adoc
|
|
|
|
:_mod-docs-content-type: PROCEDURE
|
|
[id="logging-enabling-loki-alerts_{context}"]
|
|
= Creating a log-based alerting rule with Loki
|
|
|
|
The `AlertingRule` CR contains a set of specifications and webhook validation definitions to declare groups of alerting rules for a single `LokiStack` instance. In addition, the webhook validation definition provides support for rule validation conditions:
|
|
|
|
* If an `AlertingRule` CR includes an invalid `interval` period, it is an invalid alerting rule
|
|
* If an `AlertingRule` CR includes an invalid `for` period, it is an invalid alerting rule.
|
|
* If an `AlertingRule` CR includes an invalid LogQL `expr`, it is an invalid alerting rule.
|
|
* If an `AlertingRule` CR includes two groups with the same name, it is an invalid alerting rule.
|
|
* If none of the above applies, an alerting rule is considered valid.
|
|
|
|
.AlertingRule definitions
|
|
[options="header"]
|
|
|===
|
|
| Tenant type | Valid namespaces for `AlertingRule` CRs
|
|
| application a| `<your_application_namespace>`
|
|
| audit a| `openshift-logging`
|
|
| infrastructure a| `openshift-/\*`, `kube-/\*`, `default`
|
|
|===
|
|
|
|
.Procedure
|
|
|
|
. Create an `AlertingRule` custom resource (CR):
|
|
+
|
|
.Example infrastructure `AlertingRule` CR
|
|
[source,yaml]
|
|
----
|
|
apiVersion: loki.grafana.com/v1
|
|
kind: AlertingRule
|
|
metadata:
|
|
name: loki-operator-alerts
|
|
namespace: openshift-operators-redhat <1>
|
|
labels: <2>
|
|
openshift.io/<label_name>: "true"
|
|
spec:
|
|
tenantID: "infrastructure" <3>
|
|
groups:
|
|
- name: LokiOperatorHighReconciliationError
|
|
rules:
|
|
- alert: HighPercentageError
|
|
expr: | <4>
|
|
sum(rate({kubernetes_namespace_name="openshift-operators-redhat", kubernetes_pod_name=~"loki-operator-controller-manager.*"} |= "error" [1m])) by (job)
|
|
/
|
|
sum(rate({kubernetes_namespace_name="openshift-operators-redhat", kubernetes_pod_name=~"loki-operator-controller-manager.*"}[1m])) by (job)
|
|
> 0.01
|
|
for: 10s
|
|
labels:
|
|
severity: critical <5>
|
|
annotations:
|
|
summary: High Loki Operator Reconciliation Errors <6>
|
|
description: High Loki Operator Reconciliation Errors <7>
|
|
----
|
|
<1> The namespace where this `AlertingRule` CR is created must have a label matching the LokiStack `spec.rules.namespaceSelector` definition.
|
|
<2> The `labels` block must match the LokiStack `spec.rules.selector` definition.
|
|
<3> `AlertingRule` CRs for `infrastructure` tenants are only supported in the `openshift-\*`, `kube-\*`, or `default` namespaces.
|
|
<4> The value for `kubernetes_namespace_name:` must match the value for `metadata.namespace`.
|
|
<5> The value of this mandatory field must be `critical`, `warning`, or `info`.
|
|
<6> This field is mandatory.
|
|
<7> This field is mandatory.
|
|
+
|
|
.Example application `AlertingRule` CR
|
|
[source,yaml]
|
|
----
|
|
apiVersion: loki.grafana.com/v1
|
|
kind: AlertingRule
|
|
metadata:
|
|
name: app-user-workload
|
|
namespace: app-ns <1>
|
|
labels: <2>
|
|
openshift.io/<label_name>: "true"
|
|
spec:
|
|
tenantID: "application"
|
|
groups:
|
|
- name: AppUserWorkloadHighError
|
|
rules:
|
|
- alert:
|
|
expr: | <3>
|
|
sum(rate({kubernetes_namespace_name="app-ns", kubernetes_pod_name=~"podName.*"} |= "error" [1m])) by (job)
|
|
for: 10s
|
|
labels:
|
|
severity: critical <4>
|
|
annotations:
|
|
summary: <5>
|
|
description: <6>
|
|
----
|
|
<1> The namespace where this `AlertingRule` CR is created must have a label matching the LokiStack `spec.rules.namespaceSelector` definition.
|
|
<2> The `labels` block must match the LokiStack `spec.rules.selector` definition.
|
|
<3> Value for `kubernetes_namespace_name:` must match the value for `metadata.namespace`.
|
|
<4> The value of this mandatory field must be `critical`, `warning`, or `info`.
|
|
<5> The value of this mandatory field is a summary of the rule.
|
|
<6> The value of this mandatory field is a detailed description of the rule.
|
|
|
|
. Apply the `AlertingRule` CR:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
$ oc apply -f <filename>.yaml
|
|
----
|