diff --git a/_topic_maps/_topic_map.yml b/_topic_maps/_topic_map.yml index 3f9cbc4efd..3458b3cbfa 100644 --- a/_topic_maps/_topic_map.yml +++ b/_topic_maps/_topic_map.yml @@ -2817,6 +2817,23 @@ Topics: File: logging-5-8-release-notes - Name: Logging 5.7 File: logging-5-7-release-notes + - Name: Logging 6.0 + Dir: logging-6.0 + Topics: + - Name: Release notes + File: log6x-release-notes + - Name: About logging 6.0 + File: log6x-about + - Name: Upgrading to Logging 6.0 + File: log6x-upgrading-to-6 + - Name: Configuring log forwarding + File: log6x-clf + - Name: Configuring LokiStack storage + File: log6x-loki + - Name: Visualization for logging + File: log6x-visual +# - Name: API reference 6.0 +# File: log6x-api-reference - Name: Support File: cluster-logging-support - Name: Troubleshooting logging diff --git a/modules/log6x-audit-log-filtering.adoc b/modules/log6x-audit-log-filtering.adoc new file mode 100644 index 0000000000..495c2307a5 --- /dev/null +++ b/modules/log6x-audit-log-filtering.adoc @@ -0,0 +1,118 @@ +// Module included in the following assemblies: +// +// * observability/logging/logging-6.0/log6x-clf.adoc + +:_mod-docs-content-type: CONCEPT +[id="log6x-audit-filtering_{context}"] += Overview of API audit filter +OpenShift API servers generate audit events for each API call, detailing the request, response, and the identity of the requester, leading to large volumes of data. The API Audit filter uses rules to enable the exclusion of non-essential events and the reduction of event size, facilitating a more manageable audit trail. Rules are checked in order, and checking stops at the first match. The amount of data that is included in an event is determined by the value of the `level` field: + +* `None`: The event is dropped. +* `Metadata`: Audit metadata is included, request and response bodies are removed. +* `Request`: Audit metadata and the request body are included, the response body is removed. +* `RequestResponse`: All data is included: metadata, request body and response body. The response body can be very large. For example, `oc get pods -A` generates a response body containing the YAML description of every pod in the cluster. + +The `ClusterLogForwarder` custom resource (CR) uses the same format as the standard link:https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/#audit-policy[Kubernetes audit policy], while providing the following additional functions: + +Wildcards:: Names of users, groups, namespaces, and resources can have a leading or trailing `\*` asterisk character. For example, the namespace `openshift-\*` matches `openshift-apiserver` or `openshift-authentication`. Resource `\*/status` matches `Pod/status` or `Deployment/status`. + +Default Rules:: Events that do not match any rule in the policy are filtered as follows: +* Read-only system events such as `get`, `list`, and `watch` are dropped. +* Service account write events that occur within the same namespace as the service account are dropped. +* All other events are forwarded, subject to any configured rate limits. + +To disable these defaults, either end your rules list with a rule that has only a `level` field or add an empty rule. + +Omit Response Codes:: A list of integer status codes to omit. You can drop events based on the HTTP status code in the response by using the `OmitResponseCodes` field, which lists HTTP status codes for which no events are created. The default value is `[404, 409, 422, 429]`. If the value is an empty list, `[]`, then no status codes are omitted. + +The `ClusterLogForwarder` CR audit policy acts in addition to the {product-title} audit policy. The `ClusterLogForwarder` CR audit filter changes what the log collector forwards and provides the ability to filter by verb, user, group, namespace, or resource. You can create multiple filters to send different summaries of the same audit stream to different places. For example, you can send a detailed stream to the local cluster log store and a less detailed stream to a remote site. + +[NOTE] +==== +You must have a cluster role `collect-audit-logs` to collect the audit logs. The following example provided is intended to illustrate the range of rules possible in an audit policy and is not a recommended configuration. +==== + +.Example audit policy +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: + name: + namespace: +spec: + serviceAccount: + name: + pipelines: + - name: my-pipeline + inputRefs: audit # <1> + filterRefs: my-policy # <2> + filters: + - name: my-policy + type: kubeAPIAudit + kubeAPIAudit: + # Don't generate audit events for all requests in RequestReceived stage. + omitStages: + - "RequestReceived" + + rules: + # Log pod changes at RequestResponse level + - level: RequestResponse + resources: + - group: "" + resources: ["pods"] + + # Log "pods/log", "pods/status" at Metadata level + - level: Metadata + resources: + - group: "" + resources: ["pods/log", "pods/status"] + + # Don't log requests to a configmap called "controller-leader" + - level: None + resources: + - group: "" + resources: ["configmaps"] + resourceNames: ["controller-leader"] + + # Don't log watch requests by the "system:kube-proxy" on endpoints or services + - level: None + users: ["system:kube-proxy"] + verbs: ["watch"] + resources: + - group: "" # core API group + resources: ["endpoints", "services"] + + # Don't log authenticated requests to certain non-resource URL paths. + - level: None + userGroups: ["system:authenticated"] + nonResourceURLs: + - "/api*" # Wildcard matching. + - "/version" + + # Log the request body of configmap changes in kube-system. + - level: Request + resources: + - group: "" # core API group + resources: ["configmaps"] + # This rule only applies to resources in the "kube-system" namespace. + # The empty string "" can be used to select non-namespaced resources. + namespaces: ["kube-system"] + + # Log configmap and secret changes in all other namespaces at the Metadata level. + - level: Metadata + resources: + - group: "" # core API group + resources: ["secrets", "configmaps"] + + # Log all other resources in core and extensions at the Request level. + - level: Request + resources: + - group: "" # core API group + - group: "extensions" # Version of group should NOT be included. + + # A catch-all rule to log all other requests at the Metadata level. + - level: Metadata +---- +<1> The log types that are collected. The value for this field can be `audit` for audit logs, `application` for application logs, `infrastructure` for infrastructure logs, or a named input that has been defined for your application. +<2> The name of your audit policy. diff --git a/modules/log6x-code-ex.adoc b/modules/log6x-code-ex.adoc new file mode 100644 index 0000000000..f9465c5a72 --- /dev/null +++ b/modules/log6x-code-ex.adoc @@ -0,0 +1,241 @@ +:_mod-docs-content-type: REFERENCE +[id="log6x-code-ex_{context}"] += Logging 6.0 Code Examples + +Code examples used in the wider Logging 6.0 documentation are hosted here for single sourcing. + +//// +This file is not intended to be included in whole. Includes from this file should be using tagged regions only. +References: +* https://github.com/openshift/openshift-docs/blob/main/contributing_to_docs/doc_guidelines.adoc#including-by-tags +* https://docs.asciidoctor.org/asciidoc/latest/directives/include-tagged-regions/ +//// + +// Content template within commented out block. +//// +// tag::tagname[] +[source,yaml] +---- +Content +More Content +---- +// end::tagname[] +//// + + +// tag::filters-unchanged[] +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: + name: my-forwarder +spec: + serviceAccount: + name: my-account + filters: + - name: my-multiline + type: detectMultilineException + - name: my-parse + type: parse + - name: my-labels + type: openshiftLabels + openshiftLabels: + foo: bar + pipelines: + - name: my-pipeline + inputRefs: + - application + outputRefs: + - my-output + filterRefs: + - my-multiline + - my-parse + - my-labels + outputs: + - name: my-output + type: http + http: + url: http://my-log-output:80 +---- +// end::filters-unchanged[] + +// tag::filters-changed[] +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: + name: my-forwarder +spec: + serviceAccount: + name: my-account + filters: + - name: drop-filter + type: drop + drop: + - test: + - field: '.level' + matches: 'debug' + - name: prune-filter + type: prune + prune: + in: + - '.kubernetes.labels.foobar' + notIn: + - '.message' + - '.log_type' + - name: audit-filter + type: kubeAPIAudit + kubeAPIAudit: + omitResponseCodes: + - 404 + - 409 + pipelines: + - name: my-pipeline + inputRefs: + - application + - audit + outputRefs: + - my-output + filterRefs: + - drop-filter + - prune-filter + - audit-filter + outputs: + - name: my-output + type: http + http: + url: http://my-log-output:80 +---- +// end::filters-changed[] + +// tag::inputs-app-audit-infra[] +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: + name: my-forwarder +spec: + serviceAccount: + name: my-account + inputs: + - name: app-logs + type: application + application: + includes: + - namespace: my-ns1 + container: my-app1 + excludes: + - namespace: my-ns2 + container: my-app2 + - name: audit-logs + type: audit + audit: + .... + - name: infra-logs + type: infrastructure + infrastructure: + .... + filters: + - name: my-parse + type: parse + - name: my-app-label + type: openshiftLabels + openshiftLabels: + my-log-index: app + - name: my-infra-label + type: openshiftLabels + openshiftLabels: + my-log-index: infra + outputs: + ...... + pipelines: + - name: my-app + inputRefs: + - application + filterRefs: + - my-parse + - my-app-label + outputRefs: + - es-output-by-label + - name: my-infra + inputRefs: + - infrastructure + filterRefs: + - my-parse + - my-infra-label + outputRefs: + - es-output-by-label +---- +// end::inputs-app-audit-infra[] + +// tag::output-cw-token[] +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: + name: my-forwarder +spec: + serviceAccount: + name: my-account + outputs: + - name: my-cw + type: cloudwatch + cloudwatch: + groupName: test-cluster_{.log_type||"unknown"} + region: us-east-1 + authentication: + type: iamRole + iamRole: + roleARN: + secretName: role-for-sts + key: credentials + token: + from: serviceAccount + pipelines: + - name: my-cw-logs + inputRefs: + - application + - infrastructure + outputRefs: + - my-cw +---- +// end::output-cw-token[] + +// tag::output-cw-static[] +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: + name: my-forwarder +spec: + serviceAccount: + name: my-account + outputs: + - name: my-cw + type: cloudwatch + cloudwatch: + groupName: test-cluster_{.log_type||"unknown"} + region: us-east-1 + authentication: + type: awsAccessKey + awsAccessKey: + keyId: + secretName: cw-secret + key: aws_access_key_id + keySecret: + secretName: cw-secret + key: aws_secret_access_key + pipelines: + - name: my-cw-logs + inputRefs: + - application + - infrastructure + outputRefs: + - my-cw +---- +// end::output-cw-static[] diff --git a/modules/log6x-collection-setup.adoc b/modules/log6x-collection-setup.adoc new file mode 100644 index 0000000000..61d25e7ada --- /dev/null +++ b/modules/log6x-collection-setup.adoc @@ -0,0 +1,201 @@ +// Module included in the following assemblies: +// +// observability/logging/logging-6.0/log6x-clf.adoc + +:_mod-docs-content-type: PROCEDURE +[id="log6x-collection-setup_{context}"] += Setting up log collection + +This release of Cluster Logging requires administrators to explicitly grant log collection permissions to the service account associated with *ClusterLogForwarder*. This was not required in previous releases for the legacy logging scenario consisting of a *ClusterLogging* and, optionally, a *ClusterLogForwarder.logging.openshift.io* resource. + +The {clo} provides `collect-audit-logs`, `collect-application-logs`, and `collect-infrastructure-logs` cluster roles, which enable the collector to collect audit logs, application logs, and infrastructure logs respectively. + +Setup log collection by binding the required cluster roles to your service account. + +== Legacy service accounts +To use the existing legacy service account `logcollector`, create the following *ClusterRoleBinding*: + +[source,terminal] +---- +$ oc adm policy add-cluster-role-to-user collect-application-logs system:serviceaccount:openshift-logging:logcollector +$ oc adm policy add-cluster-role-to-user collect-infrastructure-logs system:serviceaccount:openshift-logging:logcollector +---- + +Additionally, create the following *ClusterRoleBinding* if collecting audit logs: + +[source,terminal] +---- +$ oc adm policy add-cluster-role-to-user collect-audit-logs system:serviceaccount:openshift-logging:logcollector +---- + + +== Creating service accounts +.Prerequisites + +* The {clo} is installed in the `openshift-logging` namespace. +* You have administrator permissions. + +.Procedure + +. Create a service account for the collector. If you want to write logs to storage that requires a token for authentication, you must include a token in the service account. + +. Bind the appropriate cluster roles to the service account: ++ +.Example binding command +[source,terminal] +---- +$ oc adm policy add-cluster-role-to-user system:serviceaccount:: +---- + +=== Cluster Role Binding for your Service Account +The role_binding.yaml file binds the ClusterLogging operator's ClusterRole to a specific ServiceAccount, allowing it to manage Kubernetes resources cluster-wide. + +[source,yaml] +---- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: manager-rolebinding +roleRef: <1> + apiGroup: rbac.authorization.k8s.io <2> + kind: ClusterRole <3> + name: cluster-logging-operator <4> +subjects: <5> + - kind: ServiceAccount <6> + name: cluster-logging-operator <7> + namespace: openshift-logging <8> +---- +<1> roleRef: References the ClusterRole to which the binding applies. +<2> apiGroup: Indicates the RBAC API group, specifying that the ClusterRole is part of Kubernetes' RBAC system. +<3> kind: Specifies that the referenced role is a ClusterRole, which applies cluster-wide. +<4> name: The name of the ClusterRole being bound to the ServiceAccount, here cluster-logging-operator. +<5> subjects: Defines the entities (users or service accounts) that are being granted the permissions from the ClusterRole. +<6> kind: Specifies that the subject is a ServiceAccount. +<7> Name: The name of the ServiceAccount being granted the permissions. +<8> namespace: Indicates the namespace where the ServiceAccount is located. + +=== Writing application logs +The write-application-logs-clusterrole.yaml file defines a ClusterRole that grants permissions to write application logs to the Loki logging application. + +[source,yaml] +---- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: cluster-logging-write-application-logs +rules: <1> + - apiGroups: <2> + - loki.grafana.com <3> + resources: <4> + - application <5> + resourceNames: <6> + - logs <7> + verbs: <8> + - create <9> +Annotations +<1> rules: Specifies the permissions granted by this ClusterRole. +<2> apiGroups: Refers to the API group loki.grafana.com, which relates to the Loki logging system. +<3> loki.grafana.com: The API group for managing Loki-related resources. +<4> resources: The resource type that the ClusterRole grants permission to interact with. +<5> application: Refers to the application resources within the Loki logging system. +<6> resourceNames: Specifies the names of resources that this role can manage. +<7> logs: Refers to the log resources that can be created. +<8> verbs: The actions allowed on the resources. +<9> create: Grants permission to create new logs in the Loki system. +---- + +=== Writing audit logs +The write-audit-logs-clusterrole.yaml file defines a ClusterRole that grants permissions to create audit logs in the Loki logging system. +[source,yaml] +---- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: cluster-logging-write-audit-logs +rules: <1> + - apiGroups: <2> + - loki.grafana.com <3> + resources: <4> + - audit <5> + resourceNames: <6> + - logs <7> + verbs: <8> + - create <9> +---- +<1> rules: Defines the permissions granted by this ClusterRole. +<2> apiGroups: Specifies the API group loki.grafana.com. +<3> loki.grafana.com: The API group responsible for Loki logging resources. +<4> resources: Refers to the resource type this role manages, in this case, audit. +<5> audit: Specifies that the role manages audit logs within Loki. +<6> resourceNames: Defines the specific resources that the role can access. +<7> logs: Refers to the logs that can be managed under this role. +<8> verbs: The actions allowed on the resources. +<9> create: Grants permission to create new audit logs. + +=== Writing infrastructure logs +The write-infrastructure-logs-clusterrole.yaml file defines a ClusterRole that grants permission to create infrastructure logs in the Loki logging system. + +.Sample YAML +[source,yaml] +---- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: cluster-logging-write-infrastructure-logs +rules: <1> + - apiGroups: <2> + - loki.grafana.com <3> + resources: <4> + - infrastructure <5> + resourceNames: <6> + - logs <7> + verbs: <8> + - create <9> +---- +<1> rules: Specifies the permissions this ClusterRole grants. +<2> apiGroups: Specifies the API group for Loki-related resources. +<3> loki.grafana.com: The API group managing the Loki logging system. +<4> resources: Defines the resource type that this role can interact with. +<5> infrastructure: Refers to infrastructure-related resources that this role manages. +<6> resourceNames: Specifies the names of resources this role can manage. +<7> logs: Refers to the log resources related to infrastructure. +<8> verbs: The actions permitted by this role. +<9> create: Grants permission to create infrastructure logs in the Loki system. + +=== ClusterLogForwarder editor role +The clusterlogforwarder-editor-role.yaml file defines a ClusterRole that allows users to manage ClusterLogForwarders in OpenShift. + + +[source,yaml] +---- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: clusterlogforwarder-editor-role +rules: <1> + - apiGroups: <2> + - observability.openshift.io <3> + resources: <4> + - clusterlogforwarders <5> + verbs: <6> + - create <7> + - delete <8> + - get <9> + - list <10> + - patch <11> + - update <12> + - watch <13> +---- +<1> rules: Specifies the permissions this ClusterRole grants. +<2> apiGroups: Refers to the OpenShift-specific API group +<3> obervability.openshift.io: The API group for managing observability resources, like logging. +<4> resources: Specifies the resources this role can manage. +<5> clusterlogforwarders: Refers to the log forwarding resources in OpenShift. +<6> verbs: Specifies the actions allowed on the ClusterLogForwarders. +<7> create: Grants permission to create new ClusterLogForwarders. +<8> delete: Grants permission to delete existing ClusterLogForwarders. +<9> get: Grants permission to retrieve information about specific ClusterLogForwarders. +<10> list: Allows listing all ClusterLogForwarders. +<11> patch: Grants permission to partially modify ClusterLogForwarders. +<12> update: Grants permission to update existing ClusterLogForwarders. +<13> watch: Grants permission to monitor changes to ClusterLogForwarders. diff --git a/modules/log6x-config-roles.adoc b/modules/log6x-config-roles.adoc new file mode 100644 index 0000000000..5eaa4cb134 --- /dev/null +++ b/modules/log6x-config-roles.adoc @@ -0,0 +1,113 @@ +// Module included in the following assemblies: +// +// observability/logging/logging-6.0/log6x-clf.adoc + + +:_mod-docs-content-type: CONCEPT +[id="log6x-config-roles_{context}"] += Configuring Roles for Logging + +Logging does not grant all users access to logs by default. As an administrator, you must configure your users' access unless the Operator was upgraded and prior configurations are in place. Depending on your configuration and need, you can configure fine grain access to logs using the following: + +* Cluster wide policies +* Namespace scoped policies +* Creation of custom admin groups + +As an administrator, you must create the role bindings and cluster role bindings appropriate for your deployment. The {clo} provides the following cluster roles: + +* `cluster-logging-application-view` grants permission to read application logs. +* `cluster-logging-infrastructure-view` grants permission to read infrastructure logs. +* `cluster-logging-audit-view` grants permission to read audit logs. + +If you have upgraded from a prior version, an additional cluster role `logging-application-logs-reader` and associated cluster role binding `logging-all-authenticated-application-logs-reader` provide backward compatibility, allowing any authenticated user read access in their namespaces. + +[NOTE] +==== +Users with access by namespace must provide a namespace when querying application logs. +==== + +[id="cluster-wide-access_{context}"] +== Cluster wide access +Cluster role binding resources reference cluster roles, and set permissions cluster wide. + +.Example ClusterRoleBinding +[source,yaml] +---- +kind: ClusterRoleBinding +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: logging-all-application-logs-reader +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: cluster-logging-application-view # <1> +subjects: # <2> +- kind: Group + name: system:authenticated + apiGroup: rbac.authorization.k8s.io +---- +<1> Additional `ClusterRoles` are `cluster-logging-infrastructure-view`, and `cluster-logging-audit-view`. +<2> Specifies the users or groups this object applies to. + +[id="namespaced-access_{context}"] +== Namespaced access + +You can use `RoleBinding` resources with `ClusterRole` objects to define the namespace a user or group has access to logs for. + +.Example RoleBinding +[source,yaml] +---- +kind: RoleBinding +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: allow-read-logs + namespace: log-test-0 # <1> +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: cluster-logging-application-view +subjects: +- kind: User + apiGroup: rbac.authorization.k8s.io + name: testuser-0 +---- +<1> Specifies the namespace this `RoleBinding` applies to. + + +[id="custom-admin-group-access_{context}"] +== Custom admin group access +If you have a large deployment with several users who require broader permissions, you can create a custom group using the `adminGroup` field. Users who are members of any group specified in the `adminGroups` field of the `LokiStack` CR. + +Administrator users have access to all application logs in all namespaces, if they also get assigned the `cluster-logging-application-view` role. + +.Example `LokiStack` CR +[source,yaml] +---- +apiVersion: loki.grafana.com/v1 +kind: LokiStack +metadata: +# tag::LokiMode[] + name: logging-loki + namespace: openshift-logging +# end::LokiMode[] +# tag::NetObservMode[] + name: loki + namespace: netobserv +# end::NetObservMode[] +spec: + tenants: +# tag::LokiMode[] + mode: openshift-logging # <1> +# end::LokiMode[] +# tag::NetObservMode[] + mode: openshift-network # <1> +# end::NetObservMode[] + openshift: + adminGroups: # <2> + - cluster-admin + - custom-admin-group # <3> +---- +<1> Custom admin groups are only available in this mode. +<2> Entering an empty list `[]` value for this field disables admin groups. +<3> Overrides the default groups (`system:cluster-admins`, `cluster-admin`, `dedicated-admin`) +// end::CustomAdmin[] diff --git a/modules/log6x-content-filter-drop-records.adoc b/modules/log6x-content-filter-drop-records.adoc new file mode 100644 index 0000000000..affd4c242e --- /dev/null +++ b/modules/log6x-content-filter-drop-records.adoc @@ -0,0 +1,108 @@ +// Module included in the following assemblies: +// +// * observability/logging/logging-6.0/log6x-clf.adoc + +:_mod-docs-content-type: PROCEDURE +[id="log6x-content-filter-drop-records_{context}"] += Configuring content filters to drop unwanted log records + +When the `drop` filter is configured, the log collector evaluates log streams according to the filters before forwarding. The collector drops unwanted log records that match the specified configuration. + +.Procedure + +. Add a configuration for a filter to the `filters` spec in the `ClusterLogForwarder` CR. ++ +The following example shows how to configure the `ClusterLogForwarder` CR to drop log records based on regular expressions: ++ +.Example `ClusterLogForwarder` CR +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: +# ... +spec: + serviceAccount: + name: + filters: + - name: + type: drop # <1> + drop: # <2> + - test: # <3> + - field: .kubernetes.labels."foo-bar/baz" # <4> + matches: .+ # <5> + - field: .kubernetes.pod_name + notMatches: "my-pod" # <6> + pipelines: + - name: # <7> + filterRefs: [""] +# ... +---- +<1> Specifies the type of filter. The `drop` filter drops log records that match the filter configuration. +<2> Specifies configuration options for applying the `drop` filter. +<3> Specifies the configuration for tests that are used to evaluate whether a log record is dropped. +** If all the conditions specified for a test are true, the test passes and the log record is dropped. +** When multiple tests are specified for the `drop` filter configuration, if any of the tests pass, the record is dropped. +** If there is an error evaluating a condition, for example, the field is missing from the log record being evaluated, that condition evaluates to false. +<4> Specifies a dot-delimited field path, which is a path to a field in the log record. The path can contain alpha-numeric characters and underscores (`a-zA-Z0-9_`), for example, `.kubernetes.namespace_name`. If segments contain characters outside of this range, the segment must be in quotes, for example, `.kubernetes.labels."foo.bar-bar/baz"`. You can include multiple field paths in a single `test` configuration, but they must all evaluate to true for the test to pass and the `drop` filter to be applied. +<5> Specifies a regular expression. If log records match this regular expression, they are dropped. You can set either the `matches` or `notMatches` condition for a single `field` path, but not both. +<6> Specifies a regular expression. If log records do not match this regular expression, they are dropped. You can set either the `matches` or `notMatches` condition for a single `field` path, but not both. +<7> Specifies the pipeline that the `drop` filter is applied to. + +. Apply the `ClusterLogForwarder` CR by running the following command: ++ +[source,terminal] +---- +$ oc apply -f .yaml +---- + +.Additional examples + +The following additional example shows how you can configure the `drop` filter to only keep higher priority log records: + +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: +# ... +spec: + serviceAccount: + name: + filters: + - name: important + type: drop + drop: + - test: + - field: .message + notMatches: "(?i)critical|error" + - field: .level + matches: "info|warning" +# ... +---- + +In addition to including multiple field paths in a single `test` configuration, you can also include additional tests that are treated as _OR_ checks. In the following example, records are dropped if either `test` configuration evaluates to true. However, for the second `test` configuration, both field specs must be true for it to be evaluated to true: + +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: +# ... +spec: + serviceAccount: + name: + filters: + - name: important + type: drop + drop: + - test: + - field: .kubernetes.namespace_name + matches: "^open" + - test: + - field: .log_type + matches: "application" + - field: .kubernetes.pod_name + notMatches: "my-pod" +# ... +---- diff --git a/modules/log6x-content-filter-prune-records.adoc b/modules/log6x-content-filter-prune-records.adoc new file mode 100644 index 0000000000..77abac0c05 --- /dev/null +++ b/modules/log6x-content-filter-prune-records.adoc @@ -0,0 +1,59 @@ +// Module included in the following assemblies: +// +// * observability/logging/logging-6.0/log6x-clf.adoc + +:_mod-docs-content-type: PROCEDURE +[id="log6x-content-filter-prune-records_{context}"] += Configuring content filters to prune log records + +When the `prune` filter is configured, the log collector evaluates log streams according to the filters before forwarding. The collector prunes log records by removing low value fields such as pod annotations. + +.Procedure + +. Add a configuration for a filter to the `prune` spec in the `ClusterLogForwarder` CR. ++ +The following example shows how to configure the `ClusterLogForwarder` CR to prune log records based on field paths: ++ +[IMPORTANT] +==== +If both are specified, records are pruned based on the `notIn` array first, which takes precedence over the `in` array. After records have been pruned by using the `notIn` array, they are then pruned by using the `in` array. +==== ++ +.Example `ClusterLogForwarder` CR +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: +# ... +spec: + serviceAccount: + name: + filters: + - name: + type: prune # <1> + prune: # <2> + in: [.kubernetes.annotations, .kubernetes.namespace_id] # <3> + notIn: [.kubernetes,.log_type,.message,."@timestamp"] # <4> + pipelines: + - name: # <5> + filterRefs: [""] +# ... +---- +<1> Specify the type of filter. The `prune` filter prunes log records by configured fields. +<2> Specify configuration options for applying the `prune` filter. The `in` and `notIn` fields are specified as arrays of dot-delimited field paths, which are paths to fields in log records. These paths can contain alpha-numeric characters and underscores (`a-zA-Z0-9_`), for example, `.kubernetes.namespace_name`. If segments contain characters outside of this range, the segment must be in quotes, for example, `.kubernetes.labels."foo.bar-bar/baz"`. +<3> Optional: Any fields that are specified in this array are removed from the log record. +<4> Optional: Any fields that are not specified in this array are removed from the log record. +<5> Specify the pipeline that the `prune` filter is applied to. ++ +[NOTE] +==== +The filters exempts the `log_type`, `.log_source`, and `.message` fields. +==== + +. Apply the `ClusterLogForwarder` CR by running the following command: ++ +[source,terminal] +---- +$ oc apply -f .yaml +---- diff --git a/modules/log6x-enabling-loki-alerts.adoc b/modules/log6x-enabling-loki-alerts.adoc new file mode 100644 index 0000000000..a598de376e --- /dev/null +++ b/modules/log6x-enabling-loki-alerts.adoc @@ -0,0 +1,103 @@ +// Module included in the following assemblies: +// +// observability/logging/logging-6.0/log6x-loki.adoc + +:_mod-docs-content-type: PROCEDURE +[id="logging-enabling-loki-alerts_{context}"] += Creating a log-based alerting rule with Loki + +The `AlertingRule` CR contains a set of specifications and webhook validation definitions to declare groups of alerting rules for a single `LokiStack` instance. In addition, the webhook validation definition provides support for rule validation conditions: + +* If an `AlertingRule` CR includes an invalid `interval` period, it is an invalid alerting rule +* If an `AlertingRule` CR includes an invalid `for` period, it is an invalid alerting rule. +* If an `AlertingRule` CR includes an invalid LogQL `expr`, it is an invalid alerting rule. +* If an `AlertingRule` CR includes two groups with the same name, it is an invalid alerting rule. +* If none of the above applies, an alerting rule is considered valid. + +.AlertingRule definitions +[options="header"] +|=== +| Tenant type | Valid namespaces for `AlertingRule` CRs +| application a| `` +| audit a| `openshift-logging` +| infrastructure a| `openshift-/\*`, `kube-/\*`, `default` +|=== + +.Procedure + +. Create an `AlertingRule` custom resource (CR): ++ +.Example infrastructure `AlertingRule` CR +[source,yaml] +---- + apiVersion: loki.grafana.com/v1 + kind: AlertingRule + metadata: + name: loki-operator-alerts + namespace: openshift-operators-redhat <1> + labels: <2> + openshift.io/: "true" + spec: + tenantID: "infrastructure" <3> + groups: + - name: LokiOperatorHighReconciliationError + rules: + - alert: HighPercentageError + expr: | <4> + sum(rate({kubernetes_namespace_name="openshift-operators-redhat", kubernetes_pod_name=~"loki-operator-controller-manager.*"} |= "error" [1m])) by (job) + / + sum(rate({kubernetes_namespace_name="openshift-operators-redhat", kubernetes_pod_name=~"loki-operator-controller-manager.*"}[1m])) by (job) + > 0.01 + for: 10s + labels: + severity: critical <5> + annotations: + summary: High Loki Operator Reconciliation Errors <6> + description: High Loki Operator Reconciliation Errors <7> +---- +<1> The namespace where this `AlertingRule` CR is created must have a label matching the LokiStack `spec.rules.namespaceSelector` definition. +<2> The `labels` block must match the LokiStack `spec.rules.selector` definition. +<3> `AlertingRule` CRs for `infrastructure` tenants are only supported in the `openshift-\*`, `kube-\*`, or `default` namespaces. +<4> The value for `kubernetes_namespace_name:` must match the value for `metadata.namespace`. +<5> The value of this mandatory field must be `critical`, `warning`, or `info`. +<6> This field is mandatory. +<7> This field is mandatory. ++ +.Example application `AlertingRule` CR +[source,yaml] +---- + apiVersion: loki.grafana.com/v1 + kind: AlertingRule + metadata: + name: app-user-workload + namespace: app-ns <1> + labels: <2> + openshift.io/: "true" + spec: + tenantID: "application" + groups: + - name: AppUserWorkloadHighError + rules: + - alert: + expr: | <3> + sum(rate({kubernetes_namespace_name="app-ns", kubernetes_pod_name=~"podName.*"} |= "error" [1m])) by (job) + for: 10s + labels: + severity: critical <4> + annotations: + summary: <5> + description: <6> +---- +<1> The namespace where this `AlertingRule` CR is created must have a label matching the LokiStack `spec.rules.namespaceSelector` definition. +<2> The `labels` block must match the LokiStack `spec.rules.selector` definition. +<3> Value for `kubernetes_namespace_name:` must match the value for `metadata.namespace`. +<4> The value of this mandatory field must be `critical`, `warning`, or `info`. +<5> The value of this mandatory field is a summary of the rule. +<6> The value of this mandatory field is a detailed description of the rule. + +. Apply the `AlertingRule` CR: ++ +[source,terminal] +---- +$ oc apply -f .yaml +---- diff --git a/modules/log6x-identity-federation.adoc b/modules/log6x-identity-federation.adoc new file mode 100644 index 0000000000..d716d02960 --- /dev/null +++ b/modules/log6x-identity-federation.adoc @@ -0,0 +1,63 @@ +// Module included in the following assemblies: +// +// observability/logging/logging-6.0/log6x-loki.adoc + +:_mod-docs-content-type: PROCEDURE +[id="logging-identity-federation_{context}"] += Enabling authentication to cloud-based log stores using short-lived tokens + +Workload identity federation enables authentication to cloud-based log stores using short-lived tokens. + +.Procedure + +* Use one of the following options to enable authentication: + +** If you use the {product-title} web console to install the {loki-op}, clusters that use short-lived tokens are automatically detected. You are prompted to create roles and supply the data required for the {loki-op} to create a `CredentialsRequest` object, which populates a secret. + +** If you use the {oc-first} to install the {loki-op}, you must manually create a `Subscription` object using the appropriate template for your storage provider, as shown in the following examples. This authentication strategy is only supported for the storage providers indicated. ++ +.Example Azure sample subscription +[source,yaml] +---- +apiVersion: operators.coreos.com/v1alpha1 +kind: Subscription +metadata: + name: loki-operator + namespace: openshift-operators-redhat +spec: + channel: "stable-6.0" + installPlanApproval: Manual + name: loki-operator + source: redhat-operators + sourceNamespace: openshift-marketplace + config: + env: + - name: CLIENTID + value: + - name: TENANTID + value: + - name: SUBSCRIPTIONID + value: + - name: REGION + value: +---- ++ +.Example AWS sample subscription +[source,yaml] +---- +apiVersion: operators.coreos.com/v1alpha1 +kind: Subscription +metadata: + name: loki-operator + namespace: openshift-operators-redhat +spec: + channel: "stable-6.0" + installPlanApproval: Manual + name: loki-operator + source: redhat-operators + sourceNamespace: openshift-marketplace + config: + env: + - name: ROLEARN + value: +---- diff --git a/modules/log6x-input-spec-filter-audit-infrastructure.adoc b/modules/log6x-input-spec-filter-audit-infrastructure.adoc new file mode 100644 index 0000000000..1f9592a035 --- /dev/null +++ b/modules/log6x-input-spec-filter-audit-infrastructure.adoc @@ -0,0 +1,58 @@ +// Module included in the following assemblies: +// +// * observability/logging/logging-6.0/log6x-clf.adoc + +:_mod-docs-content-type: PROCEDURE +[id="log6x-input-spec-filter-audit-infrastructure_{context}"] += Filtering the audit and infrastructure log inputs by source + +You can define the list of `audit` and `infrastructure` sources to collect the logs by using the `input` selector. + +.Procedure + +. Add a configuration to define the `audit` and `infrastructure` sources in the `ClusterLogForwarder` CR. + ++ +The following example shows how to configure the `ClusterLogForwarder` CR to define `audit` and `infrastructure` sources: ++ +.Example `ClusterLogForwarder` CR ++ +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +# ... +spec: + serviceAccount: + name: + inputs: + - name: mylogs1 + type: infrastructure + infrastructure: + sources: # <1> + - node + - name: mylogs2 + type: audit + audit: + sources: # <2> + - kubeAPI + - openshiftAPI + - ovn +# ... +---- +<1> Specifies the list of infrastructure sources to collect. The valid sources include: +** `node`: Journal log from the node +** `container`: Logs from the workloads deployed in the namespaces +<2> Specifies the list of audit sources to collect. The valid sources include: +** `kubeAPI`: Logs from the Kubernetes API servers +** `openshiftAPI`: Logs from the OpenShift API servers +** `auditd`: Logs from a node auditd service +** `ovn`: Logs from an open virtual network service + +. Apply the `ClusterLogForwarder` CR by running the following command: + ++ +[source,terminal] +---- +$ oc apply -f .yaml +---- diff --git a/modules/log6x-input-spec-filter-labels-expressions.adoc b/modules/log6x-input-spec-filter-labels-expressions.adoc new file mode 100644 index 0000000000..c04c37736f --- /dev/null +++ b/modules/log6x-input-spec-filter-labels-expressions.adoc @@ -0,0 +1,54 @@ +// Module included in the following assemblies: +// +// * observability/logging/logging-6.0/log6x-clf.adoc + +:_mod-docs-content-type: PROCEDURE +[id="log6x-input-spec-filter-labels-expressions_{context}"] += Filtering application logs at input by including the label expressions or a matching label key and values + +You can include the application logs based on the label expressions or a matching label key and its values by using the `input` selector. + +.Procedure + +. Add a configuration for a filter to the `input` spec in the `ClusterLogForwarder` CR. ++ +The following example shows how to configure the `ClusterLogForwarder` CR to include logs based on label expressions or matched label key/values: ++ +.Example `ClusterLogForwarder` CR +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +# ... +spec: + serviceAccount: + name: + inputs: + - name: mylogs + application: + selector: + matchExpressions: + - key: env # <1> + operator: In # <2> + values: ["prod", "qa"] # <3> + - key: zone + operator: NotIn + values: ["east", "west"] + matchLabels: # <4> + app: one + name: app1 + type: application +# ... +---- +<1> Specifies the label key to match. +<2> Specifies the operator. Valid values include: `In`, `NotIn`, `Exists`, and `DoesNotExist`. +<3> Specifies an array of string values. If the `operator` value is either `Exists` or `DoesNotExist`, the value array must be empty. +<4> Specifies an exact key or value mapping. + +. Apply the `ClusterLogForwarder` CR by running the following command: + ++ +[source,terminal] +---- +$ oc apply -f .yaml +---- diff --git a/modules/log6x-input-spec-filter-namespace-container.adoc b/modules/log6x-input-spec-filter-namespace-container.adoc new file mode 100644 index 0000000000..70b3b64a62 --- /dev/null +++ b/modules/log6x-input-spec-filter-namespace-container.adoc @@ -0,0 +1,52 @@ +// Module included in the following assemblies: +// +// * observability/logging/logging-6.0/log6x-clf.adoc + +:_mod-docs-content-type: PROCEDURE +[id="log6x-input-spec-filter-namespace-container_{context}"] += Filtering application logs at input by including or excluding the namespace or container name + +You can include or exclude the application logs based on the namespace and container name by using the `input` selector. + +.Procedure + +. Add a configuration to include or exclude the namespace and container names in the `ClusterLogForwarder` CR. ++ +The following example shows how to configure the `ClusterLogForwarder` CR to include or exclude namespaces and container names: ++ +.Example `ClusterLogForwarder` CR +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +# ... +spec: + serviceAccount: + name: + inputs: + - name: mylogs + application: + includes: + - namespace: "my-project" # <1> + container: "my-container" # <2> + excludes: + - container: "other-container*" # <3> + namespace: "other-namespace" # <4> +# ... +---- +<1> Specifies that the logs are only collected from these namespaces. +<2> Specifies that the logs are only collected from these containers. +<3> Specifies the pattern of namespaces to ignore when collecting the logs. +<4> Specifies the set of containers to ignore when collecting the logs. ++ +[NOTE] +==== +The `excludes` field takes precedence over the `includes` field. +==== ++ +. Apply the `ClusterLogForwarder` CR by running the following command: ++ +[source,terminal] +---- +$ oc apply -f .yaml +---- diff --git a/modules/log6x-loki-memberlist-ip.adoc b/modules/log6x-loki-memberlist-ip.adoc new file mode 100644 index 0000000000..ae4cf210d0 --- /dev/null +++ b/modules/log6x-loki-memberlist-ip.adoc @@ -0,0 +1,33 @@ +// Module included in the following assemblies: +// +// * logging/cluster-logging-loki.adoc + +:_mod-docs-content-type: CONCEPT +[id="logging-loki-memberlist-ip_{context}"] += Configuring Loki to tolerate memberlist creation failure + +In an {product-title} cluster, administrators generally use a non-private IP network range. As a result, the LokiStack memberlist configuration fails because, by default, it only uses private IP networks. + +As an administrator, you can select the pod network for the memberlist configuration. You can modify the `LokiStack` custom resource (CR) to use the `podIP` address in the `hashRing` spec. To configure the `LokiStack` CR, use the following command: + +[source,terminal] +---- +$ oc patch LokiStack logging-loki -n openshift-logging --type=merge -p '{"spec": {"hashRing":{"memberlist":{"instanceAddrType":"podIP"},"type":"memberlist"}}}' +---- + +.Example LokiStack to include `podIP` +[source,yaml] +---- +apiVersion: loki.grafana.com/v1 +kind: LokiStack +metadata: + name: logging-loki + namespace: openshift-logging +spec: +# ... + hashRing: + type: memberlist + memberlist: + instanceAddrType: podIP +# ... +---- diff --git a/modules/log6x-loki-pod-placement.adoc b/modules/log6x-loki-pod-placement.adoc new file mode 100644 index 0000000000..fc1b5f79da --- /dev/null +++ b/modules/log6x-loki-pod-placement.adoc @@ -0,0 +1,207 @@ +// Module included in the following assemblies: +// +// observability/logging/logging-6.0/log6x-loki.adoc + +:_mod-docs-content-type: CONCEPT +[id="logging-loki-pod-placement_{context}"] += Loki pod placement +You can control which nodes the Loki pods run on, and prevent other workloads from using those nodes, by using tolerations or node selectors on the pods. + +You can apply tolerations to the log store pods with the LokiStack custom resource (CR) and apply taints to a node with the node specification. A taint on a node is a `key:value` pair that instructs the node to repel all pods that do not allow the taint. Using a specific `key:value` pair that is not on other pods ensures that only the log store pods can run on that node. + +.Example LokiStack with node selectors +[source,yaml] +---- +apiVersion: loki.grafana.com/v1 +kind: LokiStack +metadata: + name: logging-loki + namespace: openshift-logging +spec: +# ... + template: + compactor: # <1> + nodeSelector: + node-role.kubernetes.io/infra: "" # <2> + distributor: + nodeSelector: + node-role.kubernetes.io/infra: "" + gateway: + nodeSelector: + node-role.kubernetes.io/infra: "" + indexGateway: + nodeSelector: + node-role.kubernetes.io/infra: "" + ingester: + nodeSelector: + node-role.kubernetes.io/infra: "" + querier: + nodeSelector: + node-role.kubernetes.io/infra: "" + queryFrontend: + nodeSelector: + node-role.kubernetes.io/infra: "" + ruler: + nodeSelector: + node-role.kubernetes.io/infra: "" +# ... +---- +<1> Specifies the component pod type that applies to the node selector. +<2> Specifies the pods that are moved to nodes containing the defined label. + + +.Example LokiStack CR with node selectors and tolerations +[source,yaml] +---- +apiVersion: loki.grafana.com/v1 +kind: LokiStack +metadata: + name: logging-loki + namespace: openshift-logging +spec: +# ... + template: + compactor: + nodeSelector: + node-role.kubernetes.io/infra: "" + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/infra + value: reserved + - effect: NoExecute + key: node-role.kubernetes.io/infra + value: reserved + distributor: + nodeSelector: + node-role.kubernetes.io/infra: "" + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/infra + value: reserved + - effect: NoExecute + key: node-role.kubernetes.io/infra + value: reserved + nodeSelector: + node-role.kubernetes.io/infra: "" + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/infra + value: reserved + - effect: NoExecute + key: node-role.kubernetes.io/infra + value: reserved + indexGateway: + nodeSelector: + node-role.kubernetes.io/infra: "" + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/infra + value: reserved + - effect: NoExecute + key: node-role.kubernetes.io/infra + value: reserved + ingester: + nodeSelector: + node-role.kubernetes.io/infra: "" + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/infra + value: reserved + - effect: NoExecute + key: node-role.kubernetes.io/infra + value: reserved + querier: + nodeSelector: + node-role.kubernetes.io/infra: "" + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/infra + value: reserved + - effect: NoExecute + key: node-role.kubernetes.io/infra + value: reserved + queryFrontend: + nodeSelector: + node-role.kubernetes.io/infra: "" + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/infra + value: reserved + - effect: NoExecute + key: node-role.kubernetes.io/infra + value: reserved + ruler: + nodeSelector: + node-role.kubernetes.io/infra: "" + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/infra + value: reserved + - effect: NoExecute + key: node-role.kubernetes.io/infra + value: reserved + gateway: + nodeSelector: + node-role.kubernetes.io/infra: "" + tolerations: + - effect: NoSchedule + key: node-role.kubernetes.io/infra + value: reserved + - effect: NoExecute + key: node-role.kubernetes.io/infra + value: reserved +# ... +---- + +To configure the `nodeSelector` and `tolerations` fields of the LokiStack (CR), you can use the [command]`oc explain` command to view the description and fields for a particular resource: + +[source,terminal] +---- +$ oc explain lokistack.spec.template +---- + +.Example output +[source,text] +---- +KIND: LokiStack +VERSION: loki.grafana.com/v1 + +RESOURCE: template + +DESCRIPTION: + Template defines the resource/limits/tolerations/nodeselectors per + component + +FIELDS: + compactor + Compactor defines the compaction component spec. + + distributor + Distributor defines the distributor component spec. +... +---- + +For more detailed information, you can add a specific field: + +[source,terminal] +---- +$ oc explain lokistack.spec.template.compactor +---- + +.Example output +[source,text] +---- +KIND: LokiStack +VERSION: loki.grafana.com/v1 + +RESOURCE: compactor + +DESCRIPTION: + Compactor defines the compaction component spec. + +FIELDS: + nodeSelector + NodeSelector defines the labels required by a node to schedule the + component onto it. +... +---- diff --git a/modules/log6x-loki-rate-limit-errors.adoc b/modules/log6x-loki-rate-limit-errors.adoc new file mode 100644 index 0000000000..cf23ff4289 --- /dev/null +++ b/modules/log6x-loki-rate-limit-errors.adoc @@ -0,0 +1,84 @@ +// Module is included in the following assemblies: +// * logging/cluster-logging-loki.adoc +// * observability/logging/log_collection_forwarding/log-forwarding.adoc +// * observability/logging/troubleshooting/log-forwarding-troubleshooting.adoc + +:_mod-docs-content-type: PROCEDURE +[id="loki-rate-limit-errors_{context}"] += Troubleshooting Loki rate limit errors + +If the Log Forwarder API forwards a large block of messages that exceeds the rate limit to Loki, Loki generates rate limit (`429`) errors. + +These errors can occur during normal operation. For example, when adding the {logging} to a cluster that already has some logs, rate limit errors might occur while the {logging} tries to ingest all of the existing log entries. In this case, if the rate of addition of new logs is less than the total rate limit, the historical data is eventually ingested, and the rate limit errors are resolved without requiring user intervention. + +In cases where the rate limit errors continue to occur, you can fix the issue by modifying the `LokiStack` custom resource (CR). + +[IMPORTANT] +==== +The `LokiStack` CR is not available on Grafana-hosted Loki. This topic does not apply to Grafana-hosted Loki servers. +==== + +.Conditions + +* The Log Forwarder API is configured to forward logs to Loki. + +* Your system sends a block of messages that is larger than 2 MB to Loki. For example: ++ +[source,text] +---- +"values":[["1630410392689800468","{\"kind\":\"Event\",\"apiVersion\":\ +....... +...... +...... +...... +\"received_at\":\"2021-08-31T11:46:32.800278+00:00\",\"version\":\"1.7.4 1.6.0\"}},\"@timestamp\":\"2021-08-31T11:46:32.799692+00:00\",\"viaq_index_name\":\"audit-write\",\"viaq_msg_id\":\"MzFjYjJkZjItNjY0MC00YWU4LWIwMTEtNGNmM2E5ZmViMGU4\",\"log_type\":\"audit\"}"]]}]} +---- + +* After you enter `oc logs -n openshift-logging -l component=collector`, the collector logs in your cluster show a line containing one of the following error messages: ++ +[source,text] +---- +429 Too Many Requests Ingestion rate limit exceeded +---- ++ +.Example Vector error message +[source,text] +---- +2023-08-25T16:08:49.301780Z WARN sink{component_kind="sink" component_id=default_loki_infra component_type=loki component_name=default_loki_infra}: vector::sinks::util::retries: Retrying after error. error=Server responded with an error: 429 Too Many Requests internal_log_rate_limit=true +---- ++ +.Example Fluentd error message +[source,text] +---- +2023-08-30 14:52:15 +0000 [warn]: [default_loki_infra] failed to flush the buffer. retry_times=2 next_retry_time=2023-08-30 14:52:19 +0000 chunk="604251225bf5378ed1567231a1c03b8b" error_class=Fluent::Plugin::LokiOutput::LogPostError error="429 Too Many Requests Ingestion rate limit exceeded for user infrastructure (limit: 4194304 bytes/sec) while attempting to ingest '4082' lines totaling '7820025' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased\n" +---- ++ +The error is also visible on the receiving end. For example, in the LokiStack ingester pod: ++ +.Example Loki ingester error message +[source,text] +---- +level=warn ts=2023-08-30T14:57:34.155592243Z caller=grpc_logging.go:43 duration=1.434942ms method=/logproto.Pusher/Push err="rpc error: code = Code(429) desc = entry with timestamp 2023-08-30 14:57:32.012778399 +0000 UTC ignored, reason: 'Per stream rate limit exceeded (limit: 3MB/sec) while attempting to ingest for stream +---- + +.Procedure + +* Update the `ingestionBurstSize` and `ingestionRate` fields in the `LokiStack` CR: ++ +[source,yaml] +---- +apiVersion: loki.grafana.com/v1 +kind: LokiStack +metadata: + name: logging-loki + namespace: openshift-logging +spec: + limits: + global: + ingestion: + ingestionBurstSize: 16 # <1> + ingestionRate: 8 # <2> +# ... +---- +<1> The `ingestionBurstSize` field defines the maximum local rate-limited sample size per distributor replica in MB. This value is a hard limit. Set this value to at least the maximum logs size expected in a single push request. Single requests that are larger than the `ingestionBurstSize` value are not permitted. +<2> The `ingestionRate` field is a soft limit on the maximum amount of ingested samples per second in MB. Rate limit errors occur if the rate of logs exceeds the limit, but the collector retries sending the logs. As long as the total average is lower than the limit, the system recovers and errors are resolved without user intervention. diff --git a/modules/log6x-loki-rbac-rules-perms.adoc b/modules/log6x-loki-rbac-rules-perms.adoc new file mode 100644 index 0000000000..4df8445dfe --- /dev/null +++ b/modules/log6x-loki-rbac-rules-perms.adoc @@ -0,0 +1,67 @@ +// Module included in the following assemblies: +// + + +:_mod-docs-content-type: REFERENCE +[id="loki-rbac-rules-permissions_{context}"] += Authorizing LokiStack rules RBAC permissions + +Administrators can allow users to create and manage their own alerting and recording rules by binding cluster roles to usernames. +Cluster roles are defined as `ClusterRole` objects that contain necessary role-based access control (RBAC) permissions for users. + +The following cluster roles for alerting and recording rules are available for LokiStack: + +[options="header"] +|=== +|Rule name |Description + +|`alertingrules.loki.grafana.com-v1-admin` +|Users with this role have administrative-level access to manage alerting rules. This cluster role grants permissions to create, read, update, delete, list, and watch `AlertingRule` resources within the `loki.grafana.com/v1` API group. + +|`alertingrules.loki.grafana.com-v1-crdview` +|Users with this role can view the definitions of Custom Resource Definitions (CRDs) related to `AlertingRule` resources within the `loki.grafana.com/v1` API group, but do not have permissions for modifying or managing these resources. + +|`alertingrules.loki.grafana.com-v1-edit` +|Users with this role have permission to create, update, and delete `AlertingRule` resources. + +|`alertingrules.loki.grafana.com-v1-view` +|Users with this role can read `AlertingRule` resources within the `loki.grafana.com/v1` API group. They can inspect configurations, labels, and annotations for existing alerting rules but cannot make any modifications to them. + +|`recordingrules.loki.grafana.com-v1-admin` +|Users with this role have administrative-level access to manage recording rules. This cluster role grants permissions to create, read, update, delete, list, and watch `RecordingRule` resources within the `loki.grafana.com/v1` API group. + +|`recordingrules.loki.grafana.com-v1-crdview` +|Users with this role can view the definitions of Custom Resource Definitions (CRDs) related to `RecordingRule` resources within the `loki.grafana.com/v1` API group, but do not have permissions for modifying or managing these resources. + +|`recordingrules.loki.grafana.com-v1-edit` +|Users with this role have permission to create, update, and delete `RecordingRule` resources. + +|`recordingrules.loki.grafana.com-v1-view` +|Users with this role can read `RecordingRule` resources within the `loki.grafana.com/v1` API group. They can inspect configurations, labels, and annotations for existing alerting rules but cannot make any modifications to them. + +|=== + +[id="loki-rbac-rules-permissions-examples_{context}"] +== Examples + +To apply cluster roles for a user, you must bind an existing cluster role to a specific username. + +Cluster roles can be cluster or namespace scoped, depending on which type of role binding you use. +When a `RoleBinding` object is used, as when using the `oc adm policy add-role-to-user` command, the cluster role only applies to the specified namespace. +When a `ClusterRoleBinding` object is used, as when using the `oc adm policy add-cluster-role-to-user` command, the cluster role applies to all namespaces in the cluster. + +The following example command gives the specified user create, read, update and delete (CRUD) permissions for alerting rules in a specific namespace in the cluster: + +.Example cluster role binding command for alerting rule CRUD permissions in a specific namespace +[source,terminal] +---- +$ oc adm policy add-role-to-user alertingrules.loki.grafana.com-v1-admin -n +---- + +The following command gives the specified user administrator permissions for alerting rules in all namespaces: + +.Example cluster role binding command for administrator permissions +[source,terminal] +---- +$ oc adm policy add-cluster-role-to-user alertingrules.loki.grafana.com-v1-admin +---- diff --git a/modules/log6x-loki-reliability-hardening.adoc b/modules/log6x-loki-reliability-hardening.adoc new file mode 100644 index 0000000000..05455be143 --- /dev/null +++ b/modules/log6x-loki-reliability-hardening.adoc @@ -0,0 +1,39 @@ +// Module included in the following assemblies: +// +// * logging/cluster-logging-loki.adoc + +:_mod-docs-content-type: CONCEPT +[id="logging-loki-reliability-hardening_{context}"] += Configuring Loki to tolerate node failure + +The {loki-op} supports setting pod anti-affinity rules to request that pods of the same component are scheduled on different available nodes in the cluster. + +include::snippets/about-pod-affinity.adoc[] + +The Operator sets default, preferred `podAntiAffinity` rules for all Loki components, which includes the `compactor`, `distributor`, `gateway`, `indexGateway`, `ingester`, `querier`, `queryFrontend`, and `ruler` components. + +You can override the preferred `podAntiAffinity` settings for Loki components by configuring required settings in the `requiredDuringSchedulingIgnoredDuringExecution` field: + +.Example user settings for the ingester component +[source,yaml] +---- +apiVersion: loki.grafana.com/v1 +kind: LokiStack +metadata: + name: logging-loki + namespace: openshift-logging +spec: +# ... + template: + ingester: + podAntiAffinity: + # ... + requiredDuringSchedulingIgnoredDuringExecution: <1> + - labelSelector: + matchLabels: <2> + app.kubernetes.io/component: ingester + topologyKey: kubernetes.io/hostname +# ... +---- +<1> The stanza to define a required rule. +<2> The key-value pair (label) that must be matched to apply the rule. diff --git a/modules/log6x-loki-restart-hardening.adoc b/modules/log6x-loki-restart-hardening.adoc new file mode 100644 index 0000000000..4c033b6e69 --- /dev/null +++ b/modules/log6x-loki-restart-hardening.adoc @@ -0,0 +1,9 @@ +// Module included in the following assemblies: +// +// * logging/cluster-logging-loki.adoc + +:_mod-docs-content-type: CONCEPT +[id="logging-loki-restart-hardening_{context}"] += LokiStack behavior during cluster restarts + +When an {product-title} cluster is restarted, LokiStack ingestion and the query path continue to operate within the available CPU and memory resources available for the node. This means that there is no downtime for the LokiStack during {product-title} cluster updates. This behavior is achieved by using `PodDisruptionBudget` resources. The {loki-op} provisions `PodDisruptionBudget` resources for Loki, which determine the minimum number of pods that must be available per component to ensure normal operations under certain conditions. diff --git a/modules/log6x-loki-retention.adoc b/modules/log6x-loki-retention.adoc new file mode 100644 index 0000000000..c4dadea70f --- /dev/null +++ b/modules/log6x-loki-retention.adoc @@ -0,0 +1,118 @@ +// Module included in the following assemblies: +// + + +:_mod-docs-content-type: PROCEDURE +[id="logging-loki-retention_{context}"] += Enabling stream-based retention with Loki + +You can configure retention policies based on log streams. Rules for these may be set globally, per-tenant, or both. If you configure both, tenant rules apply before global rules. + +include::snippets/logging-retention-period-snip.adoc[] + +[NOTE] +==== +Schema v13 is recommended. +==== + +.Procedure + +. Create a `LokiStack` CR: ++ +** Enable stream-based retention globally as shown in the following example: ++ +.Example global stream-based retention for AWS +[source,yaml] +---- +apiVersion: loki.grafana.com/v1 +kind: LokiStack +metadata: + name: logging-loki + namespace: openshift-logging +spec: + limits: + global: <1> + retention: <2> + days: 20 + streams: + - days: 4 + priority: 1 + selector: '{kubernetes_namespace_name=~"test.+"}' <3> + - days: 1 + priority: 1 + selector: '{log_type="infrastructure"}' + managementState: Managed + replicationFactor: 1 + size: 1x.small + storage: + schemas: + - effectiveDate: "2020-10-11" + version: v13 + secret: + name: logging-loki-s3 + type: aws + storageClassName: gp3-csi + tenants: + mode: openshift-logging +---- +<1> Sets retention policy for all log streams. *Note: This field does not impact the retention period for stored logs in object storage.* +<2> Retention is enabled in the cluster when this block is added to the CR. +<3> Contains the link:https://grafana.com/docs/loki/latest/logql/query_examples/#query-examples[LogQL query] used to define the log stream.spec: + limits: + +** Enable stream-based retention per-tenant basis as shown in the following example: ++ +.Example per-tenant stream-based retention for AWS +[source,yaml] +---- +apiVersion: loki.grafana.com/v1 +kind: LokiStack +metadata: + name: logging-loki + namespace: openshift-logging +spec: + limits: + global: + retention: + days: 20 + tenants: <1> + application: + retention: + days: 1 + streams: + - days: 4 + selector: '{kubernetes_namespace_name=~"test.+"}' <2> + infrastructure: + retention: + days: 5 + streams: + - days: 1 + selector: '{kubernetes_namespace_name=~"openshift-cluster.+"}' + managementState: Managed + replicationFactor: 1 + size: 1x.small + storage: + schemas: + - effectiveDate: "2020-10-11" + version: v13 + secret: + name: logging-loki-s3 + type: aws + storageClassName: gp3-csi + tenants: + mode: openshift-logging +---- +<1> Sets retention policy by tenant. Valid tenant types are `application`, `audit`, and `infrastructure`. +<2> Contains the link:https://grafana.com/docs/loki/latest/logql/query_examples/#query-examples[LogQL query] used to define the log stream. + +. Apply the `LokiStack` CR: ++ +[source,terminal] +---- +$ oc apply -f .yaml +---- ++ +[NOTE] +==== +This is not for managing the retention for stored logs. Global retention periods for stored logs to a supported maximum of 30 days is configured with your object storage. +==== diff --git a/modules/log6x-loki-zone-aware-rep.adoc b/modules/log6x-loki-zone-aware-rep.adoc new file mode 100644 index 0000000000..e8984fd427 --- /dev/null +++ b/modules/log6x-loki-zone-aware-rep.adoc @@ -0,0 +1,32 @@ +// Module included in the following assemblies: +// +// * logging/cluster-logging-loki.adoc + +:_mod-docs-content-type: CONCEPT +[id="logging-loki-zone-aware-rep_{context}"] += Zone aware data replication + +The {loki-op} offers support for zone-aware data replication through pod topology spread constraints. Enabling this feature enhances reliability and safeguards against log loss in the event of a single zone failure. When configuring the deployment size as `1x.extra-small`, `1x.small`, or `1x.medium`, the `replication.factor` field is automatically set to 2. + +To ensure proper replication, you need to have at least as many availability zones as the replication factor specifies. While it is possible to have more availability zones than the replication factor, having fewer zones can lead to write failures. Each zone should host an equal number of instances for optimal operation. + +.Example LokiStack CR with zone replication enabled +[source,yaml] +---- +apiVersion: loki.grafana.com/v1 +kind: LokiStack +metadata: + name: logging-loki + namespace: openshift-logging +spec: + replicationFactor: 2 # <1> + replication: + factor: 2 # <2> + zones: + - maxSkew: 1 # <3> + topologyKey: topology.kubernetes.io/zone # <4> +---- +<1> Deprecated field, values entered are overwritten by `replication.factor`. +<2> This value is automatically set when deployment size is selected at setup. +<3> The maximum difference in number of pods between any two topology domains. The default is 1, and you cannot specify a value of 0. +<4> Defines zones in the form of a topology key that corresponds to a node label. diff --git a/modules/log6x-loki-zone-fail-recovery.adoc b/modules/log6x-loki-zone-fail-recovery.adoc new file mode 100644 index 0000000000..e94fde1c7d --- /dev/null +++ b/modules/log6x-loki-zone-fail-recovery.adoc @@ -0,0 +1,86 @@ +// Module included in the following assemblies: +// +// * logging/cluster-logging-loki.adoc + +:_mod-docs-content-type: PROCEDURE +[id="logging-loki-zone-fail-recovery_{context}"] += Recovering Loki pods from failed zones + +In {product-title} a zone failure happens when specific availability zone resources become inaccessible. Availability zones are isolated areas within a cloud provider's data center, aimed at enhancing redundancy and fault tolerance. If your {product-title} cluster is not configured to handle this, a zone failure can lead to service or data loss. + +Loki pods are part of a link:https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/[StatefulSet], and they come with Persistent Volume Claims (PVCs) provisioned by a `StorageClass` object. Each Loki pod and its PVCs reside in the same zone. When a zone failure occurs in a cluster, the StatefulSet controller automatically attempts to recover the affected pods in the failed zone. + +[WARNING] +==== +The following procedure will delete the PVCs in the failed zone, and all data contained therein. To avoid complete data loss the replication factor field of the `LokiStack` CR should always be set to a value greater than 1 to ensure that Loki is replicating. +==== + +.Prerequisites +* Verify your `LokiStack` CR has a replication factor greater than 1. +* Zone failure detected by the control plane, and nodes in the failed zone are marked by cloud provider integration. + +The StatefulSet controller automatically attempts to reschedule pods in a failed zone. Because the associated PVCs are also in the failed zone, automatic rescheduling to a different zone does not work. You must manually delete the PVCs in the failed zone to allow successful re-creation of the stateful Loki Pod and its provisioned PVC in the new zone. + + +.Procedure +. List the pods in `Pending` status by running the following command: ++ +[source,terminal] +---- +$ oc get pods --field-selector status.phase==Pending -n openshift-logging +---- ++ +.Example `oc get pods` output +[source,terminal] +---- +NAME READY STATUS RESTARTS AGE # <1> +logging-loki-index-gateway-1 0/1 Pending 0 17m +logging-loki-ingester-1 0/1 Pending 0 16m +logging-loki-ruler-1 0/1 Pending 0 16m +---- +<1> These pods are in `Pending` status because their corresponding PVCs are in the failed zone. + +. List the PVCs in `Pending` status by running the following command: ++ +[source,terminal] +---- +$ oc get pvc -o=json -n openshift-logging | jq '.items[] | select(.status.phase == "Pending") | .metadata.name' -r +---- ++ +.Example `oc get pvc` output +[source,terminal] +---- +storage-logging-loki-index-gateway-1 +storage-logging-loki-ingester-1 +wal-logging-loki-ingester-1 +storage-logging-loki-ruler-1 +wal-logging-loki-ruler-1 +---- + +. Delete the PVC(s) for a pod by running the following command: ++ +[source,terminal] +---- +$ oc delete pvc -n openshift-logging +---- ++ +. Delete the pod(s) by running the following command: ++ +[source,terminal] +---- +$ oc delete pod -n openshift-logging +---- ++ +Once these objects have been successfully deleted, they should automatically be rescheduled in an available zone. + +[id="logging-loki-zone-fail-term-state_{context}"] +== Troubleshooting PVC in a terminating state + +The PVCs might hang in the terminating state without being deleted, if PVC metadata finalizers are set to `kubernetes.io/pv-protection`. Removing the finalizers should allow the PVCs to delete successfully. + +* Remove the finalizer for each PVC by running the command below, then retry deletion. ++ +[source,terminal] +---- +$ oc patch pvc -p '{"metadata":{"finalizers":null}}' -n openshift-logging +---- diff --git a/modules/log6x-multiline-except.adoc b/modules/log6x-multiline-except.adoc new file mode 100644 index 0000000000..634d6e7e66 --- /dev/null +++ b/modules/log6x-multiline-except.adoc @@ -0,0 +1,64 @@ +// Module included in the following assemblies: +// +// * observability/logging/logging-6.0/log6x-clf.adoc + +:_mod-docs-content-type: PROCEDURE +[id="log6x-multiline-except_{context}"] += Enabling multi-line exception detection + +Enables multi-line error detection of container logs. + +[WARNING] +==== +Enabling this feature could have performance implications and may require additional computing resources or alternate logging solutions. +==== + +Log parsers often incorrectly identify separate lines of the same exception as separate exceptions. This leads to extra log entries and an incomplete or inaccurate view of the traced information. + +.Example java exception +[source,java] +---- +java.lang.NullPointerException: Cannot invoke "String.toString()" because "" is null + at testjava.Main.handle(Main.java:47) + at testjava.Main.printMe(Main.java:19) + at testjava.Main.main(Main.java:10) +---- + +* To enable logging to detect multi-line exceptions and reassemble them into a single log entry, ensure that the `ClusterLogForwarder` Custom Resource (CR) contains a `detectMultilineErrors` field under the `.spec.filters`. + +.Example ClusterLogForwarder CR +[source,yaml] +---- +apiVersion: "observability.openshift.io/v1" +kind: ClusterLogForwarder +metadata: + name: + namespace: +spec: + serviceAccount: + name: + filters: + - name: + type: detectMultilineException + pipelines: + - inputRefs: + - + name: + filterRefs: + - + outputRefs: + - +---- + +== Details +When log messages appear as a consecutive sequence forming an exception stack trace, they are combined into a single, unified log record. The first log message's content is replaced with the concatenated content of all the message fields in the sequence. + +The collector supports the following languages: + +* Java +* JS +* Ruby +* Python +* Golang +* PHP +* Dart diff --git a/modules/log6x-oc-explain.adoc b/modules/log6x-oc-explain.adoc new file mode 100644 index 0000000000..963ae57eb9 --- /dev/null +++ b/modules/log6x-oc-explain.adoc @@ -0,0 +1,75 @@ +// Module included in the following assemblies: +// +:_mod-docs-content-type: CONCEPT +[id="log6x-oc-explain_{context}"] + += Using the `oc explain` command + +The `oc explain` command is an essential tool in the OpenShift CLI `oc` that provides detailed descriptions of the fields within Custom Resources (CRs). This command is invaluable for administrators and developers who are configuring or troubleshooting resources in an OpenShift cluster. + +== Resource Descriptions +`oc explain` offers in-depth explanations of all fields associated with a specific object. This includes standard resources like pods and services, as well as more complex entities like statefulsets and custom resources defined by Operators. + +To view the documentation for the `outputs` field of the `ClusterLogForwarder` custom resource, you can use: + +[source,terminal] +---- +$ oc explain clusterlogforwarders.observability.openshift.io.spec.outputs +---- + +[NOTE] +==== +In place of `clusterlogforwarder` the short form `obsclf` can be used. +==== + +This will display detailed information about these fields, including their types, default values, and any associated sub-fields. + +== Hierarchical Structure +The command displays the structure of resource fields in a hierarchical format, clarifying the relationships between different configuration options. + +For instance, here's how you can drill down into the `storage` configuration for a `LokiStack` custom resource: + +[source,terminal] +---- +$ oc explain lokistacks.loki.grafana.com +$ oc explain lokistacks.loki.grafana.com.spec +$ oc explain lokistacks.loki.grafana.com.spec.storage +$ oc explain lokistacks.loki.grafana.com.spec.storage.schemas +---- + +Each command reveals a deeper level of the resource specification, making the structure clear. + +== Type Information +`oc explain` also indicates the type of each field (such as string, integer, or boolean), allowing you to verify that resource definitions use the correct data types. + +For example: + +[source,terminal] +---- +$ oc explain lokistacks.loki.grafana.com.spec.size +---- + +This will show that `size` should be defined using an integer value. + +== Default Values +When applicable, the command shows the default values for fields, providing insights into what values will be used if none are explicitly specified. + +Again using `lokistacks.loki.grafana.com` as an example: + +[source,terminal] +---- +$ oc explain lokistacks.spec.template.distributor.replicas +---- + +.Example output +[source,terminal] +---- +GROUP: loki.grafana.com +KIND: LokiStack +VERSION: v1 + +FIELD: replicas + +DESCRIPTION: + Replicas defines the number of replica pods of the component. +---- diff --git a/modules/log6x-release-notes-6-0-0.adoc b/modules/log6x-release-notes-6-0-0.adoc new file mode 100644 index 0000000000..c3463652d1 --- /dev/null +++ b/modules/log6x-release-notes-6-0-0.adoc @@ -0,0 +1,76 @@ +// module included in log6x-release-notes.adoc +:_mod-docs-content-type: REFERENCE +[id="log6x-release-notes-6-0-0_{context}"] += Logging 6.0.0 + +This release includes link:https://access.redhat.com/errata/RHBA-2024:6693[{logging-uc} {for} Bug Fix Release 6.0.0] + +include::snippets/logging-compatibility-snip.adoc[] + +.Upstream component versions +[options="header"] +|=== + +| {logging} Version 6+| Component Version + +| Operator | `eventrouter` | `logfilemetricexporter` | `loki` | `lokistack-gateway` | `opa-openshift` | `vector` + +|6.0 | 0.4 | 1.1 | 3.1.0 | 0.1 | 0.1 | 0.37.1 + +|=== + +[id="log6x-release-notes-6-0-0-removal-notice"] +== Removal notice + +* With this release, {logging} no longer supports the `ClusterLogging.logging.openshift.io` and `ClusterLogForwarder.logging.openshift.io` custom resources. Refer to the product documentation for details on the replacement features. (link:https://issues.redhat.com/browse/LOG-5803[LOG-5803]) + +* With this release, {logging} no longer manages or deploys log storage (such as Elasticsearch), visualization (such as Kibana), or Fluentd-based log collectors. (link:https://issues.redhat.com/browse/LOG-5368[LOG-5368]) + +[NOTE] +==== +In order to continue to use Elasticsearch and Kibana managed by the elasticsearch-operator, the administrator must modify those object's ownerRefs before deleting the ClusterLogging resource. +==== + +[id="log6x-release-notes-6-0-0-enhancements"] +== New features and enhancements +=== Log Collection + +* This feature introduces a new architecture for {logging} {for} by shifting component responsibilities to their relevant Operators, such as for storage, visualization, and collection. It introduces the `ClusterLogForwarder.observability.openshift.io` API for log collection and forwarding. Support for the `ClusterLogging.logging.openshift.io` and `ClusterLogForwarder.logging.openshift.io` APIs, along with the Red Hat managed Elastic stack (Elasticsearch and Kibana), is removed. Users are encouraged to migrate to the Red Hat `LokiStack` for log storage. Existing managed Elasticsearch deployments can be used for a limited time. Automated migration for log collection is not provided, so administrators need to create a new ClusterLogForwarder.observability.openshift.io specification to replace their previous custom resources. Refer to the official product documentation for more details. (link:https://issues.redhat.com/browse/LOG-3493[LOG-3493]) + +* This enhancement sets default requests and limits for Vector collector deployments' memory and CPU usage based on Vector documentation recommendations. (link:https://issues.redhat.com/browse/LOG-4745[LOG-4745]) + +* This enhancement updates Vector to align with the upstream version v0.37.1. (link:https://issues.redhat.com/browse/LOG-5296[LOG-5296]) + +* This enhancement introduces an alert that triggers when log collectors buffer logs to a node's file system and use over 15% of the available space, indicating potential back pressure issues. (link:https://issues.redhat.com/browse/LOG-5381[LOG-5381]) + +* This enhancement updates the selectors for all components to use common Kubernetes labels. (link:https://issues.redhat.com/browse/LOG-5906[LOG-5906]) + +* This enhancement changes the collector configuration to deploy as a ConfigMap instead of a secret, allowing users to view and edit the configuration when the ClusterLogForwarder is set to Unmanaged. (link:https://issues.redhat.com/browse/LOG-5599[LOG-5599]) + +* This enhancement adds the ability to configure the Vector collector log level using an annotation on the ClusterLogForwarder, with options including trace, debug, info, warn, error, or off. (link:https://issues.redhat.com/browse/LOG-5372[LOG-5372]) + +* This enhancement adds validation to reject configurations where Amazon CloudWatch outputs use multiple AWS roles, preventing incorrect log routing. (link:https://issues.redhat.com/browse/LOG-5640[LOG-5640]) +* This enhancement removes the Log Bytes Collected and Log Bytes Sent graphs from the metrics dashboard. (link:https://issues.redhat.com/browse/LOG-5964[LOG-5964]) + +* This enhancement updates the must-gather functionality to only capture information for inspecting Logging 6.0 components, including Vector deployments from ClusterLogForwarder.observability.openshift.io resources and the Red Hat managed LokiStack. (link:https://issues.redhat.com/browse/LOG-5949[LOG-5949]) + + +=== Log Storage + +* With this release, the responsibility for deploying the {logging} view plugin shifts from the {clo} to the {coo-first}. For new log storage installations that need visualization, the {coo-full} and the associated UIPlugin resource must be deployed. Refer to the xref:../../observability/cluster_observability_operator/cluster-observability-operator-overview.adoc[Cluster Observability Operator Overview] product documentation for more details. (link:https://issues.redhat.com/browse/LOG-5461[LOG-5461]) + +* This enhancement improves Azure storage secret validation by providing early warnings for specific error conditions. (link:https://issues.redhat.com/browse/LOG-4571[LOG-4571]) +[id="log6x-release-notes-6-0-0-technology-preview-features"] +== Technology Preview features + +* This release introduces a Technology Preview feature for log forwarding using OpenTelemetry. A new output type,` OTLP`, allows sending JSON-encoded log records using the OpenTelemetry data model and resource semantic conventions. (link:https://issues.redhat.com/browse/LOG-4225[LOG-4225]) + +[id="log6x-release-notes-6-0-0-bug-fixes"] +== Bug fixes + +* Before this update, the `CollectorHighErrorRate` and `CollectorVeryHighErrorRate` alerts were still present. With this update, both alerts are removed in the {logging} 6.0 release but might return in a future release. (link:https://issues.redhat.com/browse/LOG-3432[LOG-3432]) + +[id="log6x-release-notes-6-0-0-CVEs"] +== CVEs + +* link:https://access.redhat.com/security/cve/CVE-2024-34397[CVE-2024-34397] diff --git a/observability/logging/logging-6.0/_attributes b/observability/logging/logging-6.0/_attributes new file mode 120000 index 0000000000..bf7c2529fd --- /dev/null +++ b/observability/logging/logging-6.0/_attributes @@ -0,0 +1 @@ +../../../_attributes/ \ No newline at end of file diff --git a/observability/logging/logging-6.0/images b/observability/logging/logging-6.0/images new file mode 120000 index 0000000000..4399cbb3c0 --- /dev/null +++ b/observability/logging/logging-6.0/images @@ -0,0 +1 @@ +../../../images/ \ No newline at end of file diff --git a/observability/logging/logging-6.0/log6x-about.adoc b/observability/logging/logging-6.0/log6x-about.adoc new file mode 100644 index 0000000000..ff86ba966f --- /dev/null +++ b/observability/logging/logging-6.0/log6x-about.adoc @@ -0,0 +1,167 @@ +:_mod-docs-content-type: ASSEMBLY +include::_attributes/common-attributes.adoc[] +[id="log6x-about"] += Logging 6.0 +:context: logging-6x + +toc::[] + +The `ClusterLogForwarder` custom resource (CR) is the central configuration point for log collection and forwarding. + +== Inputs and Outputs + +Inputs specify the sources of logs to be forwarded. Logging provides built-in input types: `application`, `infrastructure`, and `audit`, which select logs from different parts of your cluster. You can also define custom inputs based on namespaces or pod labels to fine-tune log selection. + +Outputs define the destinations where logs are sent. Each output type has its own set of configuration options, allowing you to customize the behavior and authentication settings. + + +== Receiver Input Type +The receiver input type enables the Logging system to accept logs from external sources. It supports two formats for receiving logs: `http` and `syslog`. + +The `ReceiverSpec` defines the configuration for a receiver input. + +== Pipelines and Filters + +Pipelines determine the flow of logs from inputs to outputs. A pipeline consists of one or more input refs, output refs, and optional filter refs. Filters can be used to transform or drop log messages within a pipeline. The order of filters matters, as they are applied sequentially, and earlier filters can prevent log messages from reaching later stages. + +== Operator Behavior + +The Cluster Logging Operator manages the deployment and configuration of the collector based on the `managementState` field: + +- When set to `Managed` (default), the operator actively manages the logging resources to match the configuration defined in the spec. +- When set to `Unmanaged`, the operator does not take any action, allowing you to manually manage the logging components. + +== Validation +Logging includes extensive validation rules and default values to ensure a smooth and error-free configuration experience. The `ClusterLogForwarder` resource enforces validation checks on required fields, dependencies between fields, and the format of input values. Default values are provided for certain fields, reducing the need for explicit configuration in common scenarios. + +=== Quick Start + +.Prerequisites +* Cluster administrator permissions + +.Procedure + +. Install the `OpenShift Logging` and `Loki` Operators from OperatorHub. + +. Create a `LokiStack` custom resource (CR) in the `openshift-logging` namespace: ++ +[source,yaml] +---- +apiVersion: loki.grafana.com/v1 +kind: LokiStack +metadata: + name: logging-loki + namespace: openshift-logging +spec: + managementState: Managed + size: 1x.extra-small + storage: + schemas: + - effectiveDate: '2022-06-01' + version: v13 + secret: + name: logging-loki-s3 + type: s3 + storageClassName: gp3-csi + tenants: + mode: openshift-logging +---- + +. Create a service account for the collector: ++ +[source,shell] +---- +$ oc create sa collector -n openshift-logging +---- + +. Create a `ClusterRole` for the collector: ++ +[source,yaml] +---- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: logging-collector-logs-writer +rules: +- apiGroups: + - loki.grafana.com + resourceNames: + - logs + resources: + - application + - audit + - infrastructure + verbs: + - create +---- + +. Bind the `ClusterRole` to the service account: ++ +[source,shell] +---- +$ oc adm policy add-cluster-role-to-user logging-collector-logs-writer -z collector +---- + +. Install the Cluster Observability Operator. + +. Create a `UIPlugin` to enable the Log section in the Observe tab: ++ +[source,yaml] +---- +apiVersion: observability.openshift.io/v1alpha1 +kind: UIPlugin +metadata: + name: logging +spec: + type: Logging + logging: + lokiStack: + name: logging-loki +---- + +. Add additional roles to the collector service account: ++ +[source,shell] +---- +$ oc project openshift-logging +$ oc adm policy add-cluster-role-to-user collect-application-logs -z collector +$ oc adm policy add-cluster-role-to-user collect-audit-logs -z collector +$ oc adm policy add-cluster-role-to-user collect-infrastructure-logs -z collector +---- + +. Create a `ClusterLogForwarder` CR to configure log forwarding: ++ +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: + name: collector + namespace: openshift-logging +spec: + serviceAccount: + name: collector + outputs: + - name: default-lokistack + type: lokiStack + lokiStack: + target: + name: logging-loki + namespace: openshift-logging + authentication: + token: + from: serviceAccount + tls: + ca: + key: service-ca.crt + configMapName: openshift-service-ca.crt + pipelines: + - name: default-logstore + inputRefs: + - application + - infrastructure + outputRefs: + - default-lokistack +---- + +. Verify that logs are visible in the Log section of the Observe tab in the OpenShift web console. diff --git a/observability/logging/logging-6.0/log6x-clf.adoc b/observability/logging/logging-6.0/log6x-clf.adoc new file mode 100644 index 0000000000..5df89bc898 --- /dev/null +++ b/observability/logging/logging-6.0/log6x-clf.adoc @@ -0,0 +1,113 @@ +:_mod-docs-content-type: ASSEMBLY +include::_attributes/common-attributes.adoc[] +[id="log6x-clf"] += Configuring log forwarding +:context: logging-6x + +toc::[] + +The `ClusterLogForwarder` (CLF) allows users to configure forwarding of logs to various destinations. It provides a flexible way to select log messages from different sources, send them through a pipeline that can transform or filter them, and forward them to one or more outputs. + +.Key Functions of the ClusterLogForwarder +* Selects log messages using inputs +* Forwards logs to external destinations using outputs +* Filters, transforms, and drops log messages using filters +* Defines log forwarding pipelines connecting inputs, filters and outputs + +// need to verify if this is relevant still. +//include::modules/log6x-config-roles.adoc[leveloffset=+1] + +include::modules/log6x-collection-setup.adoc[leveloffset=+1] + +// OBSDOCS-1104 +== Modifying log level in collector + +To modify the log level in the collector, you can set the `observability.openshift.io/log-level` annotation to `trace`, `debug`, `info`, `warn`, `error`, and `off`. + +.Example log level annotation +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: + name: collector + annotations: + observability.openshift.io/log-level: debug +# ... +---- + +== Managing the Operator + +The `ClusterLogForwarder` resource has a `managementState` field that controls whether the operator actively manages its resources or leaves them Unmanaged: + +Managed:: (default) The operator will drive the logging resources to match the desired state in the CLF spec. + +Unmanaged:: The operator will not take any action related to the logging components. + +This allows administrators to temporarily pause log forwarding by setting `managementState` to `Unmanaged`. + +== Structure of the ClusterLogForwarder + +The CLF has a `spec` section that contains the following key components: + +Inputs:: Select log messages to be forwarded. Built-in input types `application`, `infrastructure` and `audit` forward logs from different parts of the cluster. You can also define custom inputs. + +Outputs:: Define destinations to forward logs to. Each output has a unique name and type-specific configuration. + +Pipelines:: Define the path logs take from inputs, through filters, to outputs. Pipelines have a unique name and consist of a list of input, output and filter names. + +Filters:: Transform or drop log messages in the pipeline. Users can define filters that match certain log fields and drop or modify the messages. Filters are applied in the order specified in the pipeline. + +=== Inputs + +Inputs are configured in an array under `spec.inputs`. There are three built-in input types: + +application:: Selects logs from all application containers, excluding those in infrastructure namespaces such as `default`, `openshift`, or any namespace with the `kube-` or `openshift-` prefix. + +infrastructure:: Selects logs from infrastructure components running in `default` and `openshift` namespaces and node logs. + +audit:: Selects logs from the OpenShift API server audit logs, Kubernetes API server audit logs, ovn audit logs, and node audit logs from auditd. + +Users can define custom inputs of type `application` that select logs from specific namespaces or using pod labels. + +=== Outputs + +Outputs are configured in an array under `spec.outputs`. Each output must have a unique name and a type. Supported types are: + +azureMonitor:: Forwards logs to Azure Monitor. +cloudwatch:: Forwards logs to AWS CloudWatch. +elasticsearch:: Forwards logs to an external Elasticsearch instance. +googleCloudLogging:: Forwards logs to Google Cloud Logging. +http:: Forwards logs to a generic HTTP endpoint. +kafka:: Forwards logs to a Kafka broker. +loki:: Forwards logs to a Loki logging backend. +lokistack:: Forwards logs to the logging supported combination of Loki and web proxy with {Product-Title} authentication integration. LokiStack's proxy uses {Product-Title} authentication to enforce multi-tenancy +otlp:: Forwards logs using the OpenTelemetry Protocol. +splunk:: Forwards logs to Splunk. +syslog:: Forwards logs to an external syslog server. + +Each output type has its own configuration fields. + +=== Pipelines + +Pipelines are configured in an array under `spec.pipelines`. Each pipeline must have a unique name and consists of: + +inputRefs:: Names of inputs whose logs should be forwarded to this pipeline. +outputRefs:: Names of outputs to send logs to. +filterRefs:: (optional) Names of filters to apply. + +The order of filterRefs matters, as they are applied sequentially. Earlier filters can drop messages that will not be processed by later filters. + +=== Filters + +Filters are configured in an array under `spec.filters`. They can match incoming log messages based on the value of structured fields and modify or drop them. + +Administrators can configure the following types of filters: + +include::modules/log6x-multiline-except.adoc[leveloffset=+2] +include::modules/log6x-content-filter-drop-records.adoc[leveloffset=+2] +include::modules/log6x-audit-log-filtering.adoc[leveloffset=+2] +include::modules/log6x-input-spec-filter-labels-expressions.adoc[leveloffset=+2] +include::modules/log6x-content-filter-prune-records.adoc[leveloffset=+2] +include::modules/log6x-input-spec-filter-audit-infrastructure.adoc[leveloffset=+1] +include::modules/log6x-input-spec-filter-namespace-container.adoc[leveloffset=+1] diff --git a/observability/logging/logging-6.0/log6x-loki.adoc b/observability/logging/logging-6.0/log6x-loki.adoc new file mode 100644 index 0000000000..1c45c1ff02 --- /dev/null +++ b/observability/logging/logging-6.0/log6x-loki.adoc @@ -0,0 +1,39 @@ +:_mod-docs-content-type: ASSEMBLY +[id="log6x-loki"] += Storing logs with LokiStack +include::_attributes/common-attributes.adoc[] +:context: logging-6x + +toc::[] + +You can configure a `LokiStack` CR to store application, audit, and infrastructure-related logs. + +[id="prerequisites_{context}"] +== Prerequisites + +* You have installed the {loki-op} by using the CLI or web console. +* You have a `serviceAccount` in the same namespace in which you create the `ClusterLogForwarder`. +* The `serviceAccount` is assigned `collect-audit-logs`, `collect-application-logs`, and `collect-infrastructure-logs` cluster roles. + +=== Core Setup and Configuration +*Role-based access controls, basic monitoring, and pod placement to deploy Loki.* + +include::modules/log6x-loki-rbac-rules-perms.adoc[leveloffset=+1] +include::modules/log6x-enabling-loki-alerts.adoc[leveloffset=+1] +include::modules/log6x-loki-memberlist-ip.adoc[leveloffset=+1] +include::modules/log6x-loki-retention.adoc[leveloffset=+1] +include::modules/log6x-loki-pod-placement.adoc[leveloffset=+1] + +=== Enhanced Reliability and Performance +*Configurations to ensure Loki’s reliability and efficiency in production.* + +include::modules/log6x-identity-federation.adoc[leveloffset=+1] +include::modules/log6x-loki-reliability-hardening.adoc[leveloffset=+1] +include::modules/log6x-loki-restart-hardening.adoc[leveloffset=+1] + +=== Advanced Deployment and Scalability +*Specialized configurations for high availability, scalability, and error handling.* + +include::modules/log6x-loki-zone-aware-rep.adoc[leveloffset=+1] +include::modules/log6x-loki-zone-fail-recovery.adoc[leveloffset=+1] +include::modules/log6x-loki-rate-limit-errors.adoc[leveloffset=+1] diff --git a/observability/logging/logging-6.0/log6x-meta-contributing.adoc b/observability/logging/logging-6.0/log6x-meta-contributing.adoc new file mode 100644 index 0000000000..6fbc977809 --- /dev/null +++ b/observability/logging/logging-6.0/log6x-meta-contributing.adoc @@ -0,0 +1,46 @@ +:_mod-docs-content-type: ASSEMBLY +include::_attributes/common-attributes.adoc[] +[id="log6x-meta-contributing"] += Contributing to logging documentation +:context: logging-6x + +[IMPORTANT] +==== +Do not include this file in the topic map. This is a guide meant for contributors, and is not intended to be published. +==== + +Logging consists of the Red Hat Openshift Logging Operator (aka the Cluster Logging Operator), and an accompanying log store Operator. Either the Loki Operator (current/future), or Elasticsearch (deprecated). Either vector (current/future) or fluentd (deprecated) handles log collection and aggregation. Operators use custom resources (CR) to manage applications and their components. High-level configuration and settings are provided by the user within a CR. The Operator translates high-level directives into low-level actions, based on best practices embedded within the Operator’s logic. A custom resource definition (CRD) defines a CR and lists all the configurations available to users of the Operator. Installing an Operator creates the CRDs, which are then used to generate CRs. + +== Operator CRs: +* `Red Hat Openshift Logging Operator` +** (Deprecated) `ClusterLogging` (CL) - Deploys the collector and forwarder which currently are both implemented by a daemonset running on each node. +** `ClusterLogForwarder` (CLF) - Generates collector configuration to forward logs per user configuration. +* `Loki Operator`: +** `LokiStack` - Controls the Loki cluster as log store and the web proxy with OpenShift Container Platform authentication integration to enforce multi-tenancy. +** `AlertingRule` - Alerting rules allow you to define alert conditions based on LogQL expressions. +** `RecordingRule` - Recording rules allow you to precompute frequently needed or computationally expensive expressions and save their result as a new set of time series. +** `RulerConfig` - The ruler API endpoints require to configure a backend object storage to store the recording rules and alerts. +* (Deprecated) `OpenShift Elasticsearch Operator` [Note: These CRs are generated and managed by the `ClusterLogging` Operator, manual changes cannot be made without being overwritten by the Operator.] +** `ElasticSearch` - Configure and deploy an Elasticsearch instance as the default log store. +** `Kibana` - Configure and deploy Kibana instance to search, query and view logs. + +== Underlying configuration(s): +* 5.0 - 5.4 +** Elasticsearch/Fluentd +* 5.5 - 5.9: [Note: features vary by version.] +** Elasticsearch/Fluentd +** Elasticsearch/Vector +** Loki/Fluentd +** Loki/Vector +* 6.0 +** Loki/Vector + +== Naming Conventions: +[May not be inclusive of all relevant modules.] +* 5.0 - 5.4 +** cluster-logging- +* 5.5 - 5.9 +** logging- +** loki-logging- +* 6.0 +** log6x- diff --git a/observability/logging/logging-6.0/log6x-release-notes.adoc b/observability/logging/logging-6.0/log6x-release-notes.adoc new file mode 100644 index 0000000000..dc48a44d4f --- /dev/null +++ b/observability/logging/logging-6.0/log6x-release-notes.adoc @@ -0,0 +1,77 @@ +:_mod-docs-content-type: ASSEMBLY +include::_attributes/common-attributes.adoc[] +[id="log6x-release-notes"] += Logging 6.0.0 +:context: logging-6x + +toc::[] + +This release includes link:https://access.redhat.com/errata/RHBA-2024:6693[{logging-uc} {for} Bug Fix Release 6.0.0] + +include::snippets/logging-compatibility-snip.adoc[] + +.Upstream component versions +[options="header"] +|=== + +| {logging} Version 6+| Component Version + +| Operator | `eventrouter` | `logfilemetricexporter` | `loki` | `lokistack-gateway` | `opa-openshift` | `vector` + +|6.0 | 0.4 | 1.1 | 3.1.0 | 0.1 | 0.1 | 0.37.1 + +|=== + +[id="log6x-release-notes-6-0-0-removal-notice"] +== Removal notice + +* With this release, {logging} no longer supports the `ClusterLogging.logging.openshift.io` and `ClusterLogForwarder.logging.openshift.io` custom resources. Refer to the product documentation for details on the replacement features. (link:https://issues.redhat.com/browse/LOG-5803[LOG-5803]) + +* With this release, {logging} no longer manages or deploys log storage (such as Elasticsearch), visualization (such as Kibana), or Fluentd-based log collectors. (link:https://issues.redhat.com/browse/LOG-5368[LOG-5368]) + +[NOTE] +==== +In order to continue to use Elasticsearch and Kibana managed by the elasticsearch-operator, the administrator must modify those object's ownerRefs before deleting the ClusterLogging resource. +==== + +[id="log6x-release-notes-6-0-0-enhancements"] +== New features and enhancements + +* This feature introduces a new architecture for {logging} {for} by shifting component responsibilities to their relevant Operators, such as for storage, visualization, and collection. It introduces the `ClusterLogForwarder.observability.openshift.io` API for log collection and forwarding. Support for the `ClusterLogging.logging.openshift.io` and `ClusterLogForwarder.logging.openshift.io` APIs, along with the Red Hat managed Elastic stack (Elasticsearch and Kibana), is removed. Users are encouraged to migrate to the Red Hat `LokiStack` for log storage. Existing managed Elasticsearch deployments can be used for a limited time. Automated migration for log collection is not provided, so administrators need to create a new ClusterLogForwarder.observability.openshift.io specification to replace their previous custom resources. Refer to the official product documentation for more details. (link:https://issues.redhat.com/browse/LOG-3493[LOG-3493]) + +* With this release, the responsibility for deploying the {logging} view plugin shifts from the {clo} to the {coo-first}. For new log storage installations that need visualization, the {coo-full} and the associated UIPlugin resource must be deployed. Refer to the xref:../../../observability/cluster_observability_operator/cluster-observability-operator-overview.adoc#cluster-observability-operator-overview[Cluster Observability Operator Overview] product documentation for more details. (link:https://issues.redhat.com/browse/LOG-5461[LOG-5461]) + +* This enhancement sets default requests and limits for Vector collector deployments' memory and CPU usage based on Vector documentation recommendations. (link:https://issues.redhat.com/browse/LOG-4745[LOG-4745]) + +* This enhancement updates Vector to align with the upstream version v0.37.1. (link:https://issues.redhat.com/browse/LOG-5296[LOG-5296]) + +* This enhancement introduces an alert that triggers when log collectors buffer logs to a node's file system and use over 15% of the available space, indicating potential back pressure issues. (link:https://issues.redhat.com/browse/LOG-5381[LOG-5381]) + +* This enhancement updates the selectors for all components to use common Kubernetes labels. (link:https://issues.redhat.com/browse/LOG-5906[LOG-5906]) + +* This enhancement changes the collector configuration to deploy as a ConfigMap instead of a secret, allowing users to view and edit the configuration when the ClusterLogForwarder is set to Unmanaged. (link:https://issues.redhat.com/browse/LOG-5599[LOG-5599]) + +* This enhancement adds the ability to configure the Vector collector log level using an annotation on the ClusterLogForwarder, with options including trace, debug, info, warn, error, or off. (link:https://issues.redhat.com/browse/LOG-5372[LOG-5372]) + +* This enhancement adds validation to reject configurations where Amazon CloudWatch outputs use multiple AWS roles, preventing incorrect log routing. (link:https://issues.redhat.com/browse/LOG-5640[LOG-5640]) +* This enhancement removes the Log Bytes Collected and Log Bytes Sent graphs from the metrics dashboard. (link:https://issues.redhat.com/browse/LOG-5964[LOG-5964]) + +* This enhancement updates the must-gather functionality to only capture information for inspecting Logging 6.0 components, including Vector deployments from ClusterLogForwarder.observability.openshift.io resources and the Red Hat managed LokiStack. (link:https://issues.redhat.com/browse/LOG-5949[LOG-5949]) + + +* This enhancement improves Azure storage secret validation by providing early warnings for specific error conditions. (link:https://issues.redhat.com/browse/LOG-4571[LOG-4571]) + +[id="log6x-release-notes-6-0-0-technology-preview-features"] +== Technology Preview features + +* This release introduces a Technology Preview feature for log forwarding using OpenTelemetry. A new output type,` OTLP`, allows sending JSON-encoded log records using the OpenTelemetry data model and resource semantic conventions. (link:https://issues.redhat.com/browse/LOG-4225[LOG-4225]) + +[id="log6x-release-notes-6-0-0-bug-fixes"] +== Bug fixes + +* Before this update, the `CollectorHighErrorRate` and `CollectorVeryHighErrorRate` alerts were still present. With this update, both alerts are removed in the {logging} 6.0 release but might return in a future release. (link:https://issues.redhat.com/browse/LOG-3432[LOG-3432]) + +[id="log6x-release-notes-6-0-0-CVEs"] +== CVEs + +* link:https://access.redhat.com/security/cve/CVE-2024-34397[CVE-2024-34397] diff --git a/observability/logging/logging-6.0/log6x-upgrading-to-6.adoc b/observability/logging/logging-6.0/log6x-upgrading-to-6.adoc new file mode 100644 index 0000000000..8d9505d995 --- /dev/null +++ b/observability/logging/logging-6.0/log6x-upgrading-to-6.adoc @@ -0,0 +1,434 @@ +:_mod-docs-content-type: ASSEMBLY +include::_attributes/common-attributes.adoc[] +[id="log6x-upgrading-to-6"] += Upgrading to Logging 6.0 +:context: log6x + +toc::[] + +Logging v6.0 is a significant upgrade from previous releases, achieving several longstanding goals of Cluster Logging: + +* Introduction of distinct operators to manage logging components (e.g., collectors, storage, visualization). +* Removal of support for managed log storage and visualization based on Elastic products (i.e., Elasticsearch, Kibana). +* Deprecation of the Fluentd log collector implementation. +* Removal of support for `ClusterLogging.logging.openshift.io` and `ClusterLogForwarder.logging.openshift.io` resources. + +[NOTE] +==== +The *cluster-logging-operator* does not provide an automated upgrade process. +==== + +Given the various configurations for log collection, forwarding, and storage, no automated upgrade is provided by the *cluster-logging-operator*. This documentation assists administrators in converting existing `ClusterLogging.logging.openshift.io` and `ClusterLogForwarder.logging.openshift.io` specifications to the new API. Examples of migrated `ClusterLogForwarder.observability.openshift.io` resources for common use cases are included. + +include::modules/log6x-oc-explain.adoc[leveloffset=+1] + +== Log Storage + +The only managed log storage solution available in this release is a Lokistack, managed by the *loki-operator*. This solution, previously available as the preferred alternative to the managed Elasticsearch offering, remains unchanged in its deployment process. + +[IMPORTANT] +==== +To continue using an existing Red Hat managed Elasticsearch or Kibana deployment provided by the *elasticsearch-operator*, remove the owner references from the `Elasticsearch` resource named `elasticsearch`, and the `Kibana` resource named `kibana` in the `openshift-logging` namespace before removing the `ClusterLogging` resource named `instance` in the same namespace. +==== + + +. Temporarily set *ClusterLogging* to state `Unmanaged` ++ +[source,terminal] +---- +$ oc -n openshift-logging patch clusterlogging/instance -p '{"spec":{"managementState": "Unmanaged"}}' --type=merge +---- + +. Remove *ClusterLogging* `ownerReferences` from the *Elasticsearch* resource ++ +The following command ensures that *ClusterLogging* no longer owns the *Elasticsearch* resource. Updates to the *ClusterLogging* resource's `logStore` field will no longer affect the *Elasticsearch* resource. ++ +[source,terminal] +---- +$ oc -n openshift-logging patch elasticsearch/elasticsearch -p '{"metadata":{"ownerReferences": []}}' --type=merge +---- + +. Remove *ClusterLogging* `ownerReferences` from the *Kibana* resource ++ +The following command ensures that *ClusterLogging* no longer owns the *Kibana* resource. Updates to the *ClusterLogging* resource's `visualization` field will no longer affect the *Kibana* resource. ++ +[source,terminal] +---- +$ oc -n openshift-logging patch kibana/kibana -p '{"metadata":{"ownerReferences": []}}' --type=merge +---- + +. Set *ClusterLogging* to state `Managed` +[source,terminal] +---- +$ oc -n openshift-logging patch clusterlogging/instance -p '{"spec":{"managementState": "Managed"}}' --type=merge +---- + +== Log Visualization +[subs="+quotes"] +The OpenShift console UI plugin for log visualization has been moved to the *cluster-observability-operator* from the *cluster-logging-operator*. +// Pending support statement. + + + +== Log Collection and Forwarding +// Can't link to github, need to figure a workaround. + +Log collection and forwarding configurations are now specified under the new link:https://github.com/openshift/cluster-logging-operator/blob/master/docs/reference/operator/api_observability_v1.adoc[API], part of the `observability.openshift.io` API group. The following sections highlight the differences from the old API resources. + +[NOTE] +==== +Vector is the only supported collector implementation. +==== + +== Management, Resource Allocation, and Workload Scheduling + +Configuration for management state (e.g., Managed, Unmanaged), resource requests and limits, tolerations, and node selection is now part of the new *ClusterLogForwarder* API. + +.Previous Configuration +[source,yaml] +---- +apiVersion: "logging.openshift.io/v1" +kind: "ClusterLogging" +spec: + managementState: "Managed" + collection: + resources: + limits: {} + requests: {} + nodeSelector: {} + tolerations: {} +---- + +.Current Configuration +[source,yaml] +---- +apiVersion: "observability.openshift.io/v1" +kind: ClusterLogForwarder +spec: + managementState: Managed + collector: + resources: + limits: {} + requests: {} + nodeSelector: {} + tolerations: {} +---- + +== Input Specifications + +The input specification is an optional part of the *ClusterLogForwarder* specification. Administrators can continue to use the predefined values of *application*, *infrastructure*, and *audit* to collect these sources. + +=== Application Inputs + +Namespace and container inclusions and exclusions have been consolidated into a single field. + +.5.9 Application Input with Namespace and Container Includes and Excludes +[source,yaml] +---- +apiVersion: "logging.openshift.io/v1" +kind: ClusterLogForwarder +spec: + inputs: + - name: application-logs + type: application + application: + namespaces: + - foo + - bar + includes: + - namespace: my-important + container: main + excludes: + - container: too-verbose +---- + +.6.0 Application Input with Namespace and Container Includes and Excludes +[source,yaml] +---- +apiVersion: "observability.openshift.io/v1" +kind: ClusterLogForwarder +spec: + inputs: + - name: application-logs + type: application + application: + includes: + - namespace: foo + - namespace: bar + - namespace: my-important + container: main + excludes: + - container: too-verbose +---- + +[NOTE] +==== +*application*, *infrastructure*, and *audit* are reserved words and cannot be used as names when defining an input. +==== + +=== Input Receivers + +Changes to input receivers include: + +* Explicit configuration of the type at the receiver level. +* Port settings moved to the receiver level. + +.5.9 Input Receivers +[source,yaml] +---- +apiVersion: "logging.openshift.io/v1" +kind: ClusterLogForwarder +spec: + inputs: + - name: an-http + receiver: + http: + port: 8443 + format: kubeAPIAudit + - name: a-syslog + receiver: + type: syslog + syslog: + port: 9442 +---- + +.6.0 Input Receivers +[source,yaml] +---- +apiVersion: "observability.openshift.io/v1" +kind: ClusterLogForwarder +spec: + inputs: + - name: an-http + type: receiver + receiver: + type: http + port: 8443 + http: + format: kubeAPIAudit + - name: a-syslog + type: receiver + receiver: + type: syslog + port: 9442 +---- + +== Output Specifications + +High-level changes to output specifications include: + +* URL settings moved to each output type specification. +* Tuning parameters moved to each output type specification. +* Separation of TLS configuration from authentication. +* Explicit configuration of keys and secret/configmap for TLS and authentication. + +== Secrets and TLS Configuration + +Secrets and TLS configurations are now separated into authentication and TLS configuration for each output. They must be explicitly defined in the specification rather than relying on administrators to define secrets with recognized keys. Upgrading TLS and authorization configurations requires administrators to understand previously recognized keys to continue using existing secrets. Examples in the following sections provide details on how to configure *ClusterLogForwarder* secrets to forward to existing Red Hat managed log storage solutions. + +== Red Hat Managed Elasticsearch + +.v5.9 Forwarding to Red Hat Managed Elasticsearch +[source,yaml] +---- +apiVersion: logging.openshift.io/v1 +kind: ClusterLogging +metadata: + name: instance + namespace: openshift-logging +spec: + logStore: + type: elasticsearch +---- + +.v6.0 Forwarding to Red Hat Managed Elasticsearch +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: + name: instance + namespace: openshift-logging +spec: + outputs: + - name: default-elasticsearch + type: elasticsearch + elasticsearch: + url: https://elasticsearch:9200 + version: 6 + index: -write-{+yyyy.MM.dd} + tls: + ca: + key: ca-bundle.crt + secretName: collector + certificate: + key: tls.crt + secretName: collector + key: + key: tls.key + secretName: collector + pipelines: + - outputRefs: + - default-elasticsearch + - inputRefs: + - application + - infrastructure +---- + +[NOTE] +==== +In this example, application logs are written to the `application-write` alias/index instead of `app-write`. +==== + +== Red Hat Managed LokiStack + +.v5.9 Forwarding to Red Hat Managed LokiStack +[source,yaml] +---- +apiVersion: logging.openshift.io/v1 +kind: ClusterLogging +metadata: + name: instance + namespace: openshift-logging +spec: + logStore: + type: lokistack + lokistack: + name: lokistack-dev +---- + +.v6.0 Forwarding to Red Hat Managed LokiStack +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +metadata: + name: instance + namespace: openshift-logging +spec: + outputs: + - name: default-lokistack + type: lokiStack + lokiStack: + target: + name: lokistack-dev + namespace: openshift-logging + authentication: + token: + from: serviceAccount + tls: + ca: + key: service-ca.crt + configMapName: openshift-service-ca.crt + pipelines: + - outputRefs: + - default-lokistack + - inputRefs: + - application + - infrastructure +---- + +== Filters and Pipeline Configuration + +Pipeline configurations now define only the routing of input sources to their output destinations, with any required transformations configured separately as filters. All attributes of pipelines from previous releases have been converted to filters in this release. Individual filters are defined in the `filters` specification and referenced by a pipeline. + +.5.9 Filters +[source,yaml] +---- +apiVersion: logging.openshift.io/v1 +kind: ClusterLogForwarder +spec: + pipelines: + - name: application-logs + parse: json + labels: + foo: bar + detectMultilineErrors: true +---- + +.6.0 Filter Configuration +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +spec: + filters: + - name: detectexception + type: detectMultilineException + - name: parse-json + type: parse + - name: labels + type: openShiftLabels + openShiftLabels: + foo: bar + pipelines: + - name: application-logs + filterRefs: + - detectexception + - labels + - parse-json +---- + +== Validation and Status + +Most validations are enforced when a resource is created or updated, providing immediate feedback. This is a departure from previous releases, where validation occurred post-creation and required inspecting the resource status. Some validation still occurs post-creation for cases where it is not possible to validate at creation or update time. + +Instances of the `ClusterLogForwarder.observability.openshift.io` must satisfy the following conditions before the operator will deploy the log collector: Authorized, Valid, Ready. An example of these conditions is: + +.6.0 Status Conditions +[source,yaml] +---- +apiVersion: observability.openshift.io/v1 +kind: ClusterLogForwarder +status: + conditions: + - lastTransitionTime: "2024-09-13T03:28:44Z" + message: 'permitted to collect log types: [application]' + reason: ClusterRolesExist + status: "True" + type: observability.openshift.io/Authorized + - lastTransitionTime: "2024-09-13T12:16:45Z" + message: "" + reason: ValidationSuccess + status: "True" + type: observability.openshift.io/Valid + - lastTransitionTime: "2024-09-13T12:16:45Z" + message: "" + reason: ReconciliationComplete + status: "True" + type: Ready + filterConditions: + - lastTransitionTime: "2024-09-13T13:02:59Z" + message: filter "detectexception" is valid + reason: ValidationSuccess + status: "True" + type: observability.openshift.io/ValidFilter-detectexception + - lastTransitionTime: "2024-09-13T13:02:59Z" + message: filter "parse-json" is valid + reason: ValidationSuccess + status: "True" + type: observability.openshift.io/ValidFilter-parse-json + inputConditions: + - lastTransitionTime: "2024-09-13T12:23:03Z" + message: input "application1" is valid + reason: ValidationSuccess + status: "True" + type: observability.openshift.io/ValidInput-application1 + outputConditions: + - lastTransitionTime: "2024-09-13T13:02:59Z" + message: output "default-lokistack-application1" is valid + reason: ValidationSuccess + status: "True" + type: observability.openshift.io/ValidOutput-default-lokistack-application1 + pipelineConditions: + - lastTransitionTime: "2024-09-13T03:28:44Z" + message: pipeline "default-before" is valid + reason: ValidationSuccess + status: "True" + type: observability.openshift.io/ValidPipeline-default-before +---- + +[NOTE] +==== +Conditions that are satisfied and applicable have a "status" value of "True". Conditions with a status other than "True" provide a reason and a message explaining the issue. +==== diff --git a/observability/logging/logging-6.0/log6x-visual.adoc b/observability/logging/logging-6.0/log6x-visual.adoc new file mode 100644 index 0000000000..cc930ae200 --- /dev/null +++ b/observability/logging/logging-6.0/log6x-visual.adoc @@ -0,0 +1,9 @@ +:_mod-docs-content-type: ASSEMBLY +[id="log6x-visual"] += Visualization for logging +include::_attributes/common-attributes.adoc[] +:context: logging-6x + +toc::[] + +Visualization for logging is provided by installing the xref:../../../observability/cluster_observability_operator/cluster-observability-operator-overview.adoc#cluster-observability-operator-overview[Cluster Observability Operator]. diff --git a/observability/logging/logging-6.0/modules b/observability/logging/logging-6.0/modules new file mode 120000 index 0000000000..7e8b50bee7 --- /dev/null +++ b/observability/logging/logging-6.0/modules @@ -0,0 +1 @@ +../../../modules/ \ No newline at end of file diff --git a/observability/logging/logging-6.0/snippets b/observability/logging/logging-6.0/snippets new file mode 120000 index 0000000000..ce62fd7c41 --- /dev/null +++ b/observability/logging/logging-6.0/snippets @@ -0,0 +1 @@ +../../../snippets/ \ No newline at end of file diff --git a/snippets/log6x-loki-statement-snip.adoc b/snippets/log6x-loki-statement-snip.adoc new file mode 100644 index 0000000000..9ca2958332 --- /dev/null +++ b/snippets/log6x-loki-statement-snip.adoc @@ -0,0 +1,9 @@ +// Text snippet included in the following assemblies: +// * observability/logging/log_storage/about-log-storage.adoc +// +// Text snippet included in the following modules: +// +// +:_mod-docs-content-type: SNIPPET + +Loki is a horizontally scalable, highly available, multi-tenant log aggregation system offered as a GA log store for {logging} {for} that can be visualized with the OpenShift {ObservabilityShortName} UI. The Loki configuration provided by OpenShift {logging-uc} is a short-term log store designed to enable users to perform fast troubleshooting with the collected logs. For that purpose, the {logging} {for} configuration of Loki has short-term storage, and is optimized for very recent queries. For long-term storage or queries over a long time period, users should look to log stores external to their cluster.