1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-05 12:46:18 +01:00

POWERMON-580 0.5 COnfiguring power metrics doc updates

This commit is contained in:
Gwynne Monahan
2025-07-01 09:38:05 -05:00
committed by openshift-cherrypick-robot
parent 50c369c332
commit 825048c97f
4 changed files with 85 additions and 133 deletions

View File

@@ -1,94 +0,0 @@
// Module included in the following assemblies:
// * power_monitoring/configuring-power-monitoring.adoc
:_mod-docs-content-type: PROCEDURE
[id="power-monitoring-configuring-kepler-redfish_{context}"]
= Configuring {PM-kepler} to use Redfish
You can configure {PM-kepler} to use Redfish as the source for running or hosting containers. {PM-kepler} can then monitor the power usage of these containers.
.Prerequisites
* You have access to the {product-title} web console.
* You are logged in as a user with the `cluster-admin` role.
* You have installed the {PM-operator}.
.Procedure
. In the *Administrator* perspective of the web console, click *Operators* -> *Installed Operators*.
. Click *{PM-title-c}* from the *Installed Operators* list and click the *{PM-kepler}* tab.
. Click *Create {PM-kepler}*. If you already have a {PM-kepler} instance created, click *Edit Kepler*.
. Configure `.spec.exporter.redfish` of the {PM-kepler} instance by specifying the mandatory `secretRef` field. You can also configure the optional `probeInterval` and `skipSSLVerify` fields to meet your needs.
+
.Example {PM-kepler} instance
[source,yaml]
----
apiVersion: kepler.system.sustainable.computing.io/v1alpha1
kind: Kepler
metadata:
name: kepler
spec:
exporter:
deployment:
# ...
redfish:
secretRef: <secret_name> required <1>
probeInterval: 60s <2>
skipSSLVerify: false <3>
# ...
----
<1> Required: Specifies the name of the secret that contains the credentials for accessing the Redfish server.
<2> Optional: Controls the frequency at which the power information is queried from Redfish. The default value is `60s`.
<3> Optional: Controls if {PM-kepler} skips verifying the Redfish server certificate. The default value is `false`.
+
[NOTE]
====
After {PM-kepler} is deployed, the `openshift-power-monitoring` namespace is created.
====
. Create the `redfish.csv` file with the following data format:
+
[source,csv]
----
<your_kubelet_node_name>,<redfish_username>,<redfish_password>,https://<redfish_ip_or_hostname>/
----
+
.Example `redfish.csv` file
[source,csv]
----
control-plane,exampleuser,examplepass,https://redfish.nodes.example.com
worker-1,exampleuser,examplepass,https://redfish.nodes.example.com
worker-2,exampleuser,examplepass,https://another.redfish.nodes.example.com
----
. Create the secret under the `openshift-power-monitoring` namespace. You must create the secret with the following conditions:
+
--
* The secret type is `Opaque`.
* The credentials are stored under the `redfish.csv` key in the `data` field of the secret.
--
+
[source,terminal]
----
$ oc -n openshift-power-monitoring \
create secret generic redfish-secret \
--from-file=redfish.csv
----
+
.Example output
[source,yaml]
----
apiVersion: v1
kind: Secret
metadata:
name: redfish-secret
data:
redfish.csv: YmFyCg==
# ...
----
+
[IMPORTANT]
====
The {PM-kepler} deployment will not continue until the Redfish secret is created. You can find this information in the `status` of a {PM-kepler} instance.
====

View File

@@ -6,43 +6,92 @@
[id="power-monitoring-kepler-configuration_{context}"]
= The {PM-kepler} configuration
You can configure {PM-kepler} with the `spec` field of the `{PM-kepler}` resource.
You can configure {PM-kepler} with the `spec` field of the `PowerMonitor` resource.
[IMPORTANT]
====
Ensure that the name of your {PM-kepler} instance is `kepler`. All other instances are rejected by the {PM-operator} Webhook.
Ensure that the name of your `PowerMonitor` instance is `power-monitor`. All other instances are rejected by the {PM-operator} Webhook.
====
The following is the list of configuration options:
.{PM-kepler} configuration options
[options="header"]
.PowerMonitor configuration options
[cols="1,3,2", options="header"]
|===
|Name |Spec |Description |Default
|`port` |`exporter.deployment` |The port on the node where the Prometheus metrics are exposed. |`9103`
|`nodeSelector` |`exporter.deployment` |The nodes on which {PM-kepler} exporter pods are scheduled. |`kubernetes.io/os: linux`
|`tolerations` |`exporter.deployment` |The tolerations for {PM-kepler} exporter that allow the pods to be scheduled on nodes with specific characteristics. |`- operator: "Exists"`
| Name
| Description
| Default Behavior
| deployment.nodeSelector
| The nodes on which Kepler (created by PowerMonitor) pods are scheduled.
| kubernetes.io/os: linux
| deployment.tolerations
| The tolerations for Power Monitor that allow the pods to be scheduled on nodes with specific characteristics.
| - operator: "Exists"
| deployment.security.mode
| Security mode can be set to either `none`, allowing unrestricted access to Kepler's metrics by any entity, or `rbac`, securing the metrics endpoint with TLS encryption and restricting access to authorized service accounts listed in `allowedSANames`.
| Set to `rbac` by default and only user workload prometheus is allowed access.
| deployment.security.allowedSANames
| A list of Service Account Names that can access Keplers metrics endpoint when security mode is `rbac`.
| In OpenShift, set to `openshift-user-workload-monitoring:prometheus-user-workload` to allow user workload monitoring to scrape Kepler.
| config.logLevel
| The level of logs to expose by Kepler.
| Set to info.
| config.metricLevels
| A list of energy metric levels to expose. Possible values include `node`, `process`, `container`, `vm`, and `pod`.
| The default list includes `node`, `pod`, and `vm`.
| config.staleness
| Specifies how long to wait before considering calculated power values as stale.
| 500ms (500 milliseconds).
| config.sampleRate
| Specifies the interval for monitoring resources such as processes, containers, and VMs.
| 5s (5 seconds).
| config.maxTerminated
| Controls terminated workload tracking. A negative value tracks unlimited workloads, zero disables tracking, and a positive value tracks the top N terminated workloads by energy consumption.
| 500.
|===
.Example `{PM-kepler}` resource with default configuration
.Example `PowerMonitor` resource with default configuration
[source,yaml]
----
apiVersion: kepler.system.sustainable.computing.io/v1alpha1
kind: Kepler
apiVersion: v1alpha1
kind: PowerMonitor
metadata:
name: kepler
labels:
app.kubernetes.io/name: powermonitor
app.kubernetes.io/instance: powermonitor
app.kubernetes.io/part-of: kepler-operator
name: power-monitor
spec:
exporter:
kepler:
deployment:
port: 9103 # <1>
nodeSelector:
kubernetes.io/os: linux # <2>
Tolerations: # <3>
- key: ""
operator: "Exists"
value: ""
effect: ""
----
<1> The Prometheus metrics are exposed on port 9103.
<2> {PM-kepler} pods are scheduled on Linux nodes.
<3> The default tolerations allow {PM-kepler} to be scheduled on any node.
nodeSelector:
kubernetes.io/os: linux
tolerations:
- key: key1
operator: Equal
value: value1
effect: NoSchedule
security:
mode: rbac
allowedSANames:
- openshift-user-workload-monitoring:prometheus-user-workload
config:
logLevel: info
metricLevels: [node, pod, vm]
staleness: 1s
sampleRate: 10s
maxTerminated: 1000
----

View File

@@ -6,25 +6,24 @@
[id="power-monitoring-monitoring-kepler-status_{context}"]
= Monitoring the {PM-kepler} status
You can monitor the state of the {PM-kepler} exporter with the `status` field of the `{PM-kepler}` resource.
You can monitor the state of the {PM-kepler} exporter with the `status` field of the `PowerMonitor` resource.
The `status.exporter` field includes information, such as the following:
The `status` field includes information, such as the following:
* The number of nodes currently running the {PM-kepler} pods
* The number of nodes that should be running the {PM-kepler} pods
* Conditions representing the health of the {PM-kepler} resource
This provides you with valuable insights into the changes made through the `spec` field.
This provides you with valuable insights into the changes made through the `spec` field.
.Example state of the `{PM-kepler}` resource
.Example state of the `PowerMonitor` resource
[source,yaml]
----
apiVersion: kepler.system.sustainable.computing.io/v1alpha1
kind: Kepler
kind: PowerMonitor
metadata:
name: kepler
name: power-monitor
status:
exporter:
conditions: # <1>
- lastTransitionTime: '2024-01-11T11:07:39Z'
message: Reconcile succeeded
@@ -34,7 +33,7 @@ status:
type: Reconciled
- lastTransitionTime: '2024-01-11T11:07:39Z'
message: >-
Kepler daemonset "kepler-operator/kepler" is deployed to all nodes and
power-monitor daemonset "openshift-power-monitoring/power-monitor" is deployed to all nodes and
available; ready 2/2
observedGeneration: 1
reason: DaemonSetReady
@@ -43,6 +42,6 @@ status:
currentNumberScheduled: 2 # <2>
desiredNumberScheduled: 2 # <3>
----
<1> The health of the {PM-kepler} resource. In this example, {PM-kepler} is successfully reconciled and ready.
<1> The health of the `PowerMonitor` resource. In this example, the `PowerMonitor` resource is successfully reconciled and ready.
<2> The number of nodes currently running the {PM-kepler} pods is 2.
<3> The wanted number of nodes to run the {PM-kepler} pods is 2.
<3> The wanted number of nodes to run the {PM-kepler} pods is 2.

View File

@@ -9,10 +9,8 @@ toc::[]
:FeatureName: Power monitoring
include::snippets/technology-preview.adoc[leveloffset=+2]
The `{PM-kepler}` resource is a Kubernetes custom resource definition (CRD) that enables you to configure the deployment and monitor the status of the {PM-kepler} resource.
The `PowerMonitor` resource is a Kubernetes custom resource definition (CRD) that enables you to configure the deployment and monitor the status of the `PowerMonitor` resource.
include::modules/power-monitoring-kepler-configuration.adoc[leveloffset=+1]
include::modules/power-monitoring-monitoring-kepler-status.adoc[leveloffset=+1]
include::modules/power-monitoring-configuring-kepler-redfish.adoc[leveloffset=+1]
include::modules/power-monitoring-monitoring-kepler-status.adoc[leveloffset=+1]