CNV#34781: removing TP from wasp-agent doc

2026-02-05 21:46:22 +01:00 · 2024-10-14 13:42:47 -04:00
parent aafe7e8105
commit 8cbafccd96
3 changed files with 260 additions and 191 deletions
--- a/modules/virt-using-wasp-agent-to-configure-higher-vm-workload-density.adoc
+++ b/modules/virt-using-wasp-agent-to-configure-higher-vm-workload-density.adoc
@@ -4,10 +4,9 @@

 :_mod-docs-content-type: PROCEDURE
 [id="virt-using-wasp-agent-to-configure-higher-vm-workload-density_{context}"]
-= Using `wasp-agent` to configure higher VM workload density
+= Using wasp-agent to increase VM workload density

-The `wasp-agent` component enables an {product-title} cluster to assign swap resources to virtual machine (VM) workloads.
-Swap usage is only supported on worker nodes.
+The `wasp-agent` component facilitates memory overcommitment by assigning swap resources to worker nodes. It also manages pod evictions when nodes are at risk due to high swap I/O traffic or high utilization.

 [IMPORTANT]
 ====
@@ -18,14 +17,99 @@ For descriptions of QoS classes, see link:https://kubernetes.io/docs/tasks/confi

 .Prerequisites

-* The `oc` tool is available.
-* You are logged into the cluster with the cluster-admin role.
-* A memory over-commit ratio is defined.
+* You have installed the OpenShift CLI (`oc`).
+* You are logged into the cluster with the `cluster-admin` role.
+* A memory overcommit ratio is defined.
 * The node belongs to a worker pool.

+[NOTE]
+====
+The `wasp-agent` component deploys an Open Container Initiative (OCI) hook to enable swap usage for containers on the node level. The low-level nature requires the `DaemonSet` object to be privileged.
+====
+
 .Procedure

-. Create a privileged service account by entering the following commands:
+. Configure the `kubelet` service to permit swap usage:
+.. Create or edit a `KubeletConfig` file with the parameters shown in the following example:
+
+.Example of a `KubeletConfig` file
+[source,yaml]
+----
+apiVersion: machineconfiguration.openshift.io/v1
+kind: KubeletConfig
+metadata:
+  name: custom-config
+spec:
+  machineConfigPoolSelector:
+    matchLabels:
+      pools.operator.machineconfiguration.openshift.io/worker: ''  # MCP
+      #machine.openshift.io/cluster-api-machine-role: worker # machine
+      #node-role.kubernetes.io/worker: '' # node
+  kubeletConfig:
+    failSwapOn: false
+----
+
+.. Wait for the worker nodes to sync with the new configuration by running the following command:
+
+[source,yaml]
+----
+$ oc wait mcp worker --for condition=Updated=True --timeout=-1s
+----
+
+. Provision swap by creating a `MachineConfig` object. For example:
+
+[source,yaml]
+----
+apiVersion: machineconfiguration.openshift.io/v1
+kind: MachineConfig
+metadata:
+  labels:
+    machineconfiguration.openshift.io/role: worker
+  name: 90-worker-swap
+spec:
+  config:
+    ignition:
+      version: 3.4.0
+    systemd:
+      units:
+        - contents: |
+            [Unit]
+            Description=Provision and enable swap
+            ConditionFirstBoot=no
+            
+            [Service]
+            Type=oneshot
+            Environment=SWAP_SIZE_MB=5000
+            ExecStart=/bin/sh -c "sudo dd if=/dev/zero of=/var/tmp/swapfile count=${SWAP_SIZE_MB} bs=1M && \
+            sudo chmod 600 /var/tmp/swapfile && \
+            sudo mkswap /var/tmp/swapfile && \
+            sudo swapon /var/tmp/swapfile && \
+            free -h && \
+            sudo systemctl set-property --runtime system.slice MemorySwapMax=0 IODeviceLatencyTargetSec=\"/ 50ms\""
+            
+            [Install]
+            RequiredBy=kubelet-dependencies.target
+          enabled: true
+          name: swap-provision.service
+----
+
+To have enough swap space for the worst-case scenario, make sure to have at least as much swap space provisioned as overcommitted RAM. Calculate the amount of swap space to be provisioned on a node by using the following formula:
+
+[source,terminal]
+----
+NODE_SWAP_SPACE = NODE_RAM * (MEMORY_OVER_COMMIT_PERCENT / 100% - 1)
+----
+
+.Example
+[source,terminal]
+----
+NODE_SWAP_SPACE = 16 GB * (150% / 100% - 1)
+               = 16 GB * (1.5 - 1)
+               = 16 GB * (0.5)
+               =  8 GB
+----
+
+. Create a privileged service account by running the following commands:
 +
 [source,terminal]
 ----
@@ -46,13 +130,27 @@ $ oc create clusterrolebinding wasp --clusterrole=cluster-admin --serviceaccount
 ----
 $ oc adm policy add-scc-to-user -n wasp privileged -z wasp
 ----
+
+. Wait for the worker nodes to sync with the new configuration by running the following command:
 +
-[NOTE]
-====
-The `wasp-agent` component deploys an OCI hook to enable swap usage for containers on the node level. The low-level nature requires the `DaemonSet` object to be privileged.
-====
+[source,yaml]
+----
+$ oc wait mcp worker --for condition=Updated=True --timeout=-1s
+----
+
+. Determine the pull URL for the wasp agent image by running the following commands:
 +
-. Deploy `wasp-agent` by creating a `DaemonSet` object as follows:
+[source,terminal]
+----
+$ OCP_VERSION=$(oc get clusterversion | awk 'NR==2' |cut -d' ' -f4 | cut -d'-' -f1)
+----
+
+[source,terminal]
+----
+$ oc get csv kubevirt-hyperconverged-operator.v${OCP_VERSION} -nopenshift-cnv -ojson | jq '.spec.relatedImages[] | select(.name|test(".*wasp-agent.*")) | .image'
+----
+
+. Deploy `wasp-agent` by creating a `DaemonSet` object as shown in the following example:
 +
 [source,yaml]
 ----
@@ -74,20 +172,30 @@ spec:
        description: >-
          Configures swap for workloads
      labels:
-          name: wasp
+        name: wasp
    spec:
-      serviceAccountName: wasp
-      hostPID: true
-      hostUsers: true
-      terminationGracePeriodSeconds: 5
      containers:
-        - name: wasp-agent
+        - env:
+            - name: SWAP_UTILIZATION_THRESHOLD_FACTOR
+              value: 0.8
+            - name: MAX_AVERAGE_SWAP_IN_PAGES_PER_SECOND
+              value: "1000"
+            - name: MAX_AVERAGE_SWAP_OUT_PAGES_PER_SECOND
+              value: "1000"
+            - name: AVERAGE_WINDOW_SIZE_SECONDS
+              value: "30"
+            - name: VERBOSITY
+              value: "1"
+            - name: FSROOT
+              value: /host
+            - name: NODE_NAME
+              valueFrom:
+                fieldRef:
+                  fieldPath: spec.nodeName
          image: >-
-            registry.redhat.io/container-native-virtualization/wasp-agent-rhel9:v4.17
+            quay.io/openshift-virtualization/wasp-agent:v4.17 <1>
          imagePullPolicy: Always
-          env:
-          - name: "FSROOT"
-            value: "/host"
+          name: wasp-agent
          resources:
            requests:
              cpu: 100m
@@ -95,175 +203,87 @@ spec:
          securityContext:
            privileged: true
          volumeMounts:
-          - name: host
-            mountPath: "/host"
-      volumes:
-      - name: host
-        hostPath:
-          path: "/"
+            - mountPath: /host
+              name: host
+            - mountPath: /rootfs
+              name: rootfs
+      hostPID: true
+      hostUsers: true
      priorityClassName: system-node-critical
+      serviceAccountName: wasp
+      terminationGracePeriodSeconds: 5
+      volumes:
+        - hostPath:
+            path: /
+          name: host
+        - hostPath:
+            path: /
+          name: rootfs
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 10%
      maxSurge: 0
-status: {}
 ----
-. Configure the `kubelet` service to permit swap:
-.. Create a `KubeletConfiguration` file as shown in the example:
-+
-.Example of a `KubeletConfiguration` file
-[source,yaml]
----
-apiVersion: machineconfiguration.openshift.io/v1
-kind: KubeletConfig
-metadata:
-  name: custom-config
-spec:
-  machineConfigPoolSelector:
-    matchLabels:
-      pools.operator.machineconfiguration.openshift.io/worker: ''  # MCP
-      #machine.openshift.io/cluster-api-machine-role: worker # machine
-      #node-role.kubernetes.io/worker: '' # node
-  kubeletConfig:
-    failSwapOn: false
-    evictionSoft:
-      memory.available: "1Gi"
-    evictionSoftGracePeriod:
-      memory.available: "10s"
----
-+
-If the cluster is already using an existing `KubeletConfiguration` file, add the following to the `spec` section:
+<1> Replace the `image` value with the image URL from the previous step.
+
+. Deploy alerting rules by creating a `PrometheusRule` object. For example:
 +
 [source,yaml]
 ----
-apiVersion: machineconfiguration.openshift.io/v1
-kind: KubeletConfig
-metadata:
-  name: custom-config
-# ...
-spec
-# ...
-    kubeletConfig:
-      evictionSoft:
-        memory.available: 1Gi
-      evictionSoftGracePeriod:
-        memory.available: 1m30s
-      failSwapOn: false
----
-.. Run the following command:
-+
-[source,yaml]
----
-$ oc wait mcp worker --for condition=Updated=True
----
-. Create a `MachineConfig` object to provision swap as follows:
-+
-[source,yaml]
----
-apiVersion: machineconfiguration.openshift.io/v1
-kind: MachineConfig
+apiVersion: monitoring.coreos.com/v1
+kind: PrometheusRule
 metadata:
  labels:
-    machineconfiguration.openshift.io/role: worker
-  name: 90-worker-swap
-spec:
-  config:
-    ignition:
-      version: 3.4.0
-    systemd:
-      units:
-      - contents: |
-          [Unit]
-          Description=Provision and enable swap
-          ConditionFirstBoot=no
-
-          [Service]
-          Type=oneshot
-          Environment=SWAP_SIZE_MB=5000
-          ExecStart=/bin/sh -c "sudo dd if=/dev/zero of=/var/tmp/swapfile count=${SWAP_SIZE_MB} bs=1M && \
-          sudo chmod 600 /var/tmp/swapfile && \
-          sudo mkswap /var/tmp/swapfile && \
-          sudo swapon /var/tmp/swapfile && \
-          free -h && \
-          sudo systemctl set-property --runtime system.slice MemorySwapMax=0 IODeviceLatencyTargetSec=\"/ 50ms\""
-
-          [Install]
-          RequiredBy=kubelet-dependencies.target
-        enabled: true
-        name: swap-provision.service
----
-+
-To have enough swap space for the worst-case scenario, make sure to have at least as much swap space provisioned as overcommitted RAM. Calculate the amount of swap space to be provisioned on a node using the following formula:
-+
-[source,terminal]
----
-NODE_SWAP_SPACE = NODE_RAM * (MEMORY_OVER_COMMIT_PERCENT / 100% - 1)
----
-+
-Example:
-+
-[source,terminal]
----
-NODE_SWAP_SPACE = 16 GB * (150% / 100% - 1)
-                = 16 GB * (1.5 - 1)
-                = 16 GB * (0.5)
-                =  8 GB
----
-+
-. Deploy alerting rules as follows:
-+
-[source,yaml]
----
-apiVersion: monitoring.openshift.io/v1
-kind: AlertingRule
-metadata:
-  name: wasp-alerts
-  namespace: openshift-monitoring
+    tier: node
+    wasp.io: ""
+  name: wasp-rules
+  namespace: wasp
 spec:
  groups:
-  - name: wasp.rules
-    rules:
-    - alert: NodeSwapping
-      annotations:
-        description: Node {{ $labels.instance }} is swapping at a rate of {{ printf "%.2f" $value }} MB/s
-        runbook_url: https://github.com/openshift-virtualization/wasp-agent/tree/main/runbooks/alerts/NodeSwapping.md
-        summary: A node is swapping memory pages
-      expr: |
-        # In MB/s
-        irate(node_memory_SwapFree_bytes{job="node-exporter"}[5m]) / 1024^2 > 0
-      for: 1m
-      labels:
-        severity: critical
+    - name: alerts.rules
+      rules:
+        - alert: NodeHighSwapActivity
+          annotations:
+            description: High swap activity detected at {{ $labels.instance }}. The rate
+              of swap out and swap in exceeds 200 in both operations in the last minute.
+              This could indicate memory pressure and may affect system performance.
+            runbook_url: https://github.com/openshift-virtualization/wasp-agent/tree/main/docs/runbooks/NodeHighSwapActivity.md
+            summary: High swap activity detected at {{ $labels.instance }}.
+          expr: rate(node_vmstat_pswpout[1m]) > 200 and rate(node_vmstat_pswpin[1m]) >
+            200
+          for: 1m
+          labels:
+            kubernetes_operator_component: kubevirt
+            kubernetes_operator_part_of: kubevirt
+            operator_health_impact: warning
+            severity: warning
 ----
-. Configure {VirtProductName} to use memory overcommit either by using the {product-title} web console or by editing the HyperConverged custom resource (CR) file as shown in the following example.
+
+. Add the `cluster-monitoring` label to the `wasp` namespace by running the following command:
 +
-Example:
+[source,terminal]
+----
+$ oc label namespace wasp openshift.io/cluster-monitoring="true"
+----
+
+. Enable memory overcommitment in {VirtProductName} by using the web console or the CLI.
 +
-[source,yaml]
----
-apiVersion: hco.kubevirt.io/v1beta1
-kind: HyperConverged
-metadata:
-  name: kubevirt-hyperconverged
-  namespace: openshift-cnv
-spec:
-  higherWorkloadDensity:
-    memoryOvercommitPercentage: 150
----
-. Apply all the configurations to compute nodes in your cluster by entering the following command:
+--
+.Web console
+.. In the {product-title} web console, go to *Virtualization* -> *Overview* -> *Settings* -> *General settings* -> *Memory density*. 
+.. Set *Enable memory density* to on.
+
+.CLI
+* Run the following command:
 +
 [source,terminal]
 ----
 $ oc patch --type=merge \
-  -f <../manifests/hco-set-memory-overcommit.yaml> \
-  --patch-file <../manifests/hco-set-memory-overcommit.yaml>
+  -f <../manifests/openshift/hco-set-memory-overcommit.yaml> \
+  --patch-file <../manifests/openshift/hco-set-memory-overcommit.yaml>
 ----
-+
-[NOTE]
-====
-After applying all configurations, the swap feature is fully available only after all `MachineConfigPool` rollouts are complete.
-====
+--

 .Verification

@@ -271,32 +291,35 @@ After applying all configurations, the swap feature is fully available only afte
 +
 [source, terminal]
 ----
-$  oc rollout status ds wasp-agent -n wasp
+$ oc rollout status ds wasp-agent -n wasp
 ----
 +
 If the deployment is successful, the following message is displayed:
 +
+.Example output
 [source, terminal]
 ----
 daemon set "wasp-agent" successfully rolled out
 ----

-. To verify that swap is correctly provisioned, do the following:
-.. Run the following command:
+. To verify that swap is correctly provisioned, complete the following steps:
+.. View a list of worker nodes by running the following command:
 +
 [source,terminal]
 ----
 $ oc get nodes -l node-role.kubernetes.io/worker
 ----
-.. Select a node from the provided list and run the following command:
+.. Select a node from the list and display its memory usage by running the following command:
 +
 [source,terminal]
 ----
-$ oc debug node/<selected-node> -- free -m
+$ oc debug node/<selected_node> -- free -m <1>
 ----
+<1> Replace `<selected_node>` with the node name.
 +
-If swap is provisioned correctly, an amount greater than zero is displayed, similar to the following:
+If swap is provisioned, an amount greater than zero is displayed in the `Swap:` row.
 +
+.Example output
 [cols="1,1,1,1,1,1,1"]
 |===
 | |total |used |free |shared |buff/cache |available
@@ -309,10 +332,12 @@ If swap is provisioned correctly, an amount greater than zero is displayed, simi
 [source,terminal]
 ----
 $ oc get -n openshift-cnv HyperConverged kubevirt-hyperconverged -o jsonpath="{.spec.higherWorkloadDensity.memoryOvercommitPercentage}"
+----
+
+.Example output
+[source,terminal]
+----
 150
 ----
 +
-The returned value, for example `150`, must match the value you had previously configured.
-
-
-
+The returned value must match the value you had previously configured.
--- a/modules/virt-wasp-agent-pod-eviction.adoc
+++ b/modules/virt-wasp-agent-pod-eviction.adoc
@@ -0,0 +1,51 @@
+// Module included in the following assemblies:
+//
+// * virt/post_installation_configuration/virt-configuring-higher-vm-workload-density.adoc
+
+:_mod-docs-content-type: CONCEPT
+[id="virt-wasp-agent-pod-eviction_{context}"]
+= Pod eviction conditions used by wasp-agent
+
+The wasp agent manages pod eviction when the system is heavily loaded and nodes are at risk. Eviction is triggered if one of the following conditions is met:
+
+High swap I/O traffic::
+
+This condition is met when swap-related I/O traffic is excessively high. 
+
+.Condition
+[source,text]
+----
+averageSwapInPerSecond > maxAverageSwapInPagesPerSecond 
+&&
+averageSwapOutPerSecond > maxAverageSwapOutPagesPerSecond
+----
+
+By default, `maxAverageSwapInPagesPerSecond` and `maxAverageSwapOutPagesPerSecond` are set to 1000 pages. The default time interval for calculating the average is 30 seconds.
+
+High swap utilization::
+
+This condition is met when swap utilization is excessively high, causing the current virtual memory usage to exceed the factored threshold. The `NODE_SWAP_SPACE` setting in your `MachineConfig` object can impact this condition.
+
+.Condition
+[source,text]
+----
+nodeWorkingSet + nodeSwapUsage < totalNodeMemory + totalSwapMemory × thresholdFactor
+----
+
+[id="environment-variables_{context}"]
+== Environment variables
+
+You can use the following environment variables to adjust the values used to calculate eviction conditions:
+
+[cols="1,1"]
+|===
+|*Environment variable* |*Function*
+|`MAX_AVERAGE_SWAP_IN_PAGES_PER_SECOND`
+|Sets the value of `maxAverageSwapInPagesPerSecond`.
+|`MAX_AVERAGE_SWAP_OUT_PAGES_PER_SECOND`
+|Sets the value of `maxAverageSwapOutPagesPerSecond`.
+|`SWAP_UTILIZATION_THRESHOLD_FACTOR`
+|Sets the `thresholdFactor` value used to calculate high swap utilization.
+|`AVERAGE_WINDOW_SIZE_SECONDS`
+|Sets the time interval for calculating the average swap usage.
+|===
--- a/virt/post_installation_configuration/virt-configuring-higher-vm-workload-density.adoc
+++ b/virt/post_installation_configuration/virt-configuring-higher-vm-workload-density.adoc
@@ -6,23 +6,16 @@ include::_attributes/common-attributes.adoc[]

 toc::[]

-To increase the number of virtual machines (VMs), you can configure a higher VM workload density in your cluster by overcommitting the amount of memory (RAM).
+You can increase the number of virtual machines (VMs) on nodes by overcommitting memory (RAM). Increasing VM workload density can be useful in the following situations:

-:FeatureName: Configuring higher workload density
-include::snippets/technology-preview.adoc[]
-
-The following workloads are especially suited for higher workload density:
-
-* Many similar workloads
-* Underused workloads
+* You have many similar workloads.
+* You have underused workloads.

 [NOTE]
 ====
-While overcommitted memory can lead to a higher workload density, it can also lower workload performance of a highly utilized system.
+Memory overcommitment can lower workload performance on a highly utilized system.
 ====

 include::modules/virt-using-wasp-agent-to-configure-higher-vm-workload-density.adoc[leveloffset=+1]

-
-
-
+include::modules/virt-wasp-agent-pod-eviction.adoc[leveloffset=+1]