From 255866ce34a8d648ca5df49e080c08148569f01a Mon Sep 17 00:00:00 2001 From: Brendan Daly Date: Thu, 21 Aug 2025 10:02:22 +0100 Subject: [PATCH] OSDOCS-15411_a:CQA updates --- modules/nodes-pods-autoscaling-about.adoc | 35 +++++-------------- ...s-pods-autoscaling-best-practices-hpa.adoc | 15 ++++---- modules/nodes-pods-autoscaling-policies.adoc | 9 ++--- .../nodes-pods-vertical-autoscaler-about.adoc | 20 +++++------ ...-pods-vertical-autoscaler-configuring.adoc | 16 ++++----- ...s-vertical-autoscaler-custom-resource.adoc | 4 +-- ...nodes-pods-vertical-autoscaler-custom.adoc | 14 ++++---- ...odes-pods-vertical-autoscaler-install.adoc | 4 +-- ...s-pods-vertical-autoscaler-moving-vpa.adoc | 8 ++--- .../nodes-pods-vertical-autoscaler-oom.adoc | 8 ++--- ...nodes-pods-vertical-autoscaler-tuning.adoc | 18 +++++----- ...es-pods-vertical-autoscaler-uninstall.adoc | 6 ++-- ...-pods-vertical-autoscaler-using-about.adoc | 24 ++++++------- .../pods/nodes-pods-vertical-autoscaler.adoc | 6 ++-- 14 files changed, 85 insertions(+), 102 deletions(-) diff --git a/modules/nodes-pods-autoscaling-about.adoc b/modules/nodes-pods-autoscaling-about.adoc index 64e561830c..65b899e1b4 100644 --- a/modules/nodes-pods-autoscaling-about.adoc +++ b/modules/nodes-pods-autoscaling-about.adoc @@ -6,30 +6,16 @@ [id="nodes-pods-autoscaling-about_{context}"] = Understanding horizontal pod autoscalers -You can create a horizontal pod autoscaler to specify the minimum and maximum number of pods -you want to run, as well as the CPU utilization or memory utilization your pods should target. +You can create a horizontal pod autoscaler to specify the minimum and maximum number of pods you want to run, and the CPU usage or memory usage your pods should target. -After you create a horizontal pod autoscaler, {product-title} begins to query the CPU and/or memory resource metrics on the pods. -When these metrics are available, the horizontal pod autoscaler computes -the ratio of the current metric utilization with the desired metric utilization, -and scales up or down accordingly. The query and scaling occurs at a regular interval, -but can take one to two minutes before metrics become available. +After you create a horizontal pod autoscaler, {product-title} begins to query the CPU, memory, or both resource metrics on the pods. When these metrics are available, the horizontal pod autoscaler computes the ratio of the current metric use with the intended metric use, and scales up or down as needed. The query and scaling occurs at a regular interval, but can take one to two minutes before metrics become available. -For replication controllers, this scaling corresponds directly to the replicas -of the replication controller. For deployment configurations, scaling corresponds -directly to the replica count of the deployment configuration. Note that autoscaling -applies only to the latest deployment in the `Complete` phase. +For replication controllers, this scaling corresponds directly to the replicas of the replication controller. For deployment, scaling corresponds directly to the replica count of the deployment. Note that autoscaling applies only to the latest deployment in the `Complete` phase. -{product-title} automatically accounts for resources and prevents unnecessary autoscaling -during resource spikes, such as during start up. Pods in the `unready` state -have `0 CPU` usage when scaling up and the autoscaler ignores the pods when scaling down. -Pods without known metrics have `0% CPU` usage when scaling up and `100% CPU` when scaling down. -This allows for more stability during the HPA decision. To use this feature, you must configure -readiness checks to determine if a new pod is ready for use. +{product-title} automatically accounts for resources and prevents unnecessary autoscaling during resource spikes, such as during start up. Pods in the `unready` state have `0 CPU` usage when scaling up and the autoscaler ignores the pods when scaling down. Pods without known metrics have `0% CPU` usage when scaling up and `100% CPU` when scaling down. This allows for more stability during the HPA decision. To use this feature, you must configure readiness checks to determine if a new pod is ready for use. ifdef::openshift-origin,openshift-enterprise,openshift-webscale[] -To use horizontal pod autoscalers, your cluster administrator must have -properly configured cluster metrics. +To use horizontal pod autoscalers, your cluster administrator must have properly configured cluster metrics. endif::openshift-origin,openshift-enterprise,openshift-webscale[] == Supported metrics @@ -43,27 +29,24 @@ The following metrics are supported by horizontal pod autoscalers: |Metric |Description |API version |CPU utilization -|Number of CPU cores used. Can be used to calculate a percentage of the pod's requested CPU. +|Number of CPU cores used. You can use this to calculate a percentage of the pod's requested CPU. |`autoscaling/v1`, `autoscaling/v2` |Memory utilization -|Amount of memory used. Can be used to calculate a percentage of the pod's requested memory. +|Amount of memory used. You can use this to calculate a percentage of the pod's requested memory. |`autoscaling/v2` |=== [IMPORTANT] ==== -For memory-based autoscaling, memory usage must increase and decrease -proportionally to the replica count. On average: +For memory-based autoscaling, memory usage must increase and decrease proportionally to the replica count. On average: * An increase in replica count must lead to an overall decrease in memory (working set) usage per-pod. * A decrease in replica count must lead to an overall increase in per-pod memory usage. -Use the {product-title} web console to check the memory behavior of your application -and ensure that your application meets these requirements before using -memory-based autoscaling. +Use the {product-title} web console to check the memory behavior of your application and ensure that your application meets these requirements before using memory-based autoscaling. ==== The following example shows autoscaling for the `hello-node` `Deployment` object. The initial deployment requires 3 pods. The HPA object increases the minimum to 5. If CPU usage on the pods reaches 75%, the pods increase to 7: diff --git a/modules/nodes-pods-autoscaling-best-practices-hpa.adoc b/modules/nodes-pods-autoscaling-best-practices-hpa.adoc index 501ed4ff60..a2a7db577c 100644 --- a/modules/nodes-pods-autoscaling-best-practices-hpa.adoc +++ b/modules/nodes-pods-autoscaling-best-practices-hpa.adoc @@ -6,14 +6,13 @@ [id="nodes-pods-autoscaling-best-practices-hpa_{context}"] = Best practices -.All pods must have resource requests configured -The HPA makes a scaling decision based on the observed CPU or memory utilization values of pods in an {product-title} cluster. Utilization values are calculated as a percentage of the resource requests of each pod. -Missing resource request values can affect the optimal performance of the HPA. +For optimal performance, configure resource requests for all pods. To prevent frequent replica fluctuations, configure the cooldown period. -.Configure the cool down period -During horizontal pod autoscaling, there might be a rapid scaling of events without a time gap. Configure the cool down period to prevent frequent replica fluctuations. -You can specify a cool down period by configuring the `stabilizationWindowSeconds` field. The stabilization window is used to restrict the fluctuation of replicas count when the metrics used for scaling keep fluctuating. -The autoscaling algorithm uses this window to infer a previous desired state and avoid unwanted changes to workload scale. +All pods must have resource requests configured:: +The HPA makes a scaling decision based on the observed CPU or memory usage values of pods in an {product-title} cluster. Utilization values are calculated as a percentage of the resource requests of each pod. Missing resource request values can affect the optimal performance of the HPA. + +Configure the cool down period:: +During horizontal pod autoscaling, there might be a rapid scaling of events without a time gap. Configure the cool down period to prevent frequent replica fluctuations. You can specify a cool down period by configuring the `stabilizationWindowSeconds` field. The stabilization window is used to restrict the fluctuation of replicas count when the metrics used for scaling keep fluctuating. The autoscaling algorithm uses this window to infer a previous required state and avoid unwanted changes to workload scale. For example, a stabilization window is specified for the `scaleDown` field: @@ -24,4 +23,4 @@ behavior: stabilizationWindowSeconds: 300 ---- -In the above example, all desired states for the past 5 minutes are considered. This approximates a rolling maximum, and avoids having the scaling algorithm frequently remove pods only to trigger recreating an equivalent pod just moments later. +In the previous example, all intended states for the past 5 minutes are considered. This approximates a rolling maximum, and avoids having the scaling algorithm often remove pods only to trigger recreating an equal pod just moments later. diff --git a/modules/nodes-pods-autoscaling-policies.adoc b/modules/nodes-pods-autoscaling-policies.adoc index 8fd12144e3..11b72b1825 100644 --- a/modules/nodes-pods-autoscaling-policies.adoc +++ b/modules/nodes-pods-autoscaling-policies.adoc @@ -2,10 +2,11 @@ // // * nodes/nodes-pods-autoscaling.adoc +:_mod-docs-content-type: CONCEPT [id="nodes-pods-autoscaling-policies_{context}"] = Scaling policies -The `autoscaling/v2` API allows you to add _scaling policies_ to a horizontal pod autoscaler. A scaling policy controls how the {product-title} horizontal pod autoscaler (HPA) scales pods. Scaling policies allow you to restrict the rate that HPAs scale pods up or down by setting a specific number or specific percentage to scale in a specified period of time. You can also define a _stabilization window_, which uses previously computed desired states to control scaling if the metrics are fluctuating. You can create multiple policies for the same scaling direction, and determine which policy is used, based on the amount of change. You can also restrict the scaling by timed iterations. The HPA scales pods during an iteration, then performs scaling, as needed, in further iterations. +Use the `autoscaling/v2` API to add _scaling policies_ to a horizontal pod autoscaler. A scaling policy controls how the {product-title} horizontal pod autoscaler (HPA) scales pods. Use scaling policies to restrict the rate that HPAs scale pods up or down by setting a specific number or specific percentage to scale in a specified period of time. You can also define a _stabilization window_, which uses previously computed required states to control scaling if the metrics are fluctuating. You can create multiple policies for the same scaling direction, and determine the policy to use, based on the amount of change. You can also restrict the scaling by timed iterations. The HPA scales pods during an iteration, then performs scaling, as needed, in further iterations. .Sample HPA object with a scaling policy [source, yaml] @@ -45,8 +46,8 @@ spec: <4> Limits the amount of scaling, either the number of pods or percentage of pods, during each iteration. There is no default value for scaling down by number of pods. <5> Determines the length of a scaling iteration. The default value is `15` seconds. <6> The default value for scaling down by percentage is 100%. -<7> Determines which policy to use first, if multiple policies are defined. Specify `Max` to use the policy that allows the highest amount of change, `Min` to use the policy that allows the lowest amount of change, or `Disabled` to prevent the HPA from scaling in that policy direction. The default value is `Max`. -<8> Determines the time period the HPA should look back at desired states. The default value is `0`. +<7> Determines the policy to use first, if multiple policies are defined. Specify `Max` to use the policy that allows the highest amount of change, `Min` to use the policy that allows the lowest amount of change, or `Disabled` to prevent the HPA from scaling in that policy direction. The default value is `Max`. +<8> Determines the time period the HPA reviews the required states. The default value is `0`. <9> This example creates a policy for scaling up. <10> Limits the amount of scaling up by the number of pods. The default value for scaling up the number of pods is 4%. <11> Limits the amount of scaling up by the percentage of pods. The default value for scaling up by percentage is 100%. @@ -80,7 +81,7 @@ spec: In this example, when the number of pods is greater than 40, the percent-based policy is used for scaling down, as that policy results in a larger change, as required by the `selectPolicy`. -If there are 80 pod replicas, in the first iteration the HPA reduces the pods by 8, which is 10% of the 80 pods (based on the `type: Percent` and `value: 10` parameters), over one minute (`periodSeconds: 60`). For the next iteration, the number of pods is 72. The HPA calculates that 10% of the remaining pods is 7.2, which it rounds up to 8 and scales down 8 pods. On each subsequent iteration, the number of pods to be scaled is re-calculated based on the number of remaining pods. When the number of pods falls below 40, the pods-based policy is applied, because the pod-based number is greater than the percent-based number. The HPA reduces 4 pods at a time (`type: Pods` and `value: 4`), over 30 seconds (`periodSeconds: 30`), until there are 20 replicas remaining (`minReplicas`). +If there are 80 pod replicas, in the first iteration the HPA reduces the pods by 8, which is 10% of the 80 pods (based on the `type: Percent` and `value: 10` parameters), over one minute (`periodSeconds: 60`). For the next iteration, the number of pods is 72. The HPA calculates that 10% of the remaining pods is 7.2, which it rounds up to 8 and scales down 8 pods. On each subsequent iteration, the number of pods to be scaled is re-calculated based on the number of remaining pods. When the number of pods falls to less than 40, the pods-based policy is applied, because the pod-based number is greater than the percent-based number. The HPA reduces 4 pods at a time (`type: Pods` and `value: 4`), over 30 seconds (`periodSeconds: 30`), until there are 20 replicas remaining (`minReplicas`). The `selectPolicy: Disabled` parameter prevents the HPA from scaling up the pods. You can manually scale up by adjusting the number of replicas in the replica set or deployment set, if needed. diff --git a/modules/nodes-pods-vertical-autoscaler-about.adoc b/modules/nodes-pods-vertical-autoscaler-about.adoc index edcf8cc966..700f6b3f6e 100644 --- a/modules/nodes-pods-vertical-autoscaler-about.adoc +++ b/modules/nodes-pods-vertical-autoscaler-about.adoc @@ -6,35 +6,35 @@ [id="nodes-pods-vertical-autoscaler-about_{context}"] = About the Vertical Pod Autoscaler Operator -The Vertical Pod Autoscaler Operator (VPA) is implemented as an API resource and a custom resource (CR). The CR determines the actions that the VPA Operator should take with the pods associated with a specific workload object, such as a daemon set, replication controller, and so forth, in a project. +The Vertical Pod Autoscaler Operator (VPA) is implemented as an API resource and a custom resource (CR). The CR determines the actions for the VPA to take with the pods associated with a specific workload object, such as a daemon set, replication controller, and so forth, in a project. -The VPA Operator consists of three components, each of which has its own pod in the VPA namespace: +The VPA consists of three components, each of which has its own pod in the VPA namespace: Recommender:: -The VPA recommender monitors the current and past resource consumption and, based on this data, determines the optimal CPU and memory resources for the pods in the associated workload object. +The VPA recommender monitors the current and past resource consumption. Based on this data, the VPA recommender determines the optimal CPU and memory resources for the pods in the associated workload object. Updater:: -The VPA updater checks if the pods in the associated workload object have the correct resources. If the resources are correct, the updater takes no action. If the resources are not correct, the updater kills the pod so that they can be recreated by their controllers with the updated requests. +The VPA updater checks if the pods in the associated workload object have the correct resources. If the resources are correct, the updater takes no action. If the resources are not correct, the updater kills the pod so that pods' controllers can re-create them with the updated requests. Admission controller:: -The VPA admission controller sets the correct resource requests on each new pod in the associated workload object, whether the pod is new or was recreated by its controller due to the VPA updater actions. +The VPA admission controller sets the correct resource requests on each new pod in the associated workload object. This applies whether the pod is new or the controller re-created the pod due to the VPA updater actions. You can use the default recommender or use your own alternative recommender to autoscale based on your own algorithms. -The default recommender automatically computes historic and current CPU and memory usage for the containers in those pods and uses this data to determine optimized resource limits and requests to ensure that these pods are operating efficiently at all times. For example, the default recommender suggests reduced resources for pods that are requesting more resources than they are using and increased resources for pods that are not requesting enough. +The default recommender automatically computes historic and current CPU and memory usage for the containers in those pods. The default recommender uses this data to determine optimized resource limits and requests to ensure that these pods are operating efficiently at all times. For example, the default recommender suggests reduced resources for pods that are requesting more resources than they are using and increased resources for pods that are not requesting enough. -The VPA then automatically deletes any pods that are out of alignment with these recommendations one at a time, so that your applications can continue to serve requests with no downtime. The workload objects then redeploy the pods with the original resource limits and requests. The VPA uses a mutating admission webhook to update the pods with optimized resource limits and requests before the pods are admitted to a node. If you do not want the VPA to delete pods, you can view the VPA resource limits and requests and manually update the pods as needed. +The VPA then automatically deletes any pods that are out of alignment with these recommendations one at a time, so that your applications can continue to serve requests with no downtime. The workload objects then redeploy the pods with the original resource limits and requests. The VPA uses a mutating admission webhook to update the pods with optimized resource limits and requests before admitting the pods to a node. If you do not want the VPA to delete pods, you can view the VPA resource limits and requests and manually update the pods as needed. [NOTE] ==== -By default, workload objects must specify a minimum of two replicas in order for the VPA to automatically delete their pods. Workload objects that specify fewer replicas than this minimum are not deleted. If you manually delete these pods, when the workload object redeploys the pods, the VPA does update the new pods with its recommendations. You can change this minimum by modifying the `VerticalPodAutoscalerController` object as shown in _Changing the VPA minimum value_. +By default, workload objects must specify a minimum of two replicas for the VPA to automatically delete their pods. Workload objects that specify fewer replicas than this minimum are not deleted. If you manually delete these pods, when the workload object redeploys the pods, the VPA updates the new pods with its recommendations. You can change this minimum by modifying the `VerticalPodAutoscalerController` object as shown in _Changing the VPA minimum value_. ==== For example, if you have a pod that uses 50% of the CPU but only requests 10%, the VPA determines that the pod is consuming more CPU than requested and deletes the pod. The workload object, such as replica set, restarts the pods and the VPA updates the new pod with its recommended resources. -For developers, you can use the VPA to help ensure your pods stay up during periods of high demand by scheduling pods onto nodes that have appropriate resources for each pod. +For developers, you can use the VPA to help ensure that your pods active during periods of high demand by scheduling pods onto nodes that have appropriate resources for each pod. -Administrators can use the VPA to better utilize cluster resources, such as preventing pods from reserving more CPU resources than needed. The VPA monitors the resources that workloads are actually using and adjusts the resource requirements so capacity is available to other workloads. The VPA also maintains the ratios between limits and requests that are specified in initial container configuration. +Administrators can use the VPA to better use cluster resources, such as preventing pods from reserving more CPU resources than needed. The VPA monitors the resources that workloads are actually using and adjusts the resource requirements so capacity is available to other workloads. The VPA also maintains the ratios between limits and requests specified in the initial container configuration. [NOTE] ==== diff --git a/modules/nodes-pods-vertical-autoscaler-configuring.adoc b/modules/nodes-pods-vertical-autoscaler-configuring.adoc index 5a03bd31ef..262ee02d4c 100644 --- a/modules/nodes-pods-vertical-autoscaler-configuring.adoc +++ b/modules/nodes-pods-vertical-autoscaler-configuring.adoc @@ -6,21 +6,21 @@ [id="nodes-pods-vertical-autoscaler-configuring_{context}"] = Using the Vertical Pod Autoscaler Operator -You can use the Vertical Pod Autoscaler Operator (VPA) by creating a VPA custom resource (CR). The CR indicates which pods it should analyze and determines the actions the VPA should take with those pods. +You can use the Vertical Pod Autoscaler Operator (VPA) by creating a VPA custom resource (CR). The CR indicates the pods to analyze and determines the actions for the VPA to take with those pods. -You can use the VPA to scale built-in resources such as deployments or stateful sets, and custom resources that manage pods. For more information on using the VPA with custom resources, see "Using the Vertical Pod Autoscaler Operator with Custom Resources." +You can use the VPA to scale built-in resources such as deployments or stateful sets, and custom resources that manage pods. For more information, see "About using the Vertical Pod Autoscaler Operator". .Prerequisites -* The workload object that you want to autoscale must exist. +* Ensure the workload object that you want to autoscale exists. -* If you want to use an alternative recommender, a deployment including that recommender must exist. +* Ensure that if you want to use an alternative recommender, a deployment including that recommender exists. .Procedure To create a VPA CR for a specific workload object: -. Change to the project where the workload object you want to scale is located. +. Change to the location of the project for the workload object you want to scale. .. Create a VPA CR YAML file: + @@ -48,8 +48,8 @@ spec: <2> Specify the name of an existing workload object you want this VPA to manage. <3> Specify the VPA mode: * `Auto` to automatically apply the recommended resources on pods associated with the controller. The VPA terminates existing pods and creates new pods with the recommended resource limits and requests. -* `Recreate` to automatically apply the recommended resources on pods associated with the workload object. The VPA terminates existing pods and creates new pods with the recommended resource limits and requests. The `Recreate` mode should be used rarely, only if you need to ensure that the pods are restarted whenever the resource request changes. -* `Initial` to automatically apply the recommended resources when pods associated with the workload object are created. The VPA does not update the pods as it learns new resource recommendations. +* `Recreate` to automatically apply the recommended resources on pods associated with the workload object. The VPA terminates existing pods and creates new pods with the recommended resource limits and requests. Use the `Recreate` mode rarely, only if you need to ensure that the pods restart whenever the resource request changes. +* `Initial` to automatically apply the recommended resources to newly-created pods associated with the workload object. The VPA does not update the pods as it learns new resource recommendations. * `Off` to only generate resource recommendations for the pods associated with the workload object. The VPA does not update the pods as it learns new resource recommendations and does not apply the recommendations to new pods. <4> Optional. Specify the containers you want to opt-out and set the mode to `Off`. <5> Optional. Specify an alternative recommender. @@ -63,7 +63,7 @@ $ oc create -f .yaml + After a few moments, the VPA learns the resource usage of the containers in the pods associated with the workload object. + -You can view the VPA recommendations using the following command: +You can view the VPA recommendations by using the following command: + [source,terminal] ---- diff --git a/modules/nodes-pods-vertical-autoscaler-custom-resource.adoc b/modules/nodes-pods-vertical-autoscaler-custom-resource.adoc index c3a798f760..4ea0683677 100644 --- a/modules/nodes-pods-vertical-autoscaler-custom-resource.adoc +++ b/modules/nodes-pods-vertical-autoscaler-custom-resource.adoc @@ -8,7 +8,7 @@ The Vertical Pod Autoscaler Operator (VPA) can update not only built-in resources such as deployments or stateful sets, but also custom resources that manage pods. -In order to use the VPA with a custom resource, when you create the `CustomResourceDefinition` (CRD) object, you must configure the `labelSelectorPath` field in the `/scale` subresource. The `/scale` subresource creates a `Scale` object. The `labelSelectorPath` field defines the JSON path inside the custom resource that corresponds to `Status.Selector` in the `Scale` object and in the custom resource. The following is an example of a `CustomResourceDefinition` and a `CustomResource` that fulfills these requirements, along with a `VerticalPodAutoscaler` definition that targets the custom resource. The following example shows the `/scale` subresource contract. +To use the VPA with a custom resource when you create the `CustomResourceDefinition` (CRD) object, you must configure the `labelSelectorPath` field in the `/scale` subresource. The `/scale` subresource creates a `Scale` object. The `labelSelectorPath` field defines the JSON path inside the custom resource that corresponds to `status.selector` in the `Scale` object and in the custom resource. The following is an example of a `CustomResourceDefinition` and a `CustomResource` that fulfills these requirements, along with a `VerticalPodAutoscaler` definition that targets the custom resource. The following example shows the `/scale` subresource contract. [NOTE] ==== @@ -73,7 +73,7 @@ spec: selector: "app=scalable-cr" <1> replicas: 1 ---- -<1> Specify the label type to apply to managed pods. This is the field referenced by the `labelSelectorPath` in the custom resource definition object. +<1> Specify the label type to apply to managed pods. This is the field that the `labelSelectorPath` references in the custom resource definition object. .Example VPA object [source,yaml] diff --git a/modules/nodes-pods-vertical-autoscaler-custom.adoc b/modules/nodes-pods-vertical-autoscaler-custom.adoc index 9f62791cd5..e1661a0e09 100644 --- a/modules/nodes-pods-vertical-autoscaler-custom.adoc +++ b/modules/nodes-pods-vertical-autoscaler-custom.adoc @@ -8,7 +8,7 @@ You can use your own recommender to autoscale based on your own algorithms. If you do not specify an alternative recommender, {product-title} uses the default recommender, which suggests CPU and memory requests based on historical usage. Because there is no universal recommendation policy that applies to all types of workloads, you might want to create and deploy different recommenders for specific workloads. -For example, the default recommender might not accurately predict future resource usage when containers exhibit certain resource behaviors, such as cyclical patterns that alternate between usage spikes and idling as used by monitoring applications, or recurring and repeating patterns used with deep learning applications. Using the default recommender with these usage behaviors might result in significant over-provisioning and Out of Memory (OOM) kills for your applications. +For example, the default recommender might not accurately predict future resource usage when containers exhibit certain resource behaviors. Examples are cyclical patterns that alternate between usage spikes and idling as used by monitoring applications, or recurring and repeating patterns used with deep learning applications. Using the default recommender with these usage behaviors might result in significant over-provisioning and Out of Memory (OOM) kills for your applications. // intro paragraph based on https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/3919-customized-recommender-vpa @@ -70,12 +70,12 @@ subjects: name: alt-vpa-recommender-sa namespace: ---- -<1> Creates a service account for the recommender in the namespace where the recommender is deployed. -<2> Binds the recommender service account to the `metrics-reader` role. Specify the namespace where the recommender is to be deployed. -<3> Binds the recommender service account to the `vpa-actor` role. Specify the namespace where the recommender is to be deployed. -<4> Binds the recommender service account to the `vpa-target-reader` role. Specify the namespace where the recommender is to be deployed. +<1> Creates a service account for the recommender in the namespace that displays the recommender. +<2> Binds the recommender service account to the `metrics-reader` role. Specify the namespace for where to deploy the recommender. +<3> Binds the recommender service account to the `vpa-actor` role. Specify the namespace for where to deploy the recommender. +<4> Binds the recommender service account to the `vpa-target-reader` role. Specify the namespace for where to display the recommender. -. To add the alternative recommender to the cluster, create a Deployment object similar to the following: +. To add the alternative recommender to the cluster, create a `Deployment` object similar to the following: + [source,yaml] ---- @@ -143,7 +143,7 @@ frontend-845d5478d-b7l4j 1/1 Running 0 4m25s vpa-alt-recommender-55878867f9-6tp5v 1/1 Running 0 9s ---- -. Configure a VPA CR that includes the name of the alternative recommender `Deployment` object. +. Configure a Vertical Pod Autoscaler Operator (VPA) custom resource (CR) that includes the name of the alternative recommender `Deployment` object. + .Example VPA CR to include the alternative recommender [source,yml] diff --git a/modules/nodes-pods-vertical-autoscaler-install.adoc b/modules/nodes-pods-vertical-autoscaler-install.adoc index 3615b691c8..7fd3e200c3 100644 --- a/modules/nodes-pods-vertical-autoscaler-install.adoc +++ b/modules/nodes-pods-vertical-autoscaler-install.adoc @@ -28,9 +28,9 @@ is automatically created if it does not exist. . Click *Install*. -.Verifiction +.Verification -. Verify the installation by listing the VPA Operator components: +. Verify the installation by listing the VPA components: .. Navigate to *Workloads* -> *Pods*. diff --git a/modules/nodes-pods-vertical-autoscaler-moving-vpa.adoc b/modules/nodes-pods-vertical-autoscaler-moving-vpa.adoc index a5c1255cc6..73c74a7674 100644 --- a/modules/nodes-pods-vertical-autoscaler-moving-vpa.adoc +++ b/modules/nodes-pods-vertical-autoscaler-moving-vpa.adoc @@ -20,7 +20,7 @@ endif::machinemgmt[] ifdef::vpa[] The Vertical Pod Autoscaler Operator (VPA) and each component has its own pod in the VPA namespace on the control plane nodes. You can move the VPA Operator and component pods to infrastructure or worker nodes by adding a node selector to the VPA subscription and the `VerticalPodAutoscalerController` CR. -You can create and use infrastructure nodes to host only infrastructure components, such as the default router, the integrated container image registry, and the components for cluster metrics and monitoring. These infrastructure nodes are not counted toward the total number of subscriptions that are required to run the environment. For more information, see _Creating infrastructure machine sets_. +You can create and use infrastructure nodes to host only infrastructure components. For example, the default router, the integrated container image registry, and the components for cluster metrics and monitoring. These infrastructure nodes are not counted toward the total number of subscriptions that are required to run the environment. For more information, see _Creating infrastructure machine sets_. You can move the components to the same node or separate nodes as appropriate for your organization. endif::vpa[] @@ -34,7 +34,7 @@ NAME READY STATUS RESTARTS vertical-pod-autoscaler-operator-6c75fcc9cd-5pb6z 1/1 Running 0 7m59s 10.128.2.24 c416-tfsbj-master-1 vpa-admission-plugin-default-6cb78d6f8b-rpcrj 1/1 Running 0 5m37s 10.129.2.22 c416-tfsbj-master-1 vpa-recommender-default-66846bd94c-dsmpp 1/1 Running 0 5m37s 10.129.2.20 c416-tfsbj-master-0 -vpa-updater-default-db8b58df-2nkvf 1/1 Running 0 5m37s 10.129.2.21 c416-tfsbj-master-1 +vpa-updater-default-db8b58df-2nkvf 1/1 Running 0 5m37s 10.129.2.21 c416-tfsbj-master-1 ---- .Procedure @@ -340,7 +340,7 @@ endif::vpa[] $ oc get pods -n openshift-vertical-pod-autoscaler -o wide ---- + -The pods are no longer deployed to the control plane nodes. In the example output below, the node is now an infra node, not a control plane node. +The pods are no longer deployed to the control plane nodes. In the following example output, the node is now an infra node, not a control plane node. + .Example output [source,terminal] @@ -349,7 +349,7 @@ NAME READY STATUS RESTARTS vertical-pod-autoscaler-operator-6c75fcc9cd-5pb6z 1/1 Running 0 7m59s 10.128.2.24 c416-tfsbj-infra-eastus3-2bndt vpa-admission-plugin-default-6cb78d6f8b-rpcrj 1/1 Running 0 5m37s 10.129.2.22 c416-tfsbj-infra-eastus1-lrgj8 vpa-recommender-default-66846bd94c-dsmpp 1/1 Running 0 5m37s 10.129.2.20 c416-tfsbj-infra-eastus1-lrgj8 -vpa-updater-default-db8b58df-2nkvf 1/1 Running 0 5m37s 10.129.2.21 c416-tfsbj-infra-eastus1-lrgj8 +vpa-updater-default-db8b58df-2nkvf 1/1 Running 0 5m37s 10.129.2.21 c416-tfsbj-infra-eastus1-lrgj8 ---- ifeval::["{context}" == "nodes-pods-vertical-autoscaler"] diff --git a/modules/nodes-pods-vertical-autoscaler-oom.adoc b/modules/nodes-pods-vertical-autoscaler-oom.adoc index 9780d62f82..c85a813ea7 100644 --- a/modules/nodes-pods-vertical-autoscaler-oom.adoc +++ b/modules/nodes-pods-vertical-autoscaler-oom.adoc @@ -6,9 +6,9 @@ [id="nodes-pods-vertical-autoscaler-oom_{context}"] = Custom memory bump-up after OOM event -If your cluster experiences an OOM (out of memory) event, the Vertical Pod Autoscaler Operator (VPA) increases the memory recommendation based on the memory consumption observed during the OOM event and a specified multiplier value in order to prevent future crashes due to insufficient memory. +If your cluster experiences an OOM (out of memory) event, the Vertical Pod Autoscaler Operator (VPA) increases the memory recommendation. The basis for the recommendation is the memory consumption observed during the OOM event and a specified multiplier value to prevent future crashes due to insufficient memory. -The recommendation is the higher of two calculations: the memory in use by the pod when the OOM event happened multiplied by a specified number of bytes or a specified percentage. The calculation is represented by the following formula: +The recommendation is the higher of two calculations: the memory in use by the pod when the OOM event happened multiplied by a specified number of bytes or a specified percentage. The following formula represents the calculation: [source,text] ---- @@ -18,9 +18,9 @@ recommendation = max(memory-usage-in-oom-event + oom-min-bump-up-bytes, memory-u You can configure the memory increase by specifying the following values in the recommender pod: * `oom-min-bump-up-bytes`. This value, in bytes, is a specific increase in memory after an OOM event occurs. The default is `100MiB`. -* `oom-bump-up-ratio`. This value is a percentage increase in memory when the OOM event occurred. The default value is `1.2`. +* `oom-bump-up-ratio`. This value is a percentage increase in memory when the OOM event occurred. The default value is `1.2`. -For example, if the pod memory usage during an OOM event is 100MB, and `oom-min-bump-up-bytes` is set to 150MB with a `oom-min-bump-ratio` of 1.2, after an OOM event, the VPA would recommend increasing the memory request for that pod to 150 MB, as it is higher than at 120MB (100MB * 1.2). +For example, if the pod memory usage during an OOM event is 100 MB, and `oom-min-bump-up-bytes` is set to 150 MB with a `oom-min-bump-ratio` of 1.2. After an OOM event, the VPA recommends increasing the memory request for that pod to 150 MB, as it is higher than at 120 MB (100 MB * 1.2). .Example recommender deployment object diff --git a/modules/nodes-pods-vertical-autoscaler-tuning.adoc b/modules/nodes-pods-vertical-autoscaler-tuning.adoc index 687930fc63..39d6c2c39b 100644 --- a/modules/nodes-pods-vertical-autoscaler-tuning.adoc +++ b/modules/nodes-pods-vertical-autoscaler-tuning.adoc @@ -8,29 +8,29 @@ As a cluster administrator, you can tune the performance of your Vertical Pod Autoscaler Operator (VPA) to limit the rate at which the VPA makes requests of the Kubernetes API server and to specify the CPU and memory resources for the VPA recommender, updater, and admission controller component pods. -Additionally, you can configure the VPA Operator to monitor only those workloads that are being managed by a VPA custom resource (CR). By default, the VPA Operator monitors every workload in the cluster. This allows the VPA Operator to accrue and store 8 days of historical data for all workloads, which the Operator can use if a new VPA CR is created for a workload. However, this causes the VPA Operator to use significant CPU and memory, which could cause the Operator to fail, particularly on larger clusters. By configuring the VPA Operator to monitor only workloads with a VPA CR, you can save on CPU and memory resources. One trade-off is that if you have a workload that has been running, and you create a VPA CR to manage that workload, the VPA Operator does not have any historical data for that workload. As a result, the initial recommendations are not as useful as those after the workload has been running for some time. +You can also configure the VPA to monitor only those workloads a VPA custom resource (CR) manages. By default, the VPA monitors every workload in the cluster. As a result, the VPA accrues and stores 8 days of historical data for all workloads. The can be used by the VPA if a new VPA CR is created for a workload. However, this causes the VPA to use significant CPU and memory. This can cause the VPA to fail, particularly on larger clusters. By configuring the VPA to monitor only workloads with a VPA CR, you can save on CPU and memory resources. One tradeoff is that where you have a running workload and you create a VPA CR to manage that workload. The VPA does not have any historical data for that workload. As a result, the initial recommendations are not as useful as those after the workload is running for some time. -These tunings allow you to ensure the VPA has sufficient resources to operate at peak efficiency and to prevent throttling and a possible delay in pod admissions. +Use these tunings to ensure the VPA has enough resources to operate at peak efficiency and to prevent throttling, and a possible delay in pod admissions. You can perform the following tunings on the VPA components by editing the `VerticalPodAutoscalerController` custom resource (CR): * To prevent throttling and pod admission delays, set the queries per second (QPS) and burst rates for VPA requests of the Kubernetes API server by using the `kube-api-qps` and `kube-api-burst` parameters. -* To ensure sufficient CPU and memory, set the CPU and memory requests for VPA component pods by using the standard `cpu` and `memory` resource requests. +* To ensure enough CPU and memory, set the CPU and memory requests for VPA component pods by using the standard `cpu` and `memory` resource requests. -* To configure the VPA Operator to monitor only workloads that are being managed by a VPA CR, set the `memory-saver` parameter to `true` for the recommender component. +* To configure the VPA to monitor only workloads that the VPA CR manages, set the `memory-saver` parameter to `true` for the recommender component. For guidelines on the resources and rate limits that you could set for each VPA component, the following tables provide recommended baseline values, depending on the size of your cluster and other factors. [IMPORTANT] ==== -These recommended values were derived from internal Red{nbsp}Hat testing on clusters that are not necessarily representative of real-world clusters. You should test these values in a non-production cluster before configuring a production cluster. +These recommended values derive from internal Red{nbsp}Hat testing on clusters that are not necessarily representative of real-world clusters. Before you configure a production cluster, ensure you test these values in a non-production cluster. ==== .Requests by containers in the cluster [cols="1,1,1,1,1,1,1,1,1,5,5"] |=== -| Component 2+| 1-500 containers 2+| 500-1000 containers 2+| 1000-2000 containers 2+| 2000-4000 containers 2+| 4000+ containers +| Component 2+| 1-500 containers 2+| 500-1,000 containers 2+| 1,000-2,000 containers 2+| 2,000-4,000 containers 2+| 4,000+ containers | | *CPU* @@ -92,7 +92,7 @@ It is recommended that you set the memory limit on your containers to at least d .Rate limits by VPAs in the cluster [cols="1,3,2,3,2,3,2,3,2"] |=== -| Component 2+| 1 - 150 VPAs 2+| 151 - 500 VPAs 2+| 501-2000 VPAs 2+| 2001-4000 VPAs +| Component 2+| 1-150 VPAs 2+| 151-500 VPAs 2+| 501-2,000 VPAs 2+| 2,001-4,000 VPAs | | *QPS Limit* ^[1]^ @@ -131,7 +131,7 @@ s| Updater [NOTE] ==== -If you have more than 4000 VPAs in your cluster, it is recommended that you start performance tuning with the values in the table and slowly increase the values until you achieve the desired recommender and updater latency and performance. You should adjust these values slowly because increased QPS and Burst could affect the cluster health and slow down the Kubernetes API server if too many API requests are being sent to the API server from the VPA components. +If you have more than 4,000 VPAs in your cluster, it is recommended that you start performance tuning with the values in the table and slowly increase the values until you achieve the required recommender and updater latency and performance. Adjust these values slowly because increased QPS and Burst can affect cluster health and slow down the Kubernetes API server if too many API requests are sent to the API server from the VPA components. ==== //// @@ -175,7 +175,7 @@ The admission pod can get throttled if you are using the VPA on custom resources ==== //// -The following example VPA controller CR is for a cluster with 1000 to 2000 containers and a pod creation surge of 26 to 50. The CR sets the following values: +The following example VPA controller CR is for a cluster with 1,000 to 2,000 containers and a pod creation surge of 26 to 50. The CR sets the following values: * The container memory and CPU requests for all three VPA components * The container memory limit for all three VPA components diff --git a/modules/nodes-pods-vertical-autoscaler-uninstall.adoc b/modules/nodes-pods-vertical-autoscaler-uninstall.adoc index 0c633adcbd..c1d928d102 100644 --- a/modules/nodes-pods-vertical-autoscaler-uninstall.adoc +++ b/modules/nodes-pods-vertical-autoscaler-uninstall.adoc @@ -6,18 +6,18 @@ [id="nodes-pods-vertical-autoscaler-uninstall_{context}"] = Uninstalling the Vertical Pod Autoscaler Operator -You can remove the Vertical Pod Autoscaler Operator (VPA) from your {product-title} cluster. After uninstalling, the resource requests for the pods already modified by an existing VPA CR do not change. Any new pods get the resources defined in the workload object, not the previous recommendations made by the Vertical Pod Autoscaler Operator. +You can remove the Vertical Pod Autoscaler Operator (VPA) from your {product-title} cluster. After uninstalling, the resource requests for the pods that are already modified by an existing VPA custom resource (CR) do not change. The resources defined in the workload object, not the previous recommendations made by the VPA, are allocated to any new pods. [NOTE] ==== You can remove a specific VPA CR by using the `oc delete vpa ` command. The same actions apply for resource requests as uninstalling the vertical pod autoscaler. ==== -After removing the VPA Operator, it is recommended that you remove the other components associated with the Operator to avoid potential issues. +After removing the VPA, it is recommended that you remove the other components associated with the Operator to avoid potential issues. .Prerequisites -* The Vertical Pod Autoscaler Operator must be installed. +* You installed the VPA. .Procedure diff --git a/modules/nodes-pods-vertical-autoscaler-using-about.adoc b/modules/nodes-pods-vertical-autoscaler-using-about.adoc index 1dc4b70174..d88d6606f5 100644 --- a/modules/nodes-pods-vertical-autoscaler-using-about.adoc +++ b/modules/nodes-pods-vertical-autoscaler-using-about.adoc @@ -4,15 +4,15 @@ :_mod-docs-content-type: CONCEPT [id="nodes-pods-vertical-autoscaler-using-about_{context}"] -= About Using the Vertical Pod Autoscaler Operator += About using the Vertical Pod Autoscaler Operator -To use the Vertical Pod Autoscaler Operator (VPA), you create a VPA custom resource (CR) for a workload object in your cluster. The VPA learns and applies the optimal CPU and memory resources for the pods associated with that workload object. You can use a VPA with a deployment, stateful set, job, daemon set, replica set, or replication controller workload object. The VPA CR must be in the same project as the pods you want to monitor. +To use the Vertical Pod Autoscaler Operator (VPA), you create a VPA custom resource (CR) for a workload object in your cluster. The VPA learns and applies the optimal CPU and memory resources for the pods associated with that workload object. You can use a VPA with a deployment, stateful set, job, daemon set, replica set, or replication controller workload object. The VPA CR must be in the same project as the pods that you want to check. -You use the VPA CR to associate a workload object and specify which mode the VPA operates in: +You use the VPA CR to associate a workload object and specify the mode that the VPA operates in: * The `Auto` and `Recreate` modes automatically apply the VPA CPU and memory recommendations throughout the pod lifetime. The VPA deletes any pods in the project that are out of alignment with its recommendations. When redeployed by the workload object, the VPA updates the new pods with its recommendations. * The `Initial` mode automatically applies VPA recommendations only at pod creation. -* The `Off` mode only provides recommended resource limits and requests, allowing you to manually apply the recommendations. The `Off` mode does not update pods. +* The `Off` mode only provides recommended resource limits and requests. You can then manually apply the recommendations. The `Off` mode does not update pods. You can also use the CR to opt-out certain containers from VPA evaluation and updates. @@ -42,7 +42,7 @@ resources: memory: 262144k ---- -You can view the VPA recommendations using the following command: +You can view the VPA recommendations by using the following command: [source,terminal] ---- @@ -90,18 +90,18 @@ status: The output shows the recommended resources, `target`, the minimum recommended resources, `lowerBound`, the highest recommended resources, `upperBound`, and the most recent resource recommendations, `uncappedTarget`. -The VPA uses the `lowerBound` and `upperBound` values to determine if a pod needs to be updated. If a pod has resource requests below the `lowerBound` values or above the `upperBound` values, the VPA terminates and recreates the pod with the `target` values. +The VPA uses the `lowerBound` and `upperBound` values to determine if a pod needs updating. If a pod has resource requests less than the `lowerBound` values or more than the `upperBound` values, the VPA terminates and recreates the pod with the `target` values. [id="nodes-pods-vertical-autoscaler-using-one-pod_{context}"] == Changing the VPA minimum value -By default, workload objects must specify a minimum of two replicas in order for the VPA to automatically delete and update their pods. As a result, workload objects that specify fewer than two replicas are not automatically acted upon by the VPA. The VPA does update new pods from these workload objects if the pods are restarted by some process external to the VPA. You can change this cluster-wide minimum value by modifying the `minReplicas` parameter in the `VerticalPodAutoscalerController` custom resource (CR). +By default, workload objects must specify a minimum of two replicas in order for the VPA to automatically delete and update their pods. As a result, workload objects that specify fewer than two replicas are not automatically acted upon by the VPA. The VPA does update new pods from these workload objects if a process external to the VPA restarts the pods. You can change this cluster-wide minimum value by modifying the `minReplicas` parameter in the `VerticalPodAutoscalerController` custom resource (CR). For example, if you set `minReplicas` to `3`, the VPA does not delete and update pods for workload objects that specify fewer than three replicas. [NOTE] ==== -If you set `minReplicas` to `1`, the VPA can delete the only pod for a workload object that specifies only one replica. You should use this setting with one-replica objects only if your workload can tolerate downtime whenever the VPA deletes a pod to adjust its resources. To avoid unwanted downtime with one-replica objects, configure the VPA CRs with the `podUpdatePolicy` set to `Initial`, which automatically updates the pod only when it is restarted by some process external to the VPA, or `Off`, which allows you to update the pod manually at an appropriate time for your application. +If you set `minReplicas` to `1`, the VPA can delete the only pod for a workload object that specifies only one replica. Use this setting with one-replica objects only if your workload can tolerate downtime whenever the VPA deletes a pod to adjust its resources. To avoid unwanted downtime with one-replica objects, configure the VPA CRs with the `podUpdatePolicy` set to `Initial`, which automatically updates the pod only when a process external to the VPA restarts, or `Off`, which you can use to update the pod manually at an appropriate time for your application. ==== .Example `VerticalPodAutoscalerController` object @@ -156,7 +156,7 @@ spec: <2> The name of the workload object you want this VPA CR to manage. <3> Set the mode to `Auto` or `Recreate`: * `Auto`. The VPA assigns resource requests on pod creation and updates the existing pods by terminating them when the requested resources differ significantly from the new recommendation. -* `Recreate`. The VPA assigns resource requests on pod creation and updates the existing pods by terminating them when the requested resources differ significantly from the new recommendation. This mode should be used rarely, only if you need to ensure that the pods are restarted whenever the resource request changes. +* `Recreate`. The VPA assigns resource requests on pod creation and updates the existing pods by terminating them when the requested resources differ significantly from the new recommendation. Use this mode rarely, only if you need to ensure that when the resource request changes the pods restart. [NOTE] ==== @@ -223,14 +223,14 @@ spec: <2> The name of the workload object you want this VPA CR to manage. <3> Set the mode to `Off`. -You can view the recommendations using the following command. +You can view the recommendations by using the following command. [source,terminal] ---- $ oc get vpa --output yaml ---- -With the recommendations, you can edit the workload object to add CPU and memory requests, then delete and redeploy the pods using the recommended resources. +With the recommendations, you can edit the workload object to add CPU and memory requests, then delete and redeploy the pods by using the recommended resources. [NOTE] ==== @@ -266,7 +266,7 @@ spec: ---- <1> The type of workload object you want this VPA CR to manage. <2> The name of the workload object you want this VPA CR to manage. -<3> Set the mode to `Auto`, `Recreate`, `Initial`, or `Off`. The `Recreate` mode should be used rarely, only if you need to ensure that the pods are restarted whenever the resource request changes. +<3> Set the mode to `Auto`, `Recreate`, `Initial`, or `Off`. Use the `Recreate` mode rarely, only if you need to ensure that when the resource request changes the pods restart. <4> Specify the containers that you do not want updated by the VPA and set the `mode` to `Off`. For example, a pod has two containers, the same resource requests and limits: diff --git a/nodes/pods/nodes-pods-vertical-autoscaler.adoc b/nodes/pods/nodes-pods-vertical-autoscaler.adoc index 1db024dec8..8edac31e61 100644 --- a/nodes/pods/nodes-pods-vertical-autoscaler.adoc +++ b/nodes/pods/nodes-pods-vertical-autoscaler.adoc @@ -8,7 +8,7 @@ toc::[] -The {product-title} Vertical Pod Autoscaler Operator (VPA) automatically reviews the historic and current CPU and memory resources for containers in pods and can update the resource limits and requests based on the usage values it learns. The VPA uses individual custom resources (CR) to update all of the pods in a project that are associated with any built-in workload objects, including the following object types: +The {product-title} Vertical Pod Autoscaler Operator (VPA) automatically reviews the historic and current CPU and memory resources for containers in pods. The VPA can update the resource limits and requests based on the usage values it learns. By using individual custom resources (CR), the VPA updates all the pods in a project associated with any built-in workload objects. This includes the following list of object types: * `Deployment` * `DeploymentConfig` @@ -18,7 +18,7 @@ The {product-title} Vertical Pod Autoscaler Operator (VPA) automatically reviews * `ReplicaSet` * `ReplicationController` -The VPA can also update certain custom resource object that manage pods, as described in xref:../../nodes/pods/nodes-pods-vertical-autoscaler.adoc#nodes-pods-vertical-autoscaler-custom-resource_nodes-pods-vertical-autoscaler[Using the Vertical Pod Autoscaler Operator with Custom Resources]. +The VPA can also update certain custom resource object that manage pods. For more information, see xref:../../nodes/pods/nodes-pods-vertical-autoscaler.adoc#nodes-pods-vertical-autoscaler-custom-resource_nodes-pods-vertical-autoscaler[Example custom resources for the Vertical Pod Autoscaler]. The VPA helps you to understand the optimal CPU and memory usage for your pods and can automatically maintain pod resources through the pod lifecycle. @@ -36,7 +36,7 @@ include::modules/nodes-pods-vertical-autoscaler-moving-vpa.adoc[leveloffset=+1] .Additional resources -* xref:../../machine_management/creating-infrastructure-machinesets.adoc#creating-infrastructure-machinesets-production[Creating infrastructure machine sets] +* xref:../../machine_management/creating-infrastructure-machinesets.adoc#creating-infrastructure-machinesets-production[Creating infrastructure machine sets] include::modules/nodes-pods-vertical-autoscaler-using-about.adoc[leveloffset=+1]