openshift-docs/modules/cluster-autoscaler-about.adoc

// Module included in the following assemblies:
//
// * machine_management/applying-autoscaling.adoc

[id="cluster-autoscaler-about-{context}"]
= About the ClusterAutoscaler

The ClusterAutoscaler adjusts the size of an {product-title} cluster to meet
its current deployment needs. It uses declarative, Kubernetes-style arguments to
provide infrastructure management that does not rely on objects of a specific
cloud provider. The ClusterAutoscaler has a cluster scope, and is not associated
with a particular namespace.

The ClusterAutoscaler increases the size of the cluster when there are pods
that failed to schedule on any of the current nodes due to insufficient
resources or when another node is necessary to meet deployment needs. The
ClusterAutoscaler does not increase the cluster resources beyond the limits
that you specify.

The ClusterAutoscaler decreases the size of the cluster when some nodes are
consistently not needed for a significant period, such as when it has low
resource use and all of its important pods can fit on other nodes.

If the following types of pods are present on a node, the ClusterAutoscaler
will not remove the node:

* Pods with restrictive PodDisruptionBudgets (PDBs).
* Kube-system pods that do not run on the node by default.
* Kube-system pods that do not have a PDBB or have a PDB that is too restrictive.
* Pods that are not backed by a controller object such as a Deployment,
ReplicaSet, or StatefulSet.
* Pods with local storage.
* Pods that cannot be moved elsewhere because of a lack of resources,
incompatible node selectors or affinity, matching anti-affinity, and so on.
* Unless they also have a `"cluster-autoscaler.kubernetes.io/safe-to-evict": "true"`
annotation, pods that have a `"cluster-autoscaler.kubernetes.io/safe-to-evict": "false"`
annotation.

If you configure the ClusterAutoscaler, additional usage restrictions apply:

* Do not modify the nodes that are in autoscaled node groups directly. All nodes
within the same node group have the same capacity and labels and run the same
system pods.
* Specify requests for your pods.
* If you have to prevent pods from being deleted too quickly, configure
appropriate PDBs.
* Confirm that your cloud provider quota is large enough to support the
maximum node pools that you configure.
* Do not run additional node group autoscalers, especially the ones offered by
your cloud provider.


The Horizontal Pod Autoscaler (HPA) and the ClusterAutoscaler modify cluster
resources in different ways. The HPA changes the deployment's or ReplicaSet's
number of replicas based on the current CPU load.
If the load increases, the HPA creates new replicas, regardless of the amount
of resources available to the cluster.
If there are not enough resources, the ClusterAutoscaler adds resources so that
the HPA-created pods can run.
If the load decreases, the HPA stops some replicas. If this action causes some
nodes to be underutilized or completely empty, the ClusterAutoscaler deletes
the unnecessary nodes.


The ClusterAutoscaler takes pod priorities into account. The Pod Priority and
Preemption feature enables scheduling pods based on priorities if the cluster
does not have enough resources, but the ClusterAutoscaler ensures that the
cluster has resources to run all pods. To honor the intention of both features,
the ClusterAutoscaler inclues a priority cutoff. You can use this cutoff to
schedule "best-effort" pods, which do not cause the ClusterAutoscaler to
increase resources but instead run only when spare resources are available.

Pods with priority lower than the cutoff value do not cause the cluster to scale
up or prevent the cluster from scaling down. No new nodes are added to run the
pods, and nodes running these pods might be deleted to free resources.

////
Default priority cutoff is 0. It can be changed using `--expendable-pods-priority-cutoff` flag,
but we discourage it.
ClusterAutoscaler also doesn't trigger scale-up if an unschedulable pod is already waiting for a lower
priority pod preemption.
////