openshift-docs/modules/recommended-scale-practices.adoc

// Module included in the following assemblies:
//
// * scalability_and_performance/recommended-performance-scale-practices/recommended-control-plane-practices.adoc

[id="recommended-scale-practices_{context}"]
= Recommended practices for scaling the cluster

The guidance in this section is only relevant for installations with cloud provider integration.

Apply the following best practices to scale the number of worker machines in your {product-title} cluster. You scale the worker machines by increasing or decreasing the number of replicas that are defined in the worker machine set.

When scaling up the cluster to higher node counts:

* Spread nodes across all of the available zones for higher availability.
* Scale up by no more than 25 to 50 machines at once.
* Consider creating new compute machine sets in each available zone with alternative instance types of similar size to help mitigate any periodic provider capacity constraints. For example, on AWS, use m5.large and m5d.large.

[NOTE]
====
Cloud providers might implement a quota for API services. Therefore, gradually scale the cluster.
====

The controller might not be able to create the machines if the replicas in the compute machine sets are set to higher numbers all at one time. The number of requests the cloud platform, which {product-title} is deployed on top of, is able to handle impacts the process. The controller will start to query more while trying to create, check, and update the machines with the status. The cloud platform on which {product-title} is deployed has API request limits;  excessive queries might lead to machine creation failures due to cloud platform limitations.

Enable machine health checks when scaling to large node counts. In case of failures, the health checks monitor the condition and automatically repair unhealthy machines.

[NOTE]
====
When scaling large and dense clusters to lower node counts, it might take large amounts of time because the process involves draining or evicting the objects running on the nodes being terminated in parallel. Also, the client might start to throttle the requests if there are too many objects to evict. The default client queries per second (QPS) and burst rates are currently set to `50` and `100` respectively. These values cannot be modified in {product-title}.
====