mirror of
https://github.com/openshift/openshift-docs.git
synced 2026-02-05 21:46:22 +01:00
127 lines
4.3 KiB
Plaintext
127 lines
4.3 KiB
Plaintext
// Module included in the following assemblies:
|
|
//
|
|
// * scalability_and_performance/recommended-performance-scale-practices/recommended-control-plane-practices.adoc
|
|
// * post_installation_configuration/node-tasks.adoc
|
|
|
|
[id="master-node-sizing_{context}"]
|
|
= Control plane node sizing
|
|
|
|
The control plane node resource requirements depend on the number and type of nodes and objects in the cluster. The following control plane node size recommendations are based on the results of a control plane density focused testing, or _Cluster-density_. This test creates the following objects across a given number of namespaces:
|
|
|
|
- 1 image stream
|
|
- 1 build
|
|
- 5 deployments, with 2 pod replicas in a `sleep` state, mounting 4 secrets, 4 config maps, and 1 downward API volume each
|
|
- 5 services, each one pointing to the TCP/8080 and TCP/8443 ports of one of the previous deployments
|
|
- 1 route pointing to the first of the previous services
|
|
- 10 secrets containing 2048 random string characters
|
|
- 10 config maps containing 2048 random string characters
|
|
|
|
|
|
[options="header",cols="4*"]
|
|
|===
|
|
| Number of worker nodes |Cluster-density (namespaces) | CPU cores |Memory (GB)
|
|
|
|
| 24
|
|
| 500
|
|
| 4
|
|
| 16
|
|
|
|
| 120
|
|
| 1000
|
|
| 8
|
|
| 32
|
|
|
|
| 252
|
|
| 4000
|
|
| 16, but 24 if using the OVN-Kubernetes network plug-in
|
|
| 64, but 128 if using the OVN-Kubernetes network plug-in
|
|
|
|
| 501, but untested with the OVN-Kubernetes network plug-in
|
|
| 4000
|
|
| 16
|
|
| 96
|
|
|
|
|===
|
|
|
|
The data from the table above is based on an {product-title} running on top of AWS, using r5.4xlarge instances as control-plane nodes and m5.2xlarge instances as worker nodes.
|
|
|
|
On a large and dense cluster with three control plane nodes, the CPU and memory usage will spike up when one of the nodes is stopped, rebooted, or fails. The failures can be due to unexpected issues with power, network, underlying infrastructure, or intentional cases where the cluster is restarted after shutting it down to save costs. The remaining two control plane nodes must handle the load in order to be highly available, which leads to increase in the resource usage. This is also expected during upgrades because the control plane nodes are cordoned, drained, and rebooted serially to apply the operating system updates, as well as the control plane Operators update. To avoid cascading failures, keep the overall CPU and memory resource usage on the control plane nodes to at most 60% of all available capacity to handle the resource usage spikes. Increase the CPU and memory on the control plane nodes accordingly to avoid potential downtime due to lack of resources.
|
|
|
|
[IMPORTANT]
|
|
====
|
|
The node sizing varies depending on the number of nodes and object counts in the cluster. It also depends on whether the objects are actively being created on the cluster. During object creation, the control plane is more active in terms of resource usage compared to when the objects are in the `running` phase.
|
|
====
|
|
|
|
Operator Lifecycle Manager (OLM ) runs on the control plane nodes and its memory footprint depends on the number of namespaces and user installed operators that OLM needs to manage on the cluster. Control plane nodes need to be sized accordingly to avoid OOM kills. Following data points are based on the results from cluster maximums testing.
|
|
|
|
[options="header",cols="3*"]
|
|
|===
|
|
| Number of namespaces |OLM memory at idle state (GB) |OLM memory with 5 user operators installed (GB)
|
|
|
|
| 500
|
|
| 0.823
|
|
| 1.7
|
|
|
|
| 1000
|
|
| 1.2
|
|
| 2.5
|
|
|
|
| 1500
|
|
| 1.7
|
|
| 3.2
|
|
|
|
| 2000
|
|
| 2
|
|
| 4.4
|
|
|
|
| 3000
|
|
| 2.7
|
|
| 5.6
|
|
|
|
| 4000
|
|
| 3.8
|
|
| 7.6
|
|
|
|
| 5000
|
|
| 4.2
|
|
| 9.02
|
|
|
|
| 6000
|
|
| 5.8
|
|
| 11.3
|
|
|
|
| 7000
|
|
| 6.6
|
|
| 12.9
|
|
|
|
| 8000
|
|
| 6.9
|
|
| 14.8
|
|
|
|
| 9000
|
|
| 8
|
|
| 17.7
|
|
|
|
| 10,000
|
|
| 9.9
|
|
| 21.6
|
|
|
|
|===
|
|
|
|
|
|
[IMPORTANT]
|
|
====
|
|
You can modify the control plane node size in a running {product-title} {product-version} cluster for the following configurations only:
|
|
|
|
* Clusters installed with a user-provisioned installation method.
|
|
* AWS clusters installed with an installer-provisioned infrastructure installation method.
|
|
* Clusters that use a control plane machine set to manage control plane machines.
|
|
|
|
For all other configurations, you must estimate your total node count and use the suggested control plane node size during installation.
|
|
====
|
|
|
|
[NOTE]
|
|
====
|
|
In {product-title} {product-version}, half of a CPU core (500 millicore) is now reserved by the system by default compared to {product-title} 3.11 and previous versions. The sizes are determined taking that into consideration.
|
|
====
|