1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-05 12:46:18 +01:00
Files
openshift-docs/modules/update-duration-mco.adoc
2024-06-04 13:54:41 +00:00

54 lines
3.2 KiB
Plaintext

// Module included in the following assemblies:
//
// * updating/understanding_updates/understanding-openshift-update-duration.adoc
:_mod-docs-content-type: CONCEPT
[id="machine-config-operator-node-updates_{context}"]
= Machine Config Operator node updates
The Machine Config Operator (MCO) applies a new machine configuration to each control plane and compute node. During this process, the MCO performs the following sequential actions on each node of the cluster:
. Cordon and drain all the nodes
. Update the operating system (OS)
. Reboot the nodes
. Uncordon all nodes and schedule workloads on the node
[NOTE]
====
When a node is cordoned, workloads cannot be scheduled to it.
====
The time to complete this process depends on several factors including the node and infrastructure configuration. This process might take 5 or more minutes to complete per node.
In addition to MCO, you should consider the impact of the following parameters:
* The control plane node update duration is predictable and oftentimes shorter than compute nodes, because the control plane workloads are tuned for graceful updates and quick drains.
* You can update the compute nodes in parallel by setting the `maxUnavailable` field to greater than `1` in the Machine Config Pool (MCP). The MCO cordons the number of nodes specified in `maxUnavailable` and marks them unavailable for update.
* When you increase `maxUnavailable` on the MCP, it can help the pool to update more quickly. However, if `maxUnavailable` is set too high, and several nodes are cordoned simultaneously, the pod disruption budget (PDB) guarded workloads could fail to drain because a schedulable node cannot be found to run the replicas. If you increase `maxUnavailable` for the MCP, ensure that you still have sufficient schedulable nodes to allow PDB guarded workloads to drain.
* Before you begin the update, you must ensure that all the nodes are available. Any unavailable nodes can significantly impact the update duration because the node unavailability affects the `maxUnavailable` and pod disruption budgets.
+
To check the status of nodes from the terminal, run the following command:
+
[source,terminal]
----
$ oc get node
----
+
.Example Output
[source,terminal]
----
NAME STATUS ROLES AGE VERSION
ip-10-0-137-31.us-east-2.compute.internal Ready,SchedulingDisabled worker 12d v1.23.5+3afdacb
ip-10-0-151-208.us-east-2.compute.internal Ready master 12d v1.23.5+3afdacb
ip-10-0-176-138.us-east-2.compute.internal Ready master 12d v1.23.5+3afdacb
ip-10-0-183-194.us-east-2.compute.internal Ready worker 12d v1.23.5+3afdacb
ip-10-0-204-102.us-east-2.compute.internal Ready master 12d v1.23.5+3afdacb
ip-10-0-207-224.us-east-2.compute.internal Ready worker 12d v1.23.5+3afdacb
----
+
If the status of the node is `NotReady` or `SchedulingDisabled`, then the node is not available and this impacts the update duration.
+
You can check the status of nodes from the *Administrator* perspective in the web console by expanding **Compute** -> **Nodes**.