1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-05 21:46:22 +01:00
Files
openshift-docs/modules/checking-mco-node-status.adoc

269 lines
15 KiB
Plaintext

// Module included in the following assemblies:
//
// * machine_configuration/machine-config-index.adoc
:_mod-docs-content-type: CONCEPT
[id="checking-mco-node-status_{context}"]
= About node status during updates
[role="_abstract"]
If you make changes to a machine config pool (MCP) that results in a new machine config, for example by using a `MachineConfig` or `KubeletConfig` object, you can get detailed information about the progress of the node updates by using the machine config nodes custom resource. This information can be helpful if issues arise during the update and you need to troubleshoot a node.
The `MachineConfigNode` custom resource allows you to monitor the progress of individual node updates as they move through the update phases. This information can be helpful with troubleshooting if one of the nodes has an issue during the update. The custom resource reports where in the update process the node is at the moment, the phases that have completed, and the phases that are remaining.
The node update process consists of the following phases and subphases that are tracked by the machine config node custom resource, explained with more detail later in this section:
* *Update Prepared*. The MCO stops the configuration drift monitoring process and verifies that the newly-created machine config can be applied to a node.
* *Update Executed*. The MCO cordons and drains the node and applies the new machine config to the node files and operating system, as needed. It contains the following sub-phases:
** *Cordoned*
** *Drained*
** *AppliedFilesAndOS*
* *PinnedImageSetsProgressing* The MCO is performing the steps needed to pin and pre-load container images.
* *PinnedImageSetsDegraded* The pinned image process failed. You can view the reason for the failure by using the `oc describe machineconfignode` command, as described later in this section.
* *NodeDegraded* The node update failed. You can view the reason for the failure by using the `oc describe machineconfignode` command, as described later in this section.
* *Update Post update action* The MCO is reloading CRI-O, as needed.
* *Rebooted Node* The MCO is rebooting the node, as needed.
* *Update Complete*. The MCO is uncordoning the node, updating the node state to the cluster, and resumes producing node metrics. It contains the following sub-phase:
** *Uncordoned*
* *Updated* The MCO completed a node update and the current config version of the node is equal to the desired updated version.
* *Resumed*. The MCO restarted the config drift monitor process and the node returns to operational state.
* *ImagePulledFromRegistry*. The MCO has pulled the desired custom layered image. This condition applies only to nodes on which {image-mode-os-on-lower} has been configured.
+
In order to see *ImagePulledFromRegistry* in the output, you must enable the `TechPreviewNoUpgrade` feature set on the cluster. For more information, see "Enabling features using feature gates".
+
[NOTE]
====
Enabling the `TechPreviewNoUpgrade` feature set cannot be undone and prevents minor version updates. These feature sets are not recommended on production clusters.
====
+
--
:FeatureName: The `ImagePulledFromRegistry` condition
include::snippets/technology-preview.adoc[]
--
As the update moves through these phases, you can query the `MachineConfigNode` custom resource, which reports one of the following conditions for each phase:
* `True`. The phase is complete on that node.
* `False`. The phase has not yet started or will not be executed on that node.
* `Unknown`. The phase is either being executed on that node or has an error. If the phase has an error, you can use the `oc describe machineconfignodes` command for more information, as described later in this section.
For example, consider a cluster with a newly-created machine config:
[source,terminal]
----
$ oc get machineconfig
----
.Example output
[source,text]
----
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
# ...
rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 c00e2c941bc6e236b50e0bf3988e6c790cf2bbb2 3.5.0 6d15h
rendered-master-a386c2d1550b927d274054124f58be68 c00e2c941bc6e236b50e0bf3988e6c790cf2bbb2 3.5.0 7m26s
# ...
rendered-worker-01f27f752eb84eba917450e43636b210 c00e2c941bc6e236b50e0bf3988e6c790cf2bbb2 3.5.0 6d15h <1>
rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 c00e2c941bc6e236b50e0bf3988e6c790cf2bbb2 3.5.0 7m26s <2>
# ...
----
<1> The current machine config for the worker nodes.
<2> The newly-created machine config that is being applied to the worker nodes.
You can watch as the nodes are updated with the new machine config:
[source,terminal]
----
$ oc get machineconfignodes
----
.Example output
[source,text]
----
NAME POOLNAME DESIREDCONFIG CURRENTCONFIG UPDATED AGE
ci-ln-ds73n5t-72292-9xsm9-master-0 master rendered-master-a386c2d1550b927d274054124f58be68 rendered-master-a386c2d1550b927d274054124f58be68 True 27M
ci-ln-ds73n5t-72292-9xsm9-master-1 master rendered-master-a386c2d1550b927d274054124f58be68 rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 False 27M
ci-ln-ds73n5t-72292-9xsm9-master-2 master rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 True 27M
ci-ln-ds73n5t-72292-9xsm9-worker-a-2d8tz worker-cnf rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 True 20M <1>
ci-ln-ds73n5t-72292-9xsm9-worker-b-gw5sd worker rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 rendered-worker-01f27f752eb84eba917450e43636b210 False 20M <2>
ci-ln-ds73n5t-72292-9xsm9-worker-c-t227w worker rendered-worker-01f27f752eb84eba917450e43636b210 rendered-worker-01f27f752eb84eba917450e43636b210 True 19M <3>
----
<1> This node has been updated. The new machine config, `rendered-worker-f351f6947f15cd0380514f4b1c89f8f2`, is shown as the desired and current machine configs.
<2> This node is currently being updated to the new machine config. The previous and new machine configs are shown as the desired and current machine configs, respectively.
<3> This node has not yet been updated to the new machine config. The previous machine config is shown as the desired and current machine configs.
.Basic machine config node fields
[cols="1,4",options="header"]
|===
|Field |Meaning
|`NAME` |The name of the node.
|`POOLNAME` |The name of the machine config pool associated with that node.
|`DESIREDCONFIG` |The name of the new machine config that the node updates to.
|`CURRENTCONFIG` |The name of the current machine configuration on that node.
|`UPDATED` a|Indicates if the node has been updated by using one of the following conditions:
* If `False`, the node is being updated to the new machine configuration shown in the `DESIREDCONFIG` field.
* If `True`, and the `CURRENTCONFIG` matches the new machine configuration shown in the `DESIREDCONFIG` field, the node has been updated.
* If `True`, and the `CURRENTCONFIG` matches the old machine configuration shown in the `DESIREDCONFIG` field, that node has not been updated yet.
|`AGE` |The age of the machine configuration node from when it was created. The age is not changed if the associated node is updated.
|===
// Field definitions based on https://github.com/openshift/api/pull/1596
You can use the `-o wide` flag to display additional information about the updates:
[source,terminal]
----
$ oc get machineconfignodes -o wide
----
.Example output
[source,text]
----
NAME POOLNAME DESIREDCONFIG CURRENTCONFIG UPDATED AGE UPDATEPREPARED UPDATEEXECUTED UPDATEPOSTACTIONCOMPLETE UPDATECOMPLETE RESUMED UPDATEDFILESANDOS CORDONEDNODE DRAINEDNODE REBOOTEDNODE UNCORDONEDNODE
ci-ln-ds73n5t-72292-9xsm9-master-0 master rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 True 27M False False False False False False False False False False
ci-ln-ds73n5t-72292-9xsm9-master-1 master rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 True 27M False False False False False False False False False False
ci-ln-ds73n5t-72292-9xsm9-master-2 master rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 rendered-master-23cf200e4ee97daa6e39fdce24c9fb67 True 27M False False False False False False False False False False
ci-ln-ds73n5t-72292-9xsm9-worker-a-2d8tz worker-cnf rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 True 20M False False False False False False False False False False
ci-ln-ds73n5t-72292-9xsm9-worker-b-gw5sd worker rendered-worker-f351f6947f15cd0380514f4b1c89f8f2 rendered-worker-01f27f752eb84eba917450e43636b210 False 20M True True Unknown False False True True True Unknown False
ci-ln-ds73n5t-72292-9xsm9-worker-c-t227w worker rendered-worker-01f27f752eb84eba917450e43636b210 rendered-worker-01f27f752eb84eba917450e43636b210 True 19M False False False False False False False False False False
----
In addition to the fields defined in the previous table, the `-o wide` output displays the following fields:
.Machine config node fields in the `-o wide` output
[cols="1,4",options="header"]
|===
|Phase Name |Definition
|`UPDATEPREPARED` |Indicates if the MCO is preparing to update the node.
|`UPDATEEXECUTED` |Indicates if the MCO has completed the body of the update on the node.
|`UPDATEPOSTACTIONCOMPLETE` |Indicates if the MCO has executed the post-update actions on the node.
|`UPDATECOMPLETE` |Indicates if the MCO has completed the update on the node.
|`RESUMED` |Indicates if the node has resumed normal processes.
|`UPDATEDFILESANDOS` |Indicates if the MCO has updated the node files and operating system.
|`CORDONEDNODE` |Indicates if the MCO has marked the node as not schedulable.
|`DRAINEDNODE` |Indicates if the MCO has drained the node.
|`REBOOTEDNODE` |Indicates if the MCO has restarted the node.
|`UNCORDONEDNODE` |Indicates if the MCO has marked the node as schedulable.
|===
For more details on the update status, you can use the `oc describe machineconfignode` command:
[source,terminal]
----
$ oc describe machineconfignode/<machine_config_node_name>
----
.Example output
[source,text]
----
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigNode
metadata:
creationTimestamp: "2025-04-28T18:40:29Z"
generation: 3
name: <machine_config_node_name> <1>
# ...
spec:
configVersion:
desired: rendered-master-34f96af2e41acb615410b97ce1c819e6 <2>
node:
name: ci-ln-921r7qk-72292-kxv95-master-0
pool:
name: master
status:
conditions:
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: All pinned image sets complete
reason: AsExpected
status: "False"
type: PinnedImageSetsProgressing
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the UpdatePrepared phase
reason: NotYetOccurred
status: "False"
type: UpdatePrepared
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the UpdateExecuted phase
reason: NotYetOccurred
status: "False"
type: UpdateExecuted
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the UpdatePostActionComplete phase
reason: NotYetOccurred
status: "False"
type: UpdatePostActionComplete
- lastTransitionTime: "2025-04-28T18:42:08Z"
message: 'Action during update to rendered-master-34f96af2e41acb615410b97ce1c819e6:
Uncordoned Node as part of completing upgrade phase'
reason: Uncordoned
status: "False"
type: UpdateComplete
- lastTransitionTime: "2025-04-28T18:42:08Z"
message: 'Action during update to rendered-master-34f96af2e41acb615410b97ce1c819e6:
In desired config . Resumed normal operations.'
reason: Resumed
status: "False"
type: Resumed
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the Drained phase
reason: NotYetOccurred
status: "False"
type: Drained
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the AppliedFilesAndOS phase
reason: NotYetOccurred
status: "False"
type: AppliedFilesAndOS
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the Cordoned phase
reason: NotYetOccurred
status: "False"
type: Cordoned
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the RebootedNode phase
reason: NotYetOccurred
status: "False"
type: RebootedNode
- lastTransitionTime: "2025-04-28T18:42:08Z"
message: Node ci-ln-921r7qk-72292-kxv95-master-0 Updated
reason: Updated
status: "True"
type: Updated
- lastTransitionTime: "2025-04-28T18:42:08Z"
message: 'Action during update to rendered-master-34f96af2e41acb615410b97ce1c819e6:
UnCordoned node. The node is reporting Unschedulable = false'
reason: UpdateCompleteUncordoned
status: "False"
type: Uncordoned
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: This node has not yet entered the NodeDegraded phase
reason: NotYetOccurred
status: "False"
type: NodeDegraded
- lastTransitionTime: "2025-04-28T18:41:09Z"
message: All is good
reason: AsExpected
status: "False"
type: PinnedImageSetsDegraded
configVersion:
current: rendered-master-34f96af2e41acb615410b97ce1c819e6 <3>
desired: rendered-master-34f96af2e41acb615410b97ce1c819e6
observedGeneration: 4
----
<1> The `MachineConfigNode` object name.
<2> The new machine configuration. This field updates after the MCO validates the machine config in the `UPDATEPREPARED` phase, then the status adds the new configuration.
<3> The current machine config on the node.
For clusters configured with {image-mode-os-on-lower}, the machine config node output also includes the name of the custom layered image that was applied to affected nodes.
include::snippets/mco-mcn-ocl-example.adoc[]
In order to see the custom layered image in the output, you must enable the `TechPreviewNoUpgrade` feature set on the cluster. For more information, see "Enabling features using feature gates".
[NOTE]
====
Enabling the `TechPreviewNoUpgrade` feature set cannot be undone and prevents minor version updates. These feature sets are not recommended on production clusters.
====
:FeatureName: The custom layered image output
include::snippets/technology-preview.adoc[]