diff --git a/modules/eco-about-node-maintenance-standalone.adoc b/modules/eco-about-node-maintenance-standalone.adoc index 570ff17081..7482ac3101 100644 --- a/modules/eco-about-node-maintenance-standalone.adoc +++ b/modules/eco-about-node-maintenance-standalone.adoc @@ -5,8 +5,6 @@ [id="eco-about-node-maintenance-operator_{context}"] = About the Node Maintenance Operator -You can place nodes into maintenance mode using the `oc adm` utility, or using `NodeMaintenance` custom resources (CRs). - The Node Maintenance Operator watches for new or deleted `NodeMaintenance` CRs. When a new `NodeMaintenance` CR is detected, no new workloads are scheduled and the node is cordoned off from the rest of the cluster. All pods that can be evicted are evicted from the node. When a `NodeMaintenance` CR is deleted, the node that is referenced in the CR is made available for new workloads. [NOTE] diff --git a/modules/eco-checking_status_of_node_maintenance_cr_tasks.adoc b/modules/eco-checking_status_of_node_maintenance_cr_tasks.adoc index 32c025f72f..6a6df77203 100644 --- a/modules/eco-checking_status_of_node_maintenance_cr_tasks.adoc +++ b/modules/eco-checking_status_of_node_maintenance_cr_tasks.adoc @@ -36,12 +36,16 @@ items: nodeName: node-1.example.com reason: Node maintenance status: - evictionPods: 3 <1> - lastError: "Last failure message" <2> + drainProgress: 100 <1> + evictionPods: 3 <2> + lastError: "Last failure message" <3> + lastUpdate: "2022-06-23T11:43:18Z" <4> phase: Succeeded - totalpods: 5 <3> + totalpods: 5 <5> ... ---- -<1> The number of pods scheduled for eviction. -<2> The latest eviction error, if any. -<3> The total number of pods before the node entered maintenance mode. \ No newline at end of file +<1> The percentage completion of draining the node. +<2> The number of pods scheduled for eviction. +<3> The latest eviction error, if any. +<4> The last time the status was updated. +<5> The total number of pods before the node entered maintenance mode. diff --git a/modules/eco-node-maintenance-operator-installation-web-console.adoc b/modules/eco-node-maintenance-operator-installation-web-console.adoc index e50d612a33..319561f3bb 100644 --- a/modules/eco-node-maintenance-operator-installation-web-console.adoc +++ b/modules/eco-node-maintenance-operator-installation-web-console.adoc @@ -29,4 +29,5 @@ To confirm that the installation is successful: If the Operator is not installed successfully: . Navigate to the *Operators* -> *Installed Operators* page and inspect the `Status` column for any errors or failures. -. Navigate to the *Workloads* -> *Pods* page and check the logs in any pods in the `openshift-operators` project that are reporting issues. +. Navigate to the *Operators* -> *Installed Operators* -> *Node Maintenance Operator* -> *Details* page, and inspect the `Conditions` section for errors before pod creation. +. Navigate to the *Workloads* -> *Pods* page, search for the `Node Maintenance Operator` pod in the installed namespace, and check the logs in the `Logs` tab. diff --git a/modules/eco-resuming-node-maintenance-actions-web-console.adoc b/modules/eco-resuming-node-maintenance-actions-web-console.adoc new file mode 100644 index 0000000000..85004f7c98 --- /dev/null +++ b/modules/eco-resuming-node-maintenance-actions-web-console.adoc @@ -0,0 +1,24 @@ +// Module included in the following assemblies: +// +//nodes/nodes/eco-node-maintenance-operator.adoc + +:_content-type: PROCEDURE +[id="eco-resuming-node-maintenance-actions-web-console_{context}"] += Resuming a bare-metal node from maintenance mode +Resume a bare-metal node from maintenance mode using the Options menu {kebab} found on each node in the *Compute* -> *Nodes* list, or using the *Actions* control of the *Node Details* screen. + +.Procedure + +. From the *Administrator* perspective of the web console, click *Compute* -> *Nodes*. +. You can resume the node from this screen, which makes it easier to perform actions on multiple nodes, or from the *Node Details* screen, where you can view comprehensive details of the selected node: +** Click the Options menu {kebab} at the end of the node and select +*Stop Maintenance*. +** Click the node name to open the *Node Details* screen and click +*Actions* -> *Stop Maintenance*. +. Click *Stop Maintenance* in the confirmation window. + +The node becomes schedulable. If it had virtual machine instances that were running on the node prior to maintenance, then they will not automatically migrate back to this node. + +.Verification + +* Navigate to the *Compute* -> *Nodes* page and verify that the corresponding node has a status of `Ready`. diff --git a/modules/eco-resuming-node-maintenance-cr-cli.adoc b/modules/eco-resuming-node-maintenance-cr-cli.adoc index ae50931f12..0dbbed583f 100644 --- a/modules/eco-resuming-node-maintenance-cr-cli.adoc +++ b/modules/eco-resuming-node-maintenance-cr-cli.adoc @@ -28,3 +28,24 @@ $ oc delete -f nodemaintenance-cr.yaml ---- nodemaintenance.nodemaintenance.medik8s.io "maintenance-example" deleted ---- + +.Verification + +. Check the progress of the maintenance task by running the following command: ++ +[source,terminal] +---- +$ oc describe node +---- ++ +where `` is the name of your node; for example, `node-1.example.com` + +. Check the example output: ++ +[source,terminal] +---- +Events: + Type Reason Age From Message + ---- ------ ---- ---- ------- + Normal NodeSchedulable 2m kubelet Node node-1.example.com status is now: NodeSchedulable +---- diff --git a/modules/eco-resuming-node-maintenance-cr-web-console.adoc b/modules/eco-resuming-node-maintenance-cr-web-console.adoc index c57676a719..87703af9eb 100644 --- a/modules/eco-resuming-node-maintenance-cr-web-console.adoc +++ b/modules/eco-resuming-node-maintenance-cr-web-console.adoc @@ -7,7 +7,7 @@ = Resuming a node from maintenance mode by using the web console To resume a node from maintenance mode, you can delete a `NodeMaintenance` custom resource (CR) by using the web console. - + .Prerequisites * Log in as a user with `cluster-admin` privileges. @@ -27,4 +27,4 @@ To resume a node from maintenance mode, you can delete a `NodeMaintenance` custo . In the {product-title} console, click *Compute → Nodes*. -. Inspect the `Status` column of the node for which you deleted the `NodeMaintenance` CR and verify that its status is `Ready`. \ No newline at end of file +. Inspect the `Status` column of the node for which you deleted the `NodeMaintenance` CR and verify that its status is `Ready`. diff --git a/modules/eco-setting-node-maintenance-actions-web-console.adoc b/modules/eco-setting-node-maintenance-actions-web-console.adoc new file mode 100644 index 0000000000..40caf35088 --- /dev/null +++ b/modules/eco-setting-node-maintenance-actions-web-console.adoc @@ -0,0 +1,23 @@ +// Module included in the following assemblies: +// +//nodes/nodes/eco-node-maintenance-operator.adoc + +:_content-type: PROCEDURE +[id="eco-setting-node-maintenance-actions-web-console_{context}"] += Setting a bare-metal node to maintenance mode +Set a bare-metal node to maintenance mode using the Options menu {kebab} found on each node in the *Compute* -> *Nodes* list, or using the *Actions* control of the *Node Details* screen. + +.Procedure + +. From the *Administrator* perspective of the web console, click *Compute* -> *Nodes*. +. You can set the node to maintenance from this screen, which makes it easier to perform actions on multiple nodes, or from the *Node Details* screen, where you can view comprehensive details of the selected node: +** Click the Options menu {kebab} at the end of the node and select *Start Maintenance*. +** Click the node name to open the *Node Details* screen and click +*Actions* -> *Start Maintenance*. +. Click *Start Maintenance* in the confirmation window. + +The node is no longer schedulable. If it had virtual machines with the `LiveMigration` eviction strategy, then it will live migrate them. All other pods and virtual machines on the node are deleted and recreated on another node. + +.Verification + +* Navigate to the *Compute* -> *Nodes* page and verify that the corresponding node has a status of `Under maintenance`. diff --git a/modules/eco-setting-node-maintenance-cr-cli.adoc b/modules/eco-setting-node-maintenance-cr-cli.adoc index a252aa3704..31ef35f469 100644 --- a/modules/eco-setting-node-maintenance-cr-cli.adoc +++ b/modules/eco-setting-node-maintenance-cr-cli.adoc @@ -38,14 +38,18 @@ spec: $ oc apply -f nodemaintenance-cr.yaml ---- -. Check the progress of the maintenance task by running the following command, replacing `` with the name of your node; for example, `node-1.example.com`: +.Verification + +. Check the progress of the maintenance task by running the following command: + [source,terminal] ---- -$ oc describe node node-1.example.com +$ oc describe node ---- + -.Example output +where `` is the name of your node; for example, `node-1.example.com` + +. Check the example output: + [source,terminal] ---- diff --git a/modules/eco-setting-node-maintenance-cr-web-console.adoc b/modules/eco-setting-node-maintenance-cr-web-console.adoc index 4ccb2a1961..77b6371406 100644 --- a/modules/eco-setting-node-maintenance-cr-web-console.adoc +++ b/modules/eco-setting-node-maintenance-cr-web-console.adoc @@ -27,4 +27,4 @@ To set a node to maintenance mode, you can create a `NodeMaintenance` custom res .Verification -In the *Node Maintenance* tab, inspect the `Status` column and verify that its status is `Succeeded`. \ No newline at end of file +In the *Node Maintenance* tab, inspect the `Status` column and verify that its status is `Succeeded`. diff --git a/nodes/nodes/eco-node-maintenance-operator.adoc b/nodes/nodes/eco-node-maintenance-operator.adoc index 1758a8026d..39393e4cd2 100644 --- a/nodes/nodes/eco-node-maintenance-operator.adoc +++ b/nodes/nodes/eco-node-maintenance-operator.adoc @@ -6,21 +6,19 @@ include::_attributes/common-attributes.adoc[] toc::[] -You can use the Node Maintenance Operator to place nodes in maintenance mode. This is a standalone version of the Node Maintenance Operator that is independent of {VirtProductName} installation. - -[NOTE] -==== -If you have installed {VirtProductName}, you must use the Node Maintenance Operator that is bundled with it. -==== +You can use the Node Maintenance Operator to place nodes in maintenance mode by using the `oc adm` utility or `NodeMaintenance` custom resources (CRs). include::modules/eco-about-node-maintenance-standalone.adoc[leveloffset=+1] -include::modules/eco-maintaining-bare-metal-nodes.adoc[leveloffset=+1] - [id="installing-standalone-nmo"] == Installing the Node Maintenance Operator You can install the Node Maintenance Operator using the web console or the OpenShift CLI (`oc`). +[NOTE] +==== +If {VirtProductName} version 4.10 or less is installed in your cluster, it includes an outdated version of the Node Maintenance Operator version. +==== + include::modules/eco-node-maintenance-operator-installation-web-console.adoc[leveloffset=+2] include::modules/eco-node-maintenance-operator-installation-cli.adoc[leveloffset=+2] @@ -29,23 +27,37 @@ The Node Maintenance Operator is supported in a restricted network environment. [id="setting-node-in-maintenance-mode"] == Setting a node to maintenance mode -You can place a node into maintenance from the web console or in the CLI by using a `NodeMaintenance` CR. +You can place a node into maintenance mode from the web console or from the CLI by using a `NodeMaintenance` CR. include::modules/eco-setting-node-maintenance-cr-web-console.adoc[leveloffset=+2] include::modules/eco-setting-node-maintenance-cr-cli.adoc[leveloffset=+2] -include::modules/eco-checking_status_of_node_maintenance_cr_tasks.adoc[leveloffset=+3] +include::modules/eco-checking_status_of_node_maintenance_cr_tasks.adoc[leveloffset=+2] [id="resuming-node-from-maintenance-mode"] == Resuming a node from maintenance mode - -You can resume a node from maintenance mode from the CLI or by using a `NodeMaintenance` CR. Resuming a node brings it out of maintenance mode and makes it schedulable again. +You can resume a node from maintenance mode from the web console or from the CLI by using a `NodeMaintenance` CR. Resuming a node brings it out of maintenance mode and makes it schedulable again. include::modules/eco-resuming-node-maintenance-cr-web-console.adoc[leveloffset=+2] include::modules/eco-resuming-node-maintenance-cr-cli.adoc[leveloffset=+2] +[id="working-with-bare-metal-nodes"] +== Working with bare-metal nodes +For clusters with bare-metal nodes, you can place a node into maintenance mode, and resume a node from maintenance mode, by using the web console *Actions* control. + +[NOTE] +==== +Clusters with bare-metal nodes can also place a node into maintenance mode, and resume a node from maintenance mode, by using the web console and CLI, as outlined. These methods, by using the web console *Actions* control, are applicable to bare-metal clusters only. +==== + +include::modules/eco-maintaining-bare-metal-nodes.adoc[leveloffset=+2] + +include::modules/eco-setting-node-maintenance-actions-web-console.adoc[leveloffset=+2] + +include::modules/eco-resuming-node-maintenance-actions-web-console.adoc[leveloffset=+2] + [id="gather-data-nmo"] == Gathering data about the Node Maintenance Operator To collect debugging information about the Node Maintenance Operator, use the `must-gather` tool. For information about the `must-gather` image for the Node Maintenance Operator, see xref:../../support/gathering-cluster-data.adoc#gathering-data-specific-features_gathering-cluster-data[Gathering data about specific features]. @@ -55,4 +67,4 @@ To collect debugging information about the Node Maintenance Operator, use the `m == Additional resources * xref:../../support/gathering-cluster-data.adoc#gathering-cluster-data[Gathering data about your cluster] * xref:../../nodes/nodes/nodes-nodes-working.adoc#nodes-nodes-working-evacuating_nodes-nodes-working[Understanding how to evacuate pods on nodes] -* xref:../../nodes/nodes/nodes-nodes-working.adoc#nodes-nodes-working-marking_nodes-nodes-working[Understanding how to mark nodes as unschedulable or schedulable] \ No newline at end of file +* xref:../../nodes/nodes/nodes-nodes-working.adoc#nodes-nodes-working-marking_nodes-nodes-working[Understanding how to mark nodes as unschedulable or schedulable]