1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-05 12:46:18 +01:00

TELCODOCS-28: New status items, drainProgress and elapsedTime, added

TELCODOCS-28: Status items updated / Setting a node to maintenance mode in the web console using the Actions control added

TELCODOCS-28: SME feedback applied

TELCODOCS-28: Added new module 'Resuming a node from maintenance mode in the web console using the Actions control'

TELCODOCS-28: Applied SME / QE feedback

TELCODOCS-28: Applied 2nd round of edits from SME / QE feedback

TELCODOCS-28: Applied 3rd round of edits from SME / QE feedback

TELCODOCS-28: Peer review feedback applied

TELCODOCS-28: Applied 4th round of edits from SME / QE feedback

TELCODOCS-28: Applied 5th round of edits from SME / QE feedback

TELCODOCS-28: Resolve merge conflict in NMO assembly

TELCODOCS-28: Applied 6th round of edits from SME feedback

TELCODOCS-28: Applied 2nd round of edits from peer review

TELCODOCS-28: Applied 3rd round of edits from peer review
This commit is contained in:
Padraig O'Grady
2022-06-22 10:53:34 +01:00
committed by openshift-cherrypick-robot
parent 6709a9ef99
commit 13412a9d1e
10 changed files with 115 additions and 28 deletions

View File

@@ -5,8 +5,6 @@
[id="eco-about-node-maintenance-operator_{context}"]
= About the Node Maintenance Operator
You can place nodes into maintenance mode using the `oc adm` utility, or using `NodeMaintenance` custom resources (CRs).
The Node Maintenance Operator watches for new or deleted `NodeMaintenance` CRs. When a new `NodeMaintenance` CR is detected, no new workloads are scheduled and the node is cordoned off from the rest of the cluster. All pods that can be evicted are evicted from the node. When a `NodeMaintenance` CR is deleted, the node that is referenced in the CR is made available for new workloads.
[NOTE]

View File

@@ -36,12 +36,16 @@ items:
nodeName: node-1.example.com
reason: Node maintenance
status:
evictionPods: 3 <1>
lastError: "Last failure message" <2>
drainProgress: 100 <1>
evictionPods: 3 <2>
lastError: "Last failure message" <3>
lastUpdate: "2022-06-23T11:43:18Z" <4>
phase: Succeeded
totalpods: 5 <3>
totalpods: 5 <5>
...
----
<1> The number of pods scheduled for eviction.
<2> The latest eviction error, if any.
<3> The total number of pods before the node entered maintenance mode.
<1> The percentage completion of draining the node.
<2> The number of pods scheduled for eviction.
<3> The latest eviction error, if any.
<4> The last time the status was updated.
<5> The total number of pods before the node entered maintenance mode.

View File

@@ -29,4 +29,5 @@ To confirm that the installation is successful:
If the Operator is not installed successfully:
. Navigate to the *Operators* -> *Installed Operators* page and inspect the `Status` column for any errors or failures.
. Navigate to the *Workloads* -> *Pods* page and check the logs in any pods in the `openshift-operators` project that are reporting issues.
. Navigate to the *Operators* -> *Installed Operators* -> *Node Maintenance Operator* -> *Details* page, and inspect the `Conditions` section for errors before pod creation.
. Navigate to the *Workloads* -> *Pods* page, search for the `Node Maintenance Operator` pod in the installed namespace, and check the logs in the `Logs` tab.

View File

@@ -0,0 +1,24 @@
// Module included in the following assemblies:
//
//nodes/nodes/eco-node-maintenance-operator.adoc
:_content-type: PROCEDURE
[id="eco-resuming-node-maintenance-actions-web-console_{context}"]
= Resuming a bare-metal node from maintenance mode
Resume a bare-metal node from maintenance mode using the Options menu {kebab} found on each node in the *Compute* -> *Nodes* list, or using the *Actions* control of the *Node Details* screen.
.Procedure
. From the *Administrator* perspective of the web console, click *Compute* -> *Nodes*.
. You can resume the node from this screen, which makes it easier to perform actions on multiple nodes, or from the *Node Details* screen, where you can view comprehensive details of the selected node:
** Click the Options menu {kebab} at the end of the node and select
*Stop Maintenance*.
** Click the node name to open the *Node Details* screen and click
*Actions* -> *Stop Maintenance*.
. Click *Stop Maintenance* in the confirmation window.
The node becomes schedulable. If it had virtual machine instances that were running on the node prior to maintenance, then they will not automatically migrate back to this node.
.Verification
* Navigate to the *Compute* -> *Nodes* page and verify that the corresponding node has a status of `Ready`.

View File

@@ -28,3 +28,24 @@ $ oc delete -f nodemaintenance-cr.yaml
----
nodemaintenance.nodemaintenance.medik8s.io "maintenance-example" deleted
----
.Verification
. Check the progress of the maintenance task by running the following command:
+
[source,terminal]
----
$ oc describe node <node-name>
----
+
where `<node-name>` is the name of your node; for example, `node-1.example.com`
. Check the example output:
+
[source,terminal]
----
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal NodeSchedulable 2m kubelet Node node-1.example.com status is now: NodeSchedulable
----

View File

@@ -7,7 +7,7 @@
= Resuming a node from maintenance mode by using the web console
To resume a node from maintenance mode, you can delete a `NodeMaintenance` custom resource (CR) by using the web console.
.Prerequisites
* Log in as a user with `cluster-admin` privileges.
@@ -27,4 +27,4 @@ To resume a node from maintenance mode, you can delete a `NodeMaintenance` custo
. In the {product-title} console, click *Compute → Nodes*.
. Inspect the `Status` column of the node for which you deleted the `NodeMaintenance` CR and verify that its status is `Ready`.
. Inspect the `Status` column of the node for which you deleted the `NodeMaintenance` CR and verify that its status is `Ready`.

View File

@@ -0,0 +1,23 @@
// Module included in the following assemblies:
//
//nodes/nodes/eco-node-maintenance-operator.adoc
:_content-type: PROCEDURE
[id="eco-setting-node-maintenance-actions-web-console_{context}"]
= Setting a bare-metal node to maintenance mode
Set a bare-metal node to maintenance mode using the Options menu {kebab} found on each node in the *Compute* -> *Nodes* list, or using the *Actions* control of the *Node Details* screen.
.Procedure
. From the *Administrator* perspective of the web console, click *Compute* -> *Nodes*.
. You can set the node to maintenance from this screen, which makes it easier to perform actions on multiple nodes, or from the *Node Details* screen, where you can view comprehensive details of the selected node:
** Click the Options menu {kebab} at the end of the node and select *Start Maintenance*.
** Click the node name to open the *Node Details* screen and click
*Actions* -> *Start Maintenance*.
. Click *Start Maintenance* in the confirmation window.
The node is no longer schedulable. If it had virtual machines with the `LiveMigration` eviction strategy, then it will live migrate them. All other pods and virtual machines on the node are deleted and recreated on another node.
.Verification
* Navigate to the *Compute* -> *Nodes* page and verify that the corresponding node has a status of `Under maintenance`.

View File

@@ -38,14 +38,18 @@ spec:
$ oc apply -f nodemaintenance-cr.yaml
----
. Check the progress of the maintenance task by running the following command, replacing `<node-name>` with the name of your node; for example, `node-1.example.com`:
.Verification
. Check the progress of the maintenance task by running the following command:
+
[source,terminal]
----
$ oc describe node node-1.example.com
$ oc describe node <node-name>
----
+
.Example output
where `<node-name>` is the name of your node; for example, `node-1.example.com`
. Check the example output:
+
[source,terminal]
----

View File

@@ -27,4 +27,4 @@ To set a node to maintenance mode, you can create a `NodeMaintenance` custom res
.Verification
In the *Node Maintenance* tab, inspect the `Status` column and verify that its status is `Succeeded`.
In the *Node Maintenance* tab, inspect the `Status` column and verify that its status is `Succeeded`.

View File

@@ -6,21 +6,19 @@ include::_attributes/common-attributes.adoc[]
toc::[]
You can use the Node Maintenance Operator to place nodes in maintenance mode. This is a standalone version of the Node Maintenance Operator that is independent of {VirtProductName} installation.
[NOTE]
====
If you have installed {VirtProductName}, you must use the Node Maintenance Operator that is bundled with it.
====
You can use the Node Maintenance Operator to place nodes in maintenance mode by using the `oc adm` utility or `NodeMaintenance` custom resources (CRs).
include::modules/eco-about-node-maintenance-standalone.adoc[leveloffset=+1]
include::modules/eco-maintaining-bare-metal-nodes.adoc[leveloffset=+1]
[id="installing-standalone-nmo"]
== Installing the Node Maintenance Operator
You can install the Node Maintenance Operator using the web console or the OpenShift CLI (`oc`).
[NOTE]
====
If {VirtProductName} version 4.10 or less is installed in your cluster, it includes an outdated version of the Node Maintenance Operator version.
====
include::modules/eco-node-maintenance-operator-installation-web-console.adoc[leveloffset=+2]
include::modules/eco-node-maintenance-operator-installation-cli.adoc[leveloffset=+2]
@@ -29,23 +27,37 @@ The Node Maintenance Operator is supported in a restricted network environment.
[id="setting-node-in-maintenance-mode"]
== Setting a node to maintenance mode
You can place a node into maintenance from the web console or in the CLI by using a `NodeMaintenance` CR.
You can place a node into maintenance mode from the web console or from the CLI by using a `NodeMaintenance` CR.
include::modules/eco-setting-node-maintenance-cr-web-console.adoc[leveloffset=+2]
include::modules/eco-setting-node-maintenance-cr-cli.adoc[leveloffset=+2]
include::modules/eco-checking_status_of_node_maintenance_cr_tasks.adoc[leveloffset=+3]
include::modules/eco-checking_status_of_node_maintenance_cr_tasks.adoc[leveloffset=+2]
[id="resuming-node-from-maintenance-mode"]
== Resuming a node from maintenance mode
You can resume a node from maintenance mode from the CLI or by using a `NodeMaintenance` CR. Resuming a node brings it out of maintenance mode and makes it schedulable again.
You can resume a node from maintenance mode from the web console or from the CLI by using a `NodeMaintenance` CR. Resuming a node brings it out of maintenance mode and makes it schedulable again.
include::modules/eco-resuming-node-maintenance-cr-web-console.adoc[leveloffset=+2]
include::modules/eco-resuming-node-maintenance-cr-cli.adoc[leveloffset=+2]
[id="working-with-bare-metal-nodes"]
== Working with bare-metal nodes
For clusters with bare-metal nodes, you can place a node into maintenance mode, and resume a node from maintenance mode, by using the web console *Actions* control.
[NOTE]
====
Clusters with bare-metal nodes can also place a node into maintenance mode, and resume a node from maintenance mode, by using the web console and CLI, as outlined. These methods, by using the web console *Actions* control, are applicable to bare-metal clusters only.
====
include::modules/eco-maintaining-bare-metal-nodes.adoc[leveloffset=+2]
include::modules/eco-setting-node-maintenance-actions-web-console.adoc[leveloffset=+2]
include::modules/eco-resuming-node-maintenance-actions-web-console.adoc[leveloffset=+2]
[id="gather-data-nmo"]
== Gathering data about the Node Maintenance Operator
To collect debugging information about the Node Maintenance Operator, use the `must-gather` tool. For information about the `must-gather` image for the Node Maintenance Operator, see xref:../../support/gathering-cluster-data.adoc#gathering-data-specific-features_gathering-cluster-data[Gathering data about specific features].
@@ -55,4 +67,4 @@ To collect debugging information about the Node Maintenance Operator, use the `m
== Additional resources
* xref:../../support/gathering-cluster-data.adoc#gathering-cluster-data[Gathering data about your cluster]
* xref:../../nodes/nodes/nodes-nodes-working.adoc#nodes-nodes-working-evacuating_nodes-nodes-working[Understanding how to evacuate pods on nodes]
* xref:../../nodes/nodes/nodes-nodes-working.adoc#nodes-nodes-working-marking_nodes-nodes-working[Understanding how to mark nodes as unschedulable or schedulable]
* xref:../../nodes/nodes/nodes-nodes-working.adoc#nodes-nodes-working-marking_nodes-nodes-working[Understanding how to mark nodes as unschedulable or schedulable]