mirror of
https://github.com/openshift/openshift-docs.git
synced 2026-02-05 12:46:18 +01:00
TELCODOCS-28: New status items, drainProgress and elapsedTime, added
TELCODOCS-28: Status items updated / Setting a node to maintenance mode in the web console using the Actions control added TELCODOCS-28: SME feedback applied TELCODOCS-28: Added new module 'Resuming a node from maintenance mode in the web console using the Actions control' TELCODOCS-28: Applied SME / QE feedback TELCODOCS-28: Applied 2nd round of edits from SME / QE feedback TELCODOCS-28: Applied 3rd round of edits from SME / QE feedback TELCODOCS-28: Peer review feedback applied TELCODOCS-28: Applied 4th round of edits from SME / QE feedback TELCODOCS-28: Applied 5th round of edits from SME / QE feedback TELCODOCS-28: Resolve merge conflict in NMO assembly TELCODOCS-28: Applied 6th round of edits from SME feedback TELCODOCS-28: Applied 2nd round of edits from peer review TELCODOCS-28: Applied 3rd round of edits from peer review
This commit is contained in:
committed by
openshift-cherrypick-robot
parent
6709a9ef99
commit
13412a9d1e
@@ -5,8 +5,6 @@
|
||||
[id="eco-about-node-maintenance-operator_{context}"]
|
||||
= About the Node Maintenance Operator
|
||||
|
||||
You can place nodes into maintenance mode using the `oc adm` utility, or using `NodeMaintenance` custom resources (CRs).
|
||||
|
||||
The Node Maintenance Operator watches for new or deleted `NodeMaintenance` CRs. When a new `NodeMaintenance` CR is detected, no new workloads are scheduled and the node is cordoned off from the rest of the cluster. All pods that can be evicted are evicted from the node. When a `NodeMaintenance` CR is deleted, the node that is referenced in the CR is made available for new workloads.
|
||||
|
||||
[NOTE]
|
||||
|
||||
@@ -36,12 +36,16 @@ items:
|
||||
nodeName: node-1.example.com
|
||||
reason: Node maintenance
|
||||
status:
|
||||
evictionPods: 3 <1>
|
||||
lastError: "Last failure message" <2>
|
||||
drainProgress: 100 <1>
|
||||
evictionPods: 3 <2>
|
||||
lastError: "Last failure message" <3>
|
||||
lastUpdate: "2022-06-23T11:43:18Z" <4>
|
||||
phase: Succeeded
|
||||
totalpods: 5 <3>
|
||||
totalpods: 5 <5>
|
||||
...
|
||||
----
|
||||
<1> The number of pods scheduled for eviction.
|
||||
<2> The latest eviction error, if any.
|
||||
<3> The total number of pods before the node entered maintenance mode.
|
||||
<1> The percentage completion of draining the node.
|
||||
<2> The number of pods scheduled for eviction.
|
||||
<3> The latest eviction error, if any.
|
||||
<4> The last time the status was updated.
|
||||
<5> The total number of pods before the node entered maintenance mode.
|
||||
|
||||
@@ -29,4 +29,5 @@ To confirm that the installation is successful:
|
||||
If the Operator is not installed successfully:
|
||||
|
||||
. Navigate to the *Operators* -> *Installed Operators* page and inspect the `Status` column for any errors or failures.
|
||||
. Navigate to the *Workloads* -> *Pods* page and check the logs in any pods in the `openshift-operators` project that are reporting issues.
|
||||
. Navigate to the *Operators* -> *Installed Operators* -> *Node Maintenance Operator* -> *Details* page, and inspect the `Conditions` section for errors before pod creation.
|
||||
. Navigate to the *Workloads* -> *Pods* page, search for the `Node Maintenance Operator` pod in the installed namespace, and check the logs in the `Logs` tab.
|
||||
|
||||
@@ -0,0 +1,24 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
//nodes/nodes/eco-node-maintenance-operator.adoc
|
||||
|
||||
:_content-type: PROCEDURE
|
||||
[id="eco-resuming-node-maintenance-actions-web-console_{context}"]
|
||||
= Resuming a bare-metal node from maintenance mode
|
||||
Resume a bare-metal node from maintenance mode using the Options menu {kebab} found on each node in the *Compute* -> *Nodes* list, or using the *Actions* control of the *Node Details* screen.
|
||||
|
||||
.Procedure
|
||||
|
||||
. From the *Administrator* perspective of the web console, click *Compute* -> *Nodes*.
|
||||
. You can resume the node from this screen, which makes it easier to perform actions on multiple nodes, or from the *Node Details* screen, where you can view comprehensive details of the selected node:
|
||||
** Click the Options menu {kebab} at the end of the node and select
|
||||
*Stop Maintenance*.
|
||||
** Click the node name to open the *Node Details* screen and click
|
||||
*Actions* -> *Stop Maintenance*.
|
||||
. Click *Stop Maintenance* in the confirmation window.
|
||||
|
||||
The node becomes schedulable. If it had virtual machine instances that were running on the node prior to maintenance, then they will not automatically migrate back to this node.
|
||||
|
||||
.Verification
|
||||
|
||||
* Navigate to the *Compute* -> *Nodes* page and verify that the corresponding node has a status of `Ready`.
|
||||
@@ -28,3 +28,24 @@ $ oc delete -f nodemaintenance-cr.yaml
|
||||
----
|
||||
nodemaintenance.nodemaintenance.medik8s.io "maintenance-example" deleted
|
||||
----
|
||||
|
||||
.Verification
|
||||
|
||||
. Check the progress of the maintenance task by running the following command:
|
||||
+
|
||||
[source,terminal]
|
||||
----
|
||||
$ oc describe node <node-name>
|
||||
----
|
||||
+
|
||||
where `<node-name>` is the name of your node; for example, `node-1.example.com`
|
||||
|
||||
. Check the example output:
|
||||
+
|
||||
[source,terminal]
|
||||
----
|
||||
Events:
|
||||
Type Reason Age From Message
|
||||
---- ------ ---- ---- -------
|
||||
Normal NodeSchedulable 2m kubelet Node node-1.example.com status is now: NodeSchedulable
|
||||
----
|
||||
|
||||
@@ -7,7 +7,7 @@
|
||||
= Resuming a node from maintenance mode by using the web console
|
||||
|
||||
To resume a node from maintenance mode, you can delete a `NodeMaintenance` custom resource (CR) by using the web console.
|
||||
|
||||
|
||||
.Prerequisites
|
||||
|
||||
* Log in as a user with `cluster-admin` privileges.
|
||||
@@ -27,4 +27,4 @@ To resume a node from maintenance mode, you can delete a `NodeMaintenance` custo
|
||||
|
||||
. In the {product-title} console, click *Compute → Nodes*.
|
||||
|
||||
. Inspect the `Status` column of the node for which you deleted the `NodeMaintenance` CR and verify that its status is `Ready`.
|
||||
. Inspect the `Status` column of the node for which you deleted the `NodeMaintenance` CR and verify that its status is `Ready`.
|
||||
|
||||
@@ -0,0 +1,23 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
//nodes/nodes/eco-node-maintenance-operator.adoc
|
||||
|
||||
:_content-type: PROCEDURE
|
||||
[id="eco-setting-node-maintenance-actions-web-console_{context}"]
|
||||
= Setting a bare-metal node to maintenance mode
|
||||
Set a bare-metal node to maintenance mode using the Options menu {kebab} found on each node in the *Compute* -> *Nodes* list, or using the *Actions* control of the *Node Details* screen.
|
||||
|
||||
.Procedure
|
||||
|
||||
. From the *Administrator* perspective of the web console, click *Compute* -> *Nodes*.
|
||||
. You can set the node to maintenance from this screen, which makes it easier to perform actions on multiple nodes, or from the *Node Details* screen, where you can view comprehensive details of the selected node:
|
||||
** Click the Options menu {kebab} at the end of the node and select *Start Maintenance*.
|
||||
** Click the node name to open the *Node Details* screen and click
|
||||
*Actions* -> *Start Maintenance*.
|
||||
. Click *Start Maintenance* in the confirmation window.
|
||||
|
||||
The node is no longer schedulable. If it had virtual machines with the `LiveMigration` eviction strategy, then it will live migrate them. All other pods and virtual machines on the node are deleted and recreated on another node.
|
||||
|
||||
.Verification
|
||||
|
||||
* Navigate to the *Compute* -> *Nodes* page and verify that the corresponding node has a status of `Under maintenance`.
|
||||
@@ -38,14 +38,18 @@ spec:
|
||||
$ oc apply -f nodemaintenance-cr.yaml
|
||||
----
|
||||
|
||||
. Check the progress of the maintenance task by running the following command, replacing `<node-name>` with the name of your node; for example, `node-1.example.com`:
|
||||
.Verification
|
||||
|
||||
. Check the progress of the maintenance task by running the following command:
|
||||
+
|
||||
[source,terminal]
|
||||
----
|
||||
$ oc describe node node-1.example.com
|
||||
$ oc describe node <node-name>
|
||||
----
|
||||
+
|
||||
.Example output
|
||||
where `<node-name>` is the name of your node; for example, `node-1.example.com`
|
||||
|
||||
. Check the example output:
|
||||
+
|
||||
[source,terminal]
|
||||
----
|
||||
|
||||
@@ -27,4 +27,4 @@ To set a node to maintenance mode, you can create a `NodeMaintenance` custom res
|
||||
|
||||
.Verification
|
||||
|
||||
In the *Node Maintenance* tab, inspect the `Status` column and verify that its status is `Succeeded`.
|
||||
In the *Node Maintenance* tab, inspect the `Status` column and verify that its status is `Succeeded`.
|
||||
|
||||
@@ -6,21 +6,19 @@ include::_attributes/common-attributes.adoc[]
|
||||
|
||||
toc::[]
|
||||
|
||||
You can use the Node Maintenance Operator to place nodes in maintenance mode. This is a standalone version of the Node Maintenance Operator that is independent of {VirtProductName} installation.
|
||||
|
||||
[NOTE]
|
||||
====
|
||||
If you have installed {VirtProductName}, you must use the Node Maintenance Operator that is bundled with it.
|
||||
====
|
||||
You can use the Node Maintenance Operator to place nodes in maintenance mode by using the `oc adm` utility or `NodeMaintenance` custom resources (CRs).
|
||||
|
||||
include::modules/eco-about-node-maintenance-standalone.adoc[leveloffset=+1]
|
||||
|
||||
include::modules/eco-maintaining-bare-metal-nodes.adoc[leveloffset=+1]
|
||||
|
||||
[id="installing-standalone-nmo"]
|
||||
== Installing the Node Maintenance Operator
|
||||
You can install the Node Maintenance Operator using the web console or the OpenShift CLI (`oc`).
|
||||
|
||||
[NOTE]
|
||||
====
|
||||
If {VirtProductName} version 4.10 or less is installed in your cluster, it includes an outdated version of the Node Maintenance Operator version.
|
||||
====
|
||||
|
||||
include::modules/eco-node-maintenance-operator-installation-web-console.adoc[leveloffset=+2]
|
||||
|
||||
include::modules/eco-node-maintenance-operator-installation-cli.adoc[leveloffset=+2]
|
||||
@@ -29,23 +27,37 @@ The Node Maintenance Operator is supported in a restricted network environment.
|
||||
|
||||
[id="setting-node-in-maintenance-mode"]
|
||||
== Setting a node to maintenance mode
|
||||
You can place a node into maintenance from the web console or in the CLI by using a `NodeMaintenance` CR.
|
||||
You can place a node into maintenance mode from the web console or from the CLI by using a `NodeMaintenance` CR.
|
||||
|
||||
include::modules/eco-setting-node-maintenance-cr-web-console.adoc[leveloffset=+2]
|
||||
|
||||
include::modules/eco-setting-node-maintenance-cr-cli.adoc[leveloffset=+2]
|
||||
|
||||
include::modules/eco-checking_status_of_node_maintenance_cr_tasks.adoc[leveloffset=+3]
|
||||
include::modules/eco-checking_status_of_node_maintenance_cr_tasks.adoc[leveloffset=+2]
|
||||
|
||||
[id="resuming-node-from-maintenance-mode"]
|
||||
== Resuming a node from maintenance mode
|
||||
|
||||
You can resume a node from maintenance mode from the CLI or by using a `NodeMaintenance` CR. Resuming a node brings it out of maintenance mode and makes it schedulable again.
|
||||
You can resume a node from maintenance mode from the web console or from the CLI by using a `NodeMaintenance` CR. Resuming a node brings it out of maintenance mode and makes it schedulable again.
|
||||
|
||||
include::modules/eco-resuming-node-maintenance-cr-web-console.adoc[leveloffset=+2]
|
||||
|
||||
include::modules/eco-resuming-node-maintenance-cr-cli.adoc[leveloffset=+2]
|
||||
|
||||
[id="working-with-bare-metal-nodes"]
|
||||
== Working with bare-metal nodes
|
||||
For clusters with bare-metal nodes, you can place a node into maintenance mode, and resume a node from maintenance mode, by using the web console *Actions* control.
|
||||
|
||||
[NOTE]
|
||||
====
|
||||
Clusters with bare-metal nodes can also place a node into maintenance mode, and resume a node from maintenance mode, by using the web console and CLI, as outlined. These methods, by using the web console *Actions* control, are applicable to bare-metal clusters only.
|
||||
====
|
||||
|
||||
include::modules/eco-maintaining-bare-metal-nodes.adoc[leveloffset=+2]
|
||||
|
||||
include::modules/eco-setting-node-maintenance-actions-web-console.adoc[leveloffset=+2]
|
||||
|
||||
include::modules/eco-resuming-node-maintenance-actions-web-console.adoc[leveloffset=+2]
|
||||
|
||||
[id="gather-data-nmo"]
|
||||
== Gathering data about the Node Maintenance Operator
|
||||
To collect debugging information about the Node Maintenance Operator, use the `must-gather` tool. For information about the `must-gather` image for the Node Maintenance Operator, see xref:../../support/gathering-cluster-data.adoc#gathering-data-specific-features_gathering-cluster-data[Gathering data about specific features].
|
||||
@@ -55,4 +67,4 @@ To collect debugging information about the Node Maintenance Operator, use the `m
|
||||
== Additional resources
|
||||
* xref:../../support/gathering-cluster-data.adoc#gathering-cluster-data[Gathering data about your cluster]
|
||||
* xref:../../nodes/nodes/nodes-nodes-working.adoc#nodes-nodes-working-evacuating_nodes-nodes-working[Understanding how to evacuate pods on nodes]
|
||||
* xref:../../nodes/nodes/nodes-nodes-working.adoc#nodes-nodes-working-marking_nodes-nodes-working[Understanding how to mark nodes as unschedulable or schedulable]
|
||||
* xref:../../nodes/nodes/nodes-nodes-working.adoc#nodes-nodes-working-marking_nodes-nodes-working[Understanding how to mark nodes as unschedulable or schedulable]
|
||||
|
||||
Reference in New Issue
Block a user