mirror of
https://github.com/openshift/openshift-docs.git
synced 2026-02-05 12:46:18 +01:00
28 lines
2.1 KiB
Plaintext
28 lines
2.1 KiB
Plaintext
// Module included in the following assemblies:
|
|
//
|
|
// * nodes/nodes/eco-node-health-check-operator.adoc
|
|
|
|
:_mod-docs-content-type: CONCEPT
|
|
[id="control-plane-fencing-self-node-remediation-operator_{context}"]
|
|
= Control plane fencing
|
|
|
|
In earlier releases, you could enable Self Node Remediation and Node Health Check on worker nodes. In the event of node failure, you can now also follow remediation strategies on control plane nodes.
|
|
|
|
Self Node Remediation occurs in two primary scenarios.
|
|
|
|
* API Server Connectivity
|
|
** In this scenario, the control plane node to be remediated is not isolated. It can be directly connected to the API Server, or it can be indirectly connected to the API Server through worker nodes or control-plane nodes, that are directly connected to the API Server.
|
|
** When there is API Server Connectivity, the control plane node is remediated only if the Node Health Check Operator has created a `SelfNodeRemediation` custom resource (CR) for the node.
|
|
|
|
* No API Server Connectivity
|
|
** In this scenario, the control plane node to be remediated is isolated from the API Server. The node cannot connect directly or indirectly to the API Server.
|
|
** When there is no API Server Connectivity, the control plane node will be remediated as outlined with these steps:
|
|
|
|
|
|
*** Check the status of the control plane node with the majority of the peer worker nodes. If the majority of the peer worker nodes cannot be reached, the node will be analyzed further.
|
|
**** Self-diagnose the status of the control plane node
|
|
***** If self diagnostics passed, no action will be taken.
|
|
***** If self diagnostics failed, the node will be fenced and remediated.
|
|
***** The self diagnostics currently supported are checking the `kubelet` service status, and checking endpoint availability using `opt in` configuration.
|
|
*** If the node did not manage to communicate to most of its worker peers, check the connectivity of the control plane node with other control plane nodes. If the node can communicate with any other control plane peer, no action will be taken. Otherwise, the node will be fenced and remediated.
|