1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-05 12:46:18 +01:00

OCPBUGS34817: Infrastructure node workloads 'xtaints' are ambiguous in example and description

This commit is contained in:
Michael Burke
2025-03-21 12:34:54 -04:00
committed by openshift-cherrypick-robot
parent 0f25c098bd
commit 12fd5e8bc3
6 changed files with 84 additions and 146 deletions

View File

@@ -100,6 +100,11 @@ Some of the infrastructure resources are deployed in your cluster by default. Yo
[source,yaml]
----
apiVersion: imageregistry.operator.openshift.io/v1
kind: Config
metadata:
name: cluster
# ...
spec:
nodePlacement: <1>
nodeSelector:

View File

@@ -7,11 +7,11 @@
[id="binding-infra-node-workloads-using-taints-tolerations_{context}"]
= Binding infrastructure node workloads using taints and tolerations
If you have an infra node that has the `infra` and `worker` roles assigned, you must configure the node so that user workloads are not assigned to it.
If you have an infrastructure node that has the `infra` and `worker` roles assigned, you must configure the node so that user workloads are not assigned to it.
[IMPORTANT]
====
It is recommended that you preserve the dual `infra,worker` label that is created for infra nodes and use taints and tolerations to manage nodes that user workloads are scheduled on. If you remove the `worker` label from the node, you must create a custom pool to manage it. A node with a label other than `master` or `worker` is not recognized by the MCO without a custom pool. Maintaining the `worker` label allows the node to be managed by the default worker machine config pool, if no custom pools that select the custom label exists. The `infra` label communicates to the cluster that it does not count toward the total number of subscriptions.
It is recommended that you preserve the dual `infra,worker` label that is created for infrastructure nodes and use taints and tolerations to manage nodes that user workloads are scheduled on. If you remove the `worker` label from the node, you must create a custom pool to manage it. A node with a label other than `master` or `worker` is not recognized by the MCO without a custom pool. Maintaining the `worker` label allows the node to be managed by the default worker machine config pool, if no custom pools that select the custom label exists. The `infra` label communicates to the cluster that it does not count toward the total number of subscriptions.
====
.Prerequisites
@@ -20,7 +20,7 @@ It is recommended that you preserve the dual `infra,worker` label that is create
.Procedure
. Add a taint to the infra node to prevent scheduling user workloads on it:
. Add a taint to the infrastructure node to prevent scheduling user workloads on it:
.. Determine if the node has the taint:
+
@@ -36,7 +36,7 @@ oc describe node ci-ln-iyhx092-f76d1-nvdfm-worker-b-wln2l
Name: ci-ln-iyhx092-f76d1-nvdfm-worker-b-wln2l
Roles: worker
...
Taints: node-role.kubernetes.io/infra:NoSchedule
Taints: node-role.kubernetes.io/infra=reserved:NoSchedule
...
----
+
@@ -58,97 +58,61 @@ $ oc adm taint nodes node1 node-role.kubernetes.io/infra=reserved:NoSchedule
+
[TIP]
====
You can alternatively apply the following YAML to add the taint:
You can alternatively edit the pod specification to add the taint:
[source,yaml]
----
kind: Node
apiVersion: v1
kind: Node
metadata:
name: <node_name>
labels:
...
name: node1
# ...
spec:
taints:
- key: node-role.kubernetes.io/infra
effect: NoSchedule
value: reserved
...
effect: NoSchedule
# ...
----
====
+
This example places a taint on `node1` that has key `node-role.kubernetes.io/infra` and taint effect `NoSchedule`. Nodes with the `NoSchedule` effect schedule only pods that tolerate the taint, but allow existing pods to remain scheduled on the node.
These examples place a taint on `node1` that has the `node-role.kubernetes.io/infra` key and the `NoSchedule` taint effect. Nodes with the `NoSchedule` effect schedule only pods that tolerate the taint, but allow existing pods to remain scheduled on the node.
+
[NOTE]
====
If a descheduler is used, pods violating node taints could be evicted from the cluster.
====
.. Add the taint with NoExecute Effect along with the above taint with NoSchedule Effect:
. Add tolerations to the pods that you want to schedule on the infrastructure node, such as the router, registry, and monitoring workloads. Referencing the previous examples, add the following tolerations to the `Pod` object specification:
+
[source,terminal]
----
$ oc adm taint nodes <node_name> <key>=<value>:<effect>
----
+
For example:
+
[source,terminal]
----
$ oc adm taint nodes node1 node-role.kubernetes.io/infra=reserved:NoExecute
----
+
[TIP]
====
You can alternatively apply the following YAML to add the taint:
[source,yaml]
----
kind: Node
apiVersion: v1
kind: Pod
metadata:
name: <node_name>
labels:
...
annotations:
# ...
spec:
taints:
- key: node-role.kubernetes.io/infra
effect: NoExecute
value: reserved
...
# ...
tolerations:
- key: node-role.kubernetes.io/infra <1>
value: reserved <2>
effect: NoSchedule <3>
operator: Equal <4>
----
====
<1> Specify the key that you added to the node.
<2> Specify the value of the key-value pair taint that you added to the node.
<3> Specify the effect that you added to the node.
<4> Specify the `Equal` Operator to require a taint with the key `node-role.kubernetes.io/infra` to be present on the node.
+
This example places a taint on `node1` that has the key `node-role.kubernetes.io/infra` and taint effect `NoExecute`. Nodes with the `NoExecute` effect schedule only pods that tolerate the taint. The effect will remove any existing pods from the node that do not have a matching toleration.
+
. Add tolerations for the pod configurations you want to schedule on the infra node, like router, registry, and monitoring workloads. Add the following code to the `Pod` object specification:
+
[source,yaml]
----
tolerations:
- effect: NoSchedule <1>
key: node-role.kubernetes.io/infra <2>
value: reserved <3>
- effect: NoExecute <4>
key: node-role.kubernetes.io/infra <5>
operator: Equal <6>
value: reserved <7>
----
<1> Specify the effect that you added to the node.
<2> Specify the key that you added to the node.
<3> Specify the value of the key-value pair taint that you added to the node.
<4> Specify the effect that you added to the node.
<5> Specify the key that you added to the node.
<6> Specify the `Equal` Operator to require a taint with the key `node-role.kubernetes.io/infra` to be present on the node.
<7> Specify the value of the key-value pair taint that you added to the node.
+
This toleration matches the taint created by the `oc adm taint` command. A pod with this toleration can be scheduled onto the infra node.
This toleration matches the taint created by the `oc adm taint` command. A pod with this toleration can be scheduled onto the infrastructure node.
+
[NOTE]
====
Moving pods for an Operator installed via OLM to an infra node is not always possible. The capability to move Operator pods depends on the configuration of each Operator.
Moving pods for an Operator installed via OLM to an infrastructure node is not always possible. The capability to move Operator pods depends on the configuration of each Operator.
====
. Schedule the pod to the infra node using a scheduler. See the documentation for _Controlling pod placement onto nodes_ for details.
. Schedule the pod to the infrastructure node by using a scheduler. See the documentation for "Controlling pod placement using the scheduler" for details.
. Remove any workloads that you do not want, or that do not belong, on the new infrastructure node. See the list of workloads supported for use on infrastructure nodes in "{product-title} infrastructure components".

View File

@@ -13,33 +13,10 @@
See Creating infrastructure machine sets for installer-provisioned infrastructure environments or for any cluster where the control plane nodes are managed by the machine API.
====
Requirements of the cluster dictate that infrastructure, also called `infra` nodes, be provisioned. The installer only provides provisions for control plane and worker nodes. Worker nodes can be designated as infrastructure nodes or application, also called `app`, nodes through labeling.
Requirements of the cluster dictate that infrastructure (infra) nodes, be provisioned. The installation program provisions only control plane and worker nodes. Worker nodes can be designated as infrastructure nodes through labeling. You can then use taints and tolerations to move appropriate workloads to the infrastructure nodes. For more information, see "Moving resources to infrastructure machine sets".
.Procedure
You can optionally create a default cluster-wide node selector. The default node selector is applied to pods created in all namespaces and creates an intersection with any existing node selectors on a pod, which additionally constrains the pod's selector.
. Add a label to the worker node that you want to act as application node:
+
[source,terminal]
----
$ oc label node <node-name> node-role.kubernetes.io/app=""
----
. Add a label to the worker nodes that you want to act as infrastructure nodes:
+
[source,terminal]
----
$ oc label node <node-name> node-role.kubernetes.io/infra=""
----
. Check to see if applicable nodes now have the `infra` role and `app` roles:
+
[source,terminal]
----
$ oc get nodes
----
. Create a default cluster-wide node selector. The default node selector is applied to pods created in all namespaces. This creates an intersection with any existing node selectors on a pod, which additionally constrains the pod's selector.
+
[IMPORTANT]
====
If the default node selector key conflicts with the key of a pod's label, then the default node selector is not applied.
@@ -49,6 +26,24 @@ However, do not set a default node selector that might cause a pod to become uns
You can alternatively use a project node selector to avoid cluster-wide node selector key conflicts.
====
.Procedure
. Add a label to the worker nodes that you want to act as infrastructure nodes:
+
[source,terminal]
----
$ oc label node <node-name> node-role.kubernetes.io/infra=""
----
. Check to see if applicable nodes now have the `infra` role:
+
[source,terminal]
----
$ oc get nodes
----
. Optional: Create a default cluster-wide node selector:
.. Edit the `Scheduler` object:
+
[source,terminal]
@@ -72,4 +67,4 @@ spec:
.. Save the file to apply the changes.
You can now move infrastructure resources to the newly labeled `infra` nodes.
You can now move infrastructure resources to the new infrastructure nodes. Also, remove any workloads that you do not want, or that do not belong, on the new infrastructure node. See the list of workloads supported for use on infrastructure nodes in "{product-title} infrastructure components".

View File

@@ -40,9 +40,6 @@ data:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
prometheusK8s:
nodeSelector:
node-role.kubernetes.io/infra: ""
@@ -50,9 +47,6 @@ data:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
prometheusOperator:
nodeSelector:
node-role.kubernetes.io/infra: ""
@@ -60,9 +54,6 @@ data:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
metricsServer:
nodeSelector:
node-role.kubernetes.io/infra: ""
@@ -70,9 +61,6 @@ data:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
kubeStateMetrics:
nodeSelector:
node-role.kubernetes.io/infra: ""
@@ -80,9 +68,6 @@ data:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
telemeterClient:
nodeSelector:
node-role.kubernetes.io/infra: ""
@@ -90,9 +75,6 @@ data:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
openshiftStateMetrics:
nodeSelector:
node-role.kubernetes.io/infra: ""
@@ -100,9 +82,6 @@ data:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
thanosQuerier:
nodeSelector:
node-role.kubernetes.io/infra: ""
@@ -110,9 +89,6 @@ data:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
monitoringPlugin:
nodeSelector:
node-role.kubernetes.io/infra: ""
@@ -120,11 +96,8 @@ data:
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoSchedule
- key: node-role.kubernetes.io/infra
value: reserved
effect: NoExecute
----
<1> Add a `nodeSelector` parameter with the appropriate value to the component you want to move. You can use a `nodeSelector` in the format shown or use `<key>: <value>` pairs, based on the value specified for the node. If you added a taint to the infrastructure node, also add a matching toleration.
<1> Add a `nodeSelector` parameter with the appropriate value to the component you want to move. You can use a `nodeSelector` parameter in the format shown or use `<key>: <value>` pairs, based on the value specified for the node. If you added a taint to the infrastructure node, also add a matching toleration.
. Watch the monitoring pods move to the new machines:
+

View File

@@ -61,15 +61,12 @@ $ oc edit configs.imageregistry.operator.openshift.io/cluster
+
[source,yaml]
----
apiVersion: imageregistry.operator.openshift.io/v1
kind: Config
metadata:
name: cluster
# ...
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
namespaces:
- openshift-image-registry
topologyKey: kubernetes.io/hostname
weight: 100
logLevel: Normal
managementState: Managed
nodeSelector: <1>
@@ -78,11 +75,8 @@ spec:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
value: reserved
- effect: NoExecute
key: node-role.kubernetes.io/infra
value: reserved
----
<1> Add a `nodeSelector` parameter with the appropriate value to the component you want to move. You can use a `nodeSelector` in the format shown or use `<key>: <value>` pairs, based on the value specified for the node. If you added a taint to the infrasructure node, also add a matching toleration.
<1> Add a `nodeSelector` parameter with the appropriate value to the component you want to move. You can use a `nodeSelector` parameter in the format shown or use `<key>: <value>` pairs, based on the value specified for the node. If you added a taint to the infrasructure node, also add a matching toleration.
. Verify the registry pod has been moved to the infrastructure node.
+

View File

@@ -59,20 +59,27 @@ $ oc edit ingresscontroller default -n openshift-ingress-operator
+
[source,yaml]
----
spec:
nodePlacement:
nodeSelector: <1>
matchLabels:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
value: reserved
- effect: NoExecute
key: node-role.kubernetes.io/infra
value: reserved
apiVersion: operator.openshift.io/v1
kind: IngressController
metadata:
creationTimestamp: "2025-03-26T21:15:43Z"
finalizers:
- ingresscontroller.operator.openshift.io/finalizer-ingresscontroller
generation: 1
name: default
# ...
spec:
nodePlacement:
nodeSelector: <1>
matchLabels:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
value: reserved
# ...
----
<1> Add a `nodeSelector` parameter with the appropriate value to the component you want to move. You can use a `nodeSelector` in the format shown or use `<key>: <value>` pairs, based on the value specified for the node. If you added a taint to the infrastructure node, also add a matching toleration.
<1> Add a `nodeSelector` parameter with the appropriate value to the component you want to move. You can use a `nodeSelector` parameter in the format shown or use `<key>: <value>` pairs, based on the value specified for the node. If you added a taint to the infrastructure node, also add a matching toleration.
. Confirm that the router pod is running on the `infra` node.
.. View the list of router pods and note the node name of the running pod: