mirror of
https://github.com/openshift/openshift-docs.git
synced 2026-02-07 09:46:53 +01:00
OSDOCS-2875: global Azure availability sets in 4.10
This commit is contained in:
@@ -19,7 +19,7 @@ For information about infrastructure nodes and which components can run on infra
|
||||
[id="creating-infrastructure-machinesets-production"]
|
||||
== Creating infrastructure machine sets for production environments
|
||||
|
||||
In a production deployment, it is recommended that you deploy at least three machine sets to hold infrastructure components. Both OpenShift Logging and {ProductName} deploy Elasticsearch, which requires three instances to be installed on different nodes. Each of these nodes can be deployed to different availability zones for high availability. A configuration like this requires three different machine sets, one for each availability zone.
|
||||
In a production deployment, it is recommended that you deploy at least three machine sets to hold infrastructure components. Both OpenShift Logging and {ProductName} deploy Elasticsearch, which requires three instances to be installed on different nodes. Each of these nodes can be deployed to different availability zones for high availability. A configuration like this requires three different machine sets, one for each availability zone. In global Azure regions that do not have multiple availability zones, you can use availability sets to ensure high availability.
|
||||
|
||||
[id="creating-infrastructure-machinesets-clouds"]
|
||||
=== Creating machine sets for different clouds
|
||||
|
||||
@@ -31,4 +31,4 @@ Cluster autoscaler:: This resource is based on the upstream cluster autoscaler p
|
||||
|
||||
Machine health check:: The `MachineHealthCheck` resource detects when a machine is unhealthy, deletes it, and, on supported platforms, makes a new machine.
|
||||
|
||||
In {product-title} version 3.11, you could not roll out a multi-zone architecture easily because the cluster did not manage machine provisioning. Beginning with {product-title} version 4.1, this process is easier. Each machine set is scoped to a single zone, so the installation program sends out machine sets across availability zones on your behalf. And then because your compute is dynamic, and in the face of a zone failure, you always have a zone for when you must rebalance your machines. The autoscaler provides best-effort balancing over the life of a cluster.
|
||||
In {product-title} version 3.11, you could not roll out a multi-zone architecture easily because the cluster did not manage machine provisioning. Beginning with {product-title} version 4.1, this process is easier. Each machine set is scoped to a single zone, so the installation program sends out machine sets across availability zones on your behalf. And then because your compute is dynamic, and in the face of a zone failure, you always have a zone for when you must rebalance your machines. In global Azure regions that do not have multiple availability zones, you can use availability sets to ensure high availability. The autoscaler provides best-effort balancing over the life of a cluster.
|
||||
|
||||
@@ -50,22 +50,19 @@ The `matchLabels` are examples only; you must map your machine groups based on y
|
||||
Short circuiting ensures that machine health checks remediate machines only when the cluster is healthy.
|
||||
Short-circuiting is configured through the `maxUnhealthy` field in the `MachineHealthCheck` resource.
|
||||
|
||||
If the user defines a value for the `maxUnhealthy` field,
|
||||
before remediating any machines, the `MachineHealthCheck` compares the value of `maxUnhealthy`
|
||||
with the number of machines within its target pool that it has determined to be unhealthy.
|
||||
Remediation is not performed if the number of unhealthy machines exceeds the `maxUnhealthy` limit.
|
||||
If the user defines a value for the `maxUnhealthy` field, before remediating any machines, the `MachineHealthCheck` compares the value of `maxUnhealthy` with the number of machines within its target pool that it has determined to be unhealthy. Remediation is not performed if the number of unhealthy machines exceeds the `maxUnhealthy` limit.
|
||||
|
||||
[IMPORTANT]
|
||||
====
|
||||
If `maxUnhealthy` is not set, the value defaults to `100%` and the machines are remediated regardless of the state of the cluster.
|
||||
====
|
||||
|
||||
The appropriate `maxUnhealthy` value depends on the scale of the cluster you deploy and how many machines the `MachineHealthCheck` covers. For example, you can use the `maxUnhealthy` value to cover multiple machine sets across multiple availability zones so that if you lose an entire zone, your `maxUnhealthy` setting prevents further remediation within the cluster.
|
||||
The appropriate `maxUnhealthy` value depends on the scale of the cluster you deploy and how many machines the `MachineHealthCheck` covers. For example, you can use the `maxUnhealthy` value to cover multiple machine sets across multiple availability zones so that if you lose an entire zone, your `maxUnhealthy` setting prevents further remediation within the cluster. In global Azure regions that do not have multiple availability zones, you can use availability sets to ensure high availability.
|
||||
|
||||
The `maxUnhealthy` field can be set as either an integer or percentage.
|
||||
There are different remediation implementations depending on the `maxUnhealthy` value.
|
||||
|
||||
=== Setting `maxUnhealthy` by using an absolute value
|
||||
=== Setting maxUnhealthy by using an absolute value
|
||||
|
||||
If `maxUnhealthy` is set to `2`:
|
||||
|
||||
@@ -74,7 +71,7 @@ If `maxUnhealthy` is set to `2`:
|
||||
|
||||
These values are independent of how many machines are being checked by the machine health check.
|
||||
|
||||
=== Setting `maxUnhealthy` by using percentages
|
||||
=== Setting maxUnhealthy by using percentages
|
||||
|
||||
If `maxUnhealthy` is set to `40%` and there are 25 machines being checked:
|
||||
|
||||
|
||||
@@ -71,12 +71,13 @@ endif::infra[]
|
||||
metadata:
|
||||
creationTimestamp: null
|
||||
labels:
|
||||
machine.openshift.io/cluster-api-machineset: <machineset_name> <4>
|
||||
ifndef::infra[]
|
||||
node-role.kubernetes.io/<role>: "" <2>
|
||||
endif::infra[]
|
||||
ifdef::infra[]
|
||||
node-role.kubernetes.io/infra: "" <2>
|
||||
taints: <4>
|
||||
taints: <5>
|
||||
- key: node-role.kubernetes.io/infra
|
||||
effect: NoSchedule
|
||||
endif::infra[]
|
||||
@@ -95,10 +96,10 @@ endif::infra[]
|
||||
internalLoadBalancer: ""
|
||||
kind: AzureMachineProviderSpec
|
||||
ifndef::infra[]
|
||||
location: <region> <4>
|
||||
location: <region> <5>
|
||||
endif::infra[]
|
||||
ifdef::infra[]
|
||||
location: <region> <5>
|
||||
location: <region> <6>
|
||||
endif::infra[]
|
||||
managedIdentity: <infrastructure_id>-identity <1>
|
||||
metadata:
|
||||
@@ -121,10 +122,10 @@ endif::infra[]
|
||||
vmSize: Standard_D4s_v3
|
||||
vnet: <infrastructure_id>-vnet <1>
|
||||
ifndef::infra[]
|
||||
zone: "1" <5>
|
||||
zone: "1" <6>
|
||||
endif::infra[]
|
||||
ifdef::infra[]
|
||||
zone: "1" <6>
|
||||
zone: "1" <7>
|
||||
endif::infra[]
|
||||
----
|
||||
<1> Specify the infrastructure ID that is based on the cluster ID that you set when you provisioned the cluster. If you have the OpenShift CLI installed, you can obtain the infrastructure ID by running the following command:
|
||||
@@ -153,15 +154,17 @@ $ oc -n openshift-machine-api \
|
||||
ifndef::infra[]
|
||||
<2> Specify the node label to add.
|
||||
<3> Specify the infrastructure ID, node label, and region.
|
||||
<4> Specify the region to place machines on.
|
||||
<5> Specify the zone within your region to place machines on. Be sure that your region supports the zone that you specify.
|
||||
<4> Optional: Specify the machine set name to enable the use of availability sets. This setting only applies to new compute machines.
|
||||
<5> Specify the region to place machines on.
|
||||
<6> Specify the zone within your region to place machines on. Be sure that your region supports the zone that you specify.
|
||||
endif::infra[]
|
||||
ifdef::infra[]
|
||||
<2> Specify the `<infra>` node label.
|
||||
<3> Specify the infrastructure ID, `<infra>` node label, and region.
|
||||
<4> Specify a taint to prevent user workloads from being scheduled on infra nodes.
|
||||
<5> Specify the region to place machines on.
|
||||
<6> Specify the zone within your region to place machines on. Be sure that your region supports the zone that you specify.
|
||||
<4> Optional: Specify the machine set name to enable the use of availability sets. This setting only applies to new compute machines.
|
||||
<5> Specify a taint to prevent user workloads from being scheduled on infra nodes.
|
||||
<6> Specify the region to place machines on.
|
||||
<7> Specify the zone within your region to place machines on. Be sure that your region supports the zone that you specify.
|
||||
|
||||
endif::infra[]
|
||||
|
||||
|
||||
@@ -33,4 +33,4 @@ spec:
|
||||
- 6666
|
||||
----
|
||||
|
||||
* If you perform direct volume migration with nodes that are in different availability zones, the migration might fail because the migrated pods cannot access the PVC. (link:https://bugzilla.redhat.com/show_bug.cgi?id=1947487[*BZ#1947487*])
|
||||
* If you perform direct volume migration with nodes that are in different availability zones or availability sets, the migration might fail because the migrated pods cannot access the PVC. (link:https://bugzilla.redhat.com/show_bug.cgi?id=1947487[*BZ#1947487*])
|
||||
|
||||
@@ -5,12 +5,12 @@
|
||||
[id="nodes-scheduler-pod-affinity-about_{context}"]
|
||||
= Understanding pod affinity
|
||||
|
||||
_Pod affinity_ and _pod anti-affinity_ allow you to constrain which nodes your pod is eligible to be scheduled on based on the key/value labels on other pods.
|
||||
_Pod affinity_ and _pod anti-affinity_ allow you to constrain which nodes your pod is eligible to be scheduled on based on the key/value labels on other pods.
|
||||
|
||||
* Pod affinity can tell the scheduler to locate a new pod on the same node as other pods if the label selector on the new pod matches the label on the current pod.
|
||||
* Pod anti-affinity can prevent the scheduler from locating a new pod on the same node as pods with the same labels if the label selector on the new pod matches the label on the current pod.
|
||||
|
||||
For example, using affinity rules, you could spread or pack pods within a service or relative to pods in other services. Anti-affinity rules allow you to prevent pods of a particular service from scheduling on the same nodes as pods of another service that are known to interfere with the performance of the pods of the first service. Or, you could spread the pods of a service across nodes or availability zones to reduce correlated failures.
|
||||
For example, using affinity rules, you could spread or pack pods within a service or relative to pods in other services. Anti-affinity rules allow you to prevent pods of a particular service from scheduling on the same nodes as pods of another service that are known to interfere with the performance of the pods of the first service. Or, you could spread the pods of a service across nodes, availability zones, or availability sets to reduce correlated failures.
|
||||
|
||||
There are two types of pod affinity rules: _required_ and _preferred_.
|
||||
|
||||
@@ -68,12 +68,12 @@ metadata:
|
||||
spec:
|
||||
affinity:
|
||||
podAntiAffinity: <1>
|
||||
preferredDuringSchedulingIgnoredDuringExecution: <2>
|
||||
preferredDuringSchedulingIgnoredDuringExecution: <2>
|
||||
- weight: 100 <3>
|
||||
podAffinityTerm:
|
||||
labelSelector:
|
||||
matchExpressions:
|
||||
- key: security <4>
|
||||
- key: security <4>
|
||||
operator: In <5>
|
||||
values:
|
||||
- S2
|
||||
@@ -93,4 +93,3 @@ spec:
|
||||
====
|
||||
If labels on a node change at runtime such that the affinity rules on a pod are no longer met, the pod continues to run on the node.
|
||||
====
|
||||
|
||||
|
||||
@@ -509,7 +509,7 @@ include::modules/nodes-scheduler-node-selectors-cluster.adoc[leveloffset=+2]
|
||||
|
||||
You can create a machine set to create machines that host only infrastructure components, such as the default router, the integrated container image registry, and components for cluster metrics and monitoring. These infrastructure machines are not counted toward the total number of subscriptions that are required to run the environment.
|
||||
|
||||
In a production deployment, it is recommended that you deploy at least three machine sets to hold infrastructure components. Both OpenShift Logging and {ProductName} deploy Elasticsearch, which requires three instances to be installed on different nodes. Each of these nodes can be deployed to different availability zones for high availability. A configuration like this requires three different machine sets, one for each availability zone.
|
||||
In a production deployment, it is recommended that you deploy at least three machine sets to hold infrastructure components. Both OpenShift Logging and {ProductName} deploy Elasticsearch, which requires three instances to be installed on different nodes. Each of these nodes can be deployed to different availability zones for high availability. A configuration like this requires three different machine sets, one for each availability zone. In global Azure regions that do not have multiple availability zones, you can use availability sets to ensure high availability.
|
||||
|
||||
For information on infrastructure nodes and which components can run on infrastructure nodes, see xref:../machine_management/creating-infrastructure-machinesets.adoc#creating-infrastructure-machinesets[Creating infrastructure machine sets].
|
||||
|
||||
|
||||
Reference in New Issue
Block a user