1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-05 12:46:18 +01:00
Files
openshift-docs/modules/machine-feature-aws-add-nvidia-gpu-node.adoc
2025-06-16 15:53:53 +00:00

66 lines
2.3 KiB
Plaintext

// Module included in the following assemblies:
//
// * machine_management/cluster_api_machine_management/cluster_api_provider_configurations/cluster-api-config-options-aws.adoc
:_mod-docs-content-type: CONCEPT
[id="machine-feature-aws-add-nvidia-gpu-node_{context}"]
= GPU-enabled machine options
You can deploy GPU-enabled compute machines on {aws-first}.
The following sample configuration uses an link:https://aws.amazon.com/ec2/instance-types/#Accelerated_Computing[{aws-short} G4dn instance type], which includes an NVIDIA Tesla T4 Tensor Core GPU, as an example.
For more information about supported instance types, see the following pages in the NVIDIA documentation:
* link:https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/platform-support.html[NVIDIA GPU Operator Community support matrix]
* link:https://docs.nvidia.com/ai-enterprise/latest/product-support-matrix/index.html[NVIDIA AI Enterprise support matrix]
include::snippets/apply-machine-configuration-method.adoc[tag=method-machine-template-and-machine-set]
// Cluster API machine template spec
.Sample GPU-enabled machine template configuration
[source,yaml]
----
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSMachineTemplate
# ...
spec:
template:
spec:
instanceType: g4dn.xlarge <1>
# ...
----
<1> Specifies a G4dn instance type.
// Cluster API machine set spec
.Sample GPU-enabled machine set configuration
[source,yaml]
----
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineSet
metadata:
name: <cluster_name>-gpu-<region> <1>
namespace: openshift-cluster-api
labels:
cluster.x-k8s.io/cluster-name: <cluster_name>
spec:
clusterName: <cluster_name>
replicas: 1
selector:
matchLabels:
test: example
cluster.x-k8s.io/cluster-name: <cluster_name>
cluster.x-k8s.io/set-name: <cluster_name>-gpu-<region> <2>
template:
metadata:
labels:
test: example
cluster.x-k8s.io/cluster-name: <cluster_name>
cluster.x-k8s.io/set-name: <cluster_name>-gpu-<region> <3>
node-role.kubernetes.io/<role>: ""
# ...
----
<1> Specifies a name that includes the `gpu` role. The name includes the cluster ID as a prefix and the region as a suffix.
<2> Specifies a selector label that matches the machine set name.
<3> Specifies a template label that matches the machine set name.