// Module included in the following assemblies:
//
//  * machine_management/creating-machinesets/creating-machineset-aws.adoc

:_mod-docs-content-type: PROCEDURE
[id="nvidia-gpu-aws-adding-a-gpu-node_{context}"]
= Adding a GPU node to an existing {product-title} cluster

You can copy and modify a default compute machine set configuration to create a GPU-enabled machine set and machines for the AWS EC2 cloud provider.

For more information about the supported instance types, see the following NVIDIA documentation:

* link:https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/platform-support.html[NVIDIA GPU Operator Community support matrix]

* link:https://docs.nvidia.com/ai-enterprise/latest/product-support-matrix/index.html[NVIDIA AI Enterprise support matrix]

.Procedure

. View the existing nodes, machines, and machine sets  by running the following command. Note that each node is an instance of a machine definition with a specific AWS region and {product-title} role.
+
[source,terminal]
----
$ oc get nodes
----
+
.Example output
+
[source,terminal]
----
NAME                                        STATUS   ROLES                  AGE     VERSION
ip-10-0-52-50.us-east-2.compute.internal    Ready    worker                 3d17h   v1.31.3
ip-10-0-58-24.us-east-2.compute.internal    Ready    control-plane,master   3d17h   v1.31.3
ip-10-0-68-148.us-east-2.compute.internal   Ready    worker                 3d17h   v1.31.3
ip-10-0-68-68.us-east-2.compute.internal    Ready    control-plane,master   3d17h   v1.31.3
ip-10-0-72-170.us-east-2.compute.internal   Ready    control-plane,master   3d17h   v1.31.3
ip-10-0-74-50.us-east-2.compute.internal    Ready    worker                 3d17h   v1.31.3
----

. View the machines and machine sets that exist in the `openshift-machine-api` namespace by running the following command. Each compute machine set is associated with a different availability zone within the AWS region. The installer automatically load balances compute machines across availability zones.
+
[source,terminal]
----
$ oc get machinesets -n openshift-machine-api
----
+
.Example output
+
[source,terminal]
----
NAME                                        DESIRED   CURRENT   READY   AVAILABLE   AGE
preserve-dsoc12r4-ktjfc-worker-us-east-2a   1         1         1       1           3d11h
preserve-dsoc12r4-ktjfc-worker-us-east-2b   2         2         2       2           3d11h
----

. View the machines that exist in the `openshift-machine-api` namespace by running the following command. At this time, there is only one compute machine per machine set, though a compute machine set could be scaled to add a node in a particular region and zone.
+
[source,terminal]
----
$ oc get machines -n openshift-machine-api | grep worker
----
+
.Example output
+
[source,terminal]
----
preserve-dsoc12r4-ktjfc-worker-us-east-2a-dts8r      Running   m5.xlarge   us-east-2   us-east-2a   3d11h
preserve-dsoc12r4-ktjfc-worker-us-east-2b-dkv7w      Running   m5.xlarge   us-east-2   us-east-2b   3d11h
preserve-dsoc12r4-ktjfc-worker-us-east-2b-k58cw      Running   m5.xlarge   us-east-2   us-east-2b   3d11h
----

. Make a copy of one of the existing compute `MachineSet` definitions and output the result to a JSON file by running the following command. This will be the basis for the GPU-enabled compute machine set definition.
+
[source,terminal]
----
$ oc get machineset preserve-dsoc12r4-ktjfc-worker-us-east-2a -n openshift-machine-api -o json > <output_file.json>
----

. Edit the JSON file and make the following changes to the new `MachineSet` definition:
+
* Replace `worker` with `gpu`. This will be the name of the new machine set.
* Change the instance type of the new `MachineSet` definition to `g4dn`, which includes an NVIDIA Tesla T4 GPU.
To learn more about AWS `g4dn` instance types, see link:https://aws.amazon.com/ec2/instance-types/#Accelerated_Computing[Accelerated Computing].
+
[source,terminal]
----
$ jq .spec.template.spec.providerSpec.value.instanceType preserve-dsoc12r4-ktjfc-worker-gpu-us-east-2a.json

"g4dn.xlarge"
----
+
The `<output_file.json>` file is saved as `preserve-dsoc12r4-ktjfc-worker-gpu-us-east-2a.json`.

 . Update the following fields in `preserve-dsoc12r4-ktjfc-worker-gpu-us-east-2a.json`:
+
* `.metadata.name` to a name containing `gpu`.

* `.spec.selector.matchLabels["machine.openshift.io/cluster-api-machineset"]` to
match the new `.metadata.name`.

* `.spec.template.metadata.labels["machine.openshift.io/cluster-api-machineset"]`
to match the new `.metadata.name`.

* `.spec.template.spec.providerSpec.value.instanceType` to `g4dn.xlarge`.

. To verify your changes, perform a `diff` of the original compute definition and the new GPU-enabled node definition by running the following command:
+
[source,terminal]
----
$ oc -n openshift-machine-api get preserve-dsoc12r4-ktjfc-worker-us-east-2a -o json | diff preserve-dsoc12r4-ktjfc-worker-gpu-us-east-2a.json -
----
+
.Example output
+
[source,terminal]
----
10c10

< "name": "preserve-dsoc12r4-ktjfc-worker-gpu-us-east-2a",
---
> "name": "preserve-dsoc12r4-ktjfc-worker-us-east-2a",

21c21

< "machine.openshift.io/cluster-api-machineset": "preserve-dsoc12r4-ktjfc-worker-gpu-us-east-2a"
---
> "machine.openshift.io/cluster-api-machineset": "preserve-dsoc12r4-ktjfc-worker-us-east-2a"

31c31

< "machine.openshift.io/cluster-api-machineset": "preserve-dsoc12r4-ktjfc-worker-gpu-us-east-2a"
---
> "machine.openshift.io/cluster-api-machineset": "preserve-dsoc12r4-ktjfc-worker-us-east-2a"

60c60

< "instanceType": "g4dn.xlarge",
---
> "instanceType": "m5.xlarge",
----

. Create the GPU-enabled compute machine set from the definition by running the following command:
+
[source,terminal]
----
$ oc create -f preserve-dsoc12r4-ktjfc-worker-gpu-us-east-2a.json
----
+
.Example output
+
[source,terminal]
----
machineset.machine.openshift.io/preserve-dsoc12r4-ktjfc-worker-gpu-us-east-2a created
----

.Verification

. View the machine set you created by running the following command:
+
[source,terminal]
----
$ oc -n openshift-machine-api get machinesets | grep gpu
----
+
The MachineSet replica count is set to `1` so a new `Machine` object is created automatically.

+
.Example output
+
[source,terminal]
----
preserve-dsoc12r4-ktjfc-worker-gpu-us-east-2a   1         1         1       1           4m21s
----

. View the `Machine` object that the machine set created by running the following command:
+
[source,terminal]
----
$ oc -n openshift-machine-api get machines | grep gpu
----
+
.Example output
+
[source,terminal]
----
preserve-dsoc12r4-ktjfc-worker-gpu-us-east-2a    running    g4dn.xlarge   us-east-2   us-east-2a  4m36s
----

Note that there is no need to specify a namespace for the node. The node definition is cluster scoped.