1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-05 21:46:22 +01:00

Merge pull request #83599 from xenolinux/hcp-virt-nvidia-gpus

OSDOCS#12121:HCP KubeVirt Nvidia GPU support
This commit is contained in:
Servesha Dudhgaonkar
2024-11-08 12:38:07 +05:30
committed by GitHub
3 changed files with 148 additions and 0 deletions

View File

@@ -50,3 +50,7 @@ include::modules/hcp-virt-image-caching.adoc[leveloffset=+2]
* xref:../../virt/virtual_machines/creating_vms_custom/virt-creating-vms-by-cloning-pvcs.adoc#smart-cloning_virt-creating-vms-by-cloning-pvcs[Cloning a data volume using smart-cloning]
include::modules/hcp-virt-etcd-storage.adoc[leveloffset=+2]
include::modules/hcp-virt-attach-nvidia-gpus.adoc[leveloffset=+1]
include::modules/hcp-virt-attach-nvidia-gpus-np-api.adoc[leveloffset=+1]

View File

@@ -0,0 +1,100 @@
// Module included in the following assemblies:
//
// * hosted_control_planes/hcp-manage/hcp-manage-virt.adoc
:_mod-docs-content-type: PROCEDURE
[id="hcp-virt-attach-nvidia-gpus-np-api_{context}"]
= Attaching NVIDIA GPU devices by using the NodePool resource
You can attach one or more NVIDIA graphics processing unit (GPU) devices to node pools by configuring the `nodepool.spec.platform.kubevirt.hostDevices` field in the `NodePool` resource.
:FeatureName: Attaching NVIDIA GPU devices to node pools
include::snippets/technology-preview.adoc[]
.Procedure
* Attach one or more GPU devices to node pools:
** To attach a single GPU device, configure the `NodePool` resource by using the following example configuration:
+
[source,yaml]
----
apiVersion: hypershift.openshift.io/v1beta1
kind: NodePool
metadata:
name: <hosted_cluster_name> <1>
namespace: <hosted_cluster_namespace> <2>
spec:
arch: amd64
clusterName: <hosted_cluster_name>
management:
autoRepair: false
upgradeType: Replace
nodeDrainTimeout: 0s
nodeVolumeDetachTimeout: 0s
platform:
kubevirt:
attachDefaultNetwork: true
compute:
cores: <cpu> <3>
memory: <memory> <4>
hostDevices: <5>
- count: <count> <6>
deviceName: <gpu_device_name> <7>
networkInterfaceMultiqueue: Enable
rootVolume:
persistent:
size: 32Gi
type: Persistent
type: KubeVirt
replicas: <worker_node_count> <8>
----
<1> Specify the name of your hosted cluster, for instance, `example`.
<2> Specify the name of the hosted cluster namespace, for example, `clusters`.
<3> Specify a value for CPU, for example, `2`.
<4> Specify a value for memory, for example, `16Gi`.
<5> The `hostDevices` field defines a list of different types of GPU devices that you can attach to node pools.
<6> Specify the number of GPU devices you want to attach to each virtual machine (VM) in node pools. For example, if you attach 2 GPU devices to 3 node pool replicas, all 3 VMs in the node pool are attached to the 2 GPU devices. The default count is `1`.
<7> Specify the GPU device name, for example,`nvidia-a100`.
<8> Specify the worker count, for example, `3`.
** To attach multiple GPU devices, configure the `NodePool` resource by using the following example configuration:
+
[source,yaml]
----
apiVersion: hypershift.openshift.io/v1beta1
kind: NodePool
metadata:
name: <hosted_cluster_name>
namespace: <hosted_cluster_namespace>
spec:
arch: amd64
clusterName: <hosted_cluster_name>
management:
autoRepair: false
upgradeType: Replace
nodeDrainTimeout: 0s
nodeVolumeDetachTimeout: 0s
platform:
kubevirt:
attachDefaultNetwork: true
compute:
cores: <cpu>
memory: <memory>
hostDevices:
- count: <count>
deviceName: <gpu_device_name>
- count: <count>
deviceName: <gpu_device_name>
- count: <count>
deviceName: <gpu_device_name>
- count: <count>
deviceName: <gpu_device_name>
networkInterfaceMultiqueue: Enable
rootVolume:
persistent:
size: 32Gi
type: Persistent
type: KubeVirt
replicas: <worker_node_count>
----

View File

@@ -0,0 +1,44 @@
// Module included in the following assemblies:
//
// * hosted_control_planes/hcp-manage/hcp-manage-virt.adoc
:_mod-docs-content-type: PROCEDURE
[id="hcp-virt-attach-nvidia-gpus_{context}"]
= Attaching NVIDIA GPU devices by using the hcp CLI
You can attach one or more NVIDIA graphics processing unit (GPU) devices to node pools by using the `hcp` command-line interface (CLI) in a hosted cluster on {VirtProductName}.
:FeatureName: Attaching NVIDIA GPU devices to node pools
include::snippets/technology-preview.adoc[]
.Prerequisites
* You have exposed the NVIDIA GPU device as a resource on the node where the GPU device resides. For more information, see link:https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/openshift-virtualization.html[NVIDIA GPU Operator with {VirtProductName}].
* You have exposed the NVIDIA GPU device as an link:https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#extended-resources[extended resource] on the node to assign it to node pools.
.Procedure
* You can attach the GPU device to node pools during cluster creation by running the following command:
+
[source,terminal]
----
$ hcp create cluster kubevirt \
--name <hosted_cluster_name> \// <1>
--node-pool-replicas <worker_node_count> \// <2>
--pull-secret <path_to_pull_secret> \// <3>
--memory <memory> \// <4>
--cores <cpu> \// <5>
--host-device-name="<gpu_device_name>,count:<value>" <6>
----
<1> Specify the name of your hosted cluster, for instance, `example`.
<2> Specify the worker count, for example, `3`.
<3> Specify the path to your pull secret, for example, `/user/name/pullsecret`.
<4> Specify a value for memory, for example, `16Gi`.
<5> Specify a value for CPU, for example, `2`.
<6> Specify the GPU device name and the count, for example, `--host-device-name="nvidia-a100,count:2"`. The `--host-device-name` argument takes the name of the GPU device from the infrastructure node and an optional count that represents the number of GPU devices you want to attach to each virtual machine (VM) in node pools. The default count is `1`. For example, if you attach 2 GPU devices to 3 node pool replicas, all 3 VMs in the node pool are attached to the 2 GPU devices.
+
[TIP]
====
You can use the `--host-device-name` argument multiple times to attach multiple devices of different types.
====