openshift-docs/modules/das-operator-deploying-workloads.adoc

// Module included in the following assemblies:
//
// * operators/user/das-dynamic-accelerator-slicer-operator.adoc
//
:_mod-docs-content-type: PROCEDURE
[id="das-operator-deploying-workloads_{context}"]
= Deploying GPU workloads with the Dynamic Accelerator Slicer Operator

You can deploy workloads that request GPU slices managed by the Dynamic Accelerator Slicer (DAS) Operator. The Operator dynamically partitions GPU accelerators and schedules workloads to available GPU slices.

.Prerequisites

* You have MIG supported GPU hardware available in your cluster.
* The NVIDIA GPU Operator is installed and the `ClusterPolicy` shows a **Ready** state.
* You have installed the DAS Operator.

.Procedure

. Create a namespace by running the following command:
+
[source,terminal]
----
oc new-project cuda-workloads
----

. Create a deployment that requests GPU resources using the NVIDIA MIG resource:
+
[source,yaml]
----
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cuda-vectoradd
spec:
  replicas: 2
  selector:
    matchLabels:
      app: cuda-vectoradd
  template:
    metadata:
      labels:
        app: cuda-vectoradd
    spec:
      restartPolicy: Always
      containers:
      - name: cuda-vectoradd
        image: nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubi8
        resources:
          limits:
            nvidia.com/mig-1g.5gb: "1"
        command:
          - sh
          - -c
          - |
            env && /cuda-samples/vectorAdd && sleep 3600
----

. Apply the deployment configuration by running the following command:
+
[source,terminal]
----
$ oc apply -f cuda-vectoradd-deployment.yaml
----

. Verify that the deployment is created and pods are scheduled by running the following command:
+
[source,terminal]
----
$ oc get deployment cuda-vectoradd
----
+
.Example output
[source,terminal]
----
NAME             READY   UP-TO-DATE   AVAILABLE   AGE
cuda-vectoradd   2/2     2            2           2m
----

. Check the status of the pods by running the following command:
+
[source,terminal]
----
$ oc get pods -l app=cuda-vectoradd
----
+
.Example output
[source,terminal]
----
NAME                              READY   STATUS    RESTARTS   AGE
cuda-vectoradd-6b8c7d4f9b-abc12   1/1     Running   0          2m
cuda-vectoradd-6b8c7d4f9b-def34   1/1     Running   0          2m
----

.Verification

. Check that `AllocationClaim` resources were created for your deployment pods by running the following command:
+
[source,terminal]
----
$ oc get allocationclaims -n das-operator
----
+
.Example output
[source,terminal]
----
NAME                                                                                           AGE
13950288-57df-4ab5-82bc-6138f646633e-harpatil000034jma-qh5fm-worker-f-57md9-cuda-vectoradd-0   2m
ce997b60-a0b8-4ea4-9107-cf59b425d049-harpatil000034jma-qh5fm-worker-f-fl4wg-cuda-vectoradd-0   2m
----

. Verify that the GPU slices are properly allocated by checking one of the pod's resource allocation by running the following command:
+
[source,terminal]
----
$ oc describe pod -l app=cuda-vectoradd
----

. Check the logs to verify the CUDA sample application runs successfully by running the following command:
+
[source,terminal]
----
$ oc logs -l app=cuda-vectoradd
----
+
.Example output
[source,terminal]
----
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
----

. Check the environment variables to verify that the GPU devices are properly exposed to the container by running the following command:
+
[source,terminal]
----
$ oc exec deployment/cuda-vectoradd -- env | grep -E "(NVIDIA_VISIBLE_DEVICES|CUDA_VISIBLE_DEVICES)"
----
+
.Example output
[source,terminal]
----
NVIDIA_VISIBLE_DEVICES=MIG-d8ac9850-d92d-5474-b238-0afeabac1652
CUDA_VISIBLE_DEVICES=MIG-d8ac9850-d92d-5474-b238-0afeabac1652
----
+
These environment variables indicate that the GPU MIG slice has been properly allocated and is visible to the CUDA runtime within the container.