mirror of
https://github.com/openshift/openshift-docs.git
synced 2026-02-05 12:46:18 +01:00
80 lines
4.6 KiB
Plaintext
80 lines
4.6 KiB
Plaintext
:_mod-docs-content-type: ASSEMBLY
|
|
[id="das-about-dynamic-accelerator-slicer-operator"]
|
|
= Dynamic Accelerator Slicer (DAS) Operator
|
|
include::_attributes/common-attributes.adoc[]
|
|
:context: das-about-dynamic-accelerator-slicer-operator
|
|
|
|
toc::[]
|
|
|
|
:FeatureName: Dynamic Accelerator Slicer Operator
|
|
|
|
include::snippets/technology-preview.adoc[]
|
|
|
|
The Dynamic Accelerator Slicer (DAS) Operator allows you to dynamically slice GPU accelerators in {product-title}, instead of relying on statically sliced GPUs defined when the node is booted. This allows you to dynamically slice GPUs based on specific workload demands, ensuring efficient resource utilization.
|
|
|
|
Dynamic slicing is useful if you do not know all the accelerator partitions needed in advance on every node on the cluster.
|
|
|
|
The DAS Operator currently includes a reference implementation for NVIDIA Multi-Instance GPU (MIG) and is designed to support additional technologies such as NVIDIA MPS or GPUs from other vendors in the future.
|
|
|
|
.Limitations
|
|
|
|
The following limitations apply when using the Dynamic Accelerator Slicer Operator:
|
|
|
|
* You need to identify potential incompatibilities and ensure the system works seamlessly with various GPU drivers and operating systems.
|
|
|
|
* The Operator only works with specific MIG compatible NVIDIA GPUs and drivers, such as H100 and A100.
|
|
|
|
* The Operator cannot use only a subset of the GPUs of a node.
|
|
|
|
* The NVIDIA device plugin cannot be used together with the Dynamic Accelerator Slicer Operator to manage the GPU resources of a cluster.
|
|
|
|
[NOTE]
|
|
====
|
|
The DAS Operator is designed to work with MIG-enabled GPUs. It allocates MIG slices instead of whole GPUs. Installing the DAS Operator prevents the use of the standard resource request through the NVIDIA device plugin such as `nvidia.com/gpu: "1"`, for allocating the entire GPU.
|
|
====
|
|
|
|
//Installing the Dynamic Accelerator Slicer Operator
|
|
include::modules/das-operator-installing.adoc[leveloffset=+1]
|
|
|
|
//Installing the Dynamic Accelerator Slicer Operator using the web console
|
|
include::modules/das-operator-installing-web-console.adoc[leveloffset=+2]
|
|
|
|
[role="_additional-resources"]
|
|
.Additional resources
|
|
** xref:../security/cert_manager_operator/cert-manager-operator-install.adoc#cert-manager-operator-install[{cert-manager-operator}]
|
|
** xref:../hardware_enablement/psap-node-feature-discovery-operator.adoc#psap-node-feature-discovery-operator[Node Feature Discovery (NFD) Operator]
|
|
** link:https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/index.html[NVIDIA GPU Operator]
|
|
|
|
** link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html/specialized_hardware_and_driver_enablement/psap-node-feature-discovery-operator#creating-nfd-cr-web-console_psap-node-feature-discovery-operator[NodeFeatureDiscovery CR]
|
|
|
|
//Installing the Dynamic Accelerator Slicer Operator using the CLI
|
|
include::modules/das-operator-installing-cli.adoc[leveloffset=+2]
|
|
|
|
[role="_additional-resources"]
|
|
.Additional resources
|
|
* xref:../security/cert_manager_operator/cert-manager-operator-install.adoc#cert-manager-operator-install[{cert-manager-operator}]
|
|
* xref:../hardware_enablement/psap-node-feature-discovery-operator.adoc#psap-node-feature-discovery-operator[Node Feature Discovery (NFD) Operator]
|
|
* link:https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/index.html[NVIDIA GPU Operator]
|
|
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html/specialized_hardware_and_driver_enablement/psap-node-feature-discovery-operator#creating-nfd-cr-cli_psap-node-feature-discovery-operator[NodeFeatureDiscovery CR]
|
|
|
|
//Uninstalling the Dynamic Accelerator Slicer Operator
|
|
include::modules/das-operator-uninstalling.adoc[leveloffset=+1]
|
|
|
|
//Uninstalling the Dynamic Accelerator Slicer Operator using the web console
|
|
include::modules/das-operator-uninstalling-web-console.adoc[leveloffset=+2]
|
|
|
|
//Uninstalling the Dynamic Accelerator Slicer Operator using the CLI
|
|
include::modules/das-operator-uninstalling-cli.adoc[leveloffset=+2]
|
|
|
|
//Deploying GPU workloads with the Dynamic Accelerator Slicer Operator
|
|
include::modules/das-operator-deploying-workloads.adoc[leveloffset=+1]
|
|
|
|
//Troubleshooting DAS Operator
|
|
include::modules/das-operator-troubleshooting.adoc[leveloffset=+1]
|
|
|
|
[role="_additional-resources"]
|
|
.Additional resources
|
|
* link:https://github.com/kubernetes/kubernetes/issues/128043[Kubernetes issue #128043]
|
|
* xref:../hardware_enablement/psap-node-feature-discovery-operator.adoc#psap-node-feature-discovery-operator[Node Feature Discovery Operator]
|
|
* link:https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/troubleshooting.html[NVIDIA GPU Operator troubleshooting]
|