1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-07 00:48:01 +01:00
Files
openshift-docs/modules/nvidia-gpu-time-slicing.adoc
aireilly 8936b26f9e E810 hardware plugin doc updates
WPC T-GM + GNSS updates

General PTP docs reorg and clean up

Updates based on Aneesh's review comments

removing gnss-state-change from api/ocloudNotifications/v1/<resource_address>/CurrentState

Adding metric details

Adding new PTP image

Adding final PTP 4.14 image

Aneesh's comments

jack's comments

Adjust TOC

Peer review comments
2023-12-13 11:22:03 +00:00

16 lines
1.1 KiB
Plaintext

// Module included in the following assemblies:
//
// * architecture/nvidia-gpu-architecture-overview.adoc
:_mod-docs-content-type: CONCEPT
[id="nvidia-gpu-time-slicing_{context}"]
= Time-slicing
GPU time-slicing interleaves workloads scheduled on overloaded GPUs when you are running multiple CUDA applications.
You can enable time-slicing of GPUs on Kubernetes by defining a set of replicas for a GPU, each of which can be independently distributed to a pod to run workloads on. Unlike multi-instance GPU (MIG), there is no memory or fault isolation between replicas, but for some workloads this is better than not sharing at all. Internally, GPU time-slicing is used to multiplex workloads from replicas of the same underlying GPU.
You can apply a cluster-wide default configuration for time-slicing. You can also apply node-specific configurations. For example, you can apply a time-slicing configuration only to nodes with Tesla T4 GPUs and not modify nodes with other GPU models.
You can combine these two approaches by applying a cluster-wide default configuration and then labeling nodes to give those nodes a node-specific configuration.