mirror of
https://github.com/openshift/openshift-docs.git
synced 2026-02-05 12:46:18 +01:00
16 lines
1.1 KiB
Plaintext
16 lines
1.1 KiB
Plaintext
// Module included in the following assemblies:
|
|
//
|
|
// * hardware_accelerators/about-hardware-accelerators.adoc
|
|
|
|
:_mod-docs-content-type: CONCEPT
|
|
[id="nvidia-gpu-time-slicing_{context}"]
|
|
= Time-slicing
|
|
|
|
GPU time-slicing interleaves workloads scheduled on overloaded GPUs when you are running multiple CUDA applications.
|
|
|
|
You can enable time-slicing of GPUs on Kubernetes by defining a set of replicas for a GPU, each of which can be independently distributed to a pod to run workloads on. Unlike multi-instance GPU (MIG), there is no memory or fault isolation between replicas, but for some workloads this is better than not sharing at all. Internally, GPU time-slicing is used to multiplex workloads from replicas of the same underlying GPU.
|
|
|
|
You can apply a cluster-wide default configuration for time-slicing. You can also apply node-specific configurations. For example, you can apply a time-slicing configuration only to nodes with Tesla T4 GPUs and not modify nodes with other GPU models.
|
|
|
|
You can combine these two approaches by applying a cluster-wide default configuration and then labeling nodes to give those nodes a node-specific configuration.
|