mirror of
https://github.com/openshift/openshift-docs.git
synced 2026-02-07 00:48:01 +01:00
WPC T-GM + GNSS updates General PTP docs reorg and clean up Updates based on Aneesh's review comments removing gnss-state-change from api/ocloudNotifications/v1/<resource_address>/CurrentState Adding metric details Adding new PTP image Adding final PTP 4.14 image Aneesh's comments jack's comments Adjust TOC Peer review comments
16 lines
1.1 KiB
Plaintext
16 lines
1.1 KiB
Plaintext
// Module included in the following assemblies:
|
|
//
|
|
// * architecture/nvidia-gpu-architecture-overview.adoc
|
|
|
|
:_mod-docs-content-type: CONCEPT
|
|
[id="nvidia-gpu-time-slicing_{context}"]
|
|
= Time-slicing
|
|
|
|
GPU time-slicing interleaves workloads scheduled on overloaded GPUs when you are running multiple CUDA applications.
|
|
|
|
You can enable time-slicing of GPUs on Kubernetes by defining a set of replicas for a GPU, each of which can be independently distributed to a pod to run workloads on. Unlike multi-instance GPU (MIG), there is no memory or fault isolation between replicas, but for some workloads this is better than not sharing at all. Internally, GPU time-slicing is used to multiplex workloads from replicas of the same underlying GPU.
|
|
|
|
You can apply a cluster-wide default configuration for time-slicing. You can also apply node-specific configurations. For example, you can apply a time-slicing configuration only to nodes with Tesla T4 GPUs and not modify nodes with other GPU models.
|
|
|
|
You can combine these two approaches by applying a cluster-wide default configuration and then labeling nodes to give those nodes a node-specific configuration.
|