1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-05 12:46:18 +01:00
Files
openshift-docs/modules/lws-about.adoc
2025-09-17 13:40:07 +00:00

21 lines
1.5 KiB
Plaintext

// Module included in the following assemblies:
//
// * ai_workloads/leader_worker_set/index.adoc
:_mod-docs-content-type: CONCEPT
[id="lws-about_{context}"]
= About the {lws-operator}
The {lws-operator} is based on the link:https://lws.sigs.k8s.io/[LeaderWorkerSet] open source project. `LeaderWorkerSet` is a custom Kubernetes API that can be used to deploy a group of pods as a unit. This is useful for artificial intelligence (AI) and machine learning (ML) inference workloads, where large language models (LLMs) are sharded across multiple nodes.
With the `LeaderWorkerSet` API, pods are grouped into units consisting of one leader and multiple workers, all managed together as a single entity. Each pod in a group has a unique pod identity. Pods within a group are created in parallel and share identical lifecycle stages. Rollouts, rolling updates, and pod failure restarts are performed as a group.
In the `LeaderWorkerSet` configuration, you define the size of the groups and the number of group replicas. If necessary, you can define separate templates for leader and worker pods, allowing for role-specific customization. You can also configure topology-aware placement, so that pods in the same group are co-located in the same topology.
[IMPORTANT]
====
Before you install the {lws-operator}, you must install the {cert-manager-operator} because it is required to configure services and manage metrics collection.
====
Monitoring for the {lws-operator} is provided by default with {product-title} through Prometheus.