openshift-docs/modules/lws-about.adoc

// Module included in the following assemblies:
//
// * ai_workloads/leader_worker_set/index.adoc

:_mod-docs-content-type: CONCEPT
[id="lws-about_{context}"]
= About the {lws-operator}

The {lws-operator} is based on the link:https://lws.sigs.k8s.io/[LeaderWorkerSet] open source project. `LeaderWorkerSet` is a custom Kubernetes API that can be used to deploy a group of pods as a unit. This is useful for artificial intelligence (AI) and machine learning (ML) inference workloads, where large language models (LLMs) are sharded across multiple nodes.

With the `LeaderWorkerSet` API, pods are grouped into units consisting of one leader and multiple workers, all managed together as a single entity. Each pod in a group has a unique pod identity. Pods within a group are created in parallel and share identical lifecycle stages. Rollouts, rolling updates, and pod failure restarts are performed as a group.

In the `LeaderWorkerSet` configuration, you define the size of the groups and the number of group replicas. If necessary, you can define separate templates for leader and worker pods, allowing for role-specific customization. You can also configure topology-aware placement, so that pods in the same group are co-located in the same topology.

[IMPORTANT]
====
Before you install the {lws-operator}, you must install the {cert-manager-operator} because it is required to configure services and manage metrics collection.
====

Monitoring for the {lws-operator} is provided by default with {product-title} through Prometheus.