openshift-docs/modules/cnf-cpu-infra-container.adoc

// Module included in the following assemblies:
//
// * scalability_and_performance/cnf-low-latency-tuning.adoc
// * scalability_and_performance/low_latency_tuning/cnf-tuning-low-latency-nodes-with-perf-profile.adoc

:_mod-docs-content-type: PROCEDURE
[id="cnf-cpu-infra-container_{context}"]
= Restricting CPUs for infra and application containers

Generic housekeeping and workload tasks use CPUs in a way that may impact latency-sensitive processes. By default, the container runtime uses all online CPUs to run all containers together, which can result in context switches and spikes in latency. Partitioning the CPUs prevents noisy processes from interfering with latency-sensitive processes by separating them from each other. The following table describes how processes run on a CPU after you have tuned the node using the Node Tuning Operator:

.Process' CPU assignments
[%header,cols=2*]
|===
|Process type
|Details

|`Burstable` and `BestEffort` pods
|Runs on any CPU except where low latency workload is running

|Infrastructure pods
|Runs on any CPU except where low latency workload is running

|Interrupts
|Redirects to reserved CPUs (optional in {product-title} 4.7 and later)

|Kernel processes
|Pins to reserved CPUs

|Latency-sensitive workload pods
|Pins to a specific set of exclusive CPUs from the isolated pool

|OS processes/systemd services
|Pins to reserved CPUs
|===

The allocatable capacity of cores on a node for pods of all QoS process types, `Burstable`,  `BestEffort`, or `Guaranteed`, is equal to the capacity of the isolated pool. The capacity of the reserved pool is removed from the node's total core capacity for use by the cluster and operating system housekeeping duties.

.Example 1
A node features a capacity of 100 cores. Using a performance profile, the cluster administrator allocates 50 cores to the isolated pool and 50 cores to the reserved pool. The cluster administrator assigns 25 cores to QoS `Guaranteed` pods and 25 cores for `BestEffort` or `Burstable` pods. This matches the capacity of the isolated pool.

.Example 2
A node features a capacity of 100 cores. Using a performance profile, the cluster administrator allocates 50 cores to the isolated pool and 50 cores to the reserved pool. The cluster administrator assigns 50 cores to QoS `Guaranteed` pods and one core for `BestEffort` or `Burstable` pods. This exceeds the capacity of the isolated pool by one core. Pod scheduling fails because of insufficient CPU capacity.


The exact partitioning pattern to use depends on many factors like hardware, workload characteristics and the expected system load. Some sample use cases are as follows:

* If the latency-sensitive workload uses specific hardware, such as a network interface controller (NIC), ensure that the CPUs in the isolated pool are as close as possible to this hardware. At a minimum, you should place the workload in the same Non-Uniform Memory Access (NUMA) node.

* The reserved pool is used for handling all interrupts. When depending on system networking, allocate a sufficiently-sized reserve pool to handle all the incoming packet interrupts. In {product-version} and later versions, workloads can optionally be labeled as sensitive.

The decision regarding which specific CPUs should be used for reserved and isolated partitions requires detailed analysis and measurements. Factors like NUMA affinity of devices and memory play a role. The selection also depends on the workload architecture and the specific use case.

[IMPORTANT]
====
The reserved and isolated CPU pools must not overlap and together must span all available cores in the worker node.
====

To ensure that housekeeping tasks and workloads do not interfere with each other, specify two groups of CPUs in the `spec` section of the performance profile.

* `isolated` - Specifies the CPUs for the application container workloads. These CPUs have the lowest latency. Processes in this group have no interruptions and can, for example, reach much higher DPDK zero packet loss bandwidth.

* `reserved` - Specifies the CPUs for the cluster and operating system housekeeping duties. Threads in the `reserved` group are often busy. Do not run latency-sensitive applications in the `reserved` group. Latency-sensitive applications run in the `isolated` group.

.Procedure

. Create a performance profile appropriate for the environment's hardware and topology.

. Add the `reserved` and `isolated` parameters with the CPUs you want reserved and isolated for the infra and application containers:
+
[source,yaml]
----
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
  name: infra-cpus
spec:
  cpu:
    reserved: "0-4,9" <1>
    isolated: "5-8" <2>
  nodeSelector: <3>
    node-role.kubernetes.io/worker: ""
----
<1> Specify which CPUs are for infra containers to perform cluster and operating system housekeeping duties.
<2> Specify which CPUs are for application containers to run workloads.
<3> Optional: Specify a node selector to apply the performance profile to specific nodes.