1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-05 12:46:18 +01:00
Files
openshift-docs/modules/setting-up-cpu-manager.adoc
2024-06-21 18:02:19 +00:00

315 lines
8.9 KiB
Plaintext

// Module included in the following assemblies:
//
// * scalability_and_performance/using-cpu-manager.adoc
// * post_installation_configuration/node-tasks.adoc
:_mod-docs-content-type: PROCEDURE
[id="setting_up_cpu_manager_{context}"]
= Setting up CPU Manager
To configure CPU manager, create a KubeletConfig custom resource (CR) and apply it to the desired set of nodes.
.Procedure
. Label a node by running the following command:
+
[source,terminal]
----
# oc label node perf-node.example.com cpumanager=true
----
. To enable CPU Manager for all compute nodes, edit the CR by running the following command:
+
[source,terminal]
----
# oc edit machineconfigpool worker
----
. Add the `custom-kubelet: cpumanager-enabled` label to `metadata.labels` section.
+
[source,yaml]
----
metadata:
creationTimestamp: 2020-xx-xxx
generation: 3
labels:
custom-kubelet: cpumanager-enabled
----
. Create a `KubeletConfig`, `cpumanager-kubeletconfig.yaml`, custom resource (CR). Refer to the label created in the previous step to have the correct nodes updated with the new kubelet config. See the `machineConfigPoolSelector` section:
+
[source,yaml]
----
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: cpumanager-enabled
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet: cpumanager-enabled
kubeletConfig:
cpuManagerPolicy: static <1>
cpuManagerReconcilePeriod: 5s <2>
----
<1> Specify a policy:
* `none`. This policy explicitly enables the existing default CPU affinity scheme, providing no affinity beyond what the scheduler does automatically. This is the default policy.
* `static`. This policy allows containers in guaranteed pods with integer CPU requests. It also limits access to exclusive CPUs on the node. If `static`, you must use a lowercase `s`.
<2> Optional. Specify the CPU Manager reconcile frequency. The default is `5s`.
. Create the dynamic kubelet config by running the following command:
+
[source,terminal]
----
# oc create -f cpumanager-kubeletconfig.yaml
----
+
This adds the CPU Manager feature to the kubelet config and, if needed, the Machine Config Operator (MCO) reboots the node. To enable CPU Manager, a reboot is not needed.
. Check for the merged kubelet config by running the following command:
+
[source,terminal]
----
# oc get machineconfig 99-worker-XXXXXX-XXXXX-XXXX-XXXXX-kubelet -o json | grep ownerReference -A7
----
+
.Example output
[source,json]
----
"ownerReferences": [
{
"apiVersion": "machineconfiguration.openshift.io/v1",
"kind": "KubeletConfig",
"name": "cpumanager-enabled",
"uid": "7ed5616d-6b72-11e9-aae1-021e1ce18878"
}
]
----
. Check the compute node for the updated `kubelet.conf` file by running the following command:
+
[source,terminal]
----
# oc debug node/perf-node.example.com
sh-4.2# cat /host/etc/kubernetes/kubelet.conf | grep cpuManager
----
+
.Example output
[source,terminal]
----
cpuManagerPolicy: static <1>
cpuManagerReconcilePeriod: 5s <2>
----
<1> `cpuManagerPolicy` is defined when you create the `KubeletConfig` CR.
<2> `cpuManagerReconcilePeriod` is defined when you create the `KubeletConfig` CR.
. Create a project by running the following command:
+
[source,terminal]
----
$ oc new-project <project_name>
----
. Create a pod that requests a core or multiple cores. Both limits and requests must have their CPU value set to a whole integer. That is the number of cores that will be dedicated to this pod:
+
[source,terminal]
----
# cat cpumanager-pod.yaml
----
+
.Example output
[source,yaml]
----
apiVersion: v1
kind: Pod
metadata:
generateName: cpumanager-
spec:
securityContext:
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
containers:
- name: cpumanager
image: gcr.io/google_containers/pause:3.2
resources:
requests:
cpu: 1
memory: "1G"
limits:
cpu: 1
memory: "1G"
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: [ALL]
nodeSelector:
cpumanager: "true"
----
. Create the pod:
+
[source,terminal]
----
# oc create -f cpumanager-pod.yaml
----
.Verification
. Verify that the pod is scheduled to the node that you labeled by running the following command:
+
[source,terminal]
----
# oc describe pod cpumanager
----
+
.Example output
[source,terminal]
----
Name: cpumanager-6cqz7
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: perf-node.example.com/xxx.xx.xx.xxx
...
Limits:
cpu: 1
memory: 1G
Requests:
cpu: 1
memory: 1G
...
QoS Class: Guaranteed
Node-Selectors: cpumanager=true
----
. Verify that a CPU has been exclusively assigned to the pod by running the following command:
+
[source,terminal]
----
# oc describe node --selector='cpumanager=true' | grep -i cpumanager- -B2
----
+
.Example output
[source,terminal]
----
NAMESPACE NAME CPU Requests CPU Limits Memory Requests Memory Limits Age
cpuman cpumanager-mlrrz 1 (28%) 1 (28%) 1G (13%) 1G (13%) 27m
----
. Verify that the `cgroups` are set up correctly. Get the process ID (PID) of the `pause` process by running the following commands:
+
[source,terminal]
----
# oc debug node/perf-node.example.com
----
+
[source,terminal]
----
sh-4.2# systemctl status | grep -B5 pause
----
+
[NOTE]
====
If the output returns multiple pause process entries, you must identify the correct pause process.
====
+
.Example output
[source,terminal]
----
# ├─init.scope
│ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 17
└─kubepods.slice
├─kubepods-pod69c01f8e_6b74_11e9_ac0f_0a2b62178a22.slice
│ ├─crio-b5437308f1a574c542bdf08563b865c0345c8f8c0b0a655612c.scope
│ └─32706 /pause
----
. Verify that pods of quality of service (QoS) tier `Guaranteed` are placed within the `kubepods.slice` subdirectory by running the following commands:
+
[source,terminal]
----
# cd /sys/fs/cgroup/kubepods.slice/kubepods-pod69c01f8e_6b74_11e9_ac0f_0a2b62178a22.slice/crio-b5437308f1ad1a7db0574c542bdf08563b865c0345c86e9585f8c0b0a655612c.scope
----
+
[source,terminal]
----
# for i in `ls cpuset.cpus cgroup.procs` ; do echo -n "$i "; cat $i ; done
----
+
[NOTE]
====
Pods of other QoS tiers end up in child `cgroups` of the parent `kubepods`.
====
+
.Example output
[source,terminal]
----
cpuset.cpus 1
tasks 32706
----
. Check the allowed CPU list for the task by running the following command:
+
[source,terminal]
----
# grep ^Cpus_allowed_list /proc/32706/status
----
+
.Example output
[source,terminal]
----
Cpus_allowed_list: 1
----
. Verify that another pod on the system cannot run on the core allocated for the `Guaranteed` pod. For example, to verify the pod in the `besteffort` QoS tier, run the following commands:
+
[source,terminal]
----
# cat /sys/fs/cgroup/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podc494a073_6b77_11e9_98c0_06bba5c387ea.slice/crio-c56982f57b75a2420947f0afc6cafe7534c5734efc34157525fa9abbf99e3849.scope/cpuset.cpus
----
+
[source,terminal]
----
# oc describe node perf-node.example.com
----
+
.Example output
[source,terminal]
----
...
Capacity:
attachable-volumes-aws-ebs: 39
cpu: 2
ephemeral-storage: 124768236Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 8162900Ki
pods: 250
Allocatable:
attachable-volumes-aws-ebs: 39
cpu: 1500m
ephemeral-storage: 124768236Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 7548500Ki
pods: 250
------- ---- ------------ ---------- --------------- ------------- ---
default cpumanager-6cqz7 1 (66%) 1 (66%) 1G (12%) 1G (12%) 29m
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1440m (96%) 1 (66%)
----
+
This VM has two CPU cores. The `system-reserved` setting reserves 500 millicores, meaning that half of one core is subtracted from the total capacity of the node to arrive at the `Node Allocatable` amount. You can see that `Allocatable CPU` is 1500 millicores. This means you can run one of the CPU Manager pods since each will take one whole core. A whole core is equivalent to 1000 millicores. If you try to schedule a second pod, the system will accept the pod, but it will never be scheduled:
+
[source,terminal]
----
NAME READY STATUS RESTARTS AGE
cpumanager-6cqz7 1/1 Running 0 33m
cpumanager-7qc2t 0/1 Pending 0 11s
----