mirror of
https://github.com/openshift/openshift-docs.git
synced 2026-02-05 12:46:18 +01:00
315 lines
8.9 KiB
Plaintext
315 lines
8.9 KiB
Plaintext
// Module included in the following assemblies:
|
|
//
|
|
// * scalability_and_performance/using-cpu-manager.adoc
|
|
// * post_installation_configuration/node-tasks.adoc
|
|
|
|
:_mod-docs-content-type: PROCEDURE
|
|
[id="setting_up_cpu_manager_{context}"]
|
|
= Setting up CPU Manager
|
|
|
|
To configure CPU manager, create a KubeletConfig custom resource (CR) and apply it to the desired set of nodes.
|
|
|
|
.Procedure
|
|
|
|
. Label a node by running the following command:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# oc label node perf-node.example.com cpumanager=true
|
|
----
|
|
|
|
. To enable CPU Manager for all compute nodes, edit the CR by running the following command:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# oc edit machineconfigpool worker
|
|
----
|
|
|
|
. Add the `custom-kubelet: cpumanager-enabled` label to `metadata.labels` section.
|
|
+
|
|
[source,yaml]
|
|
----
|
|
metadata:
|
|
creationTimestamp: 2020-xx-xxx
|
|
generation: 3
|
|
labels:
|
|
custom-kubelet: cpumanager-enabled
|
|
----
|
|
|
|
. Create a `KubeletConfig`, `cpumanager-kubeletconfig.yaml`, custom resource (CR). Refer to the label created in the previous step to have the correct nodes updated with the new kubelet config. See the `machineConfigPoolSelector` section:
|
|
+
|
|
[source,yaml]
|
|
----
|
|
apiVersion: machineconfiguration.openshift.io/v1
|
|
kind: KubeletConfig
|
|
metadata:
|
|
name: cpumanager-enabled
|
|
spec:
|
|
machineConfigPoolSelector:
|
|
matchLabels:
|
|
custom-kubelet: cpumanager-enabled
|
|
kubeletConfig:
|
|
cpuManagerPolicy: static <1>
|
|
cpuManagerReconcilePeriod: 5s <2>
|
|
----
|
|
<1> Specify a policy:
|
|
* `none`. This policy explicitly enables the existing default CPU affinity scheme, providing no affinity beyond what the scheduler does automatically. This is the default policy.
|
|
* `static`. This policy allows containers in guaranteed pods with integer CPU requests. It also limits access to exclusive CPUs on the node. If `static`, you must use a lowercase `s`.
|
|
<2> Optional. Specify the CPU Manager reconcile frequency. The default is `5s`.
|
|
|
|
. Create the dynamic kubelet config by running the following command:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# oc create -f cpumanager-kubeletconfig.yaml
|
|
----
|
|
+
|
|
This adds the CPU Manager feature to the kubelet config and, if needed, the Machine Config Operator (MCO) reboots the node. To enable CPU Manager, a reboot is not needed.
|
|
|
|
. Check for the merged kubelet config by running the following command:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# oc get machineconfig 99-worker-XXXXXX-XXXXX-XXXX-XXXXX-kubelet -o json | grep ownerReference -A7
|
|
----
|
|
+
|
|
.Example output
|
|
[source,json]
|
|
----
|
|
"ownerReferences": [
|
|
{
|
|
"apiVersion": "machineconfiguration.openshift.io/v1",
|
|
"kind": "KubeletConfig",
|
|
"name": "cpumanager-enabled",
|
|
"uid": "7ed5616d-6b72-11e9-aae1-021e1ce18878"
|
|
}
|
|
]
|
|
----
|
|
|
|
. Check the compute node for the updated `kubelet.conf` file by running the following command:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# oc debug node/perf-node.example.com
|
|
sh-4.2# cat /host/etc/kubernetes/kubelet.conf | grep cpuManager
|
|
----
|
|
+
|
|
.Example output
|
|
[source,terminal]
|
|
----
|
|
cpuManagerPolicy: static <1>
|
|
cpuManagerReconcilePeriod: 5s <2>
|
|
----
|
|
<1> `cpuManagerPolicy` is defined when you create the `KubeletConfig` CR.
|
|
<2> `cpuManagerReconcilePeriod` is defined when you create the `KubeletConfig` CR.
|
|
|
|
. Create a project by running the following command:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
$ oc new-project <project_name>
|
|
----
|
|
|
|
. Create a pod that requests a core or multiple cores. Both limits and requests must have their CPU value set to a whole integer. That is the number of cores that will be dedicated to this pod:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# cat cpumanager-pod.yaml
|
|
----
|
|
+
|
|
.Example output
|
|
[source,yaml]
|
|
----
|
|
apiVersion: v1
|
|
kind: Pod
|
|
metadata:
|
|
generateName: cpumanager-
|
|
spec:
|
|
securityContext:
|
|
runAsNonRoot: true
|
|
seccompProfile:
|
|
type: RuntimeDefault
|
|
containers:
|
|
- name: cpumanager
|
|
image: gcr.io/google_containers/pause:3.2
|
|
resources:
|
|
requests:
|
|
cpu: 1
|
|
memory: "1G"
|
|
limits:
|
|
cpu: 1
|
|
memory: "1G"
|
|
securityContext:
|
|
allowPrivilegeEscalation: false
|
|
capabilities:
|
|
drop: [ALL]
|
|
nodeSelector:
|
|
cpumanager: "true"
|
|
----
|
|
|
|
. Create the pod:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# oc create -f cpumanager-pod.yaml
|
|
----
|
|
|
|
.Verification
|
|
|
|
. Verify that the pod is scheduled to the node that you labeled by running the following command:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# oc describe pod cpumanager
|
|
----
|
|
+
|
|
.Example output
|
|
[source,terminal]
|
|
----
|
|
Name: cpumanager-6cqz7
|
|
Namespace: default
|
|
Priority: 0
|
|
PriorityClassName: <none>
|
|
Node: perf-node.example.com/xxx.xx.xx.xxx
|
|
...
|
|
Limits:
|
|
cpu: 1
|
|
memory: 1G
|
|
Requests:
|
|
cpu: 1
|
|
memory: 1G
|
|
...
|
|
QoS Class: Guaranteed
|
|
Node-Selectors: cpumanager=true
|
|
----
|
|
|
|
. Verify that a CPU has been exclusively assigned to the pod by running the following command:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# oc describe node --selector='cpumanager=true' | grep -i cpumanager- -B2
|
|
----
|
|
+
|
|
.Example output
|
|
[source,terminal]
|
|
----
|
|
NAMESPACE NAME CPU Requests CPU Limits Memory Requests Memory Limits Age
|
|
cpuman cpumanager-mlrrz 1 (28%) 1 (28%) 1G (13%) 1G (13%) 27m
|
|
----
|
|
|
|
. Verify that the `cgroups` are set up correctly. Get the process ID (PID) of the `pause` process by running the following commands:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# oc debug node/perf-node.example.com
|
|
----
|
|
+
|
|
[source,terminal]
|
|
----
|
|
sh-4.2# systemctl status | grep -B5 pause
|
|
----
|
|
+
|
|
[NOTE]
|
|
====
|
|
If the output returns multiple pause process entries, you must identify the correct pause process.
|
|
====
|
|
+
|
|
.Example output
|
|
[source,terminal]
|
|
----
|
|
# ├─init.scope
|
|
│ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 17
|
|
└─kubepods.slice
|
|
├─kubepods-pod69c01f8e_6b74_11e9_ac0f_0a2b62178a22.slice
|
|
│ ├─crio-b5437308f1a574c542bdf08563b865c0345c8f8c0b0a655612c.scope
|
|
│ └─32706 /pause
|
|
----
|
|
|
|
. Verify that pods of quality of service (QoS) tier `Guaranteed` are placed within the `kubepods.slice` subdirectory by running the following commands:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# cd /sys/fs/cgroup/kubepods.slice/kubepods-pod69c01f8e_6b74_11e9_ac0f_0a2b62178a22.slice/crio-b5437308f1ad1a7db0574c542bdf08563b865c0345c86e9585f8c0b0a655612c.scope
|
|
----
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# for i in `ls cpuset.cpus cgroup.procs` ; do echo -n "$i "; cat $i ; done
|
|
----
|
|
+
|
|
[NOTE]
|
|
====
|
|
Pods of other QoS tiers end up in child `cgroups` of the parent `kubepods`.
|
|
====
|
|
+
|
|
.Example output
|
|
[source,terminal]
|
|
----
|
|
cpuset.cpus 1
|
|
tasks 32706
|
|
----
|
|
|
|
. Check the allowed CPU list for the task by running the following command:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# grep ^Cpus_allowed_list /proc/32706/status
|
|
----
|
|
+
|
|
.Example output
|
|
[source,terminal]
|
|
----
|
|
Cpus_allowed_list: 1
|
|
----
|
|
|
|
. Verify that another pod on the system cannot run on the core allocated for the `Guaranteed` pod. For example, to verify the pod in the `besteffort` QoS tier, run the following commands:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# cat /sys/fs/cgroup/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podc494a073_6b77_11e9_98c0_06bba5c387ea.slice/crio-c56982f57b75a2420947f0afc6cafe7534c5734efc34157525fa9abbf99e3849.scope/cpuset.cpus
|
|
----
|
|
+
|
|
[source,terminal]
|
|
----
|
|
# oc describe node perf-node.example.com
|
|
----
|
|
+
|
|
.Example output
|
|
[source,terminal]
|
|
----
|
|
...
|
|
Capacity:
|
|
attachable-volumes-aws-ebs: 39
|
|
cpu: 2
|
|
ephemeral-storage: 124768236Ki
|
|
hugepages-1Gi: 0
|
|
hugepages-2Mi: 0
|
|
memory: 8162900Ki
|
|
pods: 250
|
|
Allocatable:
|
|
attachable-volumes-aws-ebs: 39
|
|
cpu: 1500m
|
|
ephemeral-storage: 124768236Ki
|
|
hugepages-1Gi: 0
|
|
hugepages-2Mi: 0
|
|
memory: 7548500Ki
|
|
pods: 250
|
|
------- ---- ------------ ---------- --------------- ------------- ---
|
|
default cpumanager-6cqz7 1 (66%) 1 (66%) 1G (12%) 1G (12%) 29m
|
|
|
|
Allocated resources:
|
|
(Total limits may be over 100 percent, i.e., overcommitted.)
|
|
Resource Requests Limits
|
|
-------- -------- ------
|
|
cpu 1440m (96%) 1 (66%)
|
|
----
|
|
+
|
|
This VM has two CPU cores. The `system-reserved` setting reserves 500 millicores, meaning that half of one core is subtracted from the total capacity of the node to arrive at the `Node Allocatable` amount. You can see that `Allocatable CPU` is 1500 millicores. This means you can run one of the CPU Manager pods since each will take one whole core. A whole core is equivalent to 1000 millicores. If you try to schedule a second pod, the system will accept the pod, but it will never be scheduled:
|
|
+
|
|
[source,terminal]
|
|
----
|
|
NAME READY STATUS RESTARTS AGE
|
|
cpumanager-6cqz7 1/1 Running 0 33m
|
|
cpumanager-7qc2t 0/1 Pending 0 11s
|
|
----
|