1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-05 12:46:18 +01:00

Signed-off-by: Stephen Smith <stesmith@redhat.com>

New 4.4 changes from May19

Signed-off-by: Stephen Smith <stesmith@redhat.com>

Editorial changes from ahardin.

Signed-off-by: Stephen Smith <stesmith@redhat.com>
This commit is contained in:
Stephen Smith
2020-05-19 09:07:16 -04:00
committed by openshift-cherrypick-robot
parent a05fdaca25
commit fdbc87b92b
8 changed files with 503 additions and 36 deletions

View File

@@ -1214,7 +1214,7 @@ Topics:
File: routing-optimization
- Name: What huge pages do and how they are consumed by apps
File: what-huge-pages-do-and-how-they-are-consumed-by-apps
- Name: Performance-addon operator for low latency nodes
- Name: Performance Addon Operator for low latency nodes
File: cnf-performance-addon-operator-for-low-latency-nodes
Distros: openshift-webscale
- Name: Using ArgoCD

View File

@@ -0,0 +1,86 @@
// Module included in the following assemblies:
//CNF-78
// * scalability_and_performance/cnf-performance-addon-operator-for-low-latency-nodes.adoc
[id="cnf-configuring-huge-pages_{context}"]
= Configuring huge pages
Nodes must pre-allocate huge pages used in an {product-title} cluster. Use the
Performance Addon Operator to allocate huge pages on a specific node.
{product-title} provides a method for creating and allocating huge pages.
Performance Addon Operator provides an easier method for doing this using the
PerformanceProfile.
For example, in the `hugepages` `pages` section of the PerformanceProfile,
you can specify multiple blocks of `size`, `count`, and, optionally, `node`:
----
hugepages:
defaultHugepagesSize: "1Gi"
pages:
- size: "1Gi"
count: 4
node: 0 <1>
----
<1> `node` is the NUMA node in which the huge pages are allocated. If you omit `node`, the pages are evenly spread across all NUMA nodes.
[NOTE]
====
Wait for the relevant machineconfigpool status that indicates the update is finished.
====
These are the only configuration steps you need to do to allocate huge pages.
.Verification steps
* To verify the configuration, see the `/proc/meminfo` file on the node:
+
----
# grep -i huge /proc/meminfo
----
+
----
AnonHugePages: ###### ##
ShmemHugePages: 0 kB
HugePages_Total: 2
HugePages_Free: 2
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: #### ##
Hugetlb: #### ##
----
* Use `oc debug node` to check that the kernel argument worked by going to one of the worker nodes and listing the
kernel command line arguments (in `/proc/cmdline` on the host):
+
----
$ oc debug node/ip-10-0-141-105.ec2.internal
----
+
----
Starting pod/ip-10-0-141-105ec2internal-debug ...
To use host binaries, run `chroot /host`
sh-4.2# cat /host/proc/cmdline
BOOT_IMAGE=/ostree/rhcos-... console=tty0 console=ttyS0,115200n8
rootflags=defaults,prjquota rw root=UUID=fd0... ostree=/ostree/boot.0/rhcos/16...
coreos.oem.id=qemu coreos.oem.id=ec2 ignition.platform.id=ec2 selinux=0
sh-4.2# exit
----
+
You should see the `selinux=0` argument added to the other kernel arguments.
* Use `oc describe` to report the new size:
+
----
$ oc describe node worker-0.ocp4poc.example.com | grep -i huge
----
+
----
hugepages-1g=true
hugepages-###: ###
hugepages-###: ###
----

View File

@@ -0,0 +1,149 @@
// Module included in the following assemblies:
//CNF-78
// * networking/multiple_networks/configuring-sr-iov.adoc
[id="installing-the-performance-addon-operator_{context}"]
= Installing the Performance Addon Operator
Performance Addon Operator provides the ability to enable advanced node performance tunings on a set of nodes.
As a cluster administrator, you can install Performance Addon Operator using the {product-title} CLI or the web console.
[id="install-operator-cli_{context}"]
== Installing the Operator using the CLI
As a cluster administrator, you can install the Operator using the CLI.
.Prerequisites
* A cluster installed on bare-metal hardware.
* The {product-title} Command-line Interface (CLI), commonly known as `oc`.
* Log in as a user with `cluster-admin` privileges.
.Procedure
. Create a namespace for the Performance Addon Operator by completing the following actions:
.. Create the following Namespace Custom Resource (CR) that defines the `openshift-performance-addon-operator` namespace,
and then save the YAML in the `pao-namespace.yaml` file:
+
----
apiVersion: v1
kind: Namespace
metadata:
name: performance-addon-operator
labels:
openshift.io/run-level: "1"
----
.. Create the namespace by running the following command:
+
----
$ oc create -f pao-namespace.yaml
----
. Install the Performance Addon Operator in the namespace you created in the previous step by creating the following objects:
.. Create the following OperatorGroup CR and save the YAML in the
pao-operatorgroup.yaml` file:
+
[source,yaml]
----
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: performance-addon-operator
namespace: openshift-performance-addon-operator
spec:
targetNamespaces:
- openshift-performance-addon-operator
----
.. Create the OperatorGroup CR by running the following command:
+
----
$ oc create -f pao-operatorgroup.yaml
----
.. Run the following command to get the `channel` value required for the next
step.
+
----
$ oc get packagemanifest performance-addon-operator -n openshift-marketplace -o jsonpath='{.status.defaultChannel}'
4.4
----
.. Create the following Subscription CR and save the YAML in the `pao-sub.yaml` file:
+
.Example Subscription
[source,yaml]
----
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: performance-addon-operator-subscription
namespace: openshift-performance-addon-operator
spec:
channel: <channel> <1>
name: performance-addon-operator
source: redhat-operators <2>
sourceNamespace: openshift-marketplace
----
<1> Specify the value from you obtained in the previous step for the `.status.defaultChannel` parameter.
<2> You must specify the `redhat-operators` value.
.. Create the Subscription object by running the following command:
+
----
$ oc create -f pao-sub.yaml
----
.. Change to the `openshift-performance-addon-operator` project:
+
----
$ oc project openshift-performance-addon-operator
----
[id="install-operator-web-console_{context}"]
== Installing the Performance Addon Operator using the web console
As a cluster administrator, you can install the Performance Addon Operator using the web console.
[NOTE]
====
You have to create the Namespace CR and OperatorGroup CR as mentioned
in the previous section.
====
.Procedure
. Install the Performance Addon Operator using the {product-title} web console:
.. In the {product-title} web console, click *Operators* -> *OperatorHub*.
.. Choose *Performance Addon Operator* from the list of available Operators, and then click *Install*.
.. On the *Create Operator Subscription* page, under *A specific namespace on the cluster*
select *openshift-performance-addon-operator*. Then, click *Subscribe*.
. Optional: Verify that the performance-addon-operator installed successfully:
.. Switch to the *Operators* -> *Installed Operators* page.
.. Ensure that *Performance Addon Operator* is listed in the *openshift-performance-addon-operator* project with a *Status* of *InstallSucceeded*.
+
[NOTE]
====
During installation an Operator might display a *Failed* status.
If the installation later succeeds with an *InstallSucceeded* message, you can ignore the *Failed* message.
====
+
If the Operator does not appear as installed, to troubleshoot further:
+
* Go to the *Operators* -> *Installed Operators* page and inspect
the *Operator Subscriptions* and *Install Plans* tabs for any failure or errors
under *Status*.
* Go to the *Workloads* -> *Pods* page and check the logs for Pods in the
`performance-addon-operator` project.

View File

@@ -0,0 +1,76 @@
// Module included in the following assemblies:
// Epic CNF-78
// scalability_and_performance/cnf-performance-addon-operator-for-low-latency-nodes.adoc
[IMPORTANT]
====
The feature described in this document is for *Developer Preview* purposes and is *not supported* by Red Hat at this time.
This feature could cause nodes to reboot and not be available.
====
[id="cnf-tuning-nodes-for-low-latency-via-performanceprofile_{context}"]
= Tuning nodes for low latency via PerformanceProfile
The PerformanceProfile lets you control latency tuning aspects of nodes that belong to a certain MachineConfigPool.
After you have specified your settings, the `PerformanceProfile` object is compiled into multiple objects that perform the actual node level tuning:
* A `MachineConfig` file that manipulates the nodes.
* A `KubeletConfig` file that configures the Topology Manager, the CPU Manager, and the {product-title} nodes.
* The Tuned profile that configures the Node Tuning Operator.
.Procedure
. Prepare a cluster.
. Create a Machine Config Pool.
. Install the Performance Addon Operator.
. Create a performance profile that is appropriate for your hardware and topology.
In the PerformanceProfile, you can specify whether to update the kernel to kernel-rt, allocation of hugepages, the CPUs that
will be reserved for operating system housekeeping processes and CPUs that will be used for running the workloads.
+
This is a typical performance profile:
+
[source,yaml]
----
apiversion: performance.openshift.io/v1alpha1
kind: PerformanceProfile
metadata:
name: <unique-name>
spec:
cpu:
isolated: "1-3"
reserved: "0"
hugepages:
defaultHugepagesSize: "1Gi"
pages:
- size: "1Gi"
count: 4
node: 0
realTimeKernel:
enabled: true <1>
numa:
topologyPolicy: "best-effort"
nodeSelector:
node-role.kubernetes.io/worker-cnf: ""
----
<1> Valid values are `true` or `false`. Setting the `true` value installs the real-time kernel on the node.
== Partitioning the CPUs
You can reserve cores, or threads, for operating system housekeeping tasks from a single NUMA node and put your workloads on another NUMA node.
The reason for this is that the housekeeping processes might be using the CPUs in a way that would impact latency sensitive processes
running on those same CPUs.
Keeping your workloads on a separate NUMA node prevents the processes from interfering with each other.
Additionally, each NUMA node has its own memory bus that is not shared.
Specify two groups of CPUs in the `spec` section:
* `isolated` - Has the lowest latency. Processes in this group have no interruptions and so can, for example,
reach much higher DPDK zero packet loss bandwidth.
* `reserved` - The housekeeping CPUs. Threads in the reserved group tend to be very busy, so latency-sensitive
applications should be run in the isolated group.
See link:https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a-qos-class-of-guaranteed[Create a Pod that gets assigned a QoS class of `Guaranteed`].

View File

@@ -41,7 +41,7 @@ set values, installing a kernel, and reconfiguring the machine. But this method
requires setting up four different Operators and performing many configurations
that, when done manually, is complex and could be prone to mistakes.
{product-title} 4.4 provides a performance-addon Operator to implement automatic
{product-title} 4.4 provides a Performance Addon Operator to implement automatic
tuning in order to achieve low latency performance for OpenShift applications.
The cluster administrator uses this performance profile configuration that makes
it easier to make these changes in a more reliable way. The administrator can

View File

@@ -61,6 +61,8 @@ huge pages to verify:
----
$ oc logs <tuned_pod_on_node_using_hugepages> \
-n openshift-cluster-node-tuning-operator | grep 'applied$' | tail -n1
----
+
----
2019-08-08 07:20:41,286 INFO tuned.daemon.daemon: static tuning from profile 'node-hugepages' applied
----

View File

@@ -1,5 +1,6 @@
// Epic CNF-40
[id="cnf-building-and-deploying-a-dpdk-payload"]
= Building and deploying a DPDK payload using the s2i image
= Building and deploying a DPDK payload using the S2I image
include::modules/common-attributes.adoc[]
:context: building-deploying-DPDK-using-s2i-image
toc::[]
@@ -8,7 +9,7 @@ The Data Plane Development Kit (DPDK) base image is a base image for DPDK
applications. It uses the Source-to-Image (S2I) build tool to automate the
building of application images.
Source-to-Image (S2I) is a tool for building reproducible, Docker-formatted
Source-to-Image (S2I) is a tool for building reproducible and formatted
container images. It produces ready-to-run images by injecting application
source into a container image and assembling a new image. The new image
incorporates the base image (the builder) and built source. For more
@@ -16,23 +17,26 @@ information, see
xref:../builds/build-strategies.adoc#build-strategy-s2i_build-strategies[Source-to-Image
(S2I) build].
The DPDK base image that comes preinstalled with DPDK, and a build tool used to
create a target image with DPDK and a user provided application.
The DPDK base image comes preinstalled with DPDK, and with a build tool that can be used to
create a target image containing the DPDK libraries and the application provided by the user.
.Prerequisites
Before using the S2I tool, ensure that you have the following components installed and configured:
* xref:../registry/configuring-registry-operator.adoc#configuring-registry-operator[Image Registry Operator].
* OCP Image Registry Operator:
See xref:../registry/configuring-registry-operator.adoc#configuring-registry-operator[Image Registry Operator in OpenShift Container Platform].
* xref:../networking/hardware_networks/installing-sriov-operator.adoc#installing-sriov-operator[Installing the SR-IOV Network Operator].
* SR-IOV Operator:
See xref:../networking/hardware_networks/about-sriov.adoc#about-sriov[About SR-IOV hardware on {product-title}].
An example DPDK-based application base on the dpdk-based image is available in
the link:https://github.com/openshift-kni/cnf-features-deploy/tree/master/tools/s2i-dpdk/test/test-app[cnf-features-deploy] repository.
* Performance AddOn Operator:
See xref:../scalability_and_performance/cnf-performance-addon-operator-for-low-latency-nodes.adoc#performance-addon-operator-for-low-latency-nodes[About CNF Performance Addon Operator].
This example application is the `test-pmd` application provided by dpdk.org. See the link:https://doc.dpdk.org/guides/testpmd_app_ug/[Testpmd Application User Guide].
This example application is the `test-pmd` application provided by dpdk.org.
For more information, see link:https://doc.dpdk.org/guides/testpmd_app_ug/[Testpmd Application User Guide].
.Procedure
.Building procedure
To build a target image, create a repository containing an application and the following two scripts:
@@ -58,7 +62,8 @@ This is an example of `run.sh`:
export CPU=$(cat /sys/fs/cgroup/cpuset/cpuset.cpus)
echo ${CPU}
echo ${PCIDEVICE_OPENSHIFT_IO_DPDKNIC}
echo ${PCIDEVICE_OPENSHIFT_IO_DPDKNIC} # This is the resource name configured via
the SR-IOV operator.
if [ "$RUN_TYPE" == "testpmd" ]; then
envsubst < test-template.sh > test.sh
@@ -69,17 +74,35 @@ fi
while true; do sleep inf; done;
----
The example `run.sh` will run the commands inside the `test-template.sh`.
----
spawn ./customtestpmd -l ${CPU} -w ${PCIDEVICE_OPENSHIFT_IO_DPDKNIC}
--iova-mode=va -- -i --portmask=0x1 --nb-cores=2 --forward-mode=mac --port-topology=loop
--no-mlockall
set timeout 10000
expect "testpmd>"
send -- "start\r"
sleep 20
expect "testpmd>"
send -- "stop\r"
expect "testpmd>"
send -- "quit\r"
expect eof
----
This file will run the `testpmd` compiled application from the build stage.
This spawns the `testpmd` interactive terminal then start a test workload and close it after 20 seconds.
The DPDK base image and the application repository are both used to build a target application image.
S2I copies the application from the repository to the DPDK base image, which then builds a target image using
DPDK base image resources and the copied application.
You can use the {product-title} BuildConfig to build a target image in a production environment.
A sample manifest file for building the sample DPDK application described above is available in the
link:https://github.com/openshift-kni/cnf-features-deploy/blob/master/feature-configs/demo/dpdk/build-config.yaml[build-config.yaml] file.
The `build-config.yaml` file is the file you use to create your automated build.
It creates a new `dpdk` namespace and configures an `ImageStream` for the image
and starts a build.
The internal registry must be configured in the cluster.
----
---
@@ -112,13 +135,13 @@ spec:
source: <3>
contextDir: tools/s2i-dpdk/test/test-app
git:
uri: https://github.com/openshift-kni/cnf-features-deploy.git
uri: <repo-uri> <4>
type: Git
strategy: <4>
strategy: <5>
sourceStrategy:
from:
kind: DockerImage
name: quay.io/schseba/dpdk-s2i-base:ds
name: registry.access.redhat.com/openshift4/dpdk-base-rhel8:v4.4
type: Source
successfulBuildsHistoryLimit: 5
triggers:
@@ -131,7 +154,9 @@ spec:
<3> The `source` type contains the git repository and a context directory within the repository.
<4> The `strategy` type contains a DPDK base image.
<4> The repository uri that contains the application and both the `build.sh` and the `run.sh` files.
<5> The `strategy` type contains a DPDK base image.
It is the `source` type and `strategy` type that build the image.
@@ -139,17 +164,140 @@ A complete guide to using BuildConfigs is available in
xref:../builds/understanding-buildconfigs.adoc#understanding-buildconfigs[Understanding
build configurations].
After the base DPDK image build is ready, create a new directory with a
`kustomization.yaml` file and a `build-config.yaml` patch.
After the base DPDK image build is ready, you should configure the environment to be able to run the DPDK workload on it.
This is an example of `kustomization.yaml`:
.Deployment procedure
. Create a performance profile to allocate Hugepages and isolated CPUs. For more
information, see
xref:../scalability_and_performance/cnf-performance-addon-operator-for-low-latency-nodes.adoc#cnf-understanding-low-latency_{context}[Tuning
nodes for low latency via PerformanceProfile].
. Create the SR-IOV network policy and the SR-IOV network attachment based on your network card type. For more information,
see xref:../networking/hardware_networks/using-dpdk-and-rdma.adoc#using-dpdk-and-rdma[Using Virtual Functions (VFs) with DPDK and RDMA modes].
. Using a deployment config resource instead of a regular deployment allows automatic redeploy of the workload whenever a new image is built.
You must create a special `SecurityContextConstraints` resource that will allow the `deployer` service account to create the
dpdk workload deployment and the deployment config resource pointing to the `ImageStream`.
+
`SecurityContextConstraints` example:
+
----
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../demo/dpdk
patchesStrategicMerge:
- build-config.yaml
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
name: dpdk
allowHostDirVolumePlugin: true
allowHostIPC: false
allowHostNetwork: false
allowHostPID: false
allowHostPorts: false
allowPrivilegeEscalation: false
allowPrivilegedContainer: false
allowedCapabilities:
- "*"
allowedUnsafeSysctls:
- "*"
defaultAddCapabilities: null
fsGroup:
type: RunAsAny
readOnlyRootFilesystem: false
runAsUser:
type: RunAsAny
seLinuxContext:
type: RunAsAny
seccompProfiles:
- "*"
users: <1>
- system:serviceaccount:dpdk:deployer
volumes:
- "*"
----
+
<1> This is a list of all the service accounts that will be part of the SCC.
You should add here all the namespaces that will deploy the dpdk workload in the following format `system:serviceaccount:<namespace>:deployer`.
. Apply the deployment config resource.
+
`DeploymentConfig` resource example:
+
----
apiVersion: apps.openshift.io/v1
kind: DeploymentConfig
metadata:
labels:
app: s2i-dpdk-app
app.kubernetes.io/component: s2i-dpdk-app
app.kubernetes.io/instance: s2i-dpdk-app
name: s2i-dpdk-app
namespace: dpdk
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
deploymentconfig: s2i-dpdk-app
strategy:
rollingParams:
intervalSeconds: 1
maxSurge: 25%
maxUnavailable: 25%
timeoutSeconds: 600
updatePeriodSeconds: 1
type: Rolling
template:
metadata:
labels:
deploymentconfig: s2i-dpdk-app
annotations:
k8s.v1.cni.cncf.io/networks: dpdk/dpdk-network <1>
spec:
serviceAccount: deployer
serviceAccountName: deployer
securityContext:
runAsUser: 0
containers:
- image: “<internal-registry-url>/<namespace>/<image-stream>:<tag>” <2>
securityContext:
runAsUser: 0
capabilities:
add: ["IPC_LOCK","SYS_RESOURCE"]
imagePullPolicy: Always
name: s2i-dpdk-app
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
resources: <3>
limits:
cpu: "4"
hugepages-1Gi: 4Gi
memory: 1000Mi
requests:
cpu: "4"
hugepages-1Gi: 4Gi
memory: 1000Mi
volumeMounts:
- mountPath: /mnt/huge
name: hugepage
dnsPolicy: ClusterFirst
volumes:
- name: hugepage
emptyDir:
medium: HugePages
restartPolicy: Always
test: false
triggers:
- type: ConfigChange
- imageChangeParams:
automatic: true
containerNames:
- s2i-dpdk-app
from: <4>
kind: ImageStreamTag
name: <image-stream>:<tag>
namespace: <namespace>
type: ImageChange
----
+
<1> The network attachment definition name.
<2> The image stream URL.
<3> The requested resource. The limit and request should be the same so the quality of service (QOS) will be guaranteed and the CPUs will be pinned.
<4> The image stream created to start a redeployment when a newly built image is pushed to the registry.

View File

@@ -1,5 +1,5 @@
[id="performance-addon-operator-for-low-latency-nodes"]
= Performance-addon operator for low latency nodes
= Performance Addon Operator for low latency nodes
include::modules/common-attributes.adoc[]
:context: cnf-master
@@ -7,16 +7,22 @@ toc::[]
include::modules/cnf-understanding-low-latency.adoc[leveloffset=+1]
include::modules/nw-sriov-installing-operator.adoc[leveloffset=+1]
include::modules/cnf-installing-the-performance-addon-operator.adoc[leveloffset=+1]
include::modules/configuring-huge-pages.adoc[leveloffset=+1]
include::modules/cnf-configuring-huge-pages.adoc[leveloffset=+1]
include::modules/cnf-creating-the-performance-profile-object.adoc[leveloffset=+1]
include::modules/cnf-tuning-nodes-for-low-latency-via-performanceprofile.adoc[leveloffset=+1]
.Additional resources
* For more information about Machine Config and KubeletConfig,
* For more information about MachineConfig and KubeletConfig,
see xref:../nodes/nodes/nodes-nodes-managing.adoc#nodes-nodes-managing[Managing nodes].
* For more information about the Node Tuning Operator,
see xref:../scalability_and_performance/using-node-tuning-operator.adoc#using-node-tuning-operator[Using the Node Tuning Operator].
* For more information about the PerformanceProfile,
see xref:../scalability_and_performance/what-huge-pages-do-and-how-they-are-consumed-by-apps.adoc#configuring-huge-pages_huge-pages[Configuring huge pages].
* For more information about consuming huge pages from your containers,
see xref:../scalability_and_performance/what-huge-pages-do-and-how-they-are-consumed-by-apps.adoc#how-huge-pages-are-consumed-by-apps_huge-pages[How huge pages are consumed by apps].