diff --git a/_topic_map.yml b/_topic_map.yml index 362afe964c..5e03a94989 100644 --- a/_topic_map.yml +++ b/_topic_map.yml @@ -378,7 +378,7 @@ Topics: File: troubleshooting-crio-issues - Name: Troubleshooting Operator issues File: troubleshooting-operator-issues - - Name: Investigating Pod issues + - Name: Investigating pod issues File: investigating-pod-issues - Name: Troubleshooting the Source-to-Image process File: troubleshooting-s2i diff --git a/modules/about-must-gather.adoc b/modules/about-must-gather.adoc index b50cc74061..647baafc79 100644 --- a/modules/about-must-gather.adoc +++ b/modules/about-must-gather.adoc @@ -17,6 +17,6 @@ The `oc adm must-gather` CLI command collects the information from your cluster You can specify one or more images when you run the command by including the `--image` argument. When you specify an image, the tool collects data related to that feature or product. -When you run `oc adm must-gather`, a new pod is created on the cluster. The data is collected on that Pod and saved in a new directory that starts with `must-gather.local`. This directory is created in the current working directory. +When you run `oc adm must-gather`, a new pod is created on the cluster. The data is collected on that pod and saved in a new directory that starts with `must-gather.local`. This directory is created in the current working directory. // todo: table or ref module listing available images? diff --git a/modules/copying-files-pods-and-containers.adoc b/modules/copying-files-pods-and-containers.adoc index bba6fce6db..d65e0bd885 100644 --- a/modules/copying-files-pods-and-containers.adoc +++ b/modules/copying-files-pods-and-containers.adoc @@ -5,7 +5,7 @@ [id="copying-files-pods-and-containers_{context}"] = Copying files to and from pods and containers -You can copy files to and from a Pod to test configuration changes or gather diagnostic information. +You can copy files to and from a pod to test configuration changes or gather diagnostic information. .Prerequisites @@ -15,21 +15,21 @@ You can copy files to and from a Pod to test configuration changes or gather dia .Procedure -. Copy a file to a Pod: +. Copy a file to a pod: + [source,terminal] ---- $ oc cp :/ -c <1> ---- -<1> Note that a Pod's first container will be selected if the `-c` option is not specified. +<1> The first container in a pod is selected if the `-c` option is not specified. -. Copy a file from a Pod: +. Copy a file from a pod: + [source,terminal] ---- $ oc cp :/ -c <1> ---- -<1> Note that a Pod's first container will be selected if the `-c` option is not specified. +<1> The first container in a pod is selected if the `-c` option is not specified. + [NOTE] ==== diff --git a/modules/gathering-data-specific-features.adoc b/modules/gathering-data-specific-features.adoc index 7247d84a30..095c30b59e 100644 --- a/modules/gathering-data-specific-features.adoc +++ b/modules/gathering-data-specific-features.adoc @@ -15,10 +15,7 @@ endif::[] [id="gathering-data-specific-features_{context}"] = Gathering data about specific features -You can gather debugging information about specific features by using the -`oc adm must-gather` CLI command with the `--image` or `--image-stream` argument. -The `must-gather` tool supports multiple images, so you can gather data about -more than one feature by running a single command. +You can gather debugging information about specific features by using the `oc adm must-gather` CLI command with the `--image` or `--image-stream` argument. The `must-gather` tool supports multiple images, so you can gather data about more than one feature by running a single command. ifdef::from-main-support-section[] @@ -79,7 +76,7 @@ endif::from-main-support-section[] [NOTE] ==== -To collect the default must-gather data in addition to specific feature data, add the `--image-stream=openshift/must-gather` argument. +To collect the default `must-gather` data in addition to specific feature data, add the `--image-stream=openshift/must-gather` argument. ==== .Prerequisites @@ -93,9 +90,7 @@ To collect the default must-gather data in addition to specific feature data, ad ifndef::openshift-origin[] -. Run the `oc adm must-gather` command with one or more `--image` or `--image-stream` -arguments. For example, the following command gathers both the default cluster -data and information specific to {VirtProductName}: +. Run the `oc adm must-gather` command with one or more `--image` or `--image-stream` arguments. For example, the following command gathers both the default cluster data and information specific to {VirtProductName}: + [source,terminal,subs="attributes+"] ---- @@ -103,16 +98,14 @@ $ oc adm must-gather \ --image-stream=openshift/must-gather \ <1> --image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel8:v{HCOVersion} <2> ---- -<1> The default {product-title} must-gather image +<1> The default {product-title} `must-gather` image <2> The must-gather image for {VirtProductName} endif::openshift-origin[] ifdef::openshift-origin[] -. Run the `oc adm must-gather` command with one or more `--image` or `--image-stream` -arguments. For example, the following command gathers both the default cluster -data and information specific to KubeVirt: +. Run the `oc adm must-gather` command with one or more `--image` or `--image-stream` arguments. For example, the following command gathers both the default cluster data and information specific to KubeVirt: + [source,terminal] ---- @@ -120,13 +113,12 @@ $ oc adm must-gather \ --image-stream=openshift/must-gather \ <1> --image=quay.io/kubevirt/must-gather <2> ---- -<1> The default {product-title} must-gather image +<1> The default {product-title} `must-gather` image <2> The must-gather image for KubeVirt endif::openshift-origin[] -. Create a compressed file from the `must-gather` directory that was just created -in your working directory. For example, on a computer that uses a Linux +. Create a compressed file from the `must-gather` directory that was just created in your working directory. For example, on a computer that uses a Linux operating system, run the following command: + [source,terminal] @@ -136,8 +128,7 @@ $ tar cvaf must-gather.tar.gz must-gather.local.5421342344627712289/ <1> <1> Make sure to replace `must-gather-local.5421342344627712289/` with the actual directory name. -. Attach the compressed file to your support case on the -link:https://access.redhat.com[Red Hat Customer Portal]. +. Attach the compressed file to your support case on the link:https://access.redhat.com[Red Hat Customer Portal]. ifeval::["{context}" == "gathering-cluster-data"] :!from-main-support-section: diff --git a/modules/insights-operator-showing-data-collected-from-the-cluster.adoc b/modules/insights-operator-showing-data-collected-from-the-cluster.adoc index eb3ebee1b0..2566acbc38 100644 --- a/modules/insights-operator-showing-data-collected-from-the-cluster.adoc +++ b/modules/insights-operator-showing-data-collected-from-the-cluster.adoc @@ -13,7 +13,7 @@ You can review the data that is collected by the Insights Operator. .Procedure -. Find the name of the currently running Pod for the Insights Operator: +. Find the name of the currently running pod for the Insights Operator: + [source,terminal] ---- diff --git a/modules/investigating-master-node-installation-issues.adoc b/modules/investigating-master-node-installation-issues.adoc index 1e7c41894e..e1f7daafcb 100644 --- a/modules/investigating-master-node-installation-issues.adoc +++ b/modules/investigating-master-node-installation-issues.adoc @@ -5,7 +5,7 @@ [id="investigating-master-node-installation-issues_{context}"] = Investigating master node installation issues -If you experience master node installation issues, determine the master node, {product-title} software defined network (SDN), and network Operator status. Collect `kubelet.service`, `crio.service` journald unit logs, and master node container logs for visibility into master node agent, CRI-O container runtime, and Pod activity. +If you experience master node installation issues, determine the master node, {product-title} software defined network (SDN), and network Operator status. Collect `kubelet.service`, `crio.service` journald unit logs, and master node container logs for visibility into master node agent, CRI-O container runtime, and pod activity. .Prerequisites @@ -113,14 +113,14 @@ $ oc get network.config.openshift.io cluster -o yaml $ ./openshift-install create manifests ---- + -.. Review Pod status in the `openshift-network-operator` namespace to determine whether the network Operator is running: +.. Review the pod status in the `openshift-network-operator` namespace to determine whether the Cluster Network Operator (CNO) is running: + [source,terminal] ---- $ oc get pods -n openshift-network-operator ---- + -.. Gather network Operator Pod logs from the `openshift-network-operator` namespace: +.. Gather network Operator pod logs from the `openshift-network-operator` namespace: + [source,terminal] ---- diff --git a/modules/investigating-worker-node-installation-issues.adoc b/modules/investigating-worker-node-installation-issues.adoc index cd8e1090ab..2d64b2b10b 100644 --- a/modules/investigating-worker-node-installation-issues.adoc +++ b/modules/investigating-worker-node-installation-issues.adoc @@ -5,7 +5,7 @@ [id="investigating-worker-node-installation-issues_{context}"] = Investigating worker node installation issues -If you experience worker node installation issues, you can review the worker node status. Collect `kubelet.service`, `crio.service` journald unit logs and the worker node container logs for visibility into the worker node agent, CRI-O container runtime and Pod activity. Additionally, you can check the Ignition file and Machine API Operator functionality. If worker node post-installation configuration fails, check Machine Config Operator (MCO) and DNS functionality. You can also verify system clock synchronization between the bootstrap, master, and worker nodes, and validate certificates. +If you experience worker node installation issues, you can review the worker node status. Collect `kubelet.service`, `crio.service` journald unit logs and the worker node container logs for visibility into the worker node agent, CRI-O container runtime and pod activity. Additionally, you can check the Ignition file and Machine API Operator functionality. If worker node post-installation configuration fails, check Machine Config Operator (MCO) and DNS functionality. You can also verify system clock synchronization between the bootstrap, master, and worker nodes, and validate certificates. .Prerequisites @@ -76,28 +76,28 @@ It is not possible to run `oc` commands if an installation issue prevents the {p ==== + . Unlike master nodes, worker nodes are deployed and scaled using the Machine API Operator. Check the status of the Machine API Operator. -.. Review Machine API Operator Pod status: +.. Review Machine API Operator pod status: + [source,terminal] ---- $ oc get pods -n openshift-machine-api ---- + -.. If the Machine API Operator Pod does not have a `Ready` status, detail the Pod's events: +.. If the Machine API Operator pod does not have a `Ready` status, detail the pod's events: + [source,terminal] ---- $ oc describe pod/ -n openshift-machine-api ---- + -.. Inspect `machine-api-operator` container logs. The container runs within the `machine-api-operator` Pod: +.. Inspect `machine-api-operator` container logs. The container runs within the `machine-api-operator` pod: + [source,terminal] ---- $ oc logs pod/ -n openshift-machine-api -c machine-api-operator ---- + -.. Also inspect `kube-rbac-proxy` container logs. The container also runs within the `machine-api-operator` Pod: +.. Also inspect `kube-rbac-proxy` container logs. The container also runs within the `machine-api-operator` pod: + [source,terminal] ---- diff --git a/modules/monitoring-investigating-why-user-defined-metrics-are-unavailable.adoc b/modules/monitoring-investigating-why-user-defined-metrics-are-unavailable.adoc index de895be50c..020d2405a6 100644 --- a/modules/monitoring-investigating-why-user-defined-metrics-are-unavailable.adoc +++ b/modules/monitoring-investigating-why-user-defined-metrics-are-unavailable.adoc @@ -6,7 +6,7 @@ [id="investigating-why-user-defined-metrics-are-unavailable_{context}"] = Investigating why user-defined metrics are unavailable -ServiceMonitor resources enable you to determine how to use the metrics exposed by a service in user-defined projects. Follow the steps outlined in this procedure if you have created a ServiceMonitor resource but cannot see any corresponding metrics in the Metrics UI. +`ServiceMonitor` resources enable you to determine how to use the metrics exposed by a service in user-defined projects. Follow the steps outlined in this procedure if you have created a `ServiceMonitor` resource but cannot see any corresponding metrics in the Metrics UI. .Prerequisites @@ -14,11 +14,11 @@ ServiceMonitor resources enable you to determine how to use the metrics exposed * You have installed the OpenShift CLI (`oc`). * You have enabled and configured monitoring for user-defined workloads. * You have created the `user-workload-monitoring-config` `ConfigMap` object. -* You have created a ServiceMonitor resource. +* You have created a `ServiceMonitor` resource. .Procedure -. *Check that the corresponding labels match* in the service and ServiceMonitor configurations. +. *Check that the corresponding labels match* in the service and `ServiceMonitor` configurations. .. Obtain the label defined in the service. The following example queries the `prometheus-example-app` service in the `ns1` project: + [source,terminal] @@ -33,7 +33,7 @@ $ oc -n ns1 get service prometheus-example-app -o yaml app: prometheus-example-app ---- + -.. Check that the `matchLabels` `app` label in the ServiceMonitor matches the label output in the preceding step: +.. Check that the `matchLabels` `app` label in the `ServiceMonitor` configuration matches the label output in the preceding step: + [source,terminal] ---- @@ -54,7 +54,7 @@ spec: + [NOTE] ==== -You can check service and ServiceMonitor labels as a developer with view permissions for the project. +You can check service and `ServiceMonitor` labels as a developer with view permissions for the project. ==== . *Inspect the logs for the Prometheus Operator* in the `openshift-user-workload-monitoring` project. @@ -83,7 +83,7 @@ thanos-ruler-user-workload-1 3/3 Running 0 132m $ oc -n openshift-user-workload-monitoring logs prometheus-operator-776fcbbd56-2nbfm -c prometheus-operator ---- + -If there is a issue with the ServiceMonitor, the logs might include an error similar to this example: +If there is a issue with the service monitor, the logs might include an error similar to this example: + [source,terminal] ---- @@ -101,7 +101,7 @@ $ oc port-forward -n openshift-user-workload-monitoring pod/prometheus-user-work .. Open http://localhost:9090/targets in a web browser and review the status of the target for your project directly in the Prometheus UI. Check for error messages relating to the target. . *Configure debug level logging for the Prometheus Operator* in the `openshift-user-workload-monitoring` project. -.. Edit the `user-workload-monitoring-config` ConfigMap in the `openshift-user-workload-monitoring` project: +.. Edit the `user-workload-monitoring-config` config map in the `openshift-user-workload-monitoring` project: + [source,terminal] ---- @@ -154,7 +154,7 @@ $ oc -n openshift-user-workload-monitoring get pods + [NOTE] ==== -If an unrecognized Prometheus Operator `loglevel` value is included in the ConfigMap, the `prometheus-operator` pod might not restart successfully. +If an unrecognized Prometheus Operator `loglevel` value is included in the config map, the `prometheus-operator` pod might not restart successfully. ==== + -.. Review the debug logs to see if the Prometheus Operator is using the ServiceMonitor resource. Review the logs for other related errors. +.. Review the debug logs to see if the Prometheus Operator is using the `ServiceMonitor` resource. Review the logs for other related errors. diff --git a/modules/querying-operator-status-after-installation.adoc b/modules/querying-operator-status-after-installation.adoc index 1881f4c3af..c884535e01 100644 --- a/modules/querying-operator-status-after-installation.adoc +++ b/modules/querying-operator-status-after-installation.adoc @@ -28,7 +28,7 @@ $ oc get clusteroperators $ oc describe clusteroperator ---- -. Review Operator Pod status within the Operator's namespace: +. Review Operator pod status within the Operator's namespace: + [source,terminal] ---- @@ -42,15 +42,15 @@ $ oc get pods -n $ oc describe pod/ -n ---- -. Inspect Pod logs: +. Inspect pod logs: + [source,terminal] ---- $ oc logs pod/ -n ---- -. When experiencing Pod base image related issues, review base image status. -.. Obtain details of the base image used by a problematic Pod: +. When experiencing pod base image related issues, review base image status. +.. Obtain details of the base image used by a problematic pod: + [source,terminal] ---- diff --git a/modules/storage-multi-attach-error.adoc b/modules/storage-multi-attach-error.adoc index 06f7d0f9cc..3327a79bb7 100644 --- a/modules/storage-multi-attach-error.adoc +++ b/modules/storage-multi-attach-error.adoc @@ -5,7 +5,7 @@ [id="storage-multi-attach-error_{context}"] = Resolving multi-attach errors -When a node crashes or shuts down abruptly, the attached ReadWriteOnce (RWO) volume is expected to be unmounted from the node so that it can be used by a Pod scheduled on another node. +When a node crashes or shuts down abruptly, the attached ReadWriteOnce (RWO) volume is expected to be unmounted from the node so that it can be used by a pod scheduled on another node. However, mounting on a new node is not possible because the failed node is unable to unmount the attached volume. diff --git a/modules/support-collecting-network-trace.adoc b/modules/support-collecting-network-trace.adoc index e9b79e5250..9323309f38 100644 --- a/modules/support-collecting-network-trace.adoc +++ b/modules/support-collecting-network-trace.adoc @@ -25,7 +25,7 @@ When investigating potential network-related {product-title} issues, Red Hat Sup $ oc get nodes ---- -. Enter into a debug session on the target node. This step instantiates a debug Pod called `-debug`: +. Enter into a debug session on the target node. This step instantiates a debug pod called `-debug`: + [source,terminal] ---- @@ -60,7 +60,7 @@ $ oc debug node/my-cluster-node + [NOTE] ==== -If an existing `toolbox` Pod is already running, the `toolbox` command outputs `'toolbox-' already exists. Trying to start...`. To avoid `tcpdump` issues, remove the running toolbox container with `podman rm toolbox-` and spawn a new toolbox container. +If an existing `toolbox` pod is already running, the `toolbox` command outputs `'toolbox-' already exists. Trying to start...`. To avoid `tcpdump` issues, remove the running toolbox container with `podman rm toolbox-` and spawn a new toolbox container. ==== + . Initiate a `tcpdump` session on the cluster node and redirect output to a capture file. This example uses `ens5` as the interface name: diff --git a/modules/support-gather-data.adoc b/modules/support-gather-data.adoc index 1d6e1a53b3..dd2a70a608 100644 --- a/modules/support-gather-data.adoc +++ b/modules/support-gather-data.adoc @@ -5,8 +5,7 @@ [id="support_gathering_data_{context}"] = Gathering data about your cluster for Red Hat Support -You can gather debugging information about your cluster by using the -`oc adm must-gather` CLI command. +You can gather debugging information about your cluster by using the `oc adm must-gather` CLI command. .Prerequisites @@ -26,32 +25,29 @@ $ oc adm must-gather + [NOTE] ==== -If this command fails, for example if you cannot schedule a Pod on your cluster, then use the `oc adm inspect` command to gather information for particular resources. Contact Red Hat Support for the recommended resources to gather. +If this command fails, for example if you cannot schedule a pod on your cluster, then use the `oc adm inspect` command to gather information for particular resources. Contact Red Hat Support for the recommended resources to gather. ==== + [NOTE] ==== -If your cluster is using a restricted network, you must take additional steps. If your mirror registry has a trusted CA, you must first add the trusted CA to the cluster. For all clusters on restricted networks, you must import the default `must-gather` image as an `ImageStream` before you use the `oc adm must-gather` command. +If your cluster is using a restricted network, you must take additional steps. If your mirror registry has a trusted CA, you must first add the trusted CA to the cluster. For all clusters on restricted networks, you must import the default `must-gather` image as an image stream before you use the `oc adm must-gather` command. ---- $ oc import-image is/must-gather -n openshift ---- ==== -. Create a compressed file from the `must-gather` directory that was just created -in your working directory. For example, on a computer that uses a Linux +. Create a compressed file from the `must-gather` directory that was just created in your working directory. For example, on a computer that uses a Linux operating system, run the following command: + [source,terminal] ---- $ tar cvaf must-gather.tar.gz must-gather.local.5421342344627712289/ <1> ---- -<1> Make sure to replace `must-gather-local.5421342344627712289/` with the -actual directory name. +<1> Make sure to replace `must-gather-local.5421342344627712289/` with the actual directory name. ifndef::openshift-origin[] -. Attach the compressed file to your support case on the -link:https://access.redhat.com[Red Hat Customer Portal]. +. Attach the compressed file to your support case on the link:https://access.redhat.com[Red Hat Customer Portal]. endif::[] ifdef::openshift-origin[] diff --git a/modules/support-generating-a-sosreport-archive.adoc b/modules/support-generating-a-sosreport-archive.adoc index b7b670a151..90d4f21314 100644 --- a/modules/support-generating-a-sosreport-archive.adoc +++ b/modules/support-generating-a-sosreport-archive.adoc @@ -25,7 +25,7 @@ The recommended way to generate a `sosreport` for an {product-title} {product-ve $ oc get nodes ---- -. Enter into a debug session on the target node. This step instantiates a debug Pod called `-debug`: +. Enter into a debug session on the target node. This step instantiates a debug pod called `-debug`: + [source,terminal] ---- @@ -53,7 +53,7 @@ $ oc debug node/my-cluster-node + [NOTE] ==== -If an existing `toolbox` Pod is already running, the `toolbox` command outputs `'toolbox-' already exists. Trying to start...`. Remove the running toolbox container with `podman rm toolbox-` and spawn a new toolbox container, to avoid issues with `sosreport` plugins. +If an existing `toolbox` pod is already running, the `toolbox` command outputs `'toolbox-' already exists. Trying to start...`. Remove the running toolbox container with `podman rm toolbox-` and spawn a new toolbox container, to avoid issues with `sosreport` plug-ins. ==== + . Collect a `sosreport` archive. diff --git a/modules/support-providing-diagnostic-data-to-red-hat.adoc b/modules/support-providing-diagnostic-data-to-red-hat.adoc index a2606640a3..9f203cebd5 100644 --- a/modules/support-providing-diagnostic-data-to-red-hat.adoc +++ b/modules/support-providing-diagnostic-data-to-red-hat.adoc @@ -44,7 +44,7 @@ $ oc debug node/my-cluster-node -- bash -c 'cat /host/var/tmp/my-diagnostic-data $ oc get nodes ---- -. Enter into a debug session on the target node. This step instantiates a debug Pod called `-debug`: +. Enter into a debug session on the target node. This step instantiates a debug pod called `-debug`: + [source,terminal] ---- @@ -72,10 +72,10 @@ $ oc debug node/my-cluster-node + [NOTE] ==== -If an existing `toolbox` Pod is already running, the `toolbox` command outputs `'toolbox-' already exists. Trying to start...`. Remove the running toolbox container with `podman rm toolbox-` and spawn a new toolbox container, to avoid issues. +If an existing `toolbox` pod is already running, the `toolbox` command outputs `'toolbox-' already exists. Trying to start...`. Remove the running toolbox container with `podman rm toolbox-` and spawn a new toolbox container, to avoid issues. ==== + -.. Run `redhat-support-tool` to attach a file from the debug Pod directly to an existing Red Hat Support case. This example uses support case ID '01234567' and example file path `/host/var/tmp/my-diagnostic-data.tar.gz`: +.. Run `redhat-support-tool` to attach a file from the debug pod directly to an existing Red Hat Support case. This example uses support case ID '01234567' and example file path `/host/var/tmp/my-diagnostic-data.tar.gz`: + [source,terminal] ---- diff --git a/modules/troubleshooting-disabling-autoreboot-mco.adoc b/modules/troubleshooting-disabling-autoreboot-mco.adoc index fb44ab088f..fe464ef5a1 100644 --- a/modules/troubleshooting-disabling-autoreboot-mco.adoc +++ b/modules/troubleshooting-disabling-autoreboot-mco.adoc @@ -23,7 +23,7 @@ Pausing a machine config pool stops all system reboot processes and all configur .Procedure . To pause the autoreboot process after machine config changes are applied: -* As root, update the `spec.paused` field to `true` in the MachineConfigPool CustomResourceDefinition (CRD). +* As root, update the `spec.paused` field to `true` in the `MachineConfigPool` custom resource. + .Control plane (master) nodes [source,terminal] diff --git a/modules/understanding-pod-error-states.adoc b/modules/understanding-pod-error-states.adoc index 1cea891fe9..adad1a9e0c 100644 --- a/modules/understanding-pod-error-states.adoc +++ b/modules/understanding-pod-error-states.adoc @@ -57,7 +57,7 @@ The following table provides a list of pod error states along with their descrip | Pod sandbox configuration was not obtained. | `ErrKillPodSandbox` -| A Pod's sandbox did not stop successfully. +| A pod sandbox did not stop successfully. | `ErrSetupNetwork` | Network initialization failed. diff --git a/modules/upi-installation-considerations.adoc b/modules/upi-installation-considerations.adoc index 5e7bc56a33..3696bf6071 100644 --- a/modules/upi-installation-considerations.adoc +++ b/modules/upi-installation-considerations.adoc @@ -20,7 +20,7 @@ You can alternatively install {product-title} {product-version} on infrastructur It is not possible to enable cloud provider integration in {product-title} environments that mix resources from different cloud providers, or that span multiple physical or virtual platforms. The node life cycle controller will not allow nodes that are external to the existing provider to be added to a cluster, and it is not possible to specify more than one cloud provider integration. ==== -* A provider-specific Machine API implementation is required if you want to use MachineSets or autoscaling to automatically provision {product-title} cluster nodes. +* A provider-specific Machine API implementation is required if you want to use machine sets or autoscaling to automatically provision {product-title} cluster nodes. * Check whether your chosen cloud provider offers a method to inject Ignition configuration files into hosts as part of their initial deployment. If they do not, you will need to host Ignition configuration files by using an HTTP server. The steps taken to troubleshoot Ignition configuration file issues will differ depending on which of these two methods is deployed. diff --git a/support/troubleshooting/investigating-monitoring-issues.adoc b/support/troubleshooting/investigating-monitoring-issues.adoc index 2a4db6319a..3cfda1067b 100644 --- a/support/troubleshooting/investigating-monitoring-issues.adoc +++ b/support/troubleshooting/investigating-monitoring-issues.adoc @@ -15,12 +15,12 @@ include::modules/monitoring-investigating-why-user-defined-metrics-are-unavailab .Additional resources -* xref:../../monitoring/configuring-the-monitoring-stack.adoc#creating-user-defined-workload-monitoring-configmap_configuring-the-monitoring-stack[Creating a user-defined workload monitoring ConfigMap] -* See xref:../../monitoring/managing-metrics.adoc#specifying-how-a-service-is-monitored_managing-metrics[Specifying how a service is monitored] for details on how to create a ServiceMonitor or PodMonitor +* xref:../../monitoring/configuring-the-monitoring-stack.adoc#creating-user-defined-workload-monitoring-configmap_configuring-the-monitoring-stack[Creating a user-defined workload monitoring config map] +* See xref:../../monitoring/managing-metrics.adoc#specifying-how-a-service-is-monitored_managing-metrics[Specifying how a service is monitored] for details on how to create a service monitor or pod monitor // Determining why Prometheus is consuming a lot of disk space include::modules/monitoring-determining-why-prometheus-is-consuming-disk-space.adoc[leveloffset=+1] .Additional resources -* See xref:../../monitoring/configuring-the-monitoring-stack.adoc#setting-a-scrape-sample-limit-for-user-defined-projects_configuring-the-monitoring-stack[Setting a scrape sample limit for user-defined projects] for details on how to set a scrape sample limit and create related alerting rules \ No newline at end of file +* See xref:../../monitoring/configuring-the-monitoring-stack.adoc#setting-a-scrape-sample-limit-for-user-defined-projects_configuring-the-monitoring-stack[Setting a scrape sample limit for user-defined projects] for details on how to set a scrape sample limit and create related alerting rules diff --git a/support/troubleshooting/investigating-pod-issues.adoc b/support/troubleshooting/investigating-pod-issues.adoc index e050bd7880..ed38beb6fe 100644 --- a/support/troubleshooting/investigating-pod-issues.adoc +++ b/support/troubleshooting/investigating-pod-issues.adoc @@ -1,30 +1,30 @@ [id="investigating-pod-issues"] -= Investigating Pod issues += Investigating pod issues include::modules/common-attributes.adoc[] :context: investigating-pod-issues toc::[] -{product-title} leverages the Kubernetes concept of a Pod, which is one or more containers deployed together on one host. A Pod is the smallest compute unit that can be defined, deployed, and managed on {product-title} {product-version}. +{product-title} leverages the Kubernetes concept of a pod, which is one or more containers deployed together on one host. A pod is the smallest compute unit that can be defined, deployed, and managed on {product-title} {product-version}. -After a Pod is defined, it is assigned to run on a node until its containers exit, or until it is removed. Depending on policy and exit code, Pods are either removed after exiting or retained so that their logs can be accessed. +After a pod is defined, it is assigned to run on a node until its containers exit, or until it is removed. Depending on policy and exit code, Pods are either removed after exiting or retained so that their logs can be accessed. -The first thing to check when Pod issues arise is the Pod's status. If an explicit Pod failure has occurred, observe the Pod's error state to identify specific image, container, or Pod network issues. Focus diagnostic data collection according to the error state. Review Pod event messages, as well as Pod and container log information. Diagnose issues dynamically by accessing running Pods on the command line, or start a debug Pod with root access based on a problematic Pod's deployment configuration. +The first thing to check when pod issues arise is the pod's status. If an explicit pod failure has occurred, observe the pod's error state to identify specific image, container, or pod network issues. Focus diagnostic data collection according to the error state. Review pod event messages, as well as pod and container log information. Diagnose issues dynamically by accessing running Pods on the command line, or start a debug pod with root access based on a problematic pod's deployment configuration. -// Understanding Pod error states +// Understanding pod error states include::modules/understanding-pod-error-states.adoc[leveloffset=+1] -// Reviewing Pod status +// Reviewing pod status include::modules/reviewing-pod-status.adoc[leveloffset=+1] -// Inspecting Pod and container logs +// Inspecting pod and container logs include::modules/inspecting-pod-and-container-logs.adoc[leveloffset=+1] -// Accessing running Pods +// Accessing running pods include::modules/accessing-running-pods.adoc[leveloffset=+1] -// Starting debug Pods with root access +// Starting debug pods with root access include::modules/starting-debug-pods-with-root-access.adoc[leveloffset=+1] -// Copying files to and from Pods and containers +// Copying files to and from pods and containers include::modules/copying-files-pods-and-containers.adoc[leveloffset=+1] diff --git a/support/troubleshooting/troubleshooting-operator-issues.adoc b/support/troubleshooting/troubleshooting-operator-issues.adoc index 335f10d4c6..a5fee9face 100644 --- a/support/troubleshooting/troubleshooting-operator-issues.adoc +++ b/support/troubleshooting/troubleshooting-operator-issues.adoc @@ -11,7 +11,7 @@ Operators are a method of packaging, deploying, and managing an {product-title} As a cluster administrator, you can install application Operators from the OperatorHub using the {product-title} web console or the CLI. You can then subscribe the Operator to one or more namespaces to make it available for developers on your cluster. Application Operators are managed by Operator Lifecycle Manager (OLM). -If you experience Operator issues, verify Operator Subscription status. Check Operator Pod health across the cluster and gather Operator logs for diagnosis. +If you experience Operator issues, verify Operator subscription status. Check Operator pod health across the cluster and gather Operator logs for diagnosis. // Operator Subscription condition types include::modules/olm-status-conditions.adoc[leveloffset=+1]