mirror of
https://github.com/openshift/openshift-docs.git
synced 2026-02-05 12:46:18 +01:00
TELCODOCS-2171#Generalize Day2Ops for Telco Conflicts solved
This commit is contained in:
committed by
openshift-cherrypick-robot
parent
1e2c6e0357
commit
48061065ca
@@ -3613,23 +3613,23 @@ Topics:
|
||||
File: telco-update-completing-the-y-stream-update
|
||||
- Name: Completing the z-stream update
|
||||
File: telco-update-completing-the-z-stream-update
|
||||
- Name: Troubleshooting and maintaining telco core CNF clusters
|
||||
- Name: Troubleshooting and maintaining OpenShift Container Platform clusters
|
||||
Dir: troubleshooting
|
||||
Topics:
|
||||
- Name: Troubleshooting and maintaining telco core CNF clusters
|
||||
File: telco-troubleshooting-intro
|
||||
- Name: Troubleshooting and maintaining OpenShift Container Platform clusters
|
||||
File: troubleshooting-intro
|
||||
- Name: General troubleshooting
|
||||
File: telco-troubleshooting-general-troubleshooting
|
||||
File: troubleshooting-general-troubleshooting
|
||||
- Name: Cluster maintenance
|
||||
File: telco-troubleshooting-cluster-maintenance
|
||||
File: troubleshooting-cluster-maintenance
|
||||
- Name: Security
|
||||
File: telco-troubleshooting-security
|
||||
File: troubleshooting-security
|
||||
- Name: Certificate maintenance
|
||||
File: telco-troubleshooting-cert-maintenance
|
||||
File: troubleshooting-cert-maintenance
|
||||
- Name: Machine Config Operator
|
||||
File: telco-troubleshooting-mco
|
||||
File: troubleshooting-mco
|
||||
- Name: Bare-metal node maintenance
|
||||
File: telco-troubleshooting-bmn-maintenance
|
||||
File: troubleshooting-bmn-maintenance
|
||||
- Name: Observability
|
||||
Dir: observability
|
||||
Topics:
|
||||
|
||||
@@ -11,7 +11,7 @@ You can use the following Day 2 operations to manage telco core CNF clusters.
|
||||
Updating a telco core CNF cluster:: Updating your cluster is a critical task that ensures that bugs and potential security vulnerabilities are patched.
|
||||
For more information, see xref:../day_2_core_cnf_clusters/updating/telco-update-welcome.adoc#telco-update-welcome[Updating a telco core CNF cluster].
|
||||
|
||||
Troubleshooting and maintaining telco core CNF clusters:: To maintain and troubleshoot a bare-metal environment where high-bandwidth network throughput is required, see xref:../day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-intro.adoc#telco-troubleshooting-intro[Troubleshooting and maintaining telco core CNF clusters].
|
||||
Troubleshooting and maintaining telco core CNF clusters:: To maintain and troubleshoot a bare-metal environment where high-bandwidth network throughput is required, see xref:../day_2_core_cnf_clusters/troubleshooting/troubleshooting-intro.adoc#troubleshooting-intro[Troubleshooting and maintaining {product-title} clusters].
|
||||
|
||||
Observability in telco core CNF clusters:: {product-title} generates a large amount of data, such as performance metrics and logs from the platform and the workloads running on it.
|
||||
As an administrator, you can use tools to collect and analyze the available data.
|
||||
|
||||
@@ -1,21 +0,0 @@
|
||||
:_mod-docs-content-type: ASSEMBLY
|
||||
[id="telco-troubleshooting-cluster-maintenance"]
|
||||
= Cluster maintenance
|
||||
include::_attributes/common-attributes.adoc[]
|
||||
:context: telco-troubleshooting-cluster-maintenance
|
||||
|
||||
toc::[]
|
||||
|
||||
In telco networks, you must pay more attention to certain configurations due the nature of bare-metal deployments.
|
||||
You can troubleshoot more effectively by completing these tasks:
|
||||
|
||||
* Monitor for failed or failing hardware components
|
||||
* Periodically check the status of the cluster Operators
|
||||
|
||||
[NOTE]
|
||||
====
|
||||
For hardware monitoring, contact your hardware vendor to find the appropriate logging tool for your specific hardware.
|
||||
====
|
||||
|
||||
include::modules/telco-troubleshooting-clusters-check-cluster-operators.adoc[leveloffset=+1]
|
||||
include::modules/telco-troubleshooting-clusters-check-for-failed-pods.adoc[leveloffset=+1]
|
||||
@@ -1,20 +0,0 @@
|
||||
:_mod-docs-content-type: ASSEMBLY
|
||||
[id="telco-troubleshooting-mco"]
|
||||
= Machine Config Operator
|
||||
include::_attributes/common-attributes.adoc[]
|
||||
:context: telco-troubleshooting-mco
|
||||
|
||||
toc::[]
|
||||
|
||||
The Machine Config Operator provides useful information to cluster administrators and controls what is running directly on the bare-metal host.
|
||||
|
||||
The Machine Config Operator differentiates between different groups of nodes in the cluster, allowing control plane nodes and worker nodes to run with different configurations.
|
||||
These groups of nodes run worker or application pods, which are called `MachineConfigPool` (`mcp`) groups.
|
||||
The same machine config is applied on all nodes or only on one MCP in the cluster.
|
||||
|
||||
For more information about how and why to apply MCPs in a telco core cluster, see xref:../../../edge_computing/day_2_core_cnf_clusters/updating/telco-update-ocp-update-prep.adoc#telco-update-applying-mcp-labels-to-nodes-before-the-update_ocp-update-prep[Applying MachineConfigPool labels to nodes before the update].
|
||||
|
||||
For more information about the Machine Config Operator, see xref:../../../operators/operator-reference.adoc#machine-config-operator_operator-reference[Machine Config Operator].
|
||||
|
||||
include::modules/telco-troubleshooting-mco-purpose.adoc[leveloffset=+1]
|
||||
include::modules/telco-troubleshooting-mco-apply-several-mcs.adoc[leveloffset=+1]
|
||||
@@ -1,29 +1,29 @@
|
||||
:_mod-docs-content-type: ASSEMBLY
|
||||
[id="telco-troubleshooting-bmn-maintenance"]
|
||||
[id="troubleshooting-bmn-maintenance"]
|
||||
= Bare-metal node maintenance
|
||||
include::_attributes/common-attributes.adoc[]
|
||||
:context: telco-troubleshooting-bmn-maintenance
|
||||
:context: troubleshooting-bmn-maintenance
|
||||
|
||||
toc::[]
|
||||
|
||||
You can connect to a node for general troubleshooting.
|
||||
However, in some cases, you need to perform troubleshooting or maintenance tasks on certain hardware components.
|
||||
This section discusses topics that you need to perform that hardware maintenance.
|
||||
This section discusses topics that you need to perform for hardware maintenance.
|
||||
|
||||
include::modules/telco-troubleshooting-bmn-connect-to-node.adoc[leveloffset=+1]
|
||||
include::modules/telco-troubleshooting-bmn-move-apps-to-pods.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-bmn-connect-to-node.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-bmn-move-apps-to-pods.adoc[leveloffset=+1]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
|
||||
* xref:../../../nodes/nodes/nodes-nodes-working.adoc#nodes-nodes-working_nodes-nodes-working[Working with nodes]
|
||||
|
||||
include::modules/telco-troubleshooting-bmn-replace-dimm.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-bmn-replace-dimm.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-bmn-replace-disk.adoc[leveloffset=+1]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
|
||||
* xref:../../../storage/index.adoc#storage-overview_storage-overview[{product-title} storage overview]
|
||||
|
||||
include::modules/telco-troubleshooting-bmn-replace-disk.adoc[leveloffset=+1]
|
||||
include::modules/telco-troubleshooting-bmn-replace-nw-card.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-bmn-replace-nw-card.adoc[leveloffset=+1]
|
||||
@@ -1,8 +1,8 @@
|
||||
:_mod-docs-content-type: ASSEMBLY
|
||||
[id="telco-troubleshooting-cert-maintenance"]
|
||||
[id="troubleshooting-cert-maintenance"]
|
||||
= Certificate maintenance
|
||||
include::_attributes/common-attributes.adoc[]
|
||||
:context: telco-troubleshooting-cert-maintenance
|
||||
:context: troubleshooting-cert-maintenance
|
||||
|
||||
toc::[]
|
||||
|
||||
@@ -14,22 +14,22 @@ Learn about certificates in {product-title} and how to maintain them by using th
|
||||
* link:https://access.redhat.com/solutions/5018231[Which OpenShift certificates do rotate automatically and which do not in Openshift 4.x?]
|
||||
* link:https://access.redhat.com/solutions/7000968[Checking etcd certificate expiry in OpenShift 4]
|
||||
|
||||
include::modules/telco-troubleshooting-certs-manual.adoc[leveloffset=+1]
|
||||
include::modules/telco-troubleshooting-certs-manual-proxy.adoc[leveloffset=+2]
|
||||
include::modules/troubleshooting-certs-manual.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-certs-manual-proxy.adoc[leveloffset=+2]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
|
||||
* xref:../../../security/certificate_types_descriptions/proxy-certificates.adoc#cert-types-proxy-certificates[Proxy certificates]
|
||||
|
||||
include::modules/telco-troubleshooting-certs-manual-user-provisioned.adoc[leveloffset=+2]
|
||||
include::modules/troubleshooting-certs-manual-user-provisioned.adoc[leveloffset=+2]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
|
||||
* xref:../../../security/certificate_types_descriptions/user-provided-certificates-for-api-server.adoc#cert-types-user-provided-certificates-for-the-api-server[User-provisioned certificates for the API server]
|
||||
|
||||
include::modules/telco-troubleshooting-certs-auto.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-certs-auto.adoc[leveloffset=+1]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
@@ -44,21 +44,21 @@ include::modules/telco-troubleshooting-certs-auto.adoc[leveloffset=+1]
|
||||
* xref:../../../security/certificate_types_descriptions/control-plane-certificates.adoc#cert-types-control-plane-certificates_cert-types-control-plane-certificates[Control plane certificates]
|
||||
* xref:../../../security/certificate_types_descriptions/ingress-certificates.adoc#cert-types-ingress-certificates_cert-types-ingress-certificates[Ingress certificates]
|
||||
|
||||
include::modules/telco-troubleshooting-certs-auto-etcd.adoc[leveloffset=+2]
|
||||
include::modules/troubleshooting-certs-auto-etcd.adoc[leveloffset=+2]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
|
||||
* xref:../../../security/certificate_types_descriptions/etcd-certificates.adoc#cert-types-etcd-certificates_cert-types-etcd-certificates[etcd certificates]
|
||||
|
||||
include::modules/telco-troubleshooting-certs-auto-node.adoc[leveloffset=+2]
|
||||
include::modules/troubleshooting-certs-auto-node.adoc[leveloffset=+2]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
|
||||
* xref:../../../security/certificate_types_descriptions/node-certificates.adoc#cert-types-node-certificates_cert-types-node-certificates[Node certificates]
|
||||
|
||||
include::modules/telco-troubleshooting-certs-auto-service-ca.adoc[leveloffset=+2]
|
||||
include::modules/troubleshooting-certs-auto-service-ca.adoc[leveloffset=+2]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
@@ -0,0 +1,21 @@
|
||||
:_mod-docs-content-type: ASSEMBLY
|
||||
[id="troubleshooting-cluster-maintenance"]
|
||||
= Cluster maintenance
|
||||
include::_attributes/common-attributes.adoc[]
|
||||
:context: troubleshooting-cluster-maintenance
|
||||
|
||||
toc::[]
|
||||
|
||||
When deploying {product-title} on bare-metal infrastructure, you must pay more attention to certain configurations which can have a significant impact on cluster stability.
|
||||
You can troubleshoot more effectively by completing these tasks:
|
||||
|
||||
* Monitor for failed or failing hardware components
|
||||
* Periodically check the status of the cluster Operators
|
||||
|
||||
[NOTE]
|
||||
====
|
||||
For hardware monitoring, contact your hardware vendor to find the appropriate logging tool for your specific hardware.
|
||||
====
|
||||
|
||||
include::modules/troubleshooting-clusters-check-cluster-operators.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-clusters-check-for-failed-pods.adoc[leveloffset=+1]
|
||||
@@ -1,20 +1,20 @@
|
||||
:_mod-docs-content-type: ASSEMBLY
|
||||
[id="telco-troubleshooting-general-troubleshooting"]
|
||||
[id="troubleshooting-general-troubleshooting"]
|
||||
= General troubleshooting
|
||||
include::_attributes/common-attributes.adoc[]
|
||||
:context: telco-troubleshooting-general-troubleshooting
|
||||
:context: troubleshooting-general-troubleshooting
|
||||
|
||||
toc::[]
|
||||
|
||||
When you encounter a problem, the first step is to find the specific area where the issue is happening.
|
||||
To narrow down the potential problematic areas, complete one or more tasks:
|
||||
To narrow down the potential problematic areas, complete one or more of the following tasks:
|
||||
|
||||
* Query your cluster
|
||||
* Check your pod logs
|
||||
* Debug a pod
|
||||
* Review events
|
||||
|
||||
include::modules/telco-troubleshooting-general-query-cluster.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-general-query-cluster.adoc[leveloffset=+1]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
@@ -22,7 +22,7 @@ include::modules/telco-troubleshooting-general-query-cluster.adoc[leveloffset=+1
|
||||
* xref:../../../cli_reference/openshift_cli/developer-cli-commands.adoc#oc-get[oc get]
|
||||
* xref:../../../support/troubleshooting/investigating-pod-issues.adoc#reviewing-pod-status_investigating-pod-issues[Reviewing pod status]
|
||||
|
||||
include::modules/telco-troubleshooting-general-check-logs.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-general-check-logs.adoc[leveloffset=+1]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
@@ -32,21 +32,21 @@ include::modules/telco-troubleshooting-general-check-logs.adoc[leveloffset=+1]
|
||||
* xref:../../../support/troubleshooting/investigating-pod-issues.adoc#inspecting-pod-and-container-logs_investigating-pod-issues[Inspecting pod and container logs]
|
||||
|
||||
|
||||
include::modules/telco-troubleshooting-general-describe-pod.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-general-describe-pod.adoc[leveloffset=+1]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
|
||||
* xref:../../../cli_reference/openshift_cli/developer-cli-commands.adoc#oc-describe[oc describe]
|
||||
|
||||
include::modules/telco-troubleshooting-general-review-events.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-general-review-events.adoc[leveloffset=+1]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
|
||||
* xref:../../../security/container_security/security-monitoring.adoc#security-monitoring-events_security-monitoring[Watching cluster events]
|
||||
|
||||
include::modules/telco-troubleshooting-general-connect-to-pod.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-general-connect-to-pod.adoc[leveloffset=+1]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
@@ -54,7 +54,7 @@ include::modules/telco-troubleshooting-general-connect-to-pod.adoc[leveloffset=+
|
||||
* xref:../../../cli_reference/openshift_cli/developer-cli-commands.adoc#oc-rsh[oc rsh]
|
||||
* xref:../../../support/troubleshooting/investigating-pod-issues.adoc#accessing-running-pods_investigating-pod-issues[Accessing running pods]
|
||||
|
||||
include::modules/telco-troubleshooting-general-debug-pod.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-general-debug-pod.adoc[leveloffset=+1]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
@@ -62,7 +62,7 @@ include::modules/telco-troubleshooting-general-debug-pod.adoc[leveloffset=+1]
|
||||
* xref:../../../cli_reference/openshift_cli/developer-cli-commands.adoc#oc-debug[oc debug]
|
||||
* xref:../../../support/troubleshooting/investigating-pod-issues.adoc#starting-debug-pods-with-root-access_investigating-pod-issues[Starting debug pods with root access]
|
||||
|
||||
include::modules/telco-troubleshooting-general-run-command-on-pod.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-general-run-command-on-pod.adoc[leveloffset=+1]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
@@ -1,24 +1,23 @@
|
||||
:_mod-docs-content-type: ASSEMBLY
|
||||
[id="telco-troubleshooting-intro"]
|
||||
= Troubleshooting and maintaining telco core CNF clusters
|
||||
[id="troubleshooting-intro"]
|
||||
= Troubleshooting and maintaining {product-title} clusters
|
||||
include::_attributes/common-attributes.adoc[]
|
||||
:context: telco-troubleshooting-intro
|
||||
:context: troubleshooting-intro
|
||||
|
||||
toc::[]
|
||||
|
||||
Troubleshooting and maintenance are weekly tasks that can be a challenge if you do not have the tools to reach your goal, whether you want to update a component or investigate an issue.
|
||||
Part of the challenge is knowing where and how to search for tools and answers.
|
||||
|
||||
To maintain and troubleshoot a bare-metal environment where high-bandwidth network throughput is required, see the following procedures.
|
||||
To maintain and troubleshoot a bare-metal environment with high performance requirements, see the following procedures.
|
||||
|
||||
[IMPORTANT]
|
||||
====
|
||||
This troubleshooting information is not a reference for configuring {product-title} or developing Cloud-native Network Function (CNF) applications.
|
||||
This troubleshooting information is not a reference for configuring {product-title} or developing cloud-native applications.
|
||||
|
||||
For information about developing CNF applications for telco, see link:https://redhat-best-practices-for-k8s.github.io/guide/[Red Hat Best Practices for Kubernetes].
|
||||
For information about developing cloud-native applications on {product-title}, see link:https://redhat-best-practices-for-k8s.github.io/guide/[Red Hat Best Practices for Kubernetes].
|
||||
====
|
||||
|
||||
include::modules/telco-troubleshooting-cnfs.adoc[leveloffset=+1]
|
||||
include::modules/support-getting-support.adoc[leveloffset=+1]
|
||||
include::modules/support-knowledgebase-about.adoc[leveloffset=+2]
|
||||
include::modules/support-knowledgebase-search.adoc[leveloffset=+2]
|
||||
@@ -0,0 +1,18 @@
|
||||
:_mod-docs-content-type: ASSEMBLY
|
||||
[id="troubleshooting-mco"]
|
||||
= Machine Config Operator
|
||||
include::_attributes/common-attributes.adoc[]
|
||||
:context: troubleshooting-mco
|
||||
|
||||
toc::[]
|
||||
|
||||
The Machine Config Operator provides useful information to cluster administrators and controls what is running directly on the bare-metal host.
|
||||
|
||||
The Machine Config Operator differentiates between groups of nodes in the cluster, allowing control plane nodes and worker nodes to run with different configurations.
|
||||
These groups of nodes run worker or application pods, which are called `MachineConfigPool` (`mcp`) groups.
|
||||
The same machine config is applied to all nodes or only to one MCP in the cluster.
|
||||
|
||||
For more information about the Machine Config Operator, see xref:../../../operators/operator-reference.adoc#machine-config-operator_cluster-operators-ref[Machine Config Operator].
|
||||
|
||||
include::modules/troubleshooting-mco-purpose.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-mco-apply-several-mcs.adoc[leveloffset=+1]
|
||||
@@ -1,14 +1,14 @@
|
||||
:_mod-docs-content-type: ASSEMBLY
|
||||
[id="telco-troubleshooting-security"]
|
||||
[id="troubleshooting-security"]
|
||||
= Security
|
||||
include::_attributes/common-attributes.adoc[]
|
||||
:context: telco-troubleshooting-security
|
||||
:context: troubleshooting-security
|
||||
|
||||
toc::[]
|
||||
|
||||
Implementing a robust cluster security profile is important for building resilient telco networks.
|
||||
Implementing a robust cluster security profile is important for building resilient environments.
|
||||
|
||||
include::modules/telco-troubleshooting-security-authentication.adoc[leveloffset=+1]
|
||||
include::modules/troubleshooting-security-authentication.adoc[leveloffset=+1]
|
||||
|
||||
[role="_additional-resources"]
|
||||
.Additional resources
|
||||
@@ -1,11 +0,0 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-intro.adoc
|
||||
|
||||
:_mod-docs-content-type: CONCEPT
|
||||
[id="telco-troubleshooting-cnfs_{context}"]
|
||||
= Cloud-native Network Functions
|
||||
|
||||
If you are starting to use {product-title} for telecommunications Cloud-native Network Function (CNF) applications, learning about CNFs can help you understand the issues that you might encounter.
|
||||
|
||||
To learn more about CNFs and their evolution, see link:https://www.redhat.com/en/topics/cloud-native-apps/vnf-and-cnf-whats-the-difference[VNF and CNF, what’s the difference?].
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-bmn-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-bmn-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-bmn-connect-to-node_{context}"]
|
||||
[id="troubleshooting-bmn-connect-to-node_{context}"]
|
||||
= Connecting to a bare-metal node in your cluster
|
||||
|
||||
You can connect to bare-metal cluster nodes for general maintenance tasks.
|
||||
@@ -15,9 +15,9 @@ Configuring the cluster node from the host operating system is not recommended o
|
||||
|
||||
To troubleshoot your nodes, you can do the following tasks:
|
||||
|
||||
* Retrieve logs from node
|
||||
* Retrieve logs from a node
|
||||
* Use debugging
|
||||
* Use SSH to connect to the node
|
||||
* Use SSH to connect to a node
|
||||
|
||||
[IMPORTANT]
|
||||
====
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-bmn-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-bmn-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-bmn-move-apps-to-pods_{context}"]
|
||||
[id="troubleshooting-bmn-move-apps-to-pods_{context}"]
|
||||
= Moving applications to pods within the cluster
|
||||
|
||||
For scheduled hardware maintenance, you need to consider how to move your application pods to other nodes within the cluster without affecting the pod workload.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-bmn-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-bmn-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: CONCEPT
|
||||
[id="telco-troubleshooting-bmn-replace-dimm_{context}"]
|
||||
[id="troubleshooting-bmn-replace-dimm_{context}"]
|
||||
= DIMM memory replacement
|
||||
|
||||
Dual in-line memory module (DIMM) problems sometimes only appear after a server reboots.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-bmn-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-bmn-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: CONCEPT
|
||||
[id="telco-troubleshooting-bmn-replace-disk_{context}"]
|
||||
[id="troubleshooting-bmn-replace-disk_{context}"]
|
||||
= Disk replacement
|
||||
|
||||
If you do not have disk redundancy configured on your node through hardware or software redundant array of independent disks (RAID), you need to check the following:
|
||||
@@ -1,13 +1,13 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-bmn-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-bmn-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: CONCEPT
|
||||
[id="telco-troubleshooting-bmn-replace-nw-card_{context}"]
|
||||
[id="troubleshooting-bmn-replace-nw-card_{context}"]
|
||||
= Cluster network card replacement
|
||||
|
||||
When you replace a network card, the MAC address changes.
|
||||
The MAC address can be part of the DHCP or SR-IOV Operator configuration, router configuration, firewall rules, or application Cloud-native Network Function (CNF) configuration.
|
||||
The MAC address can be part of the DHCP or SR-IOV Operator configuration, router configuration, firewall rules, or cloud-native application configuration.
|
||||
Before you bring back a node online after replacing a network card, you must verify that these configurations are up-to-date.
|
||||
|
||||
[IMPORTANT]
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-cert-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-cert-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: CONCEPT
|
||||
[id="telco-troubleshooting-certs-auto-etcd_{context}"]
|
||||
[id="troubleshooting-certs-auto-etcd_{context}"]
|
||||
= Certificates managed by etcd
|
||||
|
||||
The etcd certificates are used for encrypted communication between etcd member peers as well as encrypted client traffic.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-cert-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-cert-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: CONCEPT
|
||||
[id="telco-troubleshooting-certs-auto-node_{context}"]
|
||||
[id="troubleshooting-certs-auto-node_{context}"]
|
||||
= Node certificates
|
||||
|
||||
Node certificates are self-signed certificates, which means that they are signed by the cluster and they originate from an internal certificate authority (CA) that is generated by the bootstrap process.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-cert-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-cert-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: CONCEPT
|
||||
[id="telco-troubleshooting-certs-auto-service-ca_{context}"]
|
||||
[id="troubleshooting-certs-auto-service-ca_{context}"]
|
||||
= Service CA certificates
|
||||
|
||||
The `service-ca` is an Operator that creates a self-signed certificate authority (CA) when an {product-title} cluster is deployed.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-cert-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-cert-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: CONCEPT
|
||||
[id="telco-troubleshooting-certs-auto_{context}"]
|
||||
[id="troubleshooting-certs-auto_{context}"]
|
||||
= Certificates managed by the cluster
|
||||
|
||||
You only need to check cluster-managed certificates if you detect an issue in the logs.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-cert-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-cert-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-certs-manual-proxy_{context}"]
|
||||
[id="troubleshooting-certs-manual-proxy_{context}"]
|
||||
= Managing proxy certificates
|
||||
|
||||
Proxy certificates allow users to specify one or more custom certificate authority (CA) certificates that are used by platform components when making egress connections.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-cert-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-cert-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: CONCEPT
|
||||
[id="telco-troubleshooting-certs-manual-user-provisioned_{context}"]
|
||||
[id="troubleshooting-certs-manual-user-provisioned_{context}"]
|
||||
= User-provisioned API server certificates
|
||||
|
||||
The API server is accessible by clients that are external to the cluster at `api.<cluster_name>.<base_domain>`.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-cert-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-cert-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: CONCEPT
|
||||
[id="telco-troubleshooting-certs-manual_{context}"]
|
||||
[id="troubleshooting-certs-manual_{context}"]
|
||||
= Certificates manually managed by the administrator
|
||||
|
||||
The following certificates must be renewed by a cluster administrator:
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-cluster-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-cluster-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-clusters-check-cluster-operators_{context}"]
|
||||
[id="troubleshooting-clusters-check-cluster-operators_{context}"]
|
||||
= Checking cluster Operators
|
||||
|
||||
Periodically check the status of your cluster Operators to find issues early.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-cluster-maintenance.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-cluster-maintenance.adoc
|
||||
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-clusters-check-for-failed-pods_{context}"]
|
||||
[id="troubleshooting-clusters-check-for-failed-pods_{context}"]
|
||||
= Watching for failed pods
|
||||
|
||||
To reduce troubleshooting time, regularly monitor for failed pods in your cluster.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-general-troubleshooting.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-general-troubleshooting.adoc
|
||||
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-general-check-logs_{context}"]
|
||||
[id="troubleshooting-general-check-logs_{context}"]
|
||||
= Checking pod logs
|
||||
|
||||
Get logs from the pod so that you can review the logs for issues.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-general-troubleshooting.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-general-troubleshooting.adoc
|
||||
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-general-connect-to-pod_{context}"]
|
||||
[id="troubleshooting-general-connect-to-pod_{context}"]
|
||||
= Connecting to a pod
|
||||
|
||||
You can directly connect to a currently running pod with the `oc rsh` command, which provides you with a shell on that pod.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-general-troubleshooting.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-general-troubleshooting.adoc
|
||||
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-general-debug-pod_{context}"]
|
||||
[id="troubleshooting-general-debug-pod_{context}"]
|
||||
= Debugging a pod
|
||||
|
||||
In certain cases, you do not want to directly interact with your pod that is in production.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-general-troubleshooting.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-general-troubleshooting.adoc
|
||||
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-general-describe-pod_{context}"]
|
||||
[id="troubleshooting-general-describe-pod_{context}"]
|
||||
= Describing a pod
|
||||
|
||||
Describing a pod gives you information about that pod to help with troubleshooting.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-general-troubleshooting.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-general-troubleshooting.adoc
|
||||
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-general-query-cluster_{context}"]
|
||||
[id="troubleshooting-general-query-cluster_{context}"]
|
||||
= Querying your cluster
|
||||
|
||||
Get information about your cluster so that you can more accurately find potential problems.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-general-troubleshooting.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-general-troubleshooting.adoc
|
||||
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-general-review-events_{context}"]
|
||||
[id="troubleshooting-general-review-events_{context}"]
|
||||
= Reviewing events
|
||||
|
||||
You can review the events in a given namespace to find potential issues.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-general-troubleshooting.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-general-troubleshooting.adoc
|
||||
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-run-command-on-pod_{context}"]
|
||||
[id="troubleshooting-run-command-on-pod_{context}"]
|
||||
= Running a command on a pod
|
||||
|
||||
If you want to run a command or set of commands on a pod without directly logging into it, you can use the `oc exec -it` command.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-mco.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-mco.adoc
|
||||
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-mco-apply-several-mcs_{context}"]
|
||||
[id="troubleshooting-mco-apply-several-mcs_{context}"]
|
||||
= Applying several machine config files at the same time
|
||||
|
||||
When you need to change the machine config for a group of nodes in the cluster, also known as machine config pools (MCPs), sometimes the changes must be applied with several different machine config files.
|
||||
@@ -1,9 +1,9 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-mco.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-mco.adoc
|
||||
|
||||
:_mod-docs-content-type: CONCEPT
|
||||
[id="telco-troubleshooting-mco-purpose_{context}"]
|
||||
[id="troubleshooting-mco-purpose_{context}"]
|
||||
= Purpose of the Machine Config Operator
|
||||
|
||||
The Machine Config Operator (MCO) manages and applies configuration and updates of {op-system-first} and container runtime, including everything between the kernel and kubelet.
|
||||
@@ -16,4 +16,4 @@ You must consider these minor components and how the MCO can help you manage you
|
||||
====
|
||||
You must use the MCO to perform all changes on worker or control plane nodes.
|
||||
Do not manually make changes to {op-system} or node files.
|
||||
====
|
||||
====
|
||||
@@ -1,8 +1,8 @@
|
||||
// Module included in the following assemblies:
|
||||
//
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-security.adoc
|
||||
// * edge_computing/day_2_core_cnf_clusters/troubleshooting/troubleshooting-security.adoc
|
||||
:_mod-docs-content-type: PROCEDURE
|
||||
[id="telco-troubleshooting-security-authentication_{context}"]
|
||||
[id="troubleshooting-security-authentication_{context}"]
|
||||
= Authentication
|
||||
|
||||
Determine which identity providers are in your cluster.
|
||||
Reference in New Issue
Block a user