OSDOCS-9597: update troubleshooting book MicroShift

2026-02-05 12:46:18 +01:00 · 2024-05-28 12:01:15 -04:00
parent f82d2a176c
commit 4539f71a42
12 changed files with 135 additions and 51 deletions
--- a/_topic_maps/_topic_map_ms.yml
+++ b/_topic_maps/_topic_map_ms.yml
@@ -497,15 +497,17 @@ Name: Troubleshooting
 Dir: microshift_troubleshooting
 Distros: microshift
 Topics:
- Name: Checking your version
+- Name: Check your version
  File: microshift-version
- Name: Troubleshooting backup and restore
-  File: microshift-troubleshoot-backup-restore
 - Name: Troubleshoot the cluster
  File: microshift-troubleshoot-cluster
+- Name: Troubleshoot backup and restore
+  File: microshift-troubleshoot-backup-restore
 - Name: Troubleshoot updates
  File: microshift-troubleshoot-updates
- Name: Checking audit logs
+- Name: Check the audit logs
  File: microshift-audit-logs
+- Name: Troubleshoot etcd
+  File: microshift-etcd-troubleshoot
 - Name: Additional information
  File: microshift-things-to-know
--- a/microshift_support/microshift-etcd.adoc
+++ b/microshift_support/microshift-etcd.adoc
@@ -6,11 +6,15 @@ include::_attributes/attributes-microshift.adoc[]

 toc::[]

-[role="_abstract"]
 The etcd service is delivered as part of the {product-title} RPM. The etcd service is run as a separate process and the etcd lifecycle is managed automatically by {microshift-short}.

 include::modules/microshift-observe-debug-etcd-server.adoc[leveloffset=+1]

-include::modules/microshift-config-etcd.adoc[leveloffset=+1]
-
 include::modules/microshift-etcd-version.adoc[leveloffset=+1]
+
+[id="microshift-troubleshooting-etcd_{context}"]
+== Troubleshooting etcd
+
+To troubleshoot etcd and improve performance, configure the memory allowance for the service.
+
+include::modules/microshift-config-etcd.adoc[leveloffset=+1]
--- a/microshift_support/microshift-getting-support.adoc
+++ b/microshift_support/microshift-getting-support.adoc
@@ -8,12 +8,15 @@ toc::[]

 Use the following information to get more help with {op-system-bundle}, including {product-title} or {op-system-ostree-first}.

+//OCP module
 include::modules/support.adoc[leveloffset=+1]

 include::modules/microshift-provide-feedback-jira-link.adoc[leveloffset=+1]

+//OCP module
 include::modules/support-knowledgebase-about.adoc[leveloffset=+1]

+//OCP module
 include::modules/support-knowledgebase-search.adoc[leveloffset=+1]

 include::modules/microshift-submitting-a-case.adoc[leveloffset=+1]
--- a/microshift_troubleshooting/microshift-etcd-troubleshoot.adoc
+++ b/microshift_troubleshooting/microshift-etcd-troubleshoot.adoc
@@ -0,0 +1,11 @@
+:_mod-docs-content-type: ASSEMBLY
+[id="microshift-etcd-troubleshoot"]
+= Troubleshoot etcd
+include::_attributes/attributes-microshift.adoc[]
+:context: microshift-etcd-troubleshoot
+
+toc::[]
+
+To troubleshoot etcd and improve performance, configure the memory allowance for the service.
+
+include::modules/microshift-config-etcd.adoc[leveloffset=+1]
--- a/microshift_troubleshooting/microshift-troubleshoot-cluster.adoc
+++ b/microshift_troubleshooting/microshift-troubleshoot-cluster.adoc
@@ -6,6 +6,6 @@ include::_attributes/attributes-microshift.adoc[]

 toc::[]

-To begin troubleshooting a {product-title} cluster, first access the cluster status.
+To begin troubleshooting a {microshift-short} cluster, first access the cluster status.

 include::modules/microshift-check-cluster-status.adoc[leveloffset=+1]
--- a/microshift_troubleshooting/microshift-troubleshoot-updates.adoc
+++ b/microshift_troubleshooting/microshift-troubleshoot-updates.adoc
@@ -8,10 +8,10 @@ toc::[]

 To troubleshoot {microshift-short} updates, use the following guide.

-[IMPORTANT]
-====
-You can only update {microshift-short} from one minor version to the next in sequence. For example, you must update 4.14 to 4.15.
-====
+//[IMPORTANT]
+//====
+//You can only update {microshift-short} from one minor version to the next in sequence. For example, you must update 4.14 to 4.15.
+//====

 include::modules/microshift-updates-troubleshooting.adoc[leveloffset=+1]

--- a/modules/microshift-check-cluster-status.adoc
+++ b/modules/microshift-check-cluster-status.adoc
@@ -6,33 +6,50 @@
 [id="microshift-check-cluster-status_{context}"]
 = Checking the status of a cluster

-You can check the status of a {microshift-short} cluster or see active pods by running a simple command. Given in the following procedure are three commands you can use to check cluster status. You can choose to run one, two, or all commands to help you retrieve the information you need to troubleshoot the cluster.
+You can check the status of a {microshift-short} cluster or see active pods. Given in the following procedure are three different commands you can use to check cluster status. You can choose to run one, two, or all commands to help you get the information you need to troubleshoot the cluster.

 .Procedure
-* You can check the system status, which returns the cluster status, by running the following command:
+* Check the system status, which returns the cluster status, by running the following command:
 +
 [source,terminal]
 ----
 $ sudo systemctl status microshift
 ----
 +
-If {microshift-short} is failing to start, this command returns the logs from the previous run.
+If {microshift-short} fails to start, this command returns the logs from the previous run.
+
+.Example healthy output
+[source,text]
+----
+● microshift.service - MicroShift
+     Loaded: loaded (/usr/lib/systemd/system/microshift.service; enabled; preset: disabled)
+     Active: active (running) since <day> <date> 12:39:06 UTC; 47min ago
+   Main PID: 20926 (microshift)
+      Tasks: 14 (limit: 48063)
+     Memory: 542.9M
+        CPU: 2min 41.185s
+     CGroup: /system.slice/microshift.service
+             └─20926 microshift run

-* Optional: You can view the logs by running the following command:
+<Month-Day> 13:23:06 i-06166fbb376f14a8b.<hostname> microshift[20926]: kube-apiserver I0528 13:23:06.876001   20926 controll>
+<Month-Day> 13:23:06 i-06166fbb376f14a8b.<hostname> microshift[20926]: kube-apiserver I0528 13:23:06.876574   20926 controll>
+# ...
+----
+
+* Optional: Get comprehensive logs by running the following command:
 +
 [source,terminal]
 ----
 $ sudo journalctl -u microshift
 ----
-
+
 [NOTE]
 ====
 The default configuration of the `systemd` journal service stores data in a volatile directory. To persist system logs across system starts and restarts, enable log persistence and set limits on the maximum journal data size.
 ====

-* Optional: If {microshift-short} is running, you can see active pods by entering the following command:
+* Optional: If {microshift-short} is running, check the status of active pods by entering the following command:
 +
-[source,terminal]
----
-$ oc get pods -A
----
+--
+include::snippets/microshift-healthy-pods-snip.adoc[leveloffset=+1]
+--
--- a/modules/microshift-check-journal-logs-updates.adoc
+++ b/modules/microshift-check-journal-logs-updates.adoc
@@ -15,7 +15,7 @@ The default configuration of the `systemd` journal service stores data in a vola

 .Procedure

-* Check the {microshift-short} journal logs by running the following command:
+* Get comprehensive {microshift-short} journal logs by running the following command:
 +
 [source,terminal]
 ----
@@ -29,14 +29,7 @@ $ sudo journalctl -u microshift
 $ sudo journalctl -u greenboot-healthcheck
 ----

-* Check the journal logs for a boot of a specific service by running the following command:
-+
-[source,terminal]
----
-$ sudo journalctl --boot <boot> -u <service-name>
----
-
-* Examining the comprehensive logs of a specific boot uses two steps. First list the boots, then select the one you want from the list you obtained:
+* Examining the comprehensive logs of a specific boot uses three steps. First list the boots, then select the one you want from the list you obtained:

 ** List the boots present in the journal logs by running the following command:
 +
@@ -44,10 +37,28 @@ $ sudo journalctl --boot <boot> -u <service-name>
 ----
 $ sudo journalctl --list-boots
 ----
+
+.Example output
+[source,text]
+----
+IDX  BOOT ID                          	FIRST ENTRY                 LAST ENTRY
+ 0   681ece6f5c3047e183e9d43268c5527f 	<Day> <Date> 12:27:58 UTC 	<Day> <Date>> 13:39:41 UTC
+#....
+----

-** Check the journal logs for the boot you want by running the following command:
+** Check the journal logs for the specific boot you want by running the following command:
 +
 [source,terminal]
 ----
-$ sudo journalctl --boot <-my-boot-number>
+$ sudo journalctl --boot <-my_boot_ID> <1>
 ----
+<1> Replace _<-my-boot-ID>_ with the number assigned to the specific boot that you want to check.
+
+** Check the journal logs for the boot of a specific service by running the following command:
+
+[source,terminal]
+----
+$ sudo journalctl --boot <-my_boot_ID> -u <service_name> <1> <2>
+----
+<1> Replace _<-my-boot-ID>_ with the number assigned to the specific boot that you want to check.
+<2> Replace _<service_name>_ with the name of the service that you want to check.
--- a/modules/microshift-config-etcd.adoc
+++ b/modules/microshift-config-etcd.adoc
@@ -6,7 +6,7 @@
 [id="microshift-config-etcd_{context}"]
 = Configuring the memoryLimitMB value to set parameters for the etcd server

-By default, etcd will use as much memory as necessary to handle the load on the system. In some memory constrained systems, it might be necessary to limit the amount of memory etcd is allowed to use at a given time.
+By default, etcd uses as much memory as necessary to handle the load on the system. In memory-constrained systems, you might need to limit the amount of memory etcd uses.

 .Procedure

@@ -20,7 +20,7 @@ etcd:
 +
 [NOTE]
 ====
-The minimum permissible value for `memoryLimitMB` on {microshift-short} is 128 MB. Values close to the minimum value are more likely to impact etcd performance. The lower the limit, the longer etcd takes to respond to queries. If the limit is too low or the etcd usage is high, queries time out.
+The minimum required value for `memoryLimitMB` on {microshift-short} is 128 MB. Values close to the minimum value are more likely to impact etcd performance. The lower the limit, the longer etcd takes to respond to queries. If the limit is too low or the etcd usage is high, queries time out.
 ====

 .Verification
--- a/modules/microshift-etcd-version.adoc
+++ b/modules/microshift-etcd-version.adoc
@@ -7,7 +7,7 @@
 [id="microshift-version-etcd_{context}"]
 = Checking the etcd version

-You can get the version information for the etcd database included with your {microshift-short}.
+You can get the version information for the etcd database included with your {microshift-short} by using one or both of the following methods, depending on the level of information that you need.

 .Procedure

@@ -21,8 +21,8 @@ $ microshift-etcd version
 .Example output
 [source,terminal,subs="attributes+"]
 ----
-microshift-etcd Version: 4.16.1
-Base etcd Version: 3.5.10
+microshift-etcd Version: 4.16.0
+Base etcd Version: 3.5.13
 ----

 * To display the full database version information, run the following command:
@@ -37,15 +37,15 @@ $ microshift-etcd version -o json
 ----
 {
  "major": "4",
-  "minor": "15",
-  "gitVersion": "4.16.1",
-  "gitCommit": "2e182312718cc9d267ec71f37dc2fbe2eed01ee2",
+  "minor": "16",
+  "gitVersion": "4.16.0~rc.1",
+  "gitCommit": "140777711962eb4e0b765c39dfd325fb0abb3622",
  "gitTreeState": "clean",
-  "buildDate": "2024-01-09T06:51:40Z",
-  "goVersion": "go1.20.10",
+  "buildDate": "2024-05-10T16:37:53Z",
+  "goVersion": "go1.21.9"
  "compiler": "gc",
  "platform": "linux/amd64",
  "patch": "",
-  "etcdVersion": "3.5.10"
+  "etcdVersion": "3.5.13"
 }
 ----
--- a/modules/microshift-updates-troubleshooting.adoc
+++ b/modules/microshift-updates-troubleshooting.adoc
@@ -8,28 +8,33 @@

 In some cases, {microshift-short} might fail to update. In these events, it is helpful to understand failure types and how to troubleshoot them.

-[id="microshift-update-path-blocked-by-version-sequence_{context}"]
-== Update path is blocked by {microshift-short} version sequence
-{microshift-short} requires serial updates. Attempting to update {microshift-short} by skipping a minor version fails:
+//[id="microshift-update-path-blocked-by-version-sequence_{context}"]
+//== Update path is blocked by {microshift-short} version sequence
+//Certain versions of {microshift-short} require serial updates. Attempting to update {microshift-short} by skipping a minor version fails:

-* For example, if your current version is `4.14.5`, but you try to update from that version to `4.16.0`, the message, `executable (4.16.0) is too recent compared to existing data (4.14.5): version difference is 2, maximum allowed difference is 1` appears and {microshift-short} fails to start.
+//* For example, if your current version is `4.14.5`, but you try to update from that version to `4.16.0`, the message, `executable (4.16.0) is too recent compared to existing data (4.14.5): version difference is 2, maximum allowed difference is 1` appears and {microshift-short} fails to start.

-In this example, you must first update `4.14.5` to a version of `4.15`, and then you can upgrade to `4.16.0`.
+//In this example, you must first update `4.14.5` to a version of `4.15`, and then you can upgrade to `4.16.0`.

 [id="microshift-update-path-blocked-by-version-incompatibility_{context}"]
 == Update path is blocked by version incompatibility
 RPM dependency errors result if a {microshift-short} update is incompatible with the version of {op-system-ostree-first} or {op-system-base-full}.

+[id="microshift-compatibility-table_{context}"]
+=== Compatibility table
 Check the following compatibility table:

 include::snippets/microshift-rhde-compatibility-table-snip.adoc[leveloffset=+2]

+[id="microshift-version-compatibility_{context}"]
+=== Version compatibility
 Check the following update paths:

 *{product-title} update paths*

-* Generally Available Version 4.14.0 to 4.14.z on {op-system-ostree} 9.2
-* Generally Available Version 4.14.0 to 4.14.z on {op-system} 9.2
+* Generally Available Version 4.16.0 to 4.16.z on {op-system-ostree} 9.4
+* Generally Available Version 4.15.0 from {op-system} 9.2 to 4.16.0 on {op-system} 9.4
+* Generally Available Version 4.14.0 from {op-system} 9.2 to 4.16.0 on {op-system} 9.4

 [id="microshift-ostree-update-failed_{context}"]
 == OSTree update failed
--- a/snippets/microshift-healthy-pods-snip.adoc
+++ b/snippets/microshift-healthy-pods-snip.adoc
@@ -0,0 +1,31 @@
+// Snippet for healthy MicroShift output with oc get pods -a
+//
+//*  microshift_troubleshooting/microshift-troubleshoot-cluster
+
+:_mod-docs-content-type: SNIPPET
+
+[source,terminal]
+----
+$ oc get pods -A
+----
+.Example output
+[source,terminal]
+----
+NAMESPACE                   NAME                                                     READY   STATUS   RESTARTS  AGE
+default                     i-06166fbb376f14a8bus-west-2computeinternal-debug-qtwcr  1/1     Running  0		    46m
+kube-system                 csi-snapshot-controller-5c6586d546-lprv4                 1/1     Running  0		    51m
+kube-system                 csi-snapshot-webhook-6bf8ddc7f5-kz6k9                    1/1     Running  0		    51m
+openshift-dns               dns-default-45jl7                                        2/2     Running  0		    50m
+openshift-dns               node-resolver-7wmzf                                      1/1     Running  0		    51m
+openshift-ingress           router-default-78b86fbf9d-qvj9s                          1/1     Running  0		    51m
+openshift-ovn-kubernetes    ovnkube-master-5rfhh                                     4/4     Running  0		    51m
+openshift-ovn-kubernetes    ovnkube-node-gcnt6                                       1/1     Running  0		    51m
+openshift-service-ca        service-ca-bf5b7c9f8-pn6rk                               1/1     Running  0		    51m
+openshift-storage           topolvm-controller-549f7fbdd5-7vrmv                      5/5     Running  0		    51m
+openshift-storage           topolvm-node-rht2m                                       3/3     Running  0		    50m
+----
+
+[NOTE]
+====
+This example output shows basic {microshift-short}. If you have installed optional RPMs, the status of pods running those services is also expected to be shown in your output.
+====