Documentation/troubleshooting.md

---
weight: 209
toc: true
title: Troubleshooting
menu:
    docs:
        parent: operator
lead: ""
images: []
draft: false
description: Guide on troubleshooting the Prometheus Operator.
---

### RBAC on Google Container Engine (GKE)

When you try to create `ClusterRole` (`kube-state-metrics`, `prometheus` `prometheus-operator`, etc.) on GKE Kubernetes cluster running 1.6 version, you will probably run into permission errors:

```
<....>
Error from server (Forbidden): error when creating
"manifests/prometheus-operator/prometheus-operator-cluster-role.yaml":
clusterroles.rbac.authorization.k8s.io "prometheus-operator" is forbidden: attempt to grant extra privileges:
<....>
```

This is due to the way Container Engine checks permissions. From [Google Kubernetes Engine docs](https://cloud.google.com/kubernetes-engine/docs/how-to/role-based-access-control):

> Because of the way Container Engine checks permissions when you create a Role or ClusterRole, you must first create a RoleBinding that grants you all of the permissions included in the role you want to create.
> An example workaround is to create a RoleBinding that gives your Google identity a cluster-admin role before attempting to create additional Role or ClusterRole permissions.
> This is a known issue in the Beta release of Role-Based Access Control in Kubernetes and Container Engine version 1.6.

To overcome this, you must grant your current Google identity `cluster-admin` Role:

```console
# get current google identity
$ gcloud info | grep Account
Account: [myname@example.org]

# grant cluster-admin to your current identity
$ kubectl create clusterrolebinding myname-cluster-admin-binding --clusterrole=cluster-admin --user=myname@example.org
Clusterrolebinding "myname-cluster-admin-binding" created
```

### Troubleshooting ServiceMonitor changes

When creating/deleting/modifying `ServiceMonitor` objects it is sometimes not as obvious what piece is not working properly. This section gives a step by step guide how to troubleshoot such actions on a `ServiceMonitor` object.

#### Overview of `ServiceMonitor` tagging and related elements

A common problem related to `ServiceMonitor` identification by Prometheus is related to the object's labels not matching the `Prometheus` custom resource definition scope, or lack of permission for the Prometheus `ServiceAccount` to *get, list, watch* `Services` and `Endpoints` from the target application being monitored. As a general guideline consider the diagram below, giving an example of a `Deployment` and `Service` called `my-app`, being monitored by Prometheus based on a `ServiceMonitor` named `my-service-monitor`:

<!-- do not change this link without verifying that the image will display correctly on https://prometheus-operator.dev -->

![flow diagram](/img/custom-metrics-elements.png)

Note: The `ServiceMonitor` references a `Service` (not a `Deployment`, or a `Pod`), by labels *and* by the port name in the `Service`. This *port name* is optional in Kubernetes, but must be specified for the `ServiceMonitor` to work. It is not the same as the port name on the `Pod` or container, although it can be.

#### Has my `ServiceMonitor` been picked up by Prometheus?

`ServiceMonitor` objects and the namespace where they belong are selected by the `serviceMonitorSelector` and `serviceMonitorNamespaceSelector`of a Prometheus object. The name of a `ServiceMonitor` is encoded in the Prometheus configuration, so you can simply grep whether it is present there. The configuration generated by the Prometheus Operator is stored in a Kubernetes `Secret`, named after the Prometheus object name prefixed with `prometheus-` and is located in the same namespace as the Prometheus object. For example for a Prometheus object called `k8s` one can find out if the `ServiceMonitor` named `my-service-monitor` has been picked up with:

```sh
kubectl -n monitoring get secret prometheus-k8s -ojson | jq -r '.data["prometheus.yaml.gz"]' | base64 -d | gunzip | grep "my-service-monitor"
```

#### It is in the configuration but not on the Service Discovery page

ServiceMonitors pointing to Services that do not exist (e.g. nothing matching `.spec.selector`) will lead to this ServiceMonitor not being added to the Service Discovery page. Check if you can find any Service with the selector you configured.

If you use `.spec.selector.matchLabels` (instead of e.g. `.spec.selector.matchExpressions`), you can use this command to check for services matching the given label:

```
kubectl get services -l "$(kubectl get servicemonitors -n "<namespace of your ServiceMonitor>" "<name of your ServiceMonitor>" -o template='{{ $first := 1 }}{{ range $key, $value := .spec.selector.matchLabels }}{{ if eq $first 0 }},{{end}}{{ $key }}={{ $value }}{{ $first = 0 }}{{end}}')"
```

Note: this command does not take namespaces into account. If your ServiceMonitor selects a single namespace or all namespaces, you can just add that to the `kubectl get services` command (using `-n $namespace` or `-A` for all namespaces).

### Prometheus kubelet metrics server returned HTTP status 403 Forbidden

Prometheus is installed, all looks good, however the `Targets` are all showing as down. All permissions seem to be good, yet no joy. Prometheus pulling metrics from all namespaces expect kube-system, and Prometheus has access to all namespaces including kube-system.

#### Did you check the webhooks?

Issue has been resolved by amending the webhooks to use `0.0.0.0` instead of `127.0.0.1`. Follow the below commands and it will update the webhooks which allows connections to all `clusterIP's` in all `namespaces` and not just `127.0.0.1`.

**Update the kubelet service to include webhook and restart:**

```sh
KUBEADM_SYSTEMD_CONF=/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
sed -e "/cadvisor-port=0/d" -i "$KUBEADM_SYSTEMD_CONF"
if ! grep -q "authentication-token-webhook=true" "$KUBEADM_SYSTEMD_CONF"; then
  sed -e "s/--authorization-mode=Webhook/--authentication-token-webhook=true --authorization-mode=Webhook/" -i "$KUBEADM_SYSTEMD_CONF"
fi
systemctl daemon-reload
systemctl restart kubelet
```

**Modify the kube controller and kube scheduler to allow for reading data:**

```sh
sed -e "s/- --address=127.0.0.1/- --address=0.0.0.0/" -i /etc/kubernetes/manifests/kube-controller-manager.yaml
sed -e "s/- --address=127.0.0.1/- --address=0.0.0.0/" -i /etc/kubernetes/manifests/kube-scheduler.yaml
```

### Using textual port number instead of port name

The ServiceMonitor expects to use the port name as defined on the Service. So, using the Service example from the
diagram above, we have this Service definition:

```yaml mdox-exec="cat example/user-guides/getting-started/example-app-service.yaml"
kind: Service
apiVersion: v1
metadata:
  name: example-app
  labels:
    app: example-app
spec:
  selector:
    app: example-app
  ports:
  - name: web
    port: 8080
```

We would then define the service monitor using `metrics` as the port not `"8080"`. E.g.

**CORRECT**

```yaml mdox-exec="cat example/user-guides/getting-started/example-app-service-monitor.yaml"
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: example-app
  labels:
    team: frontend
spec:
  selector:
    matchLabels:
      app: example-app
  endpoints:
  - port: web
```

**INCORRECT**

```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: example-app
  labels:
    team: frontend
spec:
  selector:
    matchLabels:
      app: example-app
  endpoints:
  - port: "8080"
```

The incorrect example will give an error along these lines `spec.endpoints.port in body must be of type string: "integer"`

### Prometheus/Alertmanager pods stuck in terminating loop with healthy start up logs

It is usually a sign that more than one operator wants to manage the resource.

Check if several operators are running on the cluster:

```console
kubectl get pods --all-namespaces | grep 'prom.*operator'
```

Check the logs of the matching pods to see if they manage the same resource.
Add content header for hugo website 2021-03-09 01:03:31 +01:00			`---`
Documentation: add more content to online docs (#5060) * Documentation: add more content to online docs This change adds the following content to prometheus-operator.dev website: * New "User Guides" section with the "Getting Started" and "Alerting" guides. I've updated/cleaned up the existing content to match with the current release of the operator. * "Storage" and "Strategic Merge Patch" pages to the Operator section. The "Storage" page also documents how to manually expand statefulset volumes (related to #4079). Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Address Philip's comments Signed-off-by: Simon Pasquier <spasquie@redhat.com> Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-10-05 16:27:27 +02:00			`weight: 209`
Add content header for hugo website 2021-03-09 01:03:31 +01:00			`toc: true`
Add mdox link checking and formatting Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> 2021-08-17 17:37:45 +05:30			`title: Troubleshooting`
			`menu:`
			`docs:`
			`parent: operator`
			`lead: ""`
			`images: []`
			`draft: false`
			`description: Guide on troubleshooting the Prometheus Operator.`
Add content header for hugo website 2021-03-09 01:03:31 +01:00			`---`
add RBAC info for GKE 1.6 2017-05-09 12:35:53 +03:00
			`### RBAC on Google Container Engine (GKE)`

			When you try to create `ClusterRole` (`kube-state-metrics`, `prometheus` `prometheus-operator`, etc.) on GKE Kubernetes cluster running 1.6 version, you will probably run into permission errors:

			```
			`<....>`
:art: fix code block syntax highlight and del tailing space Signed-off-by: kWeiZh <kweizh@gmail.com> 2021-06-14 16:19:22 +08:00			`Error from server (Forbidden): error when creating`
			`"manifests/prometheus-operator/prometheus-operator-cluster-role.yaml":`
add RBAC info for GKE 1.6 2017-05-09 12:35:53 +03:00			`clusterroles.rbac.authorization.k8s.io "prometheus-operator" is forbidden: attempt to grant extra privileges:`
			`<....>`
Add mdox link checking and formatting Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> 2021-08-17 17:37:45 +05:30			```
add RBAC info for GKE 1.6 2017-05-09 12:35:53 +03:00
fix old and broken links in the documentation 2020-02-04 13:58:07 +01:00			`This is due to the way Container Engine checks permissions. From [Google Kubernetes Engine docs](https://cloud.google.com/kubernetes-engine/docs/how-to/role-based-access-control):`
add RBAC info for GKE 1.6 2017-05-09 12:35:53 +03:00
			`> Because of the way Container Engine checks permissions when you create a Role or ClusterRole, you must first create a RoleBinding that grants you all of the permissions included in the role you want to create.`
			`> An example workaround is to create a RoleBinding that gives your Google identity a cluster-admin role before attempting to create additional Role or ClusterRole permissions.`
			`> This is a known issue in the Beta release of Role-Based Access Control in Kubernetes and Container Engine version 1.6.`

			To overcome this, you must grant your current Google identity `cluster-admin` Role:

			```console
			`# get current google identity`
			`$ gcloud info \| grep Account`
			`Account: [myname@example.org]`

			`# grant cluster-admin to your current identity`
			`$ kubectl create clusterrolebinding myname-cluster-admin-binding --clusterrole=cluster-admin --user=myname@example.org`
			`Clusterrolebinding "myname-cluster-admin-binding" created`
*: bump Prometheus Operator tags in manifests 2017-05-09 12:24:00 +02:00			```
kube-prometheus: Convert to jsonnet 2018-04-08 14:53:30 +02:00
			`### Troubleshooting ServiceMonitor changes`

			When creating/deleting/modifying `ServiceMonitor` objects it is sometimes not as obvious what piece is not working properly. This section gives a step by step guide how to troubleshoot such actions on a `ServiceMonitor` object.

Documentation: added diagram to ServiceMonitor troubleshooting 2019-02-08 19:32:48 -05:00			#### Overview of `ServiceMonitor` tagging and related elements

Documentation/troubleshooting.md: refresh document Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 11:31:38 +02:00			A common problem related to `ServiceMonitor` identification by Prometheus is related to the object's labels not matching the `Prometheus` custom resource definition scope, or lack of permission for the Prometheus `ServiceAccount` to get, list, watch `Services` and `Endpoints` from the target application being monitored. As a general guideline consider the diagram below, giving an example of a `Deployment` and `Service` called `my-app`, being monitored by Prometheus based on a `ServiceMonitor` named `my-service-monitor`:
Documentation: added diagram to ServiceMonitor troubleshooting 2019-02-08 19:32:48 -05:00
Documentation: update link to image for newer Hugo (#4903) Recent versions of Hugo fails to render `troubleshooting.md` when the `custom-metrics-elements.png` image isn't located in a static directory. To avoid further complication with the `synchronize.sh` script from the github.com/prometheus-operator/website project, images are now stored under `Documentation/img` and the script will copy this directory to the `static` directory of the website. Any content that wants to include an image needs to follow the same approach. The caveat is that the image will not be displayed from the GitHub UI. I have tried to use page bundles [1] which would circumvent this limitation but it would involve moving all the individual MarkDown files into separate directories and rearrange the site's structure. See also https://github.com/prometheus-operator/website/pull/17 [1] https://gohugo.io/content-management/page-bundles/ Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 09:53:42 +02:00			`<!-- do not change this link without verifying that the image will display correctly on https://prometheus-operator.dev -->`

			`![flow diagram](/img/custom-metrics-elements.png)`
Documentation: added diagram to ServiceMonitor troubleshooting 2019-02-08 19:32:48 -05:00
docs: troubleshooting += hints on Service port name We've apparently hit this three separate times internally, and I personally spent multiple hours trying to work out why the prometheus config was being generated, but with no hosts, so this seems like a reasonable thing to note here. 2020-03-26 14:46:38 +00:00			Note: The `ServiceMonitor` references a `Service` (not a `Deployment`, or a `Pod`), by labels and by the port name in the `Service`. This port name is optional in Kubernetes, but must be specified for the `ServiceMonitor` to work. It is not the same as the port name on the `Pod` or container, although it can be.

kube-prometheus: Convert to jsonnet 2018-04-08 14:53:30 +02:00			#### Has my `ServiceMonitor` been picked up by Prometheus?

Fix #3793: mentioned serviceMonitorNamespaceSelector (#3794) 2021-01-18 17:34:21 +01:00			`ServiceMonitor` objects and the namespace where they belong are selected by the `serviceMonitorSelector` and `serviceMonitorNamespaceSelector`of a Prometheus object. The name of a `ServiceMonitor` is encoded in the Prometheus configuration, so you can simply grep whether it is present there. The configuration generated by the Prometheus Operator is stored in a Kubernetes `Secret`, named after the Prometheus object name prefixed with `prometheus-` and is located in the same namespace as the Prometheus object. For example for a Prometheus object called `k8s` one can find out if the `ServiceMonitor` named `my-service-monitor` has been picked up with:
kube-prometheus: Convert to jsonnet 2018-04-08 14:53:30 +02:00
:art: fix code block syntax highlight and del tailing space Signed-off-by: kWeiZh <kweizh@gmail.com> 2021-06-14 16:19:22 +08:00			```sh
Update Documentation/troubleshooting.md of course, thx Co-Authored-By: pesimon <1394322+pesimon@users.noreply.github.com> 2019-03-13 16:50:57 +01:00			`kubectl -n monitoring get secret prometheus-k8s -ojson \| jq -r '.data["prometheus.yaml.gz"]' \| base64 -d \| gunzip \| grep "my-service-monitor"`
kube-prometheus: Convert to jsonnet 2018-04-08 14:53:30 +02:00			```

doc: add hint why ServiceMonitor might not be in Service Discovery Apparently ServiceMonitors will not be added to the Service Discovery page if their selector does not match any Service. This is a bit counter-intuitive, so let's just add this to the troubleshooting docs. I found one other instance of this being confusing on Slack, which helped me figure my problem out. Signed-off-by: Mara Sophie Grosch <mgrosch@anexia-it.com> 2023-08-04 13:11:56 +02:00			`#### It is in the configuration but not on the Service Discovery page`

			ServiceMonitors pointing to Services that do not exist (e.g. nothing matching `.spec.selector`) will lead to this ServiceMonitor not being added to the Service Discovery page. Check if you can find any Service with the selector you configured.

			If you use `.spec.selector.matchLabels` (instead of e.g. `.spec.selector.matchExpressions`), you can use this command to check for services matching the given label:

			```
			`kubectl get services -l "$(kubectl get servicemonitors -n "<namespace of your ServiceMonitor>" "<name of your ServiceMonitor>" -o template='{{ $first := 1 }}{{ range $key, $value := .spec.selector.matchLabels }}{{ if eq $first 0 }},{{end}}{{ $key }}={{ $value }}{{ $first = 0 }}{{end}}')"`
			```

			Note: this command does not take namespaces into account. If your ServiceMonitor selects a single namespace or all namespaces, you can just add that to the `kubectl get services` command (using `-n $namespace` or `-A` for all namespaces).

Update troubleshooting.md Added section for webhooks and allowing Prometheus to speak to all namespaces and targets. 2018-08-20 12:55:31 +01:00			`### Prometheus kubelet metrics server returned HTTP status 403 Forbidden`

			Prometheus is installed, all looks good, however the `Targets` are all showing as down. All permissions seem to be good, yet no joy. Prometheus pulling metrics from all namespaces expect kube-system, and Prometheus has access to all namespaces including kube-system.

			`#### Did you check the webhooks?`

Add mdox link checking and formatting Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> 2021-08-17 17:37:45 +05:30			Issue has been resolved by amending the webhooks to use `0.0.0.0` instead of `127.0.0.1`. Follow the below commands and it will update the webhooks which allows connections to all `clusterIP's` in all `namespaces` and not just `127.0.0.1`.
Update troubleshooting.md Added section for webhooks and allowing Prometheus to speak to all namespaces and targets. 2018-08-20 12:55:31 +01:00
			`Update the kubelet service to include webhook and restart:`
Add mdox link checking and formatting Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> 2021-08-17 17:37:45 +05:30
:art: fix code block syntax highlight and del tailing space Signed-off-by: kWeiZh <kweizh@gmail.com> 2021-06-14 16:19:22 +08:00			```sh
Update troubleshooting.md Added section for webhooks and allowing Prometheus to speak to all namespaces and targets. 2018-08-20 12:55:31 +01:00			`KUBEADM_SYSTEMD_CONF=/etc/systemd/system/kubelet.service.d/10-kubeadm.conf`
			`sed -e "/cadvisor-port=0/d" -i "$KUBEADM_SYSTEMD_CONF"`
			`if ! grep -q "authentication-token-webhook=true" "$KUBEADM_SYSTEMD_CONF"; then`
			`sed -e "s/--authorization-mode=Webhook/--authentication-token-webhook=true --authorization-mode=Webhook/" -i "$KUBEADM_SYSTEMD_CONF"`
			`fi`
			`systemctl daemon-reload`
			`systemctl restart kubelet`
			```

			`Modify the kube controller and kube scheduler to allow for reading data:`
Add mdox link checking and formatting Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> 2021-08-17 17:37:45 +05:30
:art: fix code block syntax highlight and del tailing space Signed-off-by: kWeiZh <kweizh@gmail.com> 2021-06-14 16:19:22 +08:00			```sh
Update troubleshooting.md Added section for webhooks and allowing Prometheus to speak to all namespaces and targets. 2018-08-20 12:55:31 +01:00			`sed -e "s/- --address=127.0.0.1/- --address=0.0.0.0/" -i /etc/kubernetes/manifests/kube-controller-manager.yaml`
			`sed -e "s/- --address=127.0.0.1/- --address=0.0.0.0/" -i /etc/kubernetes/manifests/kube-scheduler.yaml`
			```
kube-prometheus: Convert to jsonnet 2018-04-08 14:53:30 +02:00
documentation: Add troubleshooting section about formatting port name correctly 2020-05-26 11:54:02 +10:00			`### Using textual port number instead of port name`

			`The ServiceMonitor expects to use the port name as defined on the Service. So, using the Service example from the`
			`diagram above, we have this Service definition:`

Documentation/troubleshooting.md: refresh document Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 11:31:38 +02:00			```yaml mdox-exec="cat example/user-guides/getting-started/example-app-service.yaml"
documentation: Add troubleshooting section about formatting port name correctly 2020-05-26 11:54:02 +10:00			`kind: Service`
Documentation/troubleshooting.md: refresh document Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 11:31:38 +02:00			`apiVersion: v1`
documentation: Add troubleshooting section about formatting port name correctly 2020-05-26 11:54:02 +10:00			`metadata:`
Documentation/troubleshooting.md: refresh document Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 11:31:38 +02:00			`name: example-app`
documentation: Add troubleshooting section about formatting port name correctly 2020-05-26 11:54:02 +10:00			`labels:`
Documentation/troubleshooting.md: refresh document Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 11:31:38 +02:00			`app: example-app`
			`spec:`
			`selector:`
			`app: example-app`
			`ports:`
			`- name: web`
			`port: 8080`
documentation: Add troubleshooting section about formatting port name correctly 2020-05-26 11:54:02 +10:00			```

			We would then define the service monitor using `metrics` as the port not `"8080"`. E.g.

			`CORRECT`
Add mdox link checking and formatting Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> 2021-08-17 17:37:45 +05:30
Documentation/troubleshooting.md: refresh document Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 11:31:38 +02:00			```yaml mdox-exec="cat example/user-guides/getting-started/example-app-service-monitor.yaml"
			`apiVersion: monitoring.coreos.com/v1`
documentation: Add troubleshooting section about formatting port name correctly 2020-05-26 11:54:02 +10:00			`kind: ServiceMonitor`
			`metadata:`
Documentation/troubleshooting.md: refresh document Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 11:31:38 +02:00			`name: example-app`
			`labels:`
			`team: frontend`
documentation: Add troubleshooting section about formatting port name correctly 2020-05-26 11:54:02 +10:00			`spec:`
Documentation/troubleshooting.md: refresh document Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 11:31:38 +02:00			`selector:`
			`matchLabels:`
			`app: example-app`
documentation: Add troubleshooting section about formatting port name correctly 2020-05-26 11:54:02 +10:00			`endpoints:`
Documentation/troubleshooting.md: refresh document Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 11:31:38 +02:00			`- port: web`
documentation: Add troubleshooting section about formatting port name correctly 2020-05-26 11:54:02 +10:00			```

			`INCORRECT`
Add mdox link checking and formatting Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> 2021-08-17 17:37:45 +05:30
:art: fix code block syntax highlight and del tailing space Signed-off-by: kWeiZh <kweizh@gmail.com> 2021-06-14 16:19:22 +08:00			```yaml
Documentation/troubleshooting.md: refresh document Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 11:31:38 +02:00			`apiVersion: monitoring.coreos.com/v1`
documentation: Add troubleshooting section about formatting port name correctly 2020-05-26 11:54:02 +10:00			`kind: ServiceMonitor`
			`metadata:`
Documentation/troubleshooting.md: refresh document Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 11:31:38 +02:00			`name: example-app`
			`labels:`
			`team: frontend`
documentation: Add troubleshooting section about formatting port name correctly 2020-05-26 11:54:02 +10:00			`spec:`
Documentation/troubleshooting.md: refresh document Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 11:31:38 +02:00			`selector:`
			`matchLabels:`
			`app: example-app`
documentation: Add troubleshooting section about formatting port name correctly 2020-05-26 11:54:02 +10:00			`endpoints:`
Documentation/troubleshooting.md: refresh document Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 11:31:38 +02:00			`- port: "8080"`
documentation: Add troubleshooting section about formatting port name correctly 2020-05-26 11:54:02 +10:00			```

Add mdox link checking and formatting Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> 2021-08-17 17:37:45 +05:30			The incorrect example will give an error along these lines `spec.endpoints.port in body must be of type string: "integer"`
Update troubleshooting.md (#4838) * Update troubleshooting.md; add check for prom/alertmanager stuck in terminating loop * update troubleshooting: change wording * Update text Signed-off-by: Simon Pasquier <spasquie@redhat.com> Co-authored-by: matt <matthew.buhagiar@explorium.ai> Co-authored-by: Simon Pasquier <spasquie@redhat.com> 2022-07-06 18:47:26 +04:00
Fix header Signed-off-by: Simon Pasquier <spasquie@redhat.com> 2022-07-13 14:29:13 +02:00			`### Prometheus/Alertmanager pods stuck in terminating loop with healthy start up logs`
Update troubleshooting.md (#4838) * Update troubleshooting.md; add check for prom/alertmanager stuck in terminating loop * update troubleshooting: change wording * Update text Signed-off-by: Simon Pasquier <spasquie@redhat.com> Co-authored-by: matt <matthew.buhagiar@explorium.ai> Co-authored-by: Simon Pasquier <spasquie@redhat.com> 2022-07-06 18:47:26 +04:00
			`It is usually a sign that more than one operator wants to manage the resource.`

			`Check if several operators are running on the cluster:`

			```console
fix typo 2022-07-23 02:30:17 +08:00			`kubectl get pods --all-namespaces \| grep 'prom.*operator'`
Update troubleshooting.md (#4838) * Update troubleshooting.md; add check for prom/alertmanager stuck in terminating loop * update troubleshooting: change wording * Update text Signed-off-by: Simon Pasquier <spasquie@redhat.com> Co-authored-by: matt <matthew.buhagiar@explorium.ai> Co-authored-by: Simon Pasquier <spasquie@redhat.com> 2022-07-06 18:47:26 +04:00			```

			`Check the logs of the matching pods to see if they manage the same resource.`