1
0
mirror of https://github.com/coreos/prometheus-operator.git synced 2026-02-05 06:45:27 +01:00

Merge pull request #7558 from heliapb/feat/troublshooting

feat: high cpu troubleshooting
This commit is contained in:
Hélia Barroso
2025-07-29 12:58:53 +01:00
committed by GitHub
parent 442d41c2ac
commit c760dc4e5d

View File

@@ -295,3 +295,16 @@ spec:
- regex: prometheus_replica
action: LabelDrop
```
### High CPU usage by the Prometheus Operator
Some scenarios can cause high CPU usage by the Prometheus Operator. For instance, with the metrics below, we can get the rate of reconciliations:
```shell
(sum by(controller,triggered_by) (rate(prometheus_operator_triggered_total[5m]))
sum by(controller) (rate(prometheus_operator_reconcile_operations_total[5m])))
```
If this shows as being `triggered_by="Secret"`, a solution is to limit the operator to watch only secrets with matching labels using the `--secret-field-selector` argument. Also, you can use the namespace selectors to limit the number of namespaces watched by the operator.
Another reported issue has to do with a high amount of Service/Endpoint/ServiceMonitor, where issues with high CPU and memory were also encountered. A solution was to reduce the number of ServiceMonitors, to target multiple Services/Endpoints.