mirror of
https://github.com/coreos/prometheus-operator.git
synced 2026-02-05 06:45:27 +01:00
Merge pull request #7558 from heliapb/feat/troublshooting
feat: high cpu troubleshooting
This commit is contained in:
@@ -295,3 +295,16 @@ spec:
|
||||
- regex: prometheus_replica
|
||||
action: LabelDrop
|
||||
```
|
||||
|
||||
### High CPU usage by the Prometheus Operator
|
||||
|
||||
Some scenarios can cause high CPU usage by the Prometheus Operator. For instance, with the metrics below, we can get the rate of reconciliations:
|
||||
|
||||
```shell
|
||||
(sum by(controller,triggered_by) (rate(prometheus_operator_triggered_total[5m]))
|
||||
sum by(controller) (rate(prometheus_operator_reconcile_operations_total[5m])))
|
||||
```
|
||||
|
||||
If this shows as being `triggered_by="Secret"`, a solution is to limit the operator to watch only secrets with matching labels using the `--secret-field-selector` argument. Also, you can use the namespace selectors to limit the number of namespaces watched by the operator.
|
||||
|
||||
Another reported issue has to do with a high amount of Service/Endpoint/ServiceMonitor, where issues with high CPU and memory were also encountered. A solution was to reduce the number of ServiceMonitors, to target multiple Services/Endpoints.
|
||||
|
||||
Reference in New Issue
Block a user