1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-05 12:46:18 +01:00

New assembly to add the new PromQL metrics for virtualization resources. Assembly is also re-using some of the querying metrics modules to provide usability context. Edit: additional queries added to reflect changes for 4.8 (cnv-11661)

This commit is contained in:
Andrew Burden
2021-03-10 18:50:31 +01:00
committed by openshift-cherrypick-robot
parent 63eec06fbe
commit 07d58eb758
6 changed files with 140 additions and 0 deletions

View File

@@ -2853,6 +2853,8 @@ Topics:
File: virt-using-dashboard-to-get-cluster-info
- Name: OpenShift cluster monitoring, logging, and Telemetry
File: virt-openshift-cluster-monitoring
- Name: Prometheus queries for virtual resources
File: virt-prometheus-queries
- Name: Collecting OpenShift Virtualization data for Red Hat Support
File: virt-collecting-virt-data
---

View File

@@ -1,6 +1,7 @@
// Module included in the following assemblies:
//
// * monitoring/managing-metrics.adoc
// * virt/logging_events_monitoring/virt-prometheus-queries.adoc
[id="querying-metrics-for-all-projects-as-an-administrator_{context}"]
= Querying metrics for all projects as a cluster administrator

View File

@@ -1,6 +1,7 @@
// Module included in the following assemblies:
//
// * monitoring/managing-metrics.adoc
// * virt/logging_events_monitoring/virt-prometheus-queries.adoc
[id="querying-metrics-for-user-defined-projects-as-a-developer_{context}"]
= Querying metrics for user-defined projects as a developer

View File

@@ -1,6 +1,7 @@
// Module included in the following assemblies:
//
// * monitoring/managing-metrics.adoc
// * virt/logging_events_monitoring/virt-prometheus-queries.adoc
[id="querying-metrics_{context}"]
= Querying metrics

View File

@@ -0,0 +1,106 @@
// Module included in the following assemblies:
//
// * virt/logging_events_monitoring/virt-prometheus-queries.adoc
[id="virt-querying-metrics_{context}"]
= Virtualization metrics
The following metric descriptions include example Prometheus Query Language (PromQL) queries.
[NOTE]
====
These metrics are not an API and might change between versions.
====
[id="virt-promql-vcpu-metrics_{context}"]
== vCPU metrics
`kubevirt_vmi_vcpu_wait_seconds`::
Returns the wait time (in seconds) for a virtual machine's vCPU.
A value above '0' means that the vCPU wants to run, but the host scheduler cannot run it yet. This indicates that there is an issue with Input/Output.
.Example query
[source,promql]
----
topk(3, sum by (name, namespace) (round(irate(kubevirt_vmi_vcpu_wait_seconds[6m]), 0.1))) > 0
----
The above query returns the top 3 VMs waiting for I/O at every given moment in time over the time period.
[id="virt-promql-network-metrics_{context}"]
== Network metrics
`kubevirt_vmi_network_receive_bytes_total`::
Returns the total amount of traffic received (in bytes) on the virtual machine's network.
`kubevirt_vmi_network_transmit_bytes_total`::
Returns the total amount of traffic transmitted (in bytes) on the virtual machine's network.
These queries can be used to identify virtual machines that are saturating the network.
.Example query
[source,promql]
----
topk(3, sum by (name, namespace) (round(irate(kubevirt_vmi_network_receive_bytes_total[6m]), 0.1)) + sum by (name, namespace) (round(irate(kubevirt_vmi_network_transmit_bytes_total[6m]) , 0.1))) > 0
----
The above query returns the top 3 VMs transmitting the most network traffic at every given moment in time over a six-minute time period.
[id="virt-promql-storage-metrics_{context}"]
== Storage metrics
`kubevirt_vmi_storage_read_traffic_bytes_total`::
Returns the total amount (in bytes) of the virtual machine's storage-related traffic.
`kubevirt_vmi_storage_write_traffic_bytes_total`::
Returns the total amount of storage writes (in bytes) of the virtual machine's storage-related traffic.
These queries can be used to identify virtual machines that are writing large amounts of data.
.Example query
[source,promql]
----
topk(3, sum by (name, namespace) (round(irate(kubevirt_vmi_storage_read_traffic_bytes_total[6m]), 0.1))
+ sum by (name, namespace) (round(irate(kubevirt_vmi_storage_write_traffic_bytes_total[6m]), 0.1))) > 0
----
The above query returns the top 3 VMs performing the most storage traffic at every given moment in time over a six-minute time period.
`kubevirt_vmi_storage_iops_read_total`::
Returns the amount of write I/O operations the virtual machine is performing per second.
`kubevirt_vmi_storage_iops_write_total`::
Returns the amount of read I/O operations the virtual machine is performing per second.
These queries can be used to determine the I/O performance of storage devices.
.Example query
[source,promql]
----
topk(3, sum by (name, namespace) (round(irate(kubevirt_vmi_storage_iops_read_total[6m]), 0.1))
+ sum by (name, namespace) (round(irate(kubevirt_vmi_storage_iops_write_total[6m]) , 0.1))) > 0
----
The above query returns the top 3 VMs performing the most I/O operations per second at every given moment in time over a six-minute time period.
[id="virt-promql-guest-memory-metrics_{context}"]
== Guest memory swapping metrics
`kubevirt_vmi_memory_swap_in_traffic_bytes_total`::
Returns the total amount (in bytes) of memory the virtual guest is swapping in.
`kubevirt_vmi_memory_swap_out_traffic_bytes_total`::
Returns the total amount (in bytes) of memory the virtual guest is swapping out.
Memory swapping indicates that the virtual machine is under memory pressure. Increasing the memory allocation of the virtual machine can mitigate this issue.
[NOTE]
====
These queries only return data for virtual guests that have memory swapping enabled.
====
.Example query
[source,promql]
----
topk(3, sum by (name, namespace) (round(irate(kubevirt_vmi_memory_swap_in_traffic_bytes_total[6m]), 0.1))
+ sum by (name, namespace) (round(irate(kubevirt_vmi_memory_swap_out_traffic_bytes_total[6m]), 0.1))) > 0
----
The above query returns the top 3 VMs where the guest is performing the most memory swapping at every given moment in time over a six-minute time period.

View File

@@ -0,0 +1,29 @@
[id="virt-prometheus-queries"]
= Prometheus queries for virtual resources
include::modules/virt-document-attributes.adoc[]
:context: virt-prometheus-queries
toc::[]
{VirtProductName} provides metrics for monitoring how infrastructure resources are consumed in the cluster. The metrics cover the following resources:
* vCPU
* Network
* Storage
* Guest memory swapping
Use the {product-title} monitoring dashboard to query virtualization metrics.
.Prerequisite
* The vCPU metric requires the `schedstats=enable` kernel argument applied to the `MachineConfig` object before it can be used. This kernel argument enables scheduler statistics used for debugging and performance tuning and adds a minor additional load to the scheduler. See the xref:../../post_installation_configuration/machine-configuration-tasks.adoc#nodes-nodes-kernel-arguments_post-install-machine-configuration-tasks[{product-title} machine configuration tasks] documentation for more information on applying a kernel argument.
include::modules/monitoring-querying-metrics.adoc[leveloffset=+1]
include::modules/monitoring-querying-metrics-for-all-projects-as-an-administrator.adoc[leveloffset=+2]
include::modules/monitoring-querying-metrics-for-user-defined-projects-as-a-developer.adoc[leveloffset=+2]
include::modules/virt-querying-metrics.adoc[leveloffset=+1]
[id="{context}-additional-resources"]
== Additional resources
* xref:../../monitoring/understanding-the-monitoring-stack.adoc#understanding-the-monitoring-stack[Understanding the {product-title} monitoring stack]