openshift-docs/modules/etcd-peer-round-trip.adoc

// Module included in the following assemblies:
//
// * etcd/etcd-performance.adoc

:_mod-docs-content-type: CONCEPT
[id="etcd-peer-round-trip_{context}"]
= How etcd peer round trip time affects performance

The etcd peer round trip time is an end-to-end test metric on how quickly something can be replicated among members. It shows the latency of etcd to finish replicating a client request among all the etcd members. The etcd peer round trip time is not the same thing as the network round trip time.

You can monitor various etcd metrics on dashboards in the {product-title} console. In the console, click *Observe* -> *Dashboards* and from the dropdown list, select *etcd*.

Near the end of the *etcd* dashboard, you can find a plot that summarizes the etcd peer round trip time.

[NOTE]
====
These etcd metrics are collected by the OpenShift metrics system in Prometheus. You can access them from the CLI by following the Red{nbsp}Hat Knowledgebase solution, link:https://access.redhat.com/solutions/5151831[How to query from the command line Prometheus statistics].
====

[source,terminal]
----
# Get token to connect to Prometheus
SECRET=$(oc get secret -n openshift-user-workload-monitoring | grep  prometheus-user-workload-token | head -n 1 | awk '{print $1 }')
export TOKEN=$(oc get secret $SECRET -n openshift-user-workload-monitoring -o json | jq -r '.data.token' | base64 -d)
export THANOS_QUERIER_HOST=$(oc get route thanos-querier -n openshift-monitoring -o json | jq -r '.spec.host')
----

Queries must be URL-encoded. The following example shows how to retrieve the metrics that are reporting the round trip time (in seconds) for etcd to finish replicating the client requests among the members:

[source,terminal]
----
# prometheus query
query="histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket[5m]))"

# urlencoded query
encoded_query=$(printf "%s" $query |jq -sRr @uri)

# querying the OpenShift metrics service
curl -s -X GET -k -H "Authorization: Bearer $TOKEN" "https://$THANOS_QUERIER_HOST/api/v1/query?query=$encoded_query" | jq '.data.result[] | .metric.pod,.value[1]'

"etcd-m2"
"0.09318400000000004"   # example ~93ms
"etcd-m0"
"0.050688"              # example ~51ms
"etcd-m1"
"0.050688"              # example ~51ms
----

The following metrics are also relevant to understanding etcd performance:

etcd_disk_wal_fsync_duration_seconds_bucket:: Reports the etcd WAL fsync duration.
etcd_disk_backend_commit_duration_seconds_bucket:: Reports the etcd backend commit latency duration.
etcd_server_leader_changes_seen_total:: Reports the leader changes.