openshift-docs/modules/efk-logging-deploying-storage-considerations.adoc

// Module included in the following assemblies:
//
// * logging/efk-logging-deploy.adoc

:_mod-docs-content-type: CONCEPT
[id="efk-logging-deploy-storage-considerations_{context}"]
= Storage considerations for cluster logging and {product-title}

An Elasticsearch index is a collection of primary shards and its corresponding replica
shards. This is how ES implements high availability internally, therefore there
is little requirement to use hardware based mirroring RAID variants. RAID 0 can still
be used to increase overall disk performance.

//Following paragraph also in nodes/efk-logging-elasticsearch

Elasticsearch is a memory-intensive application. Each Elasticsearch node needs 16G of memory for both memory requests and CPU limits
unless you specify otherwise the Cluster Logging Custom Resource. The initial set of {product-title} nodes might not be large enough
to support the Elasticsearch cluster. You must add additional nodes to the  {product-title} cluster to run with the recommended
or higher memory. Each Elasticsearch node can operate with a lower memory setting though this is not recommended for production deployments.

////
Each Elasticsearch data node requires its own individual storage, but an {product-title} deployment
can only provide volumes shared by all of its pods, which again means that
Elasticsearch clusters should not be implemented with a single deployment.
////

A persistent volume is required for each Elasticsearch deployment to have one data volume per data node. On {product-title} this is achieved using
Persistent Volume Claims.

The Elasticsearch Operator names the PVCs using the Elasticsearch resource name. Refer to
Persistent Elasticsearch Storage for more details.

Below are capacity planning guidelines for {product-title} aggregate logging.

*Example scenario*

Assumptions:

. Which application: Apache
. Bytes per line: 256
. Lines per second load on application: 1
. Raw text data -> JSON

Baseline (256 characters per minute -> 15KB/min)

[cols="3,4",options="header"]
|===
|Logging Pods
|Storage Throughput

|3 es
1 kibana
1 curator
1 fluentd
| 6 pods total: 90000 x 86400 = 7,7 GB/day

|3 es
1 kibana
1 curator
11 fluentd
| 16 pods total: 225000 x 86400 = 24,0 GB/day

|3 es
1 kibana
1 curator
20 fluentd
|25 pods total: 225000 x 86400 = 32,4 GB/day
|===


Calculating total logging throughput and disk space required for your {product-title} cluster requires knowledge of your applications. For example, if one of your
applications on average logs 10 lines-per-second, each 256 bytes-per-line,
calculate per-application throughput and disk space as follows:

----
 (bytes-per-line * (lines-per-second) = 2560 bytes per app per second
 (2560) * (number-of-pods-per-node,100) = 256,000 bytes per second per node
 256k * (number-of-nodes) = total logging throughput per cluster
----

Fluentd ships any logs from *systemd journal* and */var/log/containers/* to Elasticsearch.

////
Local SSD drives are recommended in order to achieve the best performance. In
Red Hat Enterprise Linux (RHEL) 7, the
link:https://access.redhat.com/articles/425823[deadline] IO scheduler is the
default for all block devices except SATA disks. For SATA disks, the default IO
scheduler is *cfq*.
////

Therefore, consider how much data you need in advance and that you are
aggregating application log data. Some Elasticsearch users have found that it
is necessary to
link:https://signalfx.com/blog/how-we-monitor-and-run-elasticsearch-at-scale/[keep
absolute storage consumption around 50% and below 70% at all times]. This
helps to avoid Elasticsearch becoming unresponsive during large merge
operations.

By default, at 85% ES stops allocating new data to the node, at 90% ES starts de-allocating
existing shards from that node to other nodes if possible. But if no nodes have
free capacity below 85% then ES will effectively reject creating new indices
and becomes RED.

[NOTE]
====
These low and high watermark values are Elasticsearch defaults in the current release. You can modify these values,
but you must also apply any modifications to the alerts also. The alerts are based
on these defaults.
====