mirror of
https://github.com/openshift/openshift-docs.git
synced 2026-02-06 06:46:26 +01:00
100 lines
3.9 KiB
Plaintext
100 lines
3.9 KiB
Plaintext
:_content-type: ASSEMBLY
|
|
[id="serverless-autoscaling-developer"]
|
|
= Autoscaling
|
|
include::modules/common-attributes.adoc[]
|
|
include::modules/serverless-document-attributes.adoc[]
|
|
:context: serverless-autoscaling-developer
|
|
|
|
toc::[]
|
|
|
|
Knative Serving provides automatic scaling, or _autoscaling_, for applications to match incoming demand. For example, if an application is receiving no traffic, and scale-to-zero is enabled, Knative Serving scales the application down to zero replicas. If scale-to-zero is disabled, the application is scaled down to the xref:../../serverless/develop/serverless-autoscaling-developer.adoc#serverless-autoscaling-developer-minscale[minimum number of replicas specified for applications on the cluster]. Replicas can also be scaled up to meet demand if traffic to the application increases.
|
|
|
|
If Knative autoscaling is enabled for your cluster, you can configure concurrency and scale bounds for your application.
|
|
|
|
[NOTE]
|
|
====
|
|
Any limits or targets set in the revision template are measured against a single instance of your application. For example, setting the `target` annotation to `50` configures the autoscaler to scale the application so that each revision handles 50 requests at a time.
|
|
====
|
|
|
|
[id="serverless-autoscaling-developer-scale-bounds"]
|
|
== Scale bounds
|
|
|
|
Scale bounds determine the minimum and maximum numbers of replicas that can serve an application at any given time.
|
|
|
|
You can set scale bounds for an application to help prevent cold starts or control computing costs.
|
|
|
|
[id="serverless-autoscaling-developer-minscale"]
|
|
=== Minimum scale bounds
|
|
|
|
The minimum number of replicas that can serve an application is determined by the `minScale` annotation.
|
|
|
|
The `minScale` value defaults to `0` replicas if the following conditions are met:
|
|
|
|
* The `minScale` annotation is not set
|
|
* Scaling to zero is enabled
|
|
* The class `KPA` is used
|
|
|
|
If scale to zero is not enabled, the `minScale` value defaults to `1`.
|
|
|
|
// TODO: Document KPA if supported, link to docs about setting class
|
|
|
|
// TO DO:
|
|
// Add info / links about enabling and disabling autoscaling (admin docs)
|
|
// if `enable-scale-to-zero` is set to `false` in the `config-autoscaler` config map.
|
|
|
|
.Example service spec with `minScale` spec
|
|
[source,yaml]
|
|
----
|
|
apiVersion: serving.knative.dev/v1
|
|
kind: Service
|
|
metadata:
|
|
name: example-service
|
|
namespace: default
|
|
spec:
|
|
template:
|
|
metadata:
|
|
annotations:
|
|
autoscaling.knative.dev/minScale: "0"
|
|
...
|
|
----
|
|
|
|
include::modules/serverless-autoscaling-minscale-kn.adoc[leveloffset=+3]
|
|
|
|
[id="serverless-autoscaling-developer-maxscale"]
|
|
=== Maximum scale bounds
|
|
|
|
The maximum number of replicas that can serve an application is determined by the `maxScale` annotation. If the `maxScale` annotation is not set, there is no upper limit for the number of replicas created.
|
|
|
|
.Example service spec with `maxScale` spec
|
|
[source,yaml]
|
|
----
|
|
apiVersion: serving.knative.dev/v1
|
|
kind: Service
|
|
metadata:
|
|
name: example-service
|
|
namespace: default
|
|
spec:
|
|
template:
|
|
metadata:
|
|
annotations:
|
|
autoscaling.knative.dev/maxScale: "10"
|
|
...
|
|
----
|
|
|
|
include::modules/serverless-autoscaling-maxscale-kn.adoc[leveloffset=+3]
|
|
|
|
[id="serverless-autoscaling-developer-concurrency"]
|
|
== Concurrency
|
|
|
|
Concurrency determines the number of simultaneous requests that can be processed by each replica of an application at any given time.
|
|
|
|
include::modules/serverless-concurrency-limits.adoc[leveloffset=+2]
|
|
include::modules/serverless-concurrency-limits-configure-soft.adoc[leveloffset=+2]
|
|
include::modules/serverless-concurrency-limits-configure-hard.adoc[leveloffset=+2]
|
|
include::modules/serverless-target-utilization.adoc[leveloffset=+2]
|
|
|
|
[id="additional-resources_serverless-autoscaling-developer"]
|
|
== Additional resources
|
|
|
|
* Scale-to-zero can be enabled or disabled for the cluster by cluster administrators. For more information, see xref:../../serverless/admin_guide/serverless-admin-autoscaling.adoc#serverless-enable-scale-to-zero_serverless-admin-autoscaling[Enabling scale-to-zero].
|