openshift-docs/modules/rosa-policy-change-management.adoc


// Module included in the following assemblies:
//
// * rosa_architecture/rosa_policy_service_definition/rosa-policy-shared-responsibility.adoc

:_mod-docs-content-type: CONCEPT
[id="rosa-policy-change-management_{context}"]
= Change management

This section describes the policies about how cluster and configuration changes, patches, and releases are managed.

Red{nbsp}Hat is responsible for enabling changes to the cluster infrastructure and services that the customer will control, as well as maintaining versions for the control plane nodes, infrastructure nodes and services, and worker nodes. AWS is responsible for protecting the hardware infrastructure that runs all of the services offered in the
AWS Cloud. The customer is responsible for initiating infrastructure change requests and installing and maintaining optional services and networking configurations on the cluster, as well as all changes to customer data and customer applications.

[id="rosa-policy-customer-initiated-changes_{context}"]
== Customer-initiated changes

You can initiate changes using self-service capabilities such as cluster deployment, worker node scaling, or cluster deletion.

Change history is captured in the *Cluster History* section in the OpenShift Cluster Manager *Overview tab*, and is available for you to view. The change history includes, but is not limited to, logs from the following changes:

* Adding or removing identity providers
* Adding or removing users to or from the `dedicated-admins` group
* Scaling the cluster compute nodes
* Scaling the cluster load balancer
* Scaling the cluster persistent storage
* Upgrading the cluster

You can implement a maintenance exclusion by avoiding changes in {cluster-manager} for the following components:

* Deleting a cluster
* Adding, modifying, or removing identity providers
* Adding, modifying, or removing a user from an elevated group
* Installing or removing add-ons
* Modifying cluster networking configurations
* Adding, modifying, or removing machine pools
* Enabling or disabling user workload monitoring
* Initiating an upgrade

[IMPORTANT]
====
To enforce the maintenance exclusion, ensure machine pool autoscaling or automatic upgrade policies have been disabled. After the maintenance exclusion has been lifted, proceed with enabling machine pool autoscaling or automatic upgrade policies as desired.
====

[id="rosa-policy-red-hat-initiated-changes_{context}"]
== Red{nbsp}Hat-initiated changes

Red{nbsp}Hat site reliability engineering (SRE) manages the infrastructure, code, and configuration of {product-title} using a GitOps workflow and fully automated CI/CD pipelines. This process ensures that Red{nbsp}Hat can safely introduce service improvements on a continuous basis without negatively impacting customers.

Every proposed change undergoes a series of automated verifications immediately upon check-in. Changes are then deployed to a staging environment where they undergo automated integration testing. Finally, changes are deployed to the production environment. Each step is fully automated.

An authorized Red{nbsp}Hat SRE reviewer must approve advancement to each step. The reviewer cannot be the same individual who proposed the change. All changes and approvals are fully auditable as part of the GitOps workflow.

Some changes are released to production incrementally, using feature flags to control availability of new features to specified clusters or customers, such as private or public previews.

[id="rosa-policy-patch-management_{context}"]
== Patch management

OpenShift Container Platform software and the underlying immutable Red{nbsp}Hat CoreOS (RHCOS) operating system image are patched for bugs and vulnerabilities in regular z-stream upgrades. Read more about link:https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html/architecture/architecture-rhcos[RHCOS architecture] in the OpenShift Container Platform documentation.

[id="rosa-policy-release-management_{context}"]
== Release management

Red{nbsp}Hat does not automatically upgrade your clusters. You can schedule to upgrade the clusters at regular intervals (recurring upgrade) or just once (individual upgrade) using the {cluster-manager} web console. Red{nbsp}Hat might forcefully upgrade a cluster to a new z-stream version only if the cluster is affected by a critical impact CVE.

[NOTE]
====
Because the required permissions can change between y-stream releases, the AWS managed policies are automatically updated before an upgrade can be performed.
====

ifndef::openshift-dedicated[]
You can review the history of all cluster upgrade events in the {cluster-manager} web console.
endif::openshift-dedicated[]
ifdef::openshift-dedicated[]
You can review the history of all cluster upgrade events in the {cluster-manager} web console. For more information about releases, see the link:https://access.redhat.com/support/policy/updates/openshift/dedicated[Life Cycle policy].
endif::openshift-dedicated[]

[id="rosa-policy-resource-responsibilities_{context}"]
== Service and Customer resource responsibilities

The following table defines the responsibilities for cluster resources.

[cols="2a,3a,3a",options="header"]
|===

|Resource
|Service responsibilities
|Customer responsibilities

|Logging
|**Red{nbsp}Hat**

- Centrally aggregate and monitor platform audit logs.

- Provide and maintain a logging Operator to enable the customer to deploy a logging stack for default application logging.

- Provide audit logs upon customer request.

|- Install the optional default application logging Operator on the cluster.
- Install, configure, and maintain any optional application logging solutions, such as logging sidecar containers or third-party logging applications.
- Tune size and frequency of application logs being produced by customer applications if they are affecting the stability of the logging stack or the cluster.
- Request platform audit logs through a support case for researching specific incidents.

|Application networking
|**Red{nbsp}Hat**

- Set up public load balancers. Provide the ability to set up private load balancers and up to one additional load balancer when required.

- Set up native OpenShift router service. Provide the ability to set the router as private and add up to one additional router shard.

- Install, configure, and maintain OVN-Kubernetes components for default internal pod traffic.

- Provide the ability for the customer to manage `NetworkPolicy` and `EgressNetworkPolicy` (firewall) objects.

|- Configure non-default pod network permissions for project and pod networks, pod ingress, and pod egress using `NetworkPolicy` objects.
- Use {cluster-manager} to request a private load balancer for default application routes.
- Use {cluster-manager} to configure up to one additional public or private router shard and corresponding load balancer.
- Request and configure any additional service load balancers for specific services.
- Configure any necessary DNS forwarding rules.

|Cluster networking
|**Red{nbsp}Hat**

- Set up cluster management components, such as public or private service endpoints and necessary integration with Amazon VPC components.

- Set up internal networking components required for internal cluster communication between worker
clusters and control planes.

|- Configure your firewall to grant access to the required OpenShift and AWS domains and ports before the cluster is provisioned. For more information, see "AWS firewall prerequisites".
- Provide optional non-default IP address ranges for machine CIDR, service CIDR, and pod CIDR if needed through {cluster-manager} when the cluster is provisioned.
- Request that the API service endpoint be made public or private on cluster creation or after cluster creation through {cluster-manager}.
- Create additional Ingress Controllers to publish additional application routes.
- Install, configure, and upgrade optional CNI plugins if clusters are installed without the default OpenShift CNI plugins.

|Virtual networking management
|**Red{nbsp}Hat**

- Set up and configure Amazon VPC components required to provision the cluster, such as subnets, load balancers, internet gateways, and NAT gateways.

- Provide the ability for the customer to
manage AWS VPN connectivity with on-premise resources, Amazon VPC-to-VPC connectivity, and AWS Direct Connect as required through  {cluster-manager}.

- Enable customers to create and deploy AWS load balancers for use with service load balancers.

|- Set up and maintain optional Amazon VPC components, such as Amazon VPC-to-VPC connection, AWS VPN connection, or AWS Direct Connect.
- Request and configure any additional service load balancers for specific services.

|Virtual compute management
|**Red{nbsp}Hat**

- Set up and configure the ROSA control plane and data plane to use Amazon EC2 instances for cluster compute.

- Monitor and manage the deployment of Amazon EC2 control plane and infrastructure nodes on the cluster.

|- Monitor and manage Amazon EC2 worker nodes by creating a
machine pool using the OpenShift Cluster Manager or the ROSA CLI (`rosa`).
- Manage changes to customer-deployed applications and application data.

|Cluster version
|**Red{nbsp}Hat**

- Enable upgrade scheduling process.

- Monitor upgrade progress and remedy any issues encountered.

- Publish change logs and release notes for patch release upgrades.

|- Either set up automatic upgrades or schedule patch release upgrades immediately or for the future.
- Acknowledge and schedule minor version upgrades.
- Test customer applications on patch releases to ensure compatibility.

|Capacity management
|**Red{nbsp}Hat**

- Monitor the use of the control plane.
ifndef::openshift-rosa-hcp[]
Control planes include control plane nodes and infrastructure nodes.
endif::openshift-rosa-hcp[]
- Scale and resize control plane to maintain quality of service.

| - Monitor worker node utilization and, if appropriate, enables the auto-scaling feature.
- Determine the scaling strategy of the cluster. See the additional resources for more information on machine pools.
- Use the provided {cluster-manager} controls to add or remove additional worker nodes as required.
- Respond to Red{nbsp}Hat notifications regarding cluster resource requirements.

|Virtual storage management
|**Red{nbsp}Hat**

- Set up and configure Amazon EBS to provision local node storage and persistent volume storage for the cluster.

- Set up and configure the built-in image registry to use Amazon S3 bucket storage. ^[1]^

- Regularly prune image registry resources in
Amazon S3 to optimize Amazon S3 usage and cluster performance. ^[2]^

| - Optionally configure the Amazon EBS CSI driver or the Amazon
EFS CSI driver to provision persistent volumes on the cluster.

|AWS software (public AWS services)
|**AWS**

**Compute:** Provide the Amazon EC2 service, used for ROSA relevant resources.

**Storage:** Provide Amazon EBS, used by ROSA to provision local node storage and persistent volume storage for the cluster.

**Storage:** Provide Amazon S3, used for the ROSA
built-in image registry.

**Networking:**
Provide the following AWS Cloud services, used by ROSA
to satisfy virtual networking
infrastructure needs:

** Amazon VPC
** Elastic Load Balancing
** AWS IAM
** AWS STS

**Networking:**
Provide the following AWS services, which customers can optionally integrate with ROSA:

- AWS VPN
- AWS Direct Connect
- AWS PrivateLink
- AWS Transit Gateway

| - Sign requests using an access key ID and secret access key
associated with an IAM principal or STS temporary security
credentials.
- Specify VPC subnets for the cluster to use during cluster
creation.
- Optionally configure a customer-managed VPC for use with ROSA clusters (required for PrivateLink and HCP clusters).

|Hardware/AWS global infrastructure
|**AWS**

- For information regarding  management controls for AWS data centers, see link:https://aws.amazon.com/compliance/data-center/controls[Our Controls] on the AWS Cloud Security page.

- For information regarding change management best practices, see link:https://aws.amazon.com/solutions/guidance/change-management-on-aws/[Guidance for Change Management on AWS] in the AWS Solutions Library.

|- Implement change management best practices for customer
applications and data hosted on the AWS Cloud.

|===

1. For more information on authentication flow for AWS STS, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html/authentication_and_authorization/managing-cloud-provider-credentials#cco-short-term-creds-auth-flow-aws-diagram_cco-short-term-creds[Authentication flow for AWS STS].

2. For more information on pruning images, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html/registry/registry-overview-1#pruning-images_registry-overview[Automatically pruning Images].