openshift-docs/hosted_control_planes/hosted-control-planes-release-notes.adoc

:_mod-docs-content-type: ASSEMBLY
[id="hosted-control-planes-release-notes"]
include::_attributes/common-attributes.adoc[]
= {hcp-capital} release notes
:context: hosted-control-planes-release-notes

toc::[]

Release notes contain information about new and deprecated features, changes, and known issues.

[id="hcp-4-19-release-notes_{context}"]
== {hcp-capital} release notes for {product-title} 4.19

With this release, {hcp} for {product-title} 4.19 is available. {hcp-capital} for {product-title} 4.19 supports {mce} version 2.9.

[id="hcp-4-19-new-features-and-enhancements_{context}"]
=== New features and enhancements

[id="hcp-4-19-custom-dns_{context}"]
==== Defining a custom DNS name

Cluster administrators can now define a custom DNS name for a hosted cluster to provide for more flexibility in how DNS names are managed and used. For more information, see xref:../hosted_control_planes/hcp-deploy/hcp-deploy-bm.adoc#hcp-custom-dns_hcp-deploy-bm[Defining a custom DNS name].

[id="hcp-4-19-aws-tags_{context}"]
==== Adding or updating {aws-short} tags for hosted clusters

Cluster administrators can add or update {aws-first} tags for several different types of resources. For more information, see xref:../hosted_control_planes/hcp-manage/hcp-manage-aws.adoc#hcp-aws-tags_hcp-managing-aws[Adding or updating {aws-short} tags for a hosted cluster].

[id="hcp-4-19-auto-dr-oadp_{context}"]
==== Automated disaster recovery for a hosted cluster by using {oadp-short}

On bare metal or {aws-first} platforms, you can automate disaster recovery for a hosted cluster by using {oadp-first}. For more information, see xref:../hosted_control_planes/hcp_high_availability/hcp-disaster-recovery-oadp-auto.adoc#hcp-disaster-recovery-oadp-auto[Automated disaster recovery for a hosted cluster by using OADP].

[id="hcp-4-19-dr-agent_{context}"]
==== Disaster recovery for a hosted cluster on a bare-metal platform

For hosted clusters on a bare-metal platform, you can complete disaster recovery tasks by using {oadp-short}, including backing up the data plane and control plane workloads and restoring either to the same management cluster or to a new management cluster. For more information, see xref:../hosted_control_planes/hcp_high_availability/hcp-disaster-recovery-oadp.adoc#hcp-disaster-recovery-oadp[Disaster recovery for a hosted cluster by using OADP].

[id="hcp-4-19-openstack_{context}"]
==== {hcp-capital} on {rh-openstack-first} 17.1 (Technology Preview)

{hcp-capital} on {rh-openstack} 17.1 is now supported as a Technology Preview feature.

For more information, see xref:../hosted_control_planes/hcp-deploy/hcp-deploy-openstack.adoc#hosted-clusters-openstack-prerequisites_hcp-deploy-openstack[Deploying {hcp} on OpenStack].

[id="hcp-4-19-np-capacity-blocks_{context}"]
==== Configuring node pool capacity blocks on {aws-short}

You can now configure node pool capacity blocks for {hcp} on {aws-first}. For more information, see xref:../hosted_control_planes/hcp-manage/hcp-manage-aws.adoc#hcp-np-capacity-blocks_hcp-managing-aws[Configuring node pool capacity blocks on {aws-short}].

[id="bug-fixes-hcp-rn-4-19_{context}"]
=== Bug fixes

//FYI - OCPBUGS-56792 is a duplicate of this bug
* Previously, when an IDMS or ICSP in the management OpenShift cluster defined a source that pointed to registry.redhat.io or registry.redhat.io/redhat, and the mirror registry did not contain the required OLM catalog images, provisioning for the `HostedCluster` resource stalled due to unauthorized image pulls. As a consequence, the `HostedCluster` resource was not deployed, and it remained blocked, where it could not pull essential catalog images from the mirrored registry.
+
With this release, if a required image cannot be pulled due to authorization errors, the provisioning now explicitly fails. The logic for registry override is improved to allow matches on the root of the registry, such as registry.redhat.io, for OLM CatalogSource image resolution. A fallback mechanism is also introduced to use the original `ImageReference` if the registry override does not yield a working image.
+
As a result, the `HostedCluster` resource can be deployed successfully, even in scenarios where the mirror registry lacks the required OLM catalog images, as the system correctly falls back to pulling from the original source when appropriate. (link:https://issues.redhat.com/browse/OCPBUGS-56492[OCPBUGS-56492])

* Previously, the control plane controller did not properly select the correct CVO manifests for a feature set. As a consequence, the incorrect CVO manifests for a feature set might have been deployed for hosted clusters. In practice, CVO manifests never differed between feature sets, so this issue had no actual impact. With this release, the control plane controller properly selects the correct CVO manifests for a feature set. As a result, the correct CVO manifests for a feature set are deployed for the hosted cluster. (link:https://issues.redhat.com/browse/OCPBUGS-44438[OCPBUGS-44438])

* Previously, when you set a secure proxy for a `HostedCluster` resource that served a certificate signed by a custom CA, that CA was not included in the initial ignition configuration for the node. As a result, the node did not boot due to failed ignition. This release fixes the issue by including the trusted CA for the proxy in the initial ignition configuration, which results in a successful node boot and ignition. (link:https://issues.redhat.com/browse/OCPBUGS-56896[OCPBUGS-56896])

* Previously, the IDMS or ICSP resources from the management cluster were processed without considering that a user might specify only the root registry name as a mirror or source for image replacement. As a consequence, any IDMS or ICSP entries that used only the root registry name did not work as expected. With this release, the mirror replacement logic now correctly handles cases where only the root registry name is provided. As a result, the issue no longer occurs, and the root registry mirror replacements are now supported. (link:https://issues.redhat.com/browse/OCPBUGS-55693[OCPBUGS-55693])

* Previously, the OADP plugin looked for the `DataUpload` object in the wrong namespace. As a consequence, the backup process was stalled indefinitely. In this release, the plugin uses the source namespace of the backup object, so this problem no longer occurs. (link:https://issues.redhat.com/browse/OCPBUGS-55469[OCPBUGS-55469])

* Previously, the SAN of the custom certificate that the user added to the `hc.spec.configuration.apiServer.servingCerts.namedCertificates` field conflicted with the hostname that was set in the `hc.spec.services.servicePublishingStrategy` field for the Kubernetes agent server (KAS). As a consequence, the KAS certificate was not added to the set of certificates to generate a new payload, and any new nodes that attempted to join the `HostedCluster` resource had issues with certificate validation. This release adds a validation step to fail earlier and warn the user about the issue, so that the problem no longer occurs. (link:https://issues.redhat.com/browse/OCPBUGS-53261[OCPBUGS-53261])

* Previously, when you created a hosted cluster in a shared VPC, the private link controller sometimes failed to assume the shared VPC role to manage the VPC endpoints in the shared VPC. With this release, a client is created for every reconciliation in the private link controller so that you can recover from invalid clients. As a result, the hosted cluster endpoints and the hosted cluster are created successfully. (link:https://issues.redhat.com/browse/OCPBUGS-45184[*OCPBUGS-45184*])

* Previously, ARM64 architecture was not allowed in the `NodePool` API on the Agent platform. As a consequence, you could not deploy heterogeneous clusters on the Agent platform. In this release, the API allows ARM64-based `NodePool` resources on the Agent platform. (link:https://issues.redhat.com/browse/OCPBUGS-46342[OCPBUGS-46342])

* Previously, the HyperShift Operator always validated the subject alternative names (SANs) for the Kubernetes API server. With this release, the Operator validates the SANs only if PKI reconciliation is enabled. (link:https://issues.redhat.com/browse/OCPBUGS-56562[OCPBUGS-56562])

* Previously, in a hosted cluster that existed for more than 1 year, when the internal serving certificates were renewed, the control plane workloads did not restart to pick up the renewed certificates. As a consequence, the control plane became degraded. With this release, when certificates are renewed, the control plane workloads are automatically restarted. As a result, the control plane remains stable. (link:https://issues.redhat.com/browse/OCPBUGS-52331[OCPBUGS-52331])

* Previously, when you created a validating webhook on a resource that the OpenShift OAuth API server managed, such as a user or a group, the validating webhook was not executed. This release fixes the communication between the OpenShift OAuth API server and the data plane by adding a `Konnectivity` proxy sidecar. As a result, the process to validate webhooks on users and groups works as expected. (link:https://issues.redhat.com/browse/OCPBUGS-52190[OCPBUGS-52190])

* Previously, when the `HostedCluster` resource was not available, the reason was not propagated correctly from `HostedControlPlane` resource in the condition. The `Status` and the `Message` information was propagated for the `Available` condition in the `HostedCluster` custom resource, but the `Resource` value was not propagated. In this release, the reason is also propagated, so you have more information to identify the root cause of unavailability. (link:https://issues.redhat.com/browse/OCPBUGS-50907[OCPBUGS-50907])

* Previously, the `managed-trust-bundle` volume mount and the `trusted-ca-bundle-managed` config map were introduced as mandatory components. This requirement caused deployment failures if you used your own Public Key Infrastructure (PKI), because the OpenShift API server expected the presence of the `trusted-ca-bundle-managed` config map. To address this issue, these components are now optional, so that clusters can deploy successfully without the `trusted-ca-bundle-managed` config map when you are using a custom PKI. (link:https://issues.redhat.com/browse/OCPBUGS-52323[OCPBUGS-52323])

* Previously, there was no way to verify that an `IBMPowerVSImage` resource was deleted, which led to unnecessary cluster retrieval attempts. As a consequence, hosted clusters on {ibm-power-server-title} were stuck in the destroy state. In this release, you can retrieve and process a cluster that is associated with an image only if the image is not in the process of being deleted. (link:https://issues.redhat.com/browse/OCPBUGS-46037[OCPBUGS-46037])

* Previously, when you created a cluster with secure proxy enabled and set the certificate configuration to `configuration.proxy.trustCA`, the cluster installation failed. In addition, the OpenShift OAuth API server could not use the management cluster proxy to reach cloud APIs. This release introduces fixes to prevent these issues. (link:https://issues.redhat.com/browse/OCPBUGS-51050[OCPBUGS-51050])

* Previously, both the `NodePool` controller and the cluster API controller set the `updatingConfig` status condition on the `NodePool` custom resource. As a consequence, the `updatingConfig` status was constantly changing. With this release, the logic to update the `updatingConfig` status is consolidated between the two controllers. As a result, the `updatingConfig` status is correctly set. (link:https://issues.redhat.com/browse/OCPBUGS-45322[OCPBUGS-45322])

* Previously, the process to validate the container image architecture did not pass through the image metadata provider. As a consequence, image overrides did not take effect, and disconnected deployments failed. In this release, the methods for the image metadata provider are modified to allow multi-architecture validations, and are propagated through all components for image validation. As a result, the issue is resolved. (link:https://issues.redhat.com/browse/OCPBUGS-44655[OCPBUGS-44655])

* Previously, the `--goaway-chance` flag for the Kubernetes API Server was not configurable. The default value for the flag was `0`. With this release, you can change the value for the `--goaway-chance` flag by adding an annotation to the `HostedCluster` custom resource. (link:https://issues.redhat.com/browse/OCPBUGS-54863[OCPBUGS-54863])

* Previously, on instances of Red{nbsp}Hat OpenShift on {ibm-cloud-title} that are based on {hcp}, in non-OVN clusters, the Cluster Network Operator could not patch service monitors and Prometheus rules in the `monitoring.coreos.com` API group. As a consequence, the Cluster Network Operator logs showed permissions errors and "could not apply" messages. With this release, permissions for service monitors and Prometheus rules are added in the Cluster Network Operator for non-OVN clusters. As a result, the Cluster Network Operator logs no longer show permissions errors. (link:https://issues.redhat.com/browse/OCPBUGS-54178[OCPBUGS-54178])

* Previously, if you tried to use the {hcp} command-line interface (CLI) to create a disconnected cluster, the creation failed because the CLI could not access the payload. With this release, the release payload architecture check is skipped in disconnected environments because the registry where it is hosted is not usually accessible from the machine where the CLI runs. As a result, you can now use the CLI to create a disconnected cluster. (link:https://issues.redhat.com/browse/OCPBUGS-47715[OCPBUGS-47715])

* Previously, when the Control Plane Operator checked API endpoint availability, the Operator did not honor the `_*PROXY` variables that were set. As a consequence, HTTP requests to validate the Kubernetes API Server failed when egress traffic was blocked, except through a proxy, and the hosted control plane and hosted cluster were not available. With this release, this issue is resolved. (link:https://issues.redhat.com/browse/OCPBUGS-49913[OCPBUGS-49913])

* Previously, when you used the {hcp} CLI (`hcp`) to create a hosted cluster, you could not configure the etcd storage size. As a consequence, the disk size was not sufficient for some larger clusters. With this release, you can set the etcd storage size by setting a flag in your `HostedCluster` resource. The flag was initially added to help the OpenShift performance team with testing higher `NodePool` resources on ROSA with HCP. As a result, you can now set the etcd storage size when you create a hosted cluster by using the `hcp` CLI. (link:https://issues.redhat.com/browse/OCPBUGS-52655[OCPBUGS-52655])

* Previously, if you tried to update a hosted cluster that used in-place updates, the proxy variables were not honored and the update failed. With this release, the pod that performs in-place upgrades honors the cluster proxy settings. As a result, updates now work for hosted clusters that use in-place updates. (link:https://issues.redhat.com/browse/OCPBUGS-48540[OCPBUGS-48540])

* Previously, the liveness and readiness probes that are associated with the OpenShift API server in {hcp} were misaligned with the probes that are used in installer-provisioned infrastructure. This release updates the liveness and readiness probes to use the `/livez` and `/readyz` endpoints instead of the `/healthz` endpoint. (link:https://issues.redhat.com/browse/OCPBUGS-54819[OCPBUGS-54819])

* Previously, the Konnectivity agent on a hosted control plane did not have a readiness check. This release adds a readiness probe to the Konnectivity agent to indicate pod readiness when the connection to Konnectivity server drops. (link:https://issues.redhat.com/browse/OCPBUGS-49611[OCPBUGS-49611])

* Previously, when the HyperShift Operator was scoped to a subset of hosted clusters and node pools, the Operator did not properly clean up token and user data secrets in control plane namespaces. As a consequence, secrets accumulated. With this release, the Operator properly cleans up secrets. (link:https://issues.redhat.com/browse/OCPBUGS-54272[OCPBUGS-54272])

[id="known-issues-hcp-rn-4-19_{context}"]
=== Known issues

* If the annotation and the `ManagedCluster` resource name do not match, the {mce} console displays the cluster as `Pending import`. The cluster cannot be used by the {mce-short}. The same issue happens when there is no annotation and the `ManagedCluster` name does not match the `Infra-ID` value of the `HostedCluster` resource.

* When you use the {mce} console to add a new node pool to an existing hosted cluster, the same version of {product-title} might appear more than once in the list of options. You can select any instance in the list for the version that you want.

* When a node pool is scaled down to 0 workers, the list of hosts in the console still shows nodes in a `Ready` state. You can verify the number of nodes in two ways:

** In the console, go to the node pool and verify that it has 0 nodes.
** On the command-line interface, run the following commands:

*** Verify that 0 nodes are in the node pool by running the following command:
+
[source,terminal]
----
$ oc get nodepool -A
----

*** Verify that 0 nodes are in the cluster by running the following command:
+
[source,terminal]
----
$ oc get nodes --kubeconfig
----

*** Verify that 0 agents are reported as bound to the cluster by running the following command:
+
[source,terminal]
----
$ oc get agents -A
----

* When you create a hosted cluster in an environment that uses the dual-stack network, you might encounter pods stuck in the `ContainerCreating` state. This issue occurs because the `openshift-service-ca-operator` resource cannot generate the `metrics-tls` secret that the DNS pods need for DNS resolution. As a result, the pods cannot resolve the Kubernetes API server. To resolve this issue, configure the DNS server settings for a dual stack network.

* If you created a hosted cluster in the same namespace as its managed cluster, detaching the managed hosted cluster deletes everything in the managed cluster namespace including the hosted cluster. The following situations can create a hosted cluster in the same namespace as its managed cluster:

** You created a hosted cluster on the Agent platform through the {mce} console by using the default hosted cluster cluster namespace.
** You created a hosted cluster through the command-line interface or API by specifying the hosted cluster namespace to be the same as the hosted cluster name.

* When you use the console or API to specify an IPv6 address for the `spec.services.servicePublishingStrategy.nodePort.address` field of a hosted cluster, a full IPv6 address with 8 hextets is required. For example, instead of specifying `2620:52:0:1306::30`, you need to specify `2620:52:0:1306:0:0:0:30`.

[id="hcp-tech-preview-features_{context}"]
=== General Availability and Technology Preview features

Some features in this release are currently in Technology Preview. These experimental features are not intended for production use. For more information about the scope of support for these features, see link:https://access.redhat.com/support/offerings/techpreview[Technology Preview Features Support Scope] on the Red{nbsp}Hat Customer Portal.

[IMPORTANT]
====
For {ibm-power-title} and {ibm-z-title}, you must run the control plane on machine types based on 64-bit x86 architecture, and node pools on {ibm-power-title} or {ibm-z-title}.
====

.{hcp-capital} GA and TP tracker
[cols="4,1,1,1",options="header"]
|===
|Feature |4.17 |4.18 |4.19

|{hcp-capital} for {product-title} using non-bare-metal agent machines
|Technology Preview
|Technology Preview
|Technology Preview

|{hcp-capital} for an ARM64 {product-title} cluster on {aws-full}
|General Availability
|General Availability
|General Availability

|{hcp-capital} for {product-title} on {ibm-power-title}
|General Availability
|General Availability
|General Availability

|{hcp-capital} for {product-title} on {ibm-z-title}
|General Availability
|General Availability
|General Availability

|{hcp-capital} for {product-title} on {rh-openstack}
|Developer Preview
|Developer Preview
|Technology Preview
|===