1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-05 12:46:18 +01:00

Merge pull request #96015 from openshift-cherrypick-robot/cherry-pick-94958-to-enterprise-4.19

[enterprise-4.19] [OSDOCS-11124]: Add automated backup/restore with OADP docs
This commit is contained in:
Laura Hinson
2025-08-01 16:14:58 -04:00
committed by GitHub
9 changed files with 413 additions and 1 deletions

View File

@@ -2572,6 +2572,8 @@ Topics:
File: hcp-disaster-recovery-aws
- Name: Disaster recovery for a hosted cluster by using OADP
File: hcp-disaster-recovery-oadp
- Name: Automated disaster recovery for a hosted cluster by using OADP
File: hcp-disaster-recovery-oadp-auto
- Name: Authentication and authorization for hosted control planes
File: hcp-authentication-authorization
- Name: Handling machine configuration for hosted control planes

View File

@@ -0,0 +1,58 @@
:_mod-docs-content-type: ASSEMBLY
[id="hcp-disaster-recovery-oadp-auto"]
= Automated disaster recovery for a hosted cluster by using {oadp-short}
include::_attributes/common-attributes.adoc[]
:context: hcp-disaster-recovery-oadp-auto
toc::[]
In hosted clusters on bare-metal or {aws-first} platforms, you can automate some backup and restore steps by using the {oadp-first} Operator.
The process involves the following steps:
. Configuring {oadp-short}
. Defining a Data Protection Application (DPA)
. Backing up the data plane workload
. Backing up the control plane workload
. Restoring a hosted cluster by using {oadp-short}
[id="hcp-auto-dr-prereqs_{context}"]
== Prerequisites
You must meet the following prerequisites on the management cluster:
* You xref:../../backup_and_restore/application_backup_and_restore/installing/oadp-installing-operator.adoc#oadp-installing-operator[installed the {oadp-short} Operator].
* You created a storage class.
* You have access to the cluster with `cluster-admin` privileges.
* You have access to the {oadp-short} subscription through a catalog source.
* You have access to a cloud storage provider that is compatible with {oadp-short}, such as S3, {azure-full}, {gcp-full}, or MinIO.
* In a disconnected environment, you have access to a self-hosted storage provider that is compatible with {oadp-short}, for example link:https://docs.redhat.com/en/documentation/red_hat_openshift_data_foundation/[{odf-full}] or link:https://min.io/[MinIO].
* Your {hcp} pods are up and running.
include::modules/hcp-dr-prep-oadp-auto.adoc[leveloffset=+1]
[role="_additional-resources"]
.Additional resources
* xref:../../backup_and_restore/application_backup_and_restore/installing/installing-oadp-aws.adoc#installing-oadp-aws[Configuring the {oadp-full} with Multicloud Object Gateway]
* xref:../../backup_and_restore/application_backup_and_restore/installing/installing-oadp-mcg.adoc#installing-oadp-mcg[Configuring the {oadp-full} with AWS S3 compatible storage]
include::modules/hcp-dr-oadp-dpa.adoc[leveloffset=+1]
[id="backing-up-data-plane-oadp-auto_{context}"]
== Backing up the data plane workload
To back up the data plane workload by using the {oadp-short} Operator, see "Backing up applications". If the data plane workload is not important, you can skip this procedure.
[role="_additional-resources"]
.Additional resources
* xref:../../backup_and_restore/application_backup_and_restore/backing_up_and_restoring/backing-up-applications.adoc#backing-up-applications[Backing up applications]
include::modules/hcp-dr-oadp-backup-cp-workload-auto.adoc[leveloffset=+1]
include::modules/hcp-dr-oadp-restore-auto.adoc[leveloffset=+1]
include::modules/hcp-dr-oadp-observe.adoc[leveloffset=+1]
include::modules/hcp-dr-oadp-observe-velero.adoc[leveloffset=+1]

View File

@@ -99,4 +99,4 @@ include::modules/hcp-dr-oadp-restore-new-mgmt.adoc[leveloffset=+2]
include::modules/hcp-dr-oadp-observe.adoc[leveloffset=+1]
include::modules/hcp-dr-oadp-observe-velero.adoc[leveloffset=+1]
include::modules/hcp-dr-oadp-observe-velero.adoc[leveloffset=+1]

View File

@@ -0,0 +1,102 @@
// Module included in the following assemblies:
//
// * hosted_control_planes/hcp-disaster-recovery-oadp-auto.adoc
:_mod-docs-content-type: REFERENCE
[id="hcp-dr-oadp-backup-cp-workload-auto_{context}"]
= Backing up the control plane workload
You can back up the control plane workload by creating the `Backup` custom resource (CR).
To monitor and observe the backup process, see "Observing the backup and restore process".
.Procedure
. Create a YAML file that defines the `Backup` CR:
+
.Example `backup-control-plane.yaml` file
[%collapsible]
====
[source,yaml]
----
apiVersion: velero.io/v1
kind: Backup
metadata:
name: <backup_resource_name> <1>
namespace: openshift-adp
labels:
velero.io/storage-location: default
spec:
hooks: {}
includedNamespaces: <2>
- <hosted_cluster_namespace> <3>
- <hosted_control_plane_namespace> <4>
includedResources:
- sa
- role
- rolebinding
- pod
- pvc
- pv
- bmh
- configmap
- infraenv <5>
- priorityclasses
- pdb
- agents
- hostedcluster
- nodepool
- secrets
- services
- deployments
- hostedcontrolplane
- cluster
- agentcluster
- agentmachinetemplate
- agentmachine
- machinedeployment
- machineset
- machine
- route
- clusterdeployment
excludedResources: []
storageLocation: default
ttl: 2h0m0s
snapshotMoveData: true <6>
datamover: "velero" <6>
defaultVolumesToFsBackup: true <7>
----
====
<1> Replace `backup_resource_name` with a name for your `Backup` resource.
<2> Selects specific namespaces to back up objects from them. You must include your hosted cluster namespace and the hosted control plane namespace.
<3> Replace `<hosted_cluster_namespace>` with the name of the hosted cluster namespace, for example, `clusters`.
<4> Replace `<hosted_control_plane_namespace>` with the name of the hosted control plane namespace, for example, `clusters-hosted`.
<5> You must create the `infraenv` resource in a separate namespace. Do not delete the `infraenv` resource during the backup process.
<6> Enables the CSI volume snapshots and uploads the control plane workload automatically to the cloud storage.
<7> Sets the `fs-backup` backing up method for persistent volumes (PVs) as default. This setting is useful when you use a combination of Container Storage Interface (CSI) volume snapshots and the `fs-backup` method.
+
[NOTE]
====
If you want to use CSI volume snapshots, you must add the `backup.velero.io/backup-volumes-excludes=<pv_name>` annotation to your PVs.
====
. Apply the `Backup` CR by running the following command:
+
[source,terminal]
----
$ oc apply -f backup-control-plane.yaml
----
.Verification
* Verify that the value of the `status.phase` is `Completed` by running the following command:
+
[source,terminal]
----
$ oc get backups.velero.io <backup_resource_name> -n openshift-adp \
-o jsonpath='{.status.phase}'
----
.Next steps
* Restore the hosted cluster by using {oadp-short}.

View File

@@ -0,0 +1,151 @@
// Module included in the following assemblies:
//
// * hosted_control_planes/hcp-disaster-recovery-oadp-auto.adoc
:_mod-docs-content-type: REFERENCE
[id="hcp-dr-oadp-dpa_{context}"]
= Automating the backup and restore process by using a DPA
You can automate parts of the backup and restore process by using a Data Protection Application (DPA). When you use a DPA, the steps to pause and restart the reconciliation of resources are automated. The DPA defines information including backup locations and Velero pod configurations.
You can create a DPA by defining a `DataProtectionApplication` object.
.Procedure
* If you use a bare-metal platform, you can create a DPA by completing the following steps:
. Create a manifest file similar to the following example:
+
.Example `dpa.yaml` file
[%collapsible]
====
[source,yaml]
----
apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
name: dpa-sample
namespace: openshift-adp
spec:
backupLocations:
- name: default
velero:
provider: aws # <1>
default: true
objectStorage:
bucket: <bucket_name> # <2>
prefix: <bucket_prefix> # <3>
config:
region: minio # <4>
profile: "default"
s3ForcePathStyle: "true"
s3Url: "<bucket_url>" # <5>
insecureSkipTLSVerify: "true"
credential:
key: cloud
name: cloud-credentials
default: true
snapshotLocations:
- velero:
provider: aws # <1>
config:
region: minio # <4>
profile: "default"
credential:
key: cloud
name: cloud-credentials
configuration:
nodeAgent:
enable: true
uploaderType: kopia
velero:
defaultPlugins:
- openshift
- aws
- csi
- hypershift
resourceTimeout: 2h
----
====
<1> Specify the provider for Velero. If you are using bare metal and MinIO, you can use `aws` as the provider.
<2> Specify the bucket name; for example, `oadp-backup`.
<3> Specify the bucket prefix; for example, `hcp`.
<4> The bucket region in this example is `minio`, which is a storage provider that is compatilble with the S3 API.
<5> Specify the URL of the S3 endpoint.
. Create the DPA object by running the following command:
+
[source,terminal]
----
$ oc create -f dpa.yaml
----
+
After you create the `DataProtectionApplication` object, new `velero` deployment and `node-agent` pods are created in the `openshift-adp` namespace.
* If you use {aws-first}, you can create a DPA by completing the following steps:
. Create a manifest file similar to the following example:
+
.Example `dpa.yaml` file
[%collapsible]
====
[source,yaml]
----
apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
name: dpa-sample
namespace: openshift-adp
spec:
backupLocations:
- name: default
velero:
provider: aws
default: true
objectStorage:
bucket: <bucket_name> # <1>
prefix: <bucket_prefix> # <2>
config:
region: minio # <3>
profile: "backupStorage"
credential:
key: cloud
name: cloud-credentials
snapshotLocations:
- velero:
provider: aws
config:
region: minio # <3>
profile: "volumeSnapshot"
credential:
key: cloud
name: cloud-credentials
configuration:
nodeAgent:
enable: true
uploaderType: kopia
velero:
defaultPlugins:
- openshift
- aws
- csi
- hypershift
resourceTimeout: 2h
----
====
<1> Specify the bucket name; for example, `oadp-backup`.
<2> Specify the bucket prefix; for example, `hcp`.
<3> The bucket region in this example is `minio`, which is a storage provider that is compatilble with the S3 API.
. Create the DPA resource by running the following command:
+
[source,terminal]
----
$ oc create -f dpa.yaml
----
+
After you create the `DataProtectionApplication` object, new `velero` deployment and `node-agent` pods are created in the `openshift-adp` namespace.
.Next steps
* Back up the data plane workload.

View File

@@ -1,6 +1,7 @@
// Module included in the following assemblies:
//
// * hosted_control_planes/hcp-disaster-recovery-oadp.adoc
// * hosted_control_planes/hcp-disaster-recovery-oadp-auto.adoc
:_mod-docs-content-type: PROCEDURE
[id="hcp-dr-oadp-observe-velero_{context}"]

View File

@@ -1,6 +1,7 @@
// Module included in the following assemblies:
//
// * hosted_control_planes/hcp-disaster-recovery-oadp.adoc
// * hosted_control_planes/hcp-disaster-recovery-oadp-auto.adoc
:_mod-docs-content-type: PROCEDURE
[id="hcp-dr-oadp-observe_{context}"]

View File

@@ -0,0 +1,86 @@
// Module included in the following assemblies:
//
// * hosted_control_planes/hcp-disaster-recovery-oadp-auto.adoc
:_mod-docs-content-type: PROCEDURE
[id="hcp-dr-oadp-restore-auto_{context}"]
= Restoring a hosted cluster by using {oadp-short}
You can restore the hosted cluster by creating the `Restore` custom resource (CR).
* If you are using an in-place update, the `InfraEnv` resource does not need spare nodes. You need to re-provision the worker nodes from the new management cluster.
* If you are using a replace update, you need some spare nodes for the `InfraEnv` resource to deploy the worker nodes.
[IMPORTANT]
====
After you back up your hosted cluster, you must destroy it to initiate the restoring process. To initiate node provisioning, you must back up workloads in the data plane before deleting the hosted cluster.
====
.Prerequisites
* You completed the steps in link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.13/html/clusters/cluster_mce_overview#remove-a-cluster-by-using-the-console[Removing a cluster by using the console] ({rh-rhacm} documentation) to delete your hosted cluster.
* You completed the steps in link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.13/html/clusters/cluster_mce_overview#removing-a-cluster-from-management-in-special-cases[Removing remaining resources after removing a cluster] ({rh-rhacm} documentation).
To monitor and observe the backup process, see "Observing the backup and restore process".
.Procedure
. Verify that no pods and persistent volume claims (PVCs) are present in the hosted control plane namespace by running the following command:
+
[source,terminal]
----
$ oc get pod pvc -n <hosted_control_plane_namespace>
----
+
.Expected output
[source,terminal]
----
No resources found
----
. Create a YAML file that defines the `Restore` CR:
+
.Example `restore-hosted-cluster.yaml` file
[source,yaml]
----
apiVersion: velero.io/v1
kind: Restore
metadata:
name: <restore_resource_name> <1>
namespace: openshift-adp
spec:
backupName: <backup_resource_name> <2>
restorePVs: true <3>
existingResourcePolicy: update <4>
excludedResources:
- nodes
- events
- events.events.k8s.io
- backups.velero.io
- restores.velero.io
- resticrepositories.velero.io
----
<1> Replace `<restore_resource_name>` with a name for your `Restore` resource.
<2> Replace `<backup_resource_name>` with a name for your `Backup` resource.
<3> Initiates the recovery of persistent volumes (PVs) and its pods.
<4> Ensures that the existing objects are overwritten with the backed up content.
+
[IMPORTANT]
====
You must create the `InfraEnv` resource in a separate namespace. Do not delete the `InfraEnv` resource during the restore process. The `InfraEnv` resource is mandatory for the new nodes to be reprovisioned.
====
. Apply the `Restore` CR by running the following command:
+
[source,terminal]
----
$ oc apply -f restore-hosted-cluster.yaml
----
. Verify if the value of the `status.phase` is `Completed` by running the following command:
+
[source,terminal]
----
$ oc get hostedcluster <hosted_cluster_name> -n <hosted_cluster_namespace> \
-o jsonpath='{.status.phase}'
----

View File

@@ -0,0 +1,11 @@
// Module included in the following assemblies:
//
// * hosted_control_planes/hcp-disaster-recovery-oadp-auto.adoc
:_mod-docs-content-type: PROCEDURE
[id="hcp-dr-prep-oadp-auto_{context}"]
= Configuring {oadp-short}
If your hosted cluster is on {aws-short}, follow the steps in "Configuring the {oadp-full} with Multicloud Object Gateway" to configure {oadp-short}.
If your hosted cluster is on a bare-metal platform, follow the steps in "Configuring the {oadp-full} with AWS S3 compatible storage" to configure {oadp-short}.