diff --git a/_topic_maps/_topic_map.yml b/_topic_maps/_topic_map.yml index 79f5d119a0..10fa883b07 100644 --- a/_topic_maps/_topic_map.yml +++ b/_topic_maps/_topic_map.yml @@ -3814,6 +3814,8 @@ Topics: File: backup-and-restore-cr-issues - Name: Restic issues File: restic-issues + - Name: OADP Data protection test + File: oadp-data-protection-test - Name: Using the must-gather tool File: using-the-must-gather-tool - Name: OADP monitoring diff --git a/backup_and_restore/application_backup_and_restore/troubleshooting/oadp-data-protection-test.adoc b/backup_and_restore/application_backup_and_restore/troubleshooting/oadp-data-protection-test.adoc new file mode 100644 index 0000000000..44a4f21066 --- /dev/null +++ b/backup_and_restore/application_backup_and_restore/troubleshooting/oadp-data-protection-test.adoc @@ -0,0 +1,29 @@ +:_mod-docs-content-type: ASSEMBLY +[id="oadp-data-protection-test"] += OADP Data protection test +:toc: + +include::_attributes/common-attributes.adoc[] +:context: oadp-data-protection-test + + +[role="_abstract"] +The `DataProtectionTest` (DPT) is a custom resource (CR) that provides a framework to validate your {oadp-short} configuration. The DPT CR checks and reports information for the following parameters: + +* The upload performance of the backups to the object storage. +* The CSI snapshot readiness for persistent volume claims. +* The storage bucket configuration, such as encryption and versioning. + +Using this information in the DPT CR, you can ensure that your data protection environment is properly configured and performing according to the set configuration. + +include::modules/oadp-dpt-spec-fields.adoc[leveloffset=+1] + +include::modules/oadp-dpt-status-fields.adoc[leveloffset=+1] + +include::modules/using-data-protection-test.adoc[leveloffset=+1] + +include::modules/oadp-dpt-use-case-bsl-spec.adoc[leveloffset=+1] + +include::modules/oadp-dpt-use-case-azure.adoc[leveloffset=+1] + +include::modules/oadp-troubleshooting-dpt.adoc[leveloffset=+1] diff --git a/backup_and_restore/application_backup_and_restore/troubleshooting/troubleshooting.adoc b/backup_and_restore/application_backup_and_restore/troubleshooting/troubleshooting.adoc index 03e51d7d2d..68dfce6f48 100644 --- a/backup_and_restore/application_backup_and_restore/troubleshooting/troubleshooting.adoc +++ b/backup_and_restore/application_backup_and_restore/troubleshooting/troubleshooting.adoc @@ -26,6 +26,8 @@ You can troubleshoot OADP issues by using the following methods: * Use the available xref:../../../backup_and_restore/application_backup_and_restore/troubleshooting/oadp-timeouts.adoc#oadp-timeouts[OADP timeouts] to reduce errors, retries, or failures. +* Run the xref:../../../backup_and_restore/application_backup_and_restore/troubleshooting/oadp-data-protection-test.adoc#oadp-data-protection-test[`DataProtectionTest` (DPT)] custom resource to verify your backup storage bucket configuration and check the CSI snapshot readiness for persistent volume claims. + * Collect logs and CR information by using the xref:../../../backup_and_restore/application_backup_and_restore/troubleshooting/using-the-must-gather-tool.adoc#using-the-must-gather-tool[`must-gather` tool]. * Monitor and analyze the workload performance with the help of xref:../../../backup_and_restore/application_backup_and_restore/troubleshooting/oadp-monitoring.adoc#oadp-monitoring[OADP monitoring]. diff --git a/modules/oadp-dpt-spec-fields.adoc b/modules/oadp-dpt-spec-fields.adoc new file mode 100644 index 0000000000..6d27a3c80e --- /dev/null +++ b/modules/oadp-dpt-spec-fields.adoc @@ -0,0 +1,23 @@ +// Module included in the following assemblies: +// +// * backup_and_restore/application_backup_and_restore/oadp-data-protection-test.adoc + +:_mod-docs-content-type: REFERENCE +[id="oadp-dpt-spec_{context}"] += OADP DataProtectionTest CR specification fields + +[role="_abstract"] +You can configure the following specification fields in the `DataProtectionTest` (DPT) custom resource (CR). + +.DPT CR spec fields +|=== +|Field |Type |Description + +| `backupLocationName` | string | Name of the `BackupStorageLocation` CR configured in the `DataProtectionApplication` (DPA) CR. +| `backupLocationSpec` | object | Inline specification of the `BackupStorageLocation` CR. +| `uploadSpeedTestConfig` | object | Configuration to run an upload speed test to the object storage. +| `csiVolumeSnapshotTestConfigs` | list | List of persistent volume claims to take a snapshot of and to verify the snapshot readiness. +| `forceRun` | boolean | Re-run the DPT CR even if status is `Complete` or `Failed`. +| `skipTLSVerify` | boolean | Bypasses the TLS certificate validation if set to `true`. + +|=== \ No newline at end of file diff --git a/modules/oadp-dpt-status-fields.adoc b/modules/oadp-dpt-status-fields.adoc new file mode 100644 index 0000000000..4327a20100 --- /dev/null +++ b/modules/oadp-dpt-status-fields.adoc @@ -0,0 +1,25 @@ +// Module included in the following assemblies: +// +// * backup_and_restore/application_backup_and_restore/oadp-data-protection-test.adoc + +:_mod-docs-content-type: REFERENCE +[id="oadp-dpt-status_{context}"] += OADP DataProtectionTest CR status fields + +[role="_abstract"] +You can review the status of the `DataProtectionTest` (DPT) custom resource (CR) by using the following status fields: + +.DPT CR status fields +|=== +|Field |Type |Description + +| `phase` | string | Current phase of the DPT CR. Values are `InProgress`, `Complete`, or `Failed`. +| `lastTested` | timestamp | The timestamp when the DPT CR was last run. +| `uploadTest` | object | Results of the upload speed test. +| `bucketMetadata` | object | Information about the storage bucket encryption and versioning. +| `snapshotTests` | list | Snapshot test results for each persistent volume claim. +| `snapshotSummary` | string | Aggregated pass/fail summary for snapshots. For example, `2/2 passed`. +| `s3Vendor` | string | {aws-short} S3-compatible storage bucket vendors. For example, {aws-short}, MinIO, Ceph. +| `errorMessage` | string | Error message if the DPT CR fails. + +|=== \ No newline at end of file diff --git a/modules/oadp-dpt-use-case-azure.adoc b/modules/oadp-dpt-use-case-azure.adoc new file mode 100644 index 0000000000..a5eee1c0bf --- /dev/null +++ b/modules/oadp-dpt-use-case-azure.adoc @@ -0,0 +1,118 @@ +// Module included in the following assemblies: +// +// * backup_and_restore/application_backup_and_restore/oadp-data-protection-test.adoc + +:_mod-docs-content-type: PROCEDURE +[id="oadp-dpt-use-case-azure_{context}"] += Running a data protection test on an Azure object storage + +[role="_abstract"] +If you are using {oadp-short} on an Azure object storage, you need to specify the Azure `STORAGE_ACCOUNT_ID` as part of the secret object. Use the following procedure to run a `DataProtectionTest` (DPT) custom resource (CR) on an Azure cluster. + + +.Prerequisites + +* You have logged in to the Azure cluster as a user with the `cluster-admin` role. +* You have installed the OpenShift CLI (`oc`). +* You have installed the {oadp-short} Operator. +* You have configured a bucket to store the backups. +* You have an application with persistent volume claims (PVCs) running in a separate namespace. + + +.Procedure + +. Add the `Storage Blob Data Contributor` role to Azure `storageAccount` object to avoid DPT run failure. Run the following command: ++ +[source,terminal] +---- +$ az role assignment create \ +--assignee "$AZURE_CLIENT_ID" \ +--role "Storage Blob Data Contributor" \ +--scope "/subscriptions/$AZURE_SUBSCRIPTION_ID/resourceGroups/$AZURE_RESOURCE_GROUP/providers/Microsoft.Storage/storageAccounts/$AZURE_STORAGE_ACCOUNT_ID" +---- + +. In your terminal, export the Azure parameters and create a secret credentials file with the parameters as shown in the following example. ++ +To run the DPT CR on Azure, you need to specify the `STORAGE_ACCOUNT_ID` parameter in the secret credentials file. ++ +[source,terminal] +---- +AZURE_SUBSCRIPTION_ID= +AZURE_TENANT_ID= +AZURE_CLIENT_ID= +AZURE_CLIENT_SECRET= +AZURE_RESOURCE_GROUP= +AZURE_STORAGE_ACCOUNT_ID= +---- + +. Create the `Secret` CR as shown in the following example: ++ +[source,terminal] +---- +$ oc create secret generic cloud-credentials-azure -n openshift-adp --from-file cloud= +---- + +. Create the `DataProtectionApplication` (DPA) CR by using the configuration shown in the following example: ++ +[source,yaml] +---- +apiVersion: oadp.openshift.io/v1alpha1 +kind: DataProtectionApplication +metadata: + name: ts-dpa + namespace: openshift-adp +spec: + configuration: + velero: + defaultPlugins: + - azure + - openshift + backupLocations: + - velero: + config: + resourceGroup: oadp-....-b7q4-rg + storageAccount: oadp...kb7q4 + subscriptionId: 53b8f5...fd54c8a + credential: + key: cloud + name: cloud-credentials-azure # <1> + provider: azure + default: true + objectStorage: + bucket: + prefix: velero +---- +<1> Specify the name of the `Secret` object. In this example, the name is `cloud-credentials-azure`. + +. Create the DPT CR by specifying the name of backup storage location (BSL), `VolumeSnapshotClass` object, and the persistent volume claim details as shown in the following example: ++ +[source,yaml] +---- +apiVersion: oadp.openshift.io/v1alpha1 +kind: DataProtectionTest +metadata: + name: dpt-sample + namespace: openshift-adp +spec: + backupLocationName: # <1> + uploadSpeedTestConfig: + fileSize: 40MB + timeout: 120s + csiVolumeSnapshotTestConfigs: + - snapshotClassName: csi-azuredisk-vsc # <2> + timeout: 90s + volumeSnapshotSource: + persistentVolumeClaimName: mysql-data # <3> + persistentVolumeClaimNamespace: ocp-mysql # <4> + - snapshotClassName: csi-azuredisk-vsc + timeout: 120s + volumeSnapshotSource: + persistentVolumeClaimName: mysql-data1 + persistentVolumeClaimNamespace: ocp-mysql +---- +<1> Specify the name of the BSL. +<2> The Azure snapshot class name. +<3> The name of the persistent volume claim. +<4> The name of the persistent volume claim namespace. + +. Run the DPT CR to verify the snapshot readiness. diff --git a/modules/oadp-dpt-use-case-bsl-spec.adoc b/modules/oadp-dpt-use-case-bsl-spec.adoc new file mode 100644 index 0000000000..f352065472 --- /dev/null +++ b/modules/oadp-dpt-use-case-bsl-spec.adoc @@ -0,0 +1,93 @@ +// Module included in the following assemblies: +// +// * backup_and_restore/application_backup_and_restore/oadp-data-protection-test.adoc + +:_mod-docs-content-type: PROCEDURE +[id="oadp-dpt-use-case-bsl-spec_{context}"] += Running a data protection test by configuring a backup storage location specification + +[role="_abstract"] +You can configure the `DataProtectionTest` (DPT) custom resource (CR) by specifying the backup storage location (BSL) specification instead of a BSL name. You then run the DPT CR to verify the Container Storage Initiative (CSI) snapshot readiness and the data upload performance to the storage bucket. + +.Prerequisites + +* You have logged in to the {product-title} cluster as a user with the `cluster-admin` role. +* You have installed the OpenShift CLI (`oc`). +* You have installed the {oadp-short} Operator. +* You have created the `DataProtectionApplication` (DPA) CR. +* You have configured a bucket to store the backups. +* You have created the `Secret` object to access the bucket storage. +* You have an application with persistent volume claims (PVCs) running in a separate namespace. + + +.Procedure + +. Create a manifest file for the DPT CR as shown in the example: ++ +[source,yaml] +---- +apiVersion: oadp.openshift.io/v1alpha1 +kind: DataProtectionTest +metadata: + name: dpt-sample + namespace: openshift-adp +spec: + backupLocationSpec: # <1> + provider: aws + default: true + objectStorage: + bucket: sample-bucket # <2> + prefix: velero + config: + region: us-east-1 # <3> + profile: "default" + insecureSkipTLSVerify: "true" + s3Url: "https://s3.amazonaws.com/sample-bucket" + credential: # <4> + name: cloud-credentials + key: cloud + uploadSpeedTestConfig: # <5> + fileSize: 50MB + timeout: 120s + csiVolumeSnapshotTestConfigs: # <6> + - volumeSnapshotSource: + persistentVolumeClaimName: mongo + persistentVolumeClaimNamespace: mongo-persistent + snapshotClassName: csi-snapclass + timeout: 2m + forceRun: true + skipTLSVerify: true # <7> +---- +<1> Configure the BSL spec by specifying details such as the cloud provider. +<2> Specify the bucket name. In this example, the bucket name is `sample-bucket`. +<3> Specify the cloud provider region. +<4> Specify the cloud credentials for the storage bucket. +<5> (Optional) Configure the `uploadSpeedTestConfig` object by setting the `fileSize` and `timeout` fields. +<6> Configure the `csiVolumeSnapshotTestConfigs` object. +<7> Set to `true` to skip the TLS certificate validation during the DPT CR run. + +. Create the DPT CR by running the following command: ++ +[source,terminal] +---- +$ oc create -f # <1> +---- +<1> Specify the file name of the DPT manifest. + + +.Verification + +. Verify that the phase of the DPT CR is `Complete` by running the following command: ++ +[source,terminal] +---- +$ oc get dpt dpt-sample +---- ++ +The example output is as following: ++ +[source,terminal] +---- +NAME PHASE LASTTESTED UPLOADSPEED(MBPS) ENCRYPTION VERSIONING SNAPSHOTS AGE +dpt-sample Complete 17m 546 AES256 Enabled 2/2 passed 17m +---- diff --git a/modules/oadp-troubleshooting-dpt.adoc b/modules/oadp-troubleshooting-dpt.adoc new file mode 100644 index 0000000000..f084a41265 --- /dev/null +++ b/modules/oadp-troubleshooting-dpt.adoc @@ -0,0 +1,21 @@ +// Module included in the following assemblies: +// +// * backup_and_restore/application_backup_and_restore/oadp-data-protection-test.adoc + +:_mod-docs-content-type: CONCEPT +[id="oadp-troubleshooting-dpt_{context}"] += Troubleshooting the DataProtectionTest custom resource + +[role="_abstract"] +Use the following table to troubleshoot common issues when running the `DataProtectionTest` (DPT) custom resource (CR). + +.DPT CR troubleshooting +|=== +|Error |Reason |Solution + +| DPT stuck in `InProgress` state | Bucket credentials or bucket access failure | Check `Secret` object, bucket permissions, and logs. +| Upload test failed | Incorrect `Secret` object or S3 endpoint | Check the `BackupStorageLocation` object config and the access keys. +| Snapshot tests fail | Incorrect configuration of CSI snapshot controller | Check the `VolumeSnapshotClass` object availability and the CSI driver logs. +| Bucket encryption or versioning not populated | Cloud provider limitations | Not all object storage providers expose these fields consistently. + +|=== \ No newline at end of file diff --git a/modules/using-data-protection-test.adoc b/modules/using-data-protection-test.adoc new file mode 100644 index 0000000000..d9406bce2e --- /dev/null +++ b/modules/using-data-protection-test.adoc @@ -0,0 +1,124 @@ +// Module included in the following assemblies: +// +// * backup_and_restore/application_backup_and_restore/oadp-data-protection-test.adoc + +:_mod-docs-content-type: PROCEDURE +[id="using-data-protection-test_{context}"] += Using the DataProtectionTest custom resource + +[role="_abstract"] +You can configure the `DataProtectionTest` (DPT) custom resource (CR) and then run the DPT CR to verify the Container Storage Initiative (CSI) snapshot readiness and the data upload performance to the storage bucket. + +.Prerequisites + +* You have logged in to the {product-title} cluster as a user with the `cluster-admin` role. +* You have installed the OpenShift CLI (`oc`). +* You have installed the {oadp-short} Operator. +* You have created the `DataProtectionApplication` (DPA) CR. +* You have configured a backup storage location (BSL) to store the backups. +* You have an application with persistent volume claims (PVCs) running in a separate namespace. + + +.Procedure + +. Create a manifest file for the DPT CR as shown in the example: ++ +[source,yaml] +---- +apiVersion: oadp.openshift.io/v1alpha1 +kind: DataProtectionTest +metadata: + name: dpt-sample + namespace: openshift-adp +spec: + backupLocationName: # <1> + csiVolumeSnapshotTestConfigs: # <2> + - snapshotClassName: csi-gce-pd-vsc + timeout: 90s + volumeSnapshotSource: + persistentVolumeClaimName: # <3> + persistentVolumeClaimNamespace: # <4> + - snapshotClassName: csi-gce-pd-vsc + timeout: 120s + volumeSnapshotSource: + persistentVolumeClaimName: # <5> + persistentVolumeClaimNamespace: + forceRun: false # <6> + uploadSpeedTestConfig: # <7> + fileSize: 200MB + timeout: 120s +---- +<1> Specify the name of the BSL. +<2> Specify a list for `csiVolumeSnapshotTestConfigs`. In this example, two PVCs are being tested. +<3> Specify the name of the first PVC. +<4> Specify the namespace of the PVC. +<5> Specify the name of the second PVC. +<6> Set the `forceRun` flag to `false` if you want to make the {oadp-short} controller skip re-running tests. +<7> Configure the `uploadSpeedTestConfig` object by setting the `fileSize` and `timeout` fields. + +. Create the DPT CR by running the following command: ++ +[source,terminal] +---- +$ oc create -f # <1> +---- +<1> Specify the file name of the DPT manifest. + + +.Verification + +. Verify that the phase of the DPT CR is `Complete` by running the following command: ++ +[source,terminal] +---- +$ oc get dpt dpt-sample +---- ++ +The example output is as following: ++ +[source,terminal] +---- +NAME PHASE LASTTESTED UPLOADSPEED(MBPS) ENCRYPTION VERSIONING SNAPSHOTS AGE +dpt-sample Complete 17m 546 AES256 Enabled 2/2 passed 17m +---- + +. Verify that the CSI snapshots are ready and the data upload tests are successful by running the following command: ++ +[source,terminal] +---- +$ oc get dpt dpt-sample -o yaml +---- ++ +The example output is as following: ++ +[source,yaml] +---- +apiVersion: oadp.openshift.io/v1alpha1 +kind: DataProtectionTest +.... +status: + bucketMetadata: # <1> + encryptionAlgorithm: AES256 + versioningStatus: Enabled + lastTested: "202...:47:51Z" + phase: Complete + s3Vendor: AWS # <2> + snapshotSummary: 2/2 passed # <3> + snapshotTests: + - persistentVolumeClaimName: mysql-data + persistentVolumeClaimNamespace: ocp-mysql + readyDuration: 24s + status: Ready + - persistentVolumeClaimName: mysql-data1 + persistentVolumeClaimNamespace: ocp-mysql + readyDuration: 40s + status: Ready + uploadTest: # <4> + duration: 3.071s + speedMbps: 546 + success: true +---- +<1> The bucket metadata information. +<2> The S3 bucket vendor. +<3> Summary of the CSI snapshot tests. +<4> The upload test details. \ No newline at end of file