1
0
mirror of https://github.com/openshift/openshift-docs.git synced 2026-02-05 12:46:18 +01:00
Files
openshift-docs/modules/migration-error-messages.adoc
2025-10-07 12:26:32 -04:00

208 lines
8.0 KiB
Plaintext

// Module included in the following assemblies:
//
// * migrating_from_ocp_3_to_4/troubleshooting-3-4.adoc
// * migration_toolkit_for_containers/troubleshooting-mtc
:_mod-docs-content-type: PROCEDURE
[id="migration-error-messages_{context}"]
= Error messages and resolutions
This section describes common error messages you might encounter with the {mtc-first} and how to resolve their underlying causes.
[id="ca-certificate-error-displayed-when-accessing-console-for-first-time_{context}"]
== CA certificate error displayed when accessing the {mtc-short} console for the first time
If a `CA certificate error` message is displayed the first time you try to access the {mtc-short} console, the likely cause is the use of self-signed CA certificates in one of the clusters.
To resolve this issue, navigate to the `oauth-authorization-server` URL displayed in the error message and accept the certificate. To resolve this issue permanently, add the certificate to the trust store of your web browser.
If an `Unauthorized` message is displayed after you have accepted the certificate, navigate to the {mtc-short} console and refresh the web page.
[id="oauth-timeout-error-in-console_{context}"]
== OAuth timeout error in the {mtc-short} console
If a `connection has timed out` message is displayed in the {mtc-short} console after you have accepted a self-signed certificate, the causes are likely to be the following:
* Interrupted network access to the OAuth server
* Interrupted network access to the {product-title} console
* Proxy configuration that blocks access to the `oauth-authorization-server` URL. See link:https://access.redhat.com/solutions/5514491[MTC console inaccessible because of OAuth timeout error] for details.
To determine the cause of the timeout:
* Inspect the {mtc-short} console web page with a browser web inspector.
* Check the `Migration UI` pod log for errors.
[id="certificate-signed-by-unknown-authority-error_{context}"]
== Certificate signed by unknown authority error
If you use a self-signed certificate to secure a cluster or a replication repository for the {mtc-short}, certificate verification might fail with the following error message: `Certificate signed by unknown authority`.
You can create a custom CA certificate bundle file and upload it in the {mtc-short} web console when you add a cluster or a replication repository.
.Procedure
Download a CA certificate from a remote endpoint and save it as a CA bundle file:
[source,terminal]
----
$ echo -n | openssl s_client -connect <host_FQDN>:<port> \ <1>
| sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > <ca_bundle.cert> <2>
----
<1> Specify the host FQDN and port of the endpoint, for example, `api.my-cluster.example.com:6443`.
<2> Specify the name of the CA bundle file.
[id="backup-storage-location-errors-in-velero-pod-log_{context}"]
== Backup storage location errors in the Velero pod log
If a `Velero` `Backup` custom resource contains a reference to a backup storage location (BSL) that does not exist, the `Velero` pod log might display the following error messages:
[source,terminal]
----
$ oc logs <Velero_Pod> -n openshift-migration
----
.Example output
[source,terminal]
----
level=error msg="Error checking repository for stale locks" error="error getting backup storage location: BackupStorageLocation.velero.io \"ts-dpa-1\" not found" error.file="/remote-source/src/github.com/vmware-tanzu/velero/pkg/restic/repository_manager.go:259"
----
You can ignore these error messages. A missing BSL cannot cause a migration to fail.
[id="pod-volume-backup-timeout-error-in-velero-pod-log_{context}"]
== Pod volume backup timeout error in the Velero pod log
If a migration fails because Restic times out, the following error is displayed in the `Velero` pod log.
[source,terminal]
----
level=error msg="Error backing up item" backup=velero/monitoring error="timed out waiting for all PodVolumeBackups to complete" error.file="/go/src/github.com/heptio/velero/pkg/restic/backupper.go:165" error.function="github.com/heptio/velero/pkg/restic.(*backupper).BackupPodVolumes" group=v1
----
The default value of `restic_timeout` is one hour. You can increase this parameter for large migrations, keeping in mind that a higher value may delay the return of error messages.
.Procedure
. In the {product-title} web console, navigate to *Ecosystem* -> *Installed Operators*.
. Click *{mtc-full} Operator*.
. In the *MigrationController* tab, click *migration-controller*.
. In the *YAML* tab, update the following parameter value:
+
[source,yaml]
----
spec:
restic_timeout: 1h <1>
----
<1> Valid units are `h` (hours), `m` (minutes), and `s` (seconds), for example, `3h30m15s`.
. Click *Save*.
[id="restic-verification-errors-in-migmigration-custom-resource_{context}"]
== Restic verification errors in the MigMigration custom resource
If data verification fails when migrating a persistent volume with the file system data copy method, the following error is displayed in the `MigMigration` CR.
.Example output
[source,yaml]
----
status:
conditions:
- category: Warn
durable: true
lastTransitionTime: 2020-04-16T20:35:16Z
message: There were verify errors found in 1 Restic volume restores. See restore `<registry-example-migration-rvwcm>`
for details <1>
status: "True"
type: ResticVerifyErrors <2>
----
<1> The error message identifies the `Restore` CR name.
<2> `ResticVerifyErrors` is a general error warning type that includes verification errors.
[NOTE]
====
A data verification error does not cause the migration process to fail.
====
You can check the `Restore` CR to identify the source of the data verification error.
.Procedure
. Log in to the target cluster.
. View the `Restore` CR:
+
[source,terminal]
----
$ oc describe <registry-example-migration-rvwcm> -n openshift-migration
----
+
The output identifies the persistent volume with `PodVolumeRestore` errors.
+
.Example output
[source,yaml]
----
status:
phase: Completed
podVolumeRestoreErrors:
- kind: PodVolumeRestore
name: <registry-example-migration-rvwcm-98t49>
namespace: openshift-migration
podVolumeRestoreResticErrors:
- kind: PodVolumeRestore
name: <registry-example-migration-rvwcm-98t49>
namespace: openshift-migration
----
. View the `PodVolumeRestore` CR:
+
[source,terminal]
----
$ oc describe <migration-example-rvwcm-98t49>
----
+
The output identifies the `Restic` pod that logged the errors.
+
.Example output
[source,yaml]
----
completionTimestamp: 2020-05-01T20:49:12Z
errors: 1
resticErrors: 1
...
resticPod: <restic-nr2v5>
----
. View the `Restic` pod log to locate the errors:
+
[source,terminal]
----
$ oc logs -f <restic-nr2v5>
----
[id="restic-permission-error-when-migrating-from-nfs-storage-with-root-squash-enabled_{context}"]
== Restic permission error when migrating from NFS storage with root_squash enabled
If you are migrating data from NFS storage and `root_squash` is enabled, `Restic` maps to `nfsnobody` and does not have permission to perform the migration. The following error is displayed in the `Restic` pod log.
.Example output
[source,terminal]
----
backup=openshift-migration/<backup_id> controller=pod-volume-backup error="fork/exec /usr/bin/restic: permission denied" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/controller/pod_volume_backup_controller.go:280" error.function="github.com/vmware-tanzu/velero/pkg/controller.(*podVolumeBackupController).processBackup" logSource="pkg/controller/pod_volume_backup_controller.go:280" name=<backup_id> namespace=openshift-migration
----
You can resolve this issue by creating a supplemental group for Restic and adding the group ID to the `MigrationController` CR manifest.
.Procedure
. Create a supplemental group for Restic on the NFS storage.
. Set the `setgid` bit on the NFS directories so that group ownership is inherited.
. Add the `restic_supplemental_groups` parameter to the `MigrationController` CR manifest on the source and target clusters:
+
[source,yaml]
----
spec:
restic_supplemental_groups: <group_id> <1>
----
<1> Specify the supplemental group ID.
. Wait for the `Restic` pods to restart so that the changes are applied.