openshift-docs/modules/migration-error-messages.adoc

// Module included in the following assemblies:
//
// * migration/migrating_3_4/troubleshooting-3-4.adoc
// * migration/migrating_4_1_4/troubleshooting-4-1-4.adoc
// * migration/migrating_4_2_4/troubleshooting-4-2-4.adoc
[id='migration-error-messages_{context}']
= Error messages and resolutions

This section describes common error messages and how to resolve their underlying causes.

[id='ca-certificate-error-in-console_{context}']
== CA certificate error in the {mtc-short} console

If a `CA certificate error` message is displayed the first time you try to access the {mtc-short} console, the likely cause is the use of self-signed CA certificates in one of the clusters.

To resolve this issue, navigate to the `oauth-authorization-server` URL displayed in the error message and accept the certificate. To resolve this issue permanently, add the certificate to the trust store of your web browser.

If an `Unauthorized` message is displayed after you have accepted the certificate, navigate to the {mtc-short} console and refresh the web page.

[id='oauth-timeout-error-in-console_{context}']
== OAuth timeout error in the {mtc-short} console

If a `connection has timed out` message is displayed in the {mtc-short} console after you have accepted a self-signed certificate, the causes are likely to be the following:

* Interrupted network access to the OAuth server
* Interrupted network access to the {product-title} console
* Proxy configuration that blocks access to the `oauth-authorization-server` URL. See link:https://access.redhat.com/solutions/5514491[MTC console inaccessible because of OAuth timeout error] for details.

You can determine the cause of the timeout.

.Procedure

. Navigate to the {mtc-short} console and inspect the elements with the browser web inspector.
. Check the `migration-ui` pod log:
+
[source,terminal]
----
$ oc logs migration-ui-<86b679ffc7-h6l6v> -n openshift-migration
----

[id='podvolumebackups-timeout-error-in-velero-log_{context}']
== `PodVolumeBackups` timeout error in Velero log

If a migration fails because Restic times out, the following error is displayed in the Velero log.

.Example output
[source,terminal]
----
level=error msg="Error backing up item" backup=velero/monitoring error="timed out waiting for all PodVolumeBackups to complete" error.file="/go/src/github.com/heptio/velero/pkg/restic/backupper.go:165" error.function="github.com/heptio/velero/pkg/restic.(*backupper).BackupPodVolumes" group=v1
----

The default value of `restic_timeout` is one hour. You can increase this parameter for large migrations, keeping in mind that a higher value may delay the return of error messages.

.Procedure

. In the {product-title} web console, navigate to *Operators* -> *Installed Operators*.
. Click *{mtc-short} Operator*.
. In the *MigrationController* tab, click *migration-controller*.
. In the *YAML* tab, update the following parameter value:
+
[source,yaml]
----
spec:
  restic_timeout: 1h <1>
----
<1> Valid units are `h` (hours), `m` (minutes), and `s` (seconds), for example, `3h30m15s`.

. Click *Save*.

[id='restic-verification-error-migmigration_{context}']
== `ResticVerifyErrors` in the MigMigration Custom Resource

If data verification fails when migrating a PV with the file system data copy method, the following error is displayed in the MigMigration Custom Resource (CR).

.Example output
[source,yaml]
----
status:
  conditions:
  - category: Warn
    durable: true
    lastTransitionTime: 2020-04-16T20:35:16Z
    message: There were verify errors found in 1 Restic volume restores. See restore `<registry-example-migration-rvwcm>`
      for details <1>
    status: "True"
    type: ResticVerifyErrors <2>
----
<1> The error message identifies the Restore CR name.
<2> `ResticErrors` is a general error warning that includes verification errors.

[NOTE]
====
A data verification error does not cause the migration process to fail.
====

You can check the Restore CR to identify the source of the data verification error.

.Procedure

. Log in to the target cluster.
. View the Restore CR:
+
[source,terminal]
----
$ oc describe <registry-example-migration-rvwcm> -n openshift-migration
----
+
The output identifies the PV with `PodVolumeRestore` errors.
+
.Example output
[source,yaml]
----
status:
  phase: Completed
  podVolumeRestoreErrors:
  - kind: PodVolumeRestore
    name: <registry-example-migration-rvwcm-98t49>
    namespace: openshift-migration
  podVolumeRestoreResticErrors:
  - kind: PodVolumeRestore
    name: <registry-example-migration-rvwcm-98t49>
    namespace: openshift-migration
----

. View the `PodVolumeRestore` CR:
+
[source,terminal]
----
$ oc describe <migration-example-rvwcm-98t49>
----
+
The output identifies the Restic pod that logged the errors.
+
.Example output
[source,yaml]
----
  completionTimestamp: 2020-05-01T20:49:12Z
  errors: 1
  resticErrors: 1
  ...
  resticPod: <restic-nr2v5>
----

. View the Restic pod log to locate the errors:
+
[source,terminal]
----
$ oc logs -f <restic-nr2v5>
----

ifeval::["{mtc-version}" < "1.3"]
[id='restic-permission-denied-error-for-nfs-storage_{context}']
== `restic: permission denied` error for NFS storage

If you are migrating data from NFS storage and `root_squash` is enabled, Restic maps to `nfsnobody` and does not have permission to perform the migration. The following error is displayed in the Restic pod log.

.Example output
[source,terminal]
----
backup=openshift-migration/<backup_id> controller=pod-volume-backup error="fork/exec /usr/bin/restic: permission denied" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/controller/pod_volume_backup_controller.go:280" error.function="github.com/vmware-tanzu/velero/pkg/controller.(*podVolumeBackupController).processBackup" logSource="pkg/controller/pod_volume_backup_controller.go:280" name=<backup_id> namespace=openshift-migration
----

You can resolve this issue by creating a supplemental group for Restic and adding the group ID to the Migration Controller.

.Procedure

. Create a supplemental group for Restic on the NFS storage.
. Set the `setgid` bit on the NFS directories so that group ownership is inherited.
. Add the `restic_supplemental_groups` parameter to the Migration Controller manifest on the source and target clusters:
+
[source,yaml]
----
spec:
  restic_supplemental_groups: <group_id> <1>
----
<1> Specify the supplemental group ID.

. Wait for the Restic pods to restart so that the changes are applied.
endif::[]