Troubleshooting: Migratable RWX volume stuck in detaching/attaching loop
| January 6, 2026
During a VM live migration or a cluster upgrade, a Migratable RWX volume may become stuck in an infinite reconciliation loop. While the volume appears to be unused, it fails to stay in a stable detached state, preventing any new workload from attaching to it.
Observed Behavior:
detached and detaching.status.currentNodeID."in-progress" (due to stale metadata), Longhorn immediately tries to transition the volume to detaching to clean up, then back to detached.Spec.MigrationNodeID is empty (""), but Status.CurrentMigrationNodeID still holds the ID of a previous migration target node.VolumeAttachment objects have been removed, yet the Longhorn Volume object behaves as if a migration finalization is required.Example volume state: The volume remains stuck in detaching even if no workload is running.
$ kubectl get volume -n longhorn-system pvc-840804d8-6f11-49fd-afae-54bc5be639de
NAME STATE ROBUSTNESS NODE
pvc-840804d8-6f11-49fd-afae-54bc5be639de detaching unknown ubuntu-lh-2
Longhorn Manager Logs: The logs on the volume owner node will show failures during the migration finalization phase. The controller is unable to find the engine to complete the switch:
level=warning msg="Failed to finalize the migration" controller=longhorn-volume error="cannot find the current engine for the switching after iterating and cleaning up all engines... all engines may be detached or in a transient state"
level=warning msg="Waiting to confirm migration until migration engine is ready" controller=longhorn-volume-attachment
VolumeAttachment (LHVA) state: Describing the volumeattachments.longhorn.io (LHVA) reveals that the Spec.Attachment Tickets and Status.Attachment Ticket Statuses are empty, yet the resource remains stuck due to a finalizer.
Name: pvc-840804d8-6f11-49fd-afae-54bc5be639de
Namespace: longhorn-system
Kind: VolumeAttachment
Metadata:
Finalizers:
longhorn.io
Spec:
Attachment Tickets: <nil>
Volume: pvc-840804d8-6f11-49fd-afae-54bc5be639de
Status:
Attachment Ticket Statuses: <nil>
This issue occurs when a Migratable RWX live migration is interrupted commonly due to a VM being powered off, a node failure, or an upgrade event - specifically during the engine switching phase of the migration.
During this phase, Longhorn expects to switch the frontend from the source engine to the destination engine. If the workload is stopped while this transition is in progress, both engines may be cleaned up or enter transient states.
As a result:
status.currentMigrationNodeID.In this scenario, the presence of a VolumeAttachment resource is a symptom rather than the root cause.
If the workload or VM has already been shut down and the volume is stuck flapping, manually clear the stale migration metadata from the Volume status.
volume.status.currentMigrationNodeIDForce the volume to drop the migration reference in its status subresource. This stops the controller from attempting to finalize a nonexistent migration.
kubectl patch -n longhorn-system volume <VOLUME_NAME> \
--type=merge \
--subresource status \
-p '{"status":{"currentMigrationNodeID":""}}'
Confirm the volume has transitioned to the detached state.
$ kubectl get volume -n longhorn-system <VOLUME_NAME>
NAME STATE ROBUSTNESS NODE
pvc-840804... detached unknown
You can now safely restart the VM or workload.
Recent articles
© 2019-2026 Longhorn Authors | Documentation Distributed under CC-BY-4.0
© 2026 The Linux Foundation. All rights reserved. The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our Trademark Usage page.