Volume Recovery
Longhorn provides two mechanisms for maintaining volume functionality in a variety of situations.
This recovery mechanism is enabled by the setting Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly.
When one of the following situations occurs, Longhorn automatically attempts to delete workload pods that are managed by a controller (for example, Deployment, StatefulSet, or DaemonSet). After deletion, the controller restarts the workload pod and Kubernetes handles volume reattachment and remounting.
If you want to prevent Longhorn from automatically deleting workload pods, disable the setting Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly on the Longhorn UI.
Longhorn does not delete pods without a controller because such pods cannot be restarted after deletion. To recover volumes that are unexpectedly detached, you must manually delete and restart the pods without a controller.
This recovery mechanism is not controlled by any specific setting.
The state of a volume can change to read-only when IO errors occur. IO errors can be caused by a variety of issues, including the following:
Longhorn checks the state of the volume’s global mount point every 10 seconds. When the volume’s filesystem changes to read-only, Longhorn updates the condition to the volume’s data engine. Longhorn then automatically attempts to remount the global mount point on the host to change the state back to read-write. Upon successful remounting, the workload pods continue functioning without disruption. However, if the mount point becomes write-protected and Longhorn fails to remount the mount point, you may still need to manually recreate the workload to force it reattach and remount the volume.
Note: This mechanism might not work in some situations. For example, when the volume’s data engine crashes, Longhorn automatically detaches and reattaches the volume. The filesystem changes to read-only in this case. Longhorn will detect the read-only mode and update the state, but Automatic Volume Remounting cannot change it back to read-write because the device is now write-protected. In this case, you can only rely on the Automatic Workload Pod Deletion mechanism, which enables volume remounting after the workload pod is recreated.
Automatic Workload Pod Deletion is triggered when unexpected failures happen. The controller deletes and then restarts the workload pod, and Kubernetes handles volume reattachment and remounting. The process may cause interruptions to the workload. If you want to prevent Longhorn from automatically deleting workload pods, disable the setting Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly on the Longhorn UI.
Automatic Volume Remounting is triggered when the volume’s filesystem changes to read-only. Longhorn remounts the global mount point on the host to change the state back to read-write.
© 2019-2024 Longhorn Authors | Documentation Distributed under CC-BY-4.0
© 2024 The Linux Foundation. All rights reserved. The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our Trademark Usage page.