Recover Volume after Unexpected Detachment
Longhorn can automatically reattach then remount volumes if an unexpected detachment happens, which can happen during a Kubernetes upgrade or a Docker reboot.
This section assumes familiarity with Linux storage concepts such as attaching and mounting volumes, and Kubernetes configuration of persistent volume storage.
To enable Longhorn to restart workloads after automatically reattaching and remounting volumes
After reattachment and remount are complete, you may need to manually restart the related workload containers for the volume restoration if the following recommended setup is not applied.
The auto remount does not work for xfs
filesystem.
Mounting one more layers with the xfs
filesystem is not allowed and will trigger the error XFS (sdb): Filesystem has duplicate UUID <filesystem UUID> - can't mount
.
If you use the xfs
filesystem, you will need to manually unmount, then mount the xfs
filesystem on the host. The device path on the host for the attached volume is /dev/longhorn/<volume name>
.
In order to recover unexpectedly detached volumes automatically, set restartPolicy
to Always
, then add livenessProbe
for the workloads using Longhorn volumes.
Then those workloads will be restarted automatically after reattachment and remount.
Here is one example for the setup:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: longhorn-volv-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn
resources:
requests:
storage: 2Gi
---
apiVersion: v1
kind: Pod
metadata:
name: volume-test
namespace: default
spec:
restartPolicy: Always
containers:
- name: volume-test
image: nginx:stable-alpine
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
command:
- ls
- /data/lost+found
initialDelaySeconds: 5
periodSeconds: 5
volumeMounts:
- name: volv
mountPath: /data
ports:
- containerPort: 80
volumes:
- name: volv
persistentVolumeClaim:
claimName: longhorn-volv-pvc
livenessProbe
will be <volumeMount.mountPath>/lost+found
livenessProbe.periodSeconds
, e.g., 1s. The liveness command is CPU consuming.This solution is applied only if:
Bidirectional
, and the Longhorn remount operation won’t be propagated to the workload containers if the containers are not restarted.To restart the workload containers,
kubectl -n <namespace of your workload> get pods <workload's pod name> -o wide
ssh
docker ps
By checking the columns COMMAND
and NAMES
of the output, you can find the corresponding container
docker restart <the container ID of the workload>
© 2019-2024 Longhorn Authors | Documentation Distributed under CC-BY-4.0
© 2024 The Linux Foundation. All rights reserved. The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our Trademark Usage page.