Troubleshooting: Longhorn Manager Stuck in CrashLoopBackOff State Due to Inaccessible Webhook
| January 17, 2025
Longhorn >= v1.5.0.
The webhook services were merged into Longhorn Manager in v1.5.0. Because of the merge, Longhorn Manager now initializes the admission and conversion webhook services first during startup. To ensure that these services are accessible, Longhorn sends a request to the webhook service URL before starting the Longhorn Manager service.
In certain situations, the webhook service may become inaccessible and cause the Longhorn Manager pod to enter a CrashLoopBackOff state. This failure can lead to repeated attempts to restart the pod.
The following sections outline the most common root causes for this issue and their corresponding solutions.
Incorrect firewall configuration may block communication between pods on different nodes in your Kubernetes cluster. Longhorn Manager is unable to access the webhook service, resulting in the CrashLoopBackOff state.
Check your firewall rules and ensure that inter-pod communication is not blocked.
DNS resolution is crucial for accessing services via their internal Kubernetes DNS names. When DNS resolution is not functioning as expected, Longhorn Manager may be unable to reach the webhook service via its DNS name.
Execute the webhook service in a pod, and then check if DNS resolution is functioning correctly by running the following commands:
kubectl exec -it <pod-name> -- /bin/bash
curl https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz
You can also check if either CoreDNS or Kube-DNS is running correctly. For more information, see Debugging DNS Resolution in the Kubernetes documentation.
Hairpinning allows a pod to access itself via its service IP. In some cases, however, a pod may fail to access a service via the service’s internal DNS name. This issue is common in single-node clusters and may also occur in some multi-node clusters.
Verify that the hairpin-mode
flag, which ensures that a pod can access itself via its service IP, is set correctly. For more information, see Edge case: A Pod fails to reach itself via the Service IP in the Kubernetes documentation.
Recent articles
Troubleshooting: Two active engines during volume live migration© 2019-2025 Longhorn Authors | Documentation Distributed under CC-BY-4.0
© 2025 The Linux Foundation. All rights reserved. The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our Trademark Usage page.