Troubleshooting: NoExecute taint prevents workloads from terminating
| July 2, 2024
All Longhorn versions.
Applying a NoExecute
taint to a node causes all pods on that node that cannot tolerate the taint to terminate. Users
may expect pods using Longhorn volumes and managed by a controller (e.g. Deployment pods) to be able to restart on
different nodes after the taint is applied. However, the replacement pods remain ContainerCreating
and the old pods
remain Terminating
indefinitely.
For example, after applying a taint to the node running this Deployment pod, the cluster remains in the following state.
eweber@laptop:~/> kubectl taint node eweber-v126-worker-9c1451b4-kgxdq k=v:NoExecute
node/eweber-v126-worker-9c1451b4-kgxdq tainted
eweber@laptop:~/website> kubectl get pod -owide
mysql-56c8c6775b-8f5gh 0/1 Terminating 0 30m 10.42.2.58 eweber-v126-worker-9c1451b4-kgxdq <none> <none>
mysql-56c8c6775b-rph8k 0/1 ContainerCreating 0 28m <none> eweber-v126-worker-9c1451b4-rw5hf <none> <none>
Describing the replacement pod reveals the immediate cause for the lack of progress.
...
eweber@laptop:~/> kubectl describe pod mysql-56c8c6775b-rph8k
Warning FailedAttachVolume 27m attachdetach-controller Multi-Attach error for volume "pvc-b23fce3b-cada-43a9-89b8-2eb9b97e39c9" Volume is already exclusively attached to one node and can't be attached to another
The above Multi-Attach error is generated by Kubernetes, not Longhorn. Kubernetes will not create a VolumeAttachment requesting Longhorn attach the volume to a new node until it is detached from the old one. Currently, there is only one VolumeAttachment for the volume in the cluster, and it still references the old node.
eweber@laptop:~/> kubectl get volumeattachment
csi-f1a6eae2f691ad48e39aca528454c64f7c46a3e6c446e6f9a8ecd960895bf0b6 driver.longhorn.io pvc-b23fce3b-cada-43a9-89b8-2eb9b97e39c9 eweber-v126-worker-9c1451b4-kgxdq true 31m
In fact, Kubernetes will not attempt to delete this old VolumeAttachment until kubelet on the old node reports that all
pods using the volume have been successfully torn down. As long as the old pod is stuck Terminating
, no progress can
be made.
The kubelet logs on the old node reveal the reason for the long termination.
Jun 28 20:40:37 eweber-v126-worker-9c1451b4-kgxdq k3s[753]: I0628 20:40:37.391775 753 reconciler_common.go:172] "operationExecutor.UnmountVolume started for volume \"mysql-volume\" (UniqueName: \"kubernetes.io/csi/driver.longhorn.io^pvc-b23fce3b-cada-43a9-89b8-2eb9b97e39c9\") pod \"942ef69b-2e06-47f3-8bac-cc738c8baa00\" (UID: \"942ef69b-2e06-47f3-8bac-cc738c8baa00\") "
Jun 28 20:40:37 eweber-v126-worker-9c1451b4-kgxdq k3s[753]: E0628 20:40:37.391968 753 nestedpendingoperations.go:348] Operation for "{volumeName:kubernetes.io/csi/driver.longhorn.io^pvc-b23fce3b-cada-43a9-89b8-2eb9b97e39c9 podName:942ef69b-2e06-47f3-8bac-cc738c8baa00 nodeName:}" failed. No retries permitted until 2024-06-28 20:42:39.391919153 +0000 UTC m=+10744.364754873 (durationBeforeRetry 2m2s). Error: UnmountVolume.TearDown failed for volume "mysql-volume" (UniqueName: "kubernetes.io/csi/driver.longhorn.io^pvc-b23fce3b-cada-43a9-89b8-2eb9b97e39c9") pod "942ef69b-2e06-47f3-8bac-cc738c8baa00" (UID: "942ef69b-2e06-47f3-8bac-cc738c8baa00") : kubernetes.io/csi: Unmounter.TearDownAt failed to get CSI client: driver name driver.longhorn.io not found in the list of registered CSI drivers
The pod can’t be torn down because the Longhorn CSI plugin is no longer registered with kubelet. It is the plugin’s
responsibility to ensure that Longhorn volumes are unmounted safely. But in this cluster, Longhorn has not been
configured to tolerate the applied NoExecute
taint. This has caused Longhorn pods (including the Longhorn CSI plugin
pod) to terminate. No progress can be made without a running Longhorn CSI plugin on the node.
eweber@laptop:~/> kubectl -n longhorn-system get pod -owide | grep eweber-v126-worker-9c1451b4-kgxdq
# Empty...
If you plan to apply NoExecute
taints to nodes that run Longhorn volumes, configure
Longhorn to tolerate them. This is
best done at install time, as tolerations cannot be updated while volumes are attached.
If you need to apply a NoExecute
taint and cannot change the taint toleration setting before, drain the node first and
then apply the intended taint. This gives all workloads consuming Longhorn volumes the opportunity to migrate to other
nodes. When the NoExecute
taint is applied, there is no longer a need for Longhorn components to run on the tainted
node.
If you already applied a NoExecute
taint that Longhorn can’t tolerate and are stuck in the situation described, there
are two options.
NoExecute
taint and wait for the situation to resolve itself. Once the Longhorn CSI plugin restarts on
the old node, pod termination will complete successfully. Then, follow the best practices above.Recent articles
Troubleshooting: NoExecute taint prevents workloads from terminating© 2019-2024 Longhorn Authors | Documentation Distributed under CC-BY-4.0
© 2024 The Linux Foundation. All rights reserved. The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our Trademark Usage page.