Troubleshooting: Instance manager pods are restarted every hour
Phan Le | February 25, 2022
v1.0.1 or newer
Each Longhorn volume has one engine and one or more replicas (see more detail about Longhorn architecture at here).
When a Longhorn volume is attached, Longhorn launches a process for each engine/replica object.
The engine process will be launched inside engine instance manager pods (the
instance-manager-e-xxxxxxxx pods inside
The replica process will be launched inside replica instance manager pods (the
instance-manager-r-xxxxxxxx pods inside
The instance manager pods are restarted every hour. As the consequence, Longhorn volumes and the workload pods are crashed every hour.
One potential root cause is that the cluster has the default PriorityClass (i.e., the PriorityClass with
globalDefault field set to
true) but the PriorityClass setting in Longhorn is empty.
See more about PriorityClass at here.
When Longhorn creates the instance manager pods, it doesn’t set the PriorityClass for them because the PriorityClass setting in Longhorn is empty. Because the cluster has default PriorityClass, Kubernetes automatically uses it for newly created Pods without a PriorityClassName. Later on, Longhorn detects the difference between the actual PriorityClass in the instance manager pods and the PriorityClass in Longhorn setting, so Longhorn deletes and recreates the instance manager pods. This happens every hour since Longhorn resyncs all setting every hour.
Set the PriorityClass setting in Longhorn to be the same as the default PriorityClass
Recent articlesTroubleshooting: Volumes Stuck in Attach/Detach Loop When Using Longhorn on OKD
© 2019-2023 Longhorn Authors | Documentation Distributed under CC-BY-4.0
© 2023 The Linux Foundation. All rights reserved. The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our Trademark Usage page.