Troubleshooting: Open-iSCSI on RHEL based systems

Keith Lucas | February 22, 2022

Applicable versions

All Longhorn versions.

Symptons

The iscsi.service systemd service may add about 2-3 minutes to the boot up time of a node if the node is restarted with longhorn volumes attached to it.

Background

Longhorn uses open-iscsi to create block devices. The RPM (iscsi-initiator-utils) for open-iscsi on Red Hat Enterprise Linux based systems has several system services. The iscsi.service is for reestablishing iSCSI connections upon reboot by reading the database stored in /var/lib/iscsi/nodes.

Longhorn uses the iscsiadm command to create an iSCSI block device individually when a Longhorn volume is attached. This creates a subdirectory in /var/lib/iscsi/nodes. If Longhorn is able to detach the volume from the node, it will clean up the subdirectory in /var/lib/iscsi/nodes. However, if the node crashes or is rebooted when a Longhorn volume is attached to a pod running on that node, the subdirectory in /var/lib/iscsi/nodes will remain there.

Solution

If the iscsi.service is enabled on the node, the service will attempt to discover the nodes left in /var/lib/iscsi/nodes subdirectories. In most cases, Longhorn would be the only user of iSCSI on the node. In that case, it is recommended to disable the iscsi.service on the node:

systemctl disable iscsi.service

It may be possible to use the iscsi.service as intended for non-Longhorn iSCSI devices. In this case, it is necessary to not change the global /etc/iscsi/iscsid.conf for the non-Longhorn devices. Longhorn relies on the default configuration.

Back to Knowledge Base

Recent articles

Instruction: How To Migrate Longhorn Chart Installed In Old Rancher UI To The Chart In New Rancher UI
Troubleshooting: Unable to access an NFS backup target
Troubleshooting: Pod with `volumeMode: Block` is stuck in terminating
Troubleshooting: Instance manager pods are restarted every hour
Troubleshooting: Open-iSCSI on RHEL based systems
Troubleshooting: Upgrading volume engine is stuck in deadlock
Tip: Set Longhorn To Only Use Storage On A Specific Set Of Nodes
Troubleshooting: Some old instance manager pods are still running after upgrade
Troubleshooting: Volume cannot be cleaned up after the node of the workload pod is down and recovered
Troubleshooting: DNS Resolution Failed
Troubleshooting: Generate pprof runtime profiling data
Troubleshooting: Pod stuck in creating state when Longhorn volumes filesystem is corrupted
Troubleshooting: None-standard Kubelet directory
Troubleshooting: Longhorn default settings do not persist
Troubleshooting: Recurring job does not create new jobs after detaching and attaching volume
Troubleshooting: Use Traefik 2.x as ingress controller
Troubleshooting: Create Support Bundle with cURL
Troubleshooting: Longhorn RWX shared mount ownership is shown as nobody in consumer Pod
Troubleshooting: `MountVolume.SetUp failed for volume` due to multipathd on the node
Troubleshooting: Longhorn-UI: Error during WebSocket handshake: Unexpected response code: 200 #2265
Troubleshooting: Longhorn volumes take a long time to finish mounting
Troubleshooting: `volume readonly or I/O error`
Troubleshooting: `volume pvc-xxx not scheduled`

© 2019-2022 Longhorn Authors | Documentation Distributed under CC-BY-4.0


© 2022 The Linux Foundation. All rights reserved. The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our Trademark Usage page.