PDA

View Full Version : SLES 11 SP4 System unresponsive when CIFS mount is down,



duwallace
01-Dec-2015, 16:14
Having an issue with CIFS mounts that are defined to the SLES 11 SP4 server that cause the ability to interact with it to become extremely slow in responsiveness when the mount becomes unavailable. In this configuration, the mount AA_NexPress_1 is associated with a folder on Windows Server. If for example the Windows Server were powered down RDP session with the SLES server are pretty much locked up. I can ssh into the SLES server and go to any directory and do a simple ll list except root where these mounts are defined. As soon the NexPress Windows Server is up and the directory is reachable the SLES server immediately starts responding normally.

Below is the directory definitions for the mounts, as well as the mount definitions in fstab.


drwxr-xr-x 2 root root 0 Jun 2 11:10 AA_NexPress_1
drwxr-xr-x 2 root root 0 Jun 24 13:57 AA_NexPress_2

//10.5.4.51/HotFolder /AA_NexPress_1 cifs auto,domain=ebnt-1,username=dbcuser,password=xxxxxx 0 0
//10.5.4.52/HotFolder /AA_NexPress_2 cifs auto,domain=ebnt-1,username=dbcuser,password=xxxxxx 0 0

Is there a parameter I can set with fstab so that it times out if the mount becomes unavailable?

jmozdzen
01-Dec-2015, 17:10
Hi duwallace,

have you tried soft-mounting (-o soft, or adding "soft" to the options list in /etc/fstab) the shares?

Regards,
Jens

duwallace
07-Dec-2015, 19:25
Hi duwallace,

have you tried soft-mounting (-o soft, or adding "soft" to the options list in /etc/fstab) the shares?

Regards,
Jens


Hi Jens,

Sorry for the delay in response. I did try adding "soft" to the FSTAB, though the mount stays in place until we issue the 'umount -fl /AA_NexPress_1' command. After which, the system respond normally. Just hoping to get it to drop when the mount is unavailable.

Thanks,
Duane

jmozdzen
08-Dec-2015, 13:50
Hi Duane,

> Just hoping to get it to drop when the mount is unavailable.

I believe you can only influence if requests to the mounted FS will time out, rather than being retried indefinitely. I wouldn't know how I'd implement some "unmount on error" approach, other than by monitoring system messages (or some access test) and then "umount -fl" in response.

Regards,
Jens