Results 1 to 6 of 6

Thread: NFS client freezes when NFS server system loses connection

Hybrid View

  1. NFS client freezes when NFS server system loses connection

    I have a few servers on a LAN where I share out the /scratch folder on each via NFS-server,
    and on each server using NFS-client I mount the other's scratch folders.
    Problem arises when any one of the servers goes down, for example a reboot.
    When an NFS-server is down, if cannot log in to any other system which is an NFS-client to that server.
    Specifically:
    - an SSH connection to the NFS-client system can be established successfully,
    - you can enter your username and password,
    - but after entering your password you never get a prompt and the SSH connection is frozen.
    - After however long if the NFS-server system finally becomes available, you will get the prompt and be able to use that SSH connection.

    How can I make it so that this does not happen?

    there is this at stackexchange with further describes the problem in general, however there does not seem to be a solution:
    http://unix.stackexchange.com/questi...-client-system




    my /etc/fstab file contains: hpc2:/scratch /scratch_hpc2 nfs defaults 0 0

    the /etc/exports file on the nfs-server system contains: /scratch <ip_address>(rw,root_squash,sync,no_subtree_chec k)

    I have set up NFS-server via YAST and I do NOT use NFSv4 nor do I use GSS security,

    and /proc/mounts from an nfs-client system shows:

    hpc2:/scratch /scratch_hpc2 nfs rw,relatime,vers=3,rsize=1048576,wsize=1048576,nam len=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys ,mountaddr=<ip_address>,mountvers=3,mountport=4055 1,mountproto=udp,local_lock=none,addr=<ip_address> 0 0

  2. #2

    Re: NFS client freezes when NFS server system loses connection

    On 09/19/2016 09:14 AM, ron7000 wrote:
    >
    > I have a few servers on a LAN where I share out the /scratch folder on
    > each via NFS-server,
    > and on each server using NFS-client I mount the other's scratch
    > folders.
    > Problem arises when any one of the servers goes down, for example a
    > reboot.
    > When an NFS-server is down, if cannot log in to any other system which
    > is an NFS-client to that server.
    > Specifically:
    > - an SSH connection to the NFS-client system can be established
    > successfully,
    > - you can enter your username and password,
    > - but after entering your password you never get a prompt and the SSH
    > connection is frozen.
    > - After however long if the NFS-server system finally becomes available,
    > you will get the prompt and be able to use that SSH connection.
    >
    > How can I make it so that this does not happen?
    >
    > there is this at stackexchange with further describes the problem in
    > general, however there does not seem to be a solution:
    > http://unix.stackexchange.com/questi...-client-system


    The biggest difference I see between your post and the one at
    StackExchange is that you are having a problem merely on a login, where
    the StackExchange thread has a problem when somebody tries to access the
    mountpoint explicitly.

    Are you trying, via login somehow (.bashrc, .profile, user's home
    directory assignment), to access that mountpoint? If so, why?

    Do not try to access mountpoints that are not there; that's just silly.

    Does this happen if you just launch a new bash shell via an
    already-logged-in user?

    Code:
    bash
    --
    Good luck.

    If you find this post helpful and are logged into the web interface,
    show your appreciation and click on the star below...

  3. Re: NFS client freezes when NFS server system loses connecti

    Hi,

    try mounting it with the "soft" option. From "man 5 nfs":

    soft / hard Determines the recovery behavior of the NFS client after an NFS request times out. If neither option is specified (or if the hard option is specified), NFS requests are retried indefinitely. If the soft option is specified, then the NFS
    client fails an NFS request after retrans retransmissions have been sent, causing the NFS client to return an error to the calling application.
    Regards,
    J
    Last edited by jmozdzen; 23-Sep-2016 at 15:42.
    From the times when today's "old school" was "new school"

    If you find this post helpful and are logged into the web interface, show your appreciation and click on the star below...

  4. Re: NFS client freezes when NFS server system loses connecti

    everything i've read about the soft option says to never use it, and can easily lead to data corruption.

    Do not try to access mountpoints that are not there; that's just silly.
    i think you are missing my point, otherwise you are effectively saying don't use nfs because there is no way to handle the situation when a system that is an nfs-server goes down.

    When all systems are up and running, everything is fine and the mount points work but the problem is when one system goes down for whatever reason at any point in time then all the other systems for any user having a shell open will freeze. Now yeah the mount points are still there going to that system that's now offline. I have no way of automatically knowing exactly when a system goes down to then get in quickly to undo all mount points to it.

    this freezing will also happen for any existing shell window that is open for any user,
    in addition to if you try to log in to an nfs-client system where one of its mount points does not exist.
    This is completely separate from having a working shell window where at the prompt you try to do: cd /folder_from_nfsserver_thathascrashed/
    What i am saying is when an nfs-server goes down, the shell windows that are already open become unresponsive until the nfs-server comes back online.
    Even when i am logged in under my user account I cannot switch user with SU to become root to then type unmount /folder_from_nfsserver_thathascrashed/
    After doing su and typing root password, I never get a prompt !!!
    Last edited by ron7000; 27-Sep-2016 at 18:51.

  5. #5

    Re: NFS client freezes when NFS server system loses connection

    On 09/27/2016 11:54 AM, ron7000 wrote:
    >
    > everything i've read about the soft option says to never use it, and can
    > easily lead to data corruption.
    >
    >> Do not try to access mountpoints that are not there; that's just silly.

    >
    > i think you are missing my point, otherwise you are effectively saying
    > don't use nfs because there is no way to handle the situation when a
    > system that is an nfs-server goes down.


    I definitely do not mean to discourage NFS use that much. I also cannot
    duplicate your issue, at least not entirely, though my system's setup
    (SLES 12) is using NFSv4, which you indicated is not the case with your
    systems One thing that is similar-ish, is that if I am on an NFS client
    system, and I then stop the NFS service on my server (not stopping the
    entire server yet, just its NFS service), and then I go into the parent
    directory of my mountpoint (/import), I cannot do a long-listing of the
    directory contents, which would effectively be the mount points. Where
    you have your mount point, I think, right off the root of your filesystem
    (/), perhaps that is something similar. If you have something in a login
    script trying to do a listing of the filesystem root, or doing something
    equivalent with another command, perhaps that is where it blocks because
    of the inability to look at the mount point fully.

    Even in that case, after I lock up the 'ls' command, hitting Ctrl+c fixes
    it, so perhaps try that during your SSH login that hangs to see if you are
    in a login script. If you are, figure out which one, and where, and maybe
    a fix can be implemented that way too. It would be neat to see what
    happens if you move your mountpoints one level deeper, for example,
    instead of /scratch-hpc2 you put them at /import/scratch-hpc2 or somewhere
    else not as public as the root of the filesystem.

    If you cannot get in even with Ctrl+c after the SSH login takes the
    username and password successfully, the next step is to figure out what is
    blocking, and for that I would use strace. While already logged into the
    box, find a command that locks things up, like loading a new shell
    ('bash') may. To use strace, run the following and be prepared for a lot
    of output so be sure your history is big:

    Code:
    strace -ttt -ff bash
    Post the output here and let's see what the last line is.

    --
    Good luck.

    If you find this post helpful and are logged into the web interface,
    show your appreciation and click on the star below...

  6. Re: NFS client freezes when NFS server system loses connecti

    Hi ron7000,
    Quote Originally Posted by ron7000 View Post
    everything i've read about the soft option says to never use it, and can easily lead to data corruption.
    well, I haven't seen that happen so far, but of course, the theoretical risk is there. So YMMV.

    The hanging sessions, including logins, point at hanging NFS mounts somewhere in the search path - either the shell search path ($PATH) or some access path used by one of the generally started programs. I even had Eclipse traverse a multitude of directories that should have been none of its business... and noticed, because those access hung on a bad NFS mount point.

    So from my point of view, you only have to options with NFS:

    Either you have the client wait for any NFS access ad infinitum, which will help to avoid corruption but may cause severe "hangs" from the users' POV.

    Or you will allow the I/O operations to fail when the server is down.

    If you've found a third option, please let me know - we run many systems relying on NFS mounts all over the place, with some of the servers intentionally unavailable over extended periods of time. So we'd suffer from hangs without the "soft" option, if those resources weren't properly unmounted before killing the server. Which can easily happen when the NFS server is a mobile system.

    Regards,
    J
    Last edited by jmozdzen; 04-Oct-2016 at 17:01.
    From the times when today's "old school" was "new school"

    If you find this post helpful and are logged into the web interface, show your appreciation and click on the star below...

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •