PDA

View Full Version : SLES-Other SLES 9,1 NFS issue



krebstar3
05-Sep-2018, 23:42
Hello everyone,

First let me preface by saying, I know this is very old, and if I did not need the server to continue, I would scrap it and install a fresh new OS. However, that being the case, I am stuck with it for now.

I have everyone working like I would expect, except NFS. For some reason when I use the mount -t nfs server:/share/123 /mnt/share it gives me an error:

mount server reported tcp not available, falling back to udp
mount: RPC: Program not registered

I have other linux servers that can access it just fine. Here is the weird part...it was working before. We needed to reboot the server, and when we did, it just stopped working. /etc/fstab has not changed, but for some reason it will not mount the NFS share any longer.

Things I have tried:

stoping/restarting/rebooting portmap, nfs, nfsboot, nfslock
Messing about with the firewall - currently shut off
creating new local folders to try and mount it there
checked and rechecked /etc/exports on the nfs server
tried mounting via IP rather than name
checked /etc/hosts.allow /hosts.deny
Verified that the NFS server is hosting NFS 4,3,2

I am able to ping by name and IP. I can SSH and SCP.

Is there something I am missing? Normally setting up NFS is super fast and easy...

Appreciate any help you can give.

All the best,

kreb

malcolmlewis
06-Sep-2018, 03:25
Hello everyone,

First let me preface by saying, I know this is very old, and if I did not need the server to continue, I would scrap it and install a fresh new OS. However, that being the case, I am stuck with it for now.

I have everyone working like I would expect, except NFS. For some reason when I use the mount -t nfs server:/share/123 /mnt/share it gives me an error:

mount server reported tcp not available, falling back to udp
mount: RPC: Program not registered

I have other linux servers that can access it just fine. Here is the weird part...it was working before. We needed to reboot the server, and when we did, it just stopped working. /etc/fstab has not changed, but for some reason it will not mount the NFS share any longer.

Things I have tried:

stoping/restarting/rebooting portmap, nfs, nfsboot, nfslock
Messing about with the firewall - currently shut off
creating new local folders to try and mount it there
checked and rechecked /etc/exports on the nfs server
tried mounting via IP rather than name
checked /etc/hosts.allow /hosts.deny
Verified that the NFS server is hosting NFS 4,3,2

I am able to ping by name and IP. I can SSH and SCP.

Is there something I am missing? Normally setting up NFS is super fast and easy...

Appreciate any help you can give.

All the best,

kreb
Hi and welcome to the Forum :)
Sure it's not a routing problem, gateway changed etc?

jmozdzen
06-Sep-2018, 12:21
Hi kreb,

please give it a try with "mount -t nfs -o proto=udp,mountproto=udp server:/share/123 /mnt/share" and / or specifying "vers=3" if that fails, just for checks.

My guess is that an update of the tirpc package on the client machine might have changed things, but that's just a "pain in the bones, might be that rain's coming up" type of feeling ;)

Regards,
J

krebstar3
06-Sep-2018, 15:33
Thanks for the tips.

@MalcomLewis

I am not entirely certain it is a routing problem, however they are not routing as they are on the same network and plugged into the same switch.

@JMozden

Those options don't work either:

# mount -t nfs -o proto=udp,vers=3 server:/share /mnt/share
mount server reported tcp not available, falling back to udp
mount: RPC: Program not registered


Any other ideas?

jmozdzen
06-Sep-2018, 15:47
Had you tested "mountproto=udp" as well?

What does "rpcinfo -p <server>" (<server> as specified in your mount command) return? "RPC program not registered" is meant to mean that the (NFS) client requested a specific "program, version, protocol" info from the portmapper running on the server (see the output from rpcinfo for a list of all currently registered prog/vers/proto combos) and received a reply from the server's portmapper service, that no according program was registered.

Regards,
J

krebstar3
06-Sep-2018, 16:01
@jmozden

Thank you for the reply! I just found some pertinent information!

I just noticed this:

On server 1 (the server I am trying to connect to) I get this:

# rpcinfo -p server1
program vers proto port
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100024 1 udp 46075 status
100024 1 tcp 50007 status


On another server that we have, I get this (Mapping works fine)

# rpcinfo -p server2
program vers proto port
100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100024 1 udp 838 status
100024 1 tcp 841 status
100011 1 udp 773 rquotad
100011 2 udp 773 rquotad
100011 1 tcp 776 rquotad
100011 2 tcp 776 rquotad
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100003 4 udp 2049 nfs
100003 2 tcp 2049 nfs
100003 3 tcp 2049 nfs
100003 4 tcp 2049 nfs
100021 1 udp 32768 nlockmgr
100021 3 udp 32768 nlockmgr
100021 4 udp 32768 nlockmgr
100021 1 tcp 32773 nlockmgr
100021 3 tcp 32773 nlockmgr
100021 4 tcp 32773 nlockmgr
100005 1 udp 789 mountd
100005 1 tcp 792 mountd
100005 2 udp 789 mountd
100005 2 tcp 792 mountd
100005 3 udp 789 mountd
100005 3 tcp 792 mountd


I am not seeing nfs on server1 for some reason. However, if I jump on another linux box, and point it at server1 they showup just fine

#rpcinfo -p server1
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100024 1 udp 662 status
100024 1 tcp 662 status
100011 1 udp 875 rquotad
100011 2 udp 875 rquotad
100011 1 tcp 875 rquotad
100011 2 tcp 875 rquotad
100005 1 udp 892 mountd
100005 1 tcp 892 mountd
100005 2 udp 892 mountd
100005 2 tcp 892 mountd
100005 3 udp 892 mountd
100005 3 tcp 892 mountd
100003 2 tcp 2049 nfs
100003 3 tcp 2049 nfs
100003 4 tcp 2049 nfs
100227 2 tcp 2049 nfs_acl
100227 3 tcp 2049 nfs_acl
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100003 4 udp 2049 nfs
100227 2 udp 2049 nfs_acl
100227 3 udp 2049 nfs_acl
100021 1 udp 32769 nlockmgr
100021 3 udp 32769 nlockmgr
100021 4 udp 32769 nlockmgr
100021 1 tcp 32803 nlockmgr
100021 3 tcp 32803 nlockmgr
100021 4 tcp 32803 nlockmgr

I checked and the firewall is turned off, what am I doing wrong?

jmozdzen
06-Sep-2018, 16:32
Hm, sounds strange indeed. My first guess would have been a name resolution issue - but you mentioned that you had tried mounts via IP already.

I remember having had difficulties between an old portmapper and a new libtirpc client... iirc it had to do with trying tcp instead of udp (hence my first suggestions). But that's years back. Could you please also try with "-o udp" (as an alternative to "-o proto=udp"), just for completeness' sake?

You'll find someone with the same symptoms here http://misc.openbsd.narkive.com/AsKKavOC/nfs-portmap-rpc-program-not-registered, and that's from 2002... but then, the thread opener resorted to upgrading ;)

Regards,
J

krebstar3
06-Sep-2018, 16:51
# mount -t nfs -o udp server:/share /mnt/share
mount server reported tcp not available, falling back to udp
mount: RPC: Program not registered

I am just not certain if this is a RedHat issue (NFS server) or a SUSE (Client). Other RedHat servers can use the rpcinfo command and see NFS.

jmozdzen
06-Sep-2018, 17:02
> I am just not certain if this is a RedHat issue (NFS server) or a SUSE (Client). Other RedHat servers can use the rpcinfo command and see NFS.

I wouldn't point at any of the two - it's more that they're probably years apart (assuming that the NFS server is an up-to-date machine), implementation-wise - SLES 9 SP1 is ancient and some "interesting" changes took place in the TIRPC stack, in the recent years.

What I originally didn't recognize is that the *client* is the old machine, I (for whatever reason) assumed that the NFS server was old. It may well be that you'll have to open up things at the NFS server to allow for such old clients (protocol-wise) and that also explains why all those options for the mount did no good - they're to assist new clients accessing old servers ;)

Maybe you might ask some RH support what you'd need to do to support old clients - or try to open a service request with SUSE, if you're under support there. SUSE has an own offering to support RHES systems during migration, so they might have come across such version combinations already.

Regards,
J

krebstar3
06-Sep-2018, 18:34
Weird, it just started working. So I snapshotted it, and left it alone. Was just planning on leaving it for now. However, it literally stopped working on its own...

I have created a support case with RHEL to see if they can help. Also, is there a process to do an in place upgrade of SUSE 9?

kreb

thsundel
07-Sep-2018, 08:31
Weird, it just started working. So I snapshotted it, and left it alone. Was just planning on leaving it for now. However, it literally stopped working on its own...

I have created a support case with RHEL to see if they can help. Also, is there a process to do an in place upgrade of SUSE 9?

kreb

You can do in place upgrade to SLES10SP4 using the media: https://www.suse.com/support/kb/doc/?id=7008357
From there you can go on to more recent version: https://www.suse.com/support/kb/doc/?id=7016711

Thomas

jmozdzen
07-Sep-2018, 15:12
Hi kreb,

Also, is there a process to do an in place upgrade of SUSE 9?

adding to Thomas' reply, if you want to go to SLES15 (latest as of today), you'd need to first upgrade to SLES11SP4 as described in the TIDs linked by Thomas, then can got to SLES 15 directly - see https://www.suse.com/documentation/sles-15/book_sle_upgrade/data/sec_upgrade-paths_supported.html

Only you know what's running on that server - but it'll be a bit of a hassle to bring it to SLES11SP4, i.e. getting your hands on the media and having downtime to do the offline upgrade. I guess that's why the SLES15 guide states about upgrading even from SLES10:


We recommend a fresh installation in this case.

Regards,
J

krebstar3
07-Sep-2018, 15:58
Well, I downloaded and installed SLES12 just for testing purposes and I am running into the same problem. I do not know what the heck is going on!

I contacted RHEL and they did a tcpdump and found that the request is coming from SUSE and going to RHEL, RHEL is responding but for some reason it is not registering with SUSE. They told me that I need to reach out and have someone at SUSE look at the configuration.

We are currently out of support at the moment, but this whole fiasco is holding up an upgrade that I need to get resolved. Thinking about getting some support just to try and get through this issue.

jmozdzen
07-Sep-2018, 17:26
Hi,

one more thing I'd like to check: What kernel is the NFS *server* using?

Regards,
J

krebstar3
07-Sep-2018, 17:49
Hi,

one more thing I'd like to check: What kernel is the NFS *server* using?

Regards,
J

2.6.32-754.2.1.el6.x86_64 - Sent you an email with the pcap

jmozdzen
07-Sep-2018, 18:24
Just for the records: the pcap file shows the Portmap V2 "dump" request and response, the latter listing only portmap itself and "stat" as running services - just like reported above in #6, first invocation of "rpcinfo".

So the server does not report back that NFS is running at all, why (according to the product support response) the client should be able to connect anyhow, I don't know.

Regards,
J

malcolmlewis
07-Sep-2018, 19:05
On Fri 07 Sep 2018 05:34:01 PM CDT, jmozdzen wrote:

Just for the records: the pcap file shows the Portmap V2 "dump" request
and response, the latter listing only portmap itself and "stat" as
running services - just like reported above in #6, first invocation of
"rpcinfo".

So the server does not report back that NFS is running at all, why
(according to the product support response) the client should be able to
connect anyhow, I don't know.

Regards,
J




Hi
Still interested in the system routing and maybe arp cache....

--
Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
SLES 15 | GNOME Shell 3.26.2 | 4.12.14-25.16-default
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below... Thanks!

jmozdzen
10-Sep-2018, 10:28
Hm, sounds strange indeed. My first guess would have been a name resolution issue - but you mentioned that you had tried mounts via IP already.

For those hitting this thread via search:

The issue was resolved, the root cause was a duplicate server IP on the network, which is why it intermittently failed. The "rpcinfo" and tcpdump trace mentioned in elsewhere in this tread was the reply from the wrongly configured machine, which didn't run an NFS server.

Murphy says: If you're hunting down "intermittent" network errors like these, make sure to pay attention to the MAC address of the replying machine. If you're sure the (in this case) server has the service running, but the client will not see it in the server's reply, it is indeed likely that the reply isn't from the server in question :)

Regards,
J