PDA

View Full Version : SLES 11 SP3 bonding interface bond0 does NOT show IPv4 though shows up !



mosaddek
02-Oct-2015, 02:40
I am in situation where the master bond0 interface is up but there is no IPv4 address assigned. I have the file correctly configured in /etc/sysconfig/network/bond0 and the slave interface eth4 and eth6 also showing up. But the bond0 not taking any IP ever after network restart.

I did Reboot and restart of interfaces. But after network restart it says 'no configuration found for ethx' for all interfaces and can not make the bond0 up showing 'interface bond0 not up'.

While I check lspci or hwinfo or /proc/net/dev I can see all network cards.

Now I am not sure why cannot get the IP up? Or how to check if any hw issue ?

I cannot include much configuration as that was my O&M interface and now accessing the server with console.

Can anyone help to understand ?

thsundel
02-Oct-2015, 08:04
I am in situation where the master bond0 interface is up but there is no IPv4 address assigned. I have the file correctly configured in /etc/sysconfig/network/bond0 and the slave interface eth4 and eth6 also showing up. But the bond0 not taking any IP ever after network restart.

I did Reboot and restart of interfaces. But after network restart it says 'no configuration found for ethx' for all interfaces and can not make the bond0 up showing 'interface bond0 not up'.

While I check lspci or hwinfo or /proc/net/dev I can see all network cards.

Now I am not sure why cannot get the IP up? Or how to check if any hw issue ?

I cannot include much configuration as that was my O&M interface and now accessing the server with console.

Can anyone help to understand ?

Have you tried setting up the bond through yast, does that work?

https://www.novell.com/support/kb/doc.php?id=3815448
https://www.suse.com/documentation/sle_ha/book_sleha/data/sec_ha_netbond_yast.html

Thomas

jmozdzen
02-Oct-2015, 12:33
Hi mosaddek,

I am in situation where the master bond0 interface is up but there is no IPv4 address assigned. I have the file correctly configured in /etc/sysconfig/network/bond0 and the slave interface eth4 and eth6 also showing up. But the bond0 not taking any IP ever after network restart.

I did Reboot and restart of interfaces. But after network restart it says 'no configuration found for ethx' for all interfaces and can not make the bond0 up showing 'interface bond0 not up'.

While I check lspci or hwinfo or /proc/net/dev I can see all network cards.

Now I am not sure why cannot get the IP up? Or how to check if any hw issue ?

I cannot include much configuration as that was my O&M interface and now accessing the server with console.

Can anyone help to understand ?

have you tried configuring the address manually, i.e. by using "ifconfig bond0 1.2.3.4 netmask 255.255.255.252" (of course replacing address and netmask values according to your needs)?

Also, you have not told how you verified that it's not taking any address - did you check that vi "ip" or "ifconfig", or are you only judging because of some communications problem?

What catches the eye is that the error message references "ethx" - might it be you only created the ifcfg-bond0 file, but not the corresponding ifcfg-ethx files (to mark them as slave interfaces)? Also, initially you write that bond0 is up - but that you receive the error "bond0 is not up".

I'd recommend to start from scratch, via YaST, by deleting any configuration for bond0, eth4 and eth6. Then configure eth4,6 as slaves, then create a new interface "bond0", configure it according to your needs and add the slaves.

Once that is done and works, have a look at /etc/sysconfig/networks/ifcfg-eth[46] and ...-bond0, to compare to what you originally had configured.

At the current stage, I wouldn't spend any more time looking for hardware defects, this looks more like a configuration issue to me.

Regards,
Jens

mosaddek
03-Oct-2015, 03:00
Thanks Jens and Thomas.

I could not reconfigure with YaST as the server vendor application is totally configured with other custom scripts. in fact the scripts build all these bonds and heartbeats [veritas also installed]. But your advice to try configuring the address manually really helped. At the moment the IP is assigned for all bonds and I can do ssh as well. Thanks again; and Jens absolute correct guessing that at the current stage, we wouldn't spend any more time looking for hardware defects, this looks more like a configuration issue.

Now moving forward the fix is temporary. The bond address will be wiped out as soon as we restart. I checked /etc/modprobe.conf but could not find much info for bond. Also the incident is happening for all bonds. Any other place to look for ?

Information

I check/set up interface with ifconfig
I have all files created required for all bonds in /etc/sysconfig/network dir[now the system is perfect, but I tried restart and IP is again lost ]
system has veritas installed [clustered solution and other cluster is up and good]

Thanks much again.

mosaddek
03-Oct-2015, 03:46
to add info while boot from /var/log/messages


ifup-route: interface lo is not up
ifup: No configuration found for eth0
..
ifup: No configuration found for eth15
kernel: [45.410661] bonding: bond0: Unable to set eth4 as primary slave.
postfix[14926]: fatal: parameter inet_interfaces: no local interface found for 127.0.0.1

jmozdzen
05-Oct-2015, 18:20
Hi mossaddek,

is this about an IP address managed by the clustering agents? Typically, the cluster solution will only manage "moving" IPs, but not loopback. So not being able to get the loopback interface up does indeed sound strange.

For SLES to properly start networking devices, you'll need the according ifcfg files in /etc/sysconfig/network. Could you please share the result of "ls -l /etc/sysconfig/network/ifcfg-*" and the result from "ifstatus lo" after boot?

> kernel: [45.410661] bonding: bond0: Unable to set eth4 as primary slave.

It'd be interesting to see the output of "ifstatus bond0" and "ifstatus eth4" and probably as well the contents of the corresponding ifcfg files (if available).

> ifup: No configuration found for eth0
> [...]

I wouldn't mind these too much, the system will try to bring up the network interfaces, once detected. If then there's no corresponding ifcfg file, the messages you quoted are reported by the system. Of course, if you're after a "zero warnings" policy, create config files marking unused interfaces as such.

> I check/set up interface with ifconfig
> I have all files created required for all bonds in /etc/sysconfig/network dir[now the system is perfect, but I tried restart and IP is again lost ]

Things are "perfect" once you can configure/start the interfaces with "ifup <interface>" :) That would use the same mechanisms as during boot, so if that works, it should work at boot time, too.

Please be a bit more specific on the required configuration - which config is under cluster control, which config is to be under system control, which physical interfaces do you actually *use*, how would you like them to be joined into which bond and how did you set up the according config files (including their name - in the initial post you wrote "I have the file correctly configured in /etc/sysconfig/network/bond0" which is *not* the correct name for that file...)

Regards,
Jens
Regards,
Jens

mosaddek
06-Oct-2015, 02:29
hi jmozdzen,

Thanks for response. I agree with you and I am trying to share the technical info. But one thing is really not working as you mentioned and expected - configuration will work after boot also. No its not working !
It is only working as temporary [setting IP from CLI] and not working after boot. I am not sure of the reason but also found interface related error messages in dmesg. now after ifstatus I find all these showing down [below]!

Yes, you are correct the cluster solution will only manage "moving" IPs. But I checked the loopback as part of system check after boot and as it did NOT came up, it was strange to me. I see message in dmesg though hostfile and lo cfg file looks ok.

# ls -l /etc/sysconfig/network/ifcfg-*
-rw-r--r-- 1 root root /etc/sysconfig/network/ifcfg-bond0
-rw-r--r-- 1 root root /etc/sysconfig/network/ifcfg-bond5
-rw-r--r-- 1 root root /etc/sysconfig/network/ifcfg-eth14
-rw------- 1 root root /etc/sysconfig/network/ifcfg-lo

Yes also /etc/sysconfig/network/bond0 was just to mention earlier. In fact file name is /etc/sysconfig/network/ifcfg-bond0

Now you might want to take a look at the output below

# ifconfig bond0
bond0 Link encap:Ethernet HWaddr XX
inet addr:XX Bcast:XX Mask:XX
inet6 addr: XX Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:11976 errors:0 dropped:73130 overruns:0 frame:0
TX packets:3480 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:76379609 (72.8 Mb) TX bytes:4532273 (4.3 Mb)

# ifstatus lo
lo
lo is down


# ifstatus bond0
bond0 name: NPD vlan
bond0 is down
# ifstatus eth4
Eth4 device: Network Corporation X
No configuration found for eth4

# cat /etc/sysconfig/network/ifcfg-bond0
DEVICE='bond0'
BONDING_MASTER='yes'
BONDING_SLAVE0='eth4'
BONDING_SLAVE1='eth6'
BONDING_MODULE_OPTS='mode=1 arp_interval=1100 arp_ip_target=xx arp_validate=3 primary=eth4'
BOOTPROTO='static'
IPADDR=''
NETWORK=''
NETMASK=''
USERCONTROL='no'
STARTMODE='auto'
NAME='NPD vlan'


Information

(1)the new thing I see one of interface only shows up broadcast but NOT 'running'. Where to check? as I see lspci shows all Ethernet list and no other specific error in dmesg

eth7 Link encap:Ethernet HWaddr XX
UP BROADCAST SLAVE MULTICAST MTU:1500 Metric:1 --> 'ifconfig eth7 up' command does NOT help to bring up
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

(2) another interesting msg after system boot

syslog-ng[7406]: syslog-ng starting up --> not sure of syslog-ng location
firmware.sh[7650]: Cannot find firmware file 'intel-ucode/06-2d-07' -->may be as you mentioned wouldn't mind these too much, the system will try to bring up the network interfaces, once detected

jmozdzen
09-Oct-2015, 14:42
Hi mosaddek,

hi jmozdzen,

Thanks for response. I agree with you and I am trying to share the technical info. But one thing is really not working as you mentioned and expected - configuration will work after boot also. No its not working !
It is only working as temporary [setting IP from CLI] and not working after boot. I am not sure of the reason but also found interface related error messages in dmesg. now after ifstatus I find all these showing down [below]!

I see that there are a few problems that need to be addressed, see my comments below:


Yes, you are correct the cluster solution will only manage "moving" IPs. But I checked the loopback as part of system check after boot and as it did NOT came up, it was strange to me.

Just for reference, here's the default content of /etc/sysconfig/network/ifcfg-lo:


# Loopback (lo) configuration
IPADDR=127.0.0.1
NETMASK=255.0.0.0
NETWORK=127.0.0.0
BROADCAST=127.255.255.255
IPADDR_2=127.0.0.2/8
STARTMODE=auto
USERCONTROL=no
FIREWALL=no
"ifup lo" then should be able to start the interface, and it should come up during boot as well. If you don't see according messages in /var/log/boot.msg (look for "Setting up loopback interface"), then check if the boot.localnet service might have been disabled ("chkconfig boot.localnet" - this should report "on").


I see message in dmesg though hostfile and lo cfg file looks ok.

# ls -l /etc/sysconfig/network/ifcfg-*
-rw-r--r-- 1 root root /etc/sysconfig/network/ifcfg-bond0
-rw-r--r-- 1 root root /etc/sysconfig/network/ifcfg-bond5
-rw-r--r-- 1 root root /etc/sysconfig/network/ifcfg-eth14
-rw------- 1 root root /etc/sysconfig/network/ifcfg-lo

This doesn't look like the actual output - I'm missing time stamps and especially size info


somehost:~ # ls -l /etc/sysconfig/network/ifcfg-lo
-rw------- 1 root root 172 Jul 17 13:30 /etc/sysconfig/network/ifcfg-lo
somehost:~ #
... but what's more important, I would have expected to see two more files, for eth4 and eth6 (the bonding slaves). These may have content similar to


BOOTPROTO='none'
BROADCAST=''
ETHTOOL_OPTIONS=''
IPADDR=''
MTU=''
NAME='Ethernet controller'
NETMASK=''
NETWORK=''
PREFIXLEN='32'
REMOTE_IPADDR=''
STARTMODE='hotplug'
USERCONTROL='no'
Your bond0 configuration file looks good to me. Interestingly, it carries no IP address - are you using it as the uplink for a linux bridge?


(1)the new thing I see one of interface only shows up broadcast but NOT 'running'. Where to check? as I see lspci shows all Ethernet list and no other specific error in dmesg

eth7 Link encap:Ethernet HWaddr XX
UP BROADCAST SLAVE MULTICAST MTU:1500 Metric:1 --> 'ifconfig eth7 up' command does NOT help to bring up
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)[/CODE]

Does "ethtool eth7" report that a link was detected? It could be a cabling problem or the switch port, the latter i.e. might be admin-down.


(2) another interesting msg after system boot

syslog-ng[7406]: syslog-ng starting up --> not sure of syslog-ng location
firmware.sh[7650]: Cannot find firmware file 'intel-ucode/06-2d-07' -->may be as you mentioned wouldn't mind these too much, the system will try to bring up the network interfaces, once detected
https://www.suse.com/support/kb/doc.php?id=7010705 Probably upgrading the microcode package would already help.

Regards,
Jens