Quote Originally Posted by jmozdzen View Post
Hi lpphiggp,

> But this is with everything currently working

... which makes a good baseline for comparison when things go bad again

BTW, do you set up bridging via Xen (so you have some script configured in xend's config file)? Even in SLES10 days, I found it to be more "logical" (and thus more maintainable) to create all bridges via regular SLES, and then only have Xen connect the VIFs to the according bridge.

Best regards,
Jens
We've done both. This last time though I left things more or less as they were, allowing xen to rename eth1 to peth1 and creating bridge eth1 in it's place.

On other boxes previously, another admin (sort of) manually just created a br0 and pointed the actual interfaces (eth0, etc..) to that, and that's worked, all but once. This one was a bit different though. This time, even the physical NIC wasn't communicating, I couldn't even get to the host box*.
Still, I was going to do that too, originally, but got flustered when I found, that unlike the other servers with this issue, I couldn't.. besides the fact that the physical NIC was not communicating,the xen script was preventing me from editing anything in the network in YaST2. Running the command to stop the script didn't help. ( /etc/xen/scripts/network-bridge stop netdev=eth0)

By the time I figured out rebooting to the non-xen kernel would allow me to edit the networking config in yast, I just went with the standard setup first to see if that would at least give me back my host access - I also switched the cable to eth1 in case I had a bad NIC port.
Anyway, once I rebooted into the xen kernel again to see if I had connectivity, I did; at that point, I just wanted to get the two VMs up and running ASAP as they'd been down for hours -we'd had someone else onsite locally look at it first when it first went down, then it took me over an hour to drive there when that didn't pan out.
So after recreating the virtual NICs, and finding that it all in fact started working again, I just went with that. I didn't edit any scripts or anything, it just picked it all up.

I did get a scare though. That was a Monday. That Thursday, one of the VMs went incommunicado again. However, a reboot of the VM fixed that too.
But now, just before the Christmas holiday, I'm worrying everyday that the **** thing will go down again. The whole business just seems very unstable to me.


*Normally, this issue would only disconnect the VMs, where now we realize perhaps a simple "brctrl addif "<bridge> <vif>" command might've sufficed -if I only lose VM network connectivity again, I'll try that, but this last one manifested itself in a particularly ugly manner.