PDA

View Full Version : SUSE HAE - standby node disconnect from network



supphakorn
17-Sep-2012, 03:54
My environment has 2 nodes. I run all resources at node 1. When I disconnect all network at node 2 (standby), it active resource at node 2 (standby) and node1 still active resource. when I plug network node 2 back to network, cluster status show multiple running and restart all resource.

Do you have solution for this environment

jmozdzen
17-Sep-2012, 14:15
Hi supphakorn,

I'm not sure I fully understand what you're trying to achieve: When you disconnect all networking at node 2, it's not in standby, but offline.

Independent of standby/offline, it's a matter of resource stickiness to avoid redistribution of resources after bringing up a new active cluster node.

OTOH, you say that you see resources running multiple times after reconnecting the second node... are we maybe talking about a split-brain situation here?

- two nodes active, all resources active on node 1
- disconnect node 2 from networking *without putting node 2 into standby first*
- node 2 does no longer see node 1 ("split brain") and so decides it must activate all resources on node 2. But on node 1, all these resources are active, too, at the same time.
- once you reconnect node 2 to the network, you can see that resources are active on node 2, too...

This is a typical problem with two-node clusters and needs to be circumvented. There are many ways to skin that cat, the proper search terms are "two-node clusters" and "split brain"...

Regards,
Jens

LarsMB
25-Sep-2012, 21:53
Hi there,

I think you have set no-quorum-policy=ignore (so that a two node cluster basically pretends to have quorum even with one node), but you have explicitly disabled IO fencing/STONITH. You need to re-enable and configure it properly for your environment to avoid the split brain and concurrency violation.

Best,
Lars