PDA

View Full Version : configured resource disappear



Adam Balakier
18-Oct-2012, 11:24
I have 2 node cluster SLES 11 sp2
My problem is that many times I added resources, which disappear after
restart servers. Also I can see resources already deleted by meself.
I use pacemaker GUI to manage resources.
What am I doingo wrong?

jmozdzen
18-Oct-2012, 12:12
Hi Adam,

seems like some sort of issue with cluster db sync:

- can you confirm that prior to the reboot, both nodes have the same content?
- does this happen when a single node reboots, or do both nodes need to reboot for this to occur?
- after the according reboot, is there anything in the logs that maybe hints at some roll-back of the database?

Regards,
Jens

Adam Balakier
18-Oct-2012, 13:02
Użytkownik "jmozdzen" <jmozdzen@no-mx.forums.suse.com> napisał w wiadomości
news:jmozdzen.5kmo9b@no-mx.forums.suse.com...
>
> Hi Adam,
>
> seems like some sort of issue with cluster db sync:
>
> - can you confirm that prior to the reboot, both nodes have the same
> content?
> - does this happen when a single node reboots, or do both nodes need to
> reboot for this to occur?
> - after the according reboot, is there anything in the logs that maybe
> hints at some roll-back of the database?
>
> Regards,
> Jens
>
>
> --
> jmozdzen
> ------------------------------------------------------------------------
> jmozdzen's Profile: http://forums.suse.com/member.php?userid=51
> View this thread: http://forums.suse.com/showthread.php?t=1921
>
Both node have the same content.
When I go out with node1 from the cluster changes are in place, the same
when I only go out from the cluster with node2. I can do that many times,
and nothing disappear
When I stop corosync on both, and start, I can observe that my last last
changes disappear.

jmozdzen
18-Oct-2012, 14:53
Hi Adam,

in addition to checking the logs during cluster start, you may want to check in what state the cluster DB is while both nodes are down. My guess is that for a currently unknown reason, the cluster disregards the lastest DB version and resorts to the previous one. Might be disk space, md5 problem or something else.

Regards,
Jens

Adam Balakier
22-Oct-2012, 20:08
It is happening on my 2 diffrent clusters instalations. One is in my lab,
another is at the client side. I am using vmware virtual machines as nodes.
Client is using Dell servers with EMS SAN Storage. Both have the same
version of sles, and ha.

Użytkownik "jmozdzen" <jmozdzen@no-mx.forums.suse.com> napisał w wiadomości
news:jmozdzen.5kmvo0@no-mx.forums.suse.com...
>
> Hi Adam,
>
> in addition to checking the logs during cluster start, you may want to
> check in what state the cluster DB is while both nodes are down. My
> guess is that for a currently unknown reason, the cluster disregards the
> lastest DB version and resorts to the previous one. Might be disk space,
> md5 problem or something else.
>
> Regards,
> Jens
>
>
> --
> jmozdzen
> ------------------------------------------------------------------------
> jmozdzen's Profile: http://forums.suse.com/member.php?userid=51
> View this thread: http://forums.suse.com/showthread.php?t=1921
>

jmozdzen
23-Oct-2012, 16:05
Hi Adam,

how do you take out the cluster nodes - maybe the (latest) change hasn't made it to persistence yet? Does this happen if you simply stop & start cluster services on both nodes?

Regards,
Jens

Adam Balakier
24-Oct-2012, 09:16
When I stop both nodes using rcopenais stop, they are stoping without
errors.
Then I start them, and latest changes to cluster disapper.

Użytkownik "jmozdzen" <jmozdzen@no-mx.forums.suse.com> napisał w wiadomości
news:jmozdzen.5kw8pb@no-mx.forums.suse.com...
>
> Hi Adam,
>
> how do you take out the cluster nodes - maybe the (latest) change
> hasn't made it to persistence yet? Does this happen if you simply stop &
> start cluster services on both nodes?
>
> Regards,
> Jens
>
>
> --
> jmozdzen
> ------------------------------------------------------------------------
> jmozdzen's Profile: http://forums.suse.com/member.php?userid=51
> View this thread: http://forums.suse.com/showthread.php?t=1921
>

jmozdzen
24-Oct-2012, 10:44
Hi Adam,

> When I stop both nodes using rcopenais stop, they are stoping without errors.
> Then I start them, and latest changes to cluster disapper.

Then "lost disk writes" or alike don't seem to be the problem.

When AIS restarts and loads the cluster db, do you see any indication in the log that it tries and fails to load the lastest (in terms of prior to stopping AIS) db, thus resorting to an older copy?

If you have a support contract, this might be a good time to open an incident with Novell/SuSE.

Regards,
Jens