PDA

View Full Version : fence_kdump agent isn't work as expected



crazyraven
30-Jun-2017, 16:50
Hi experts,

I'm using SLES 12 SP2 for test purpose and I'm also testing SLES High Availability.
In order to working with fence_kdump on my cluster, I've modified /etc/sysconfig/kdump refer to user document(https://www.suse.com/documentation/sle-ha-12/book_sleha/data/sec_ha_fencing_config.html#ex_ha_fencing_kdump)

According to that document, if make fence_kdump resource work properly, kdump should be configured like below:
cat /etc/sysconfig/kdump

<..snip>
KDUMP_POSTSCRIPT="/usr/lib64/fence_kdump_send -i 1 -p 7410 node02"

However, fence_kdump_send script doesn't exist on /usr/lib64. it's on /usr/lib so I've modified like below:

KDUMP_POSTSCRIPT="/usr/lib64/fence_kdump_send -i 1 -p 7410 node02"

Then, I crashed kernel with command "echo c > /proc/sysrq-trigger"

The kernel has crashed as expected but kdump didn't send message to node02 because of below error messages:

...<snip>
Generating REAME Finished.
Copying System.map Finished.
Copying Kernel Finished.
Running /usr/lib/fence_kdump_send -i 1 -p7410 node02
/lib/kdump/save_dump.sh: line234: /usr/lib/fence_kdump_send: No such file or directory
Last command failed (127)

...<snip>

in case of system memory is small and dumped very quickly, it doesn't matter because fence_kdump on node02 is waiting 60sec by default but if system has a lot of memory and needed more time to finish dump, to keep vmcore complete, fence_kdump should receive message from node01.

When kernel panic occurred and fence_kdump failed, then second fence method(usually it's fence device based on power device)will be executed. in this case, vmcore will not created completely.

Any idea would be appreciated.

Thank you

Automatic Reply
05-Jul-2017, 05:30
crazyraven,

It appears that in the past few days you have not received a response to your
posting. That concerns us, and has triggered this automated reply.

These forums are peer-to-peer, best effort, volunteer run and that if your issue
is urgent or not getting a response, you might try one of the following options:

- Visit http://www.suse.com/support and search the knowledgebase and/or check all
the other support options available.
- Open a service request: https://www.suse.com/support
- You could also try posting your message again. Make sure it is posted in the
correct newsgroup. (http://forums.suse.com)

Be sure to read the forum FAQ about what to expect in the way of responses:
http://forums.suse.com/faq.php

If this is a reply to a duplicate posting or otherwise posted in error, please
ignore and accept our apologies and rest assured we will issue a stern reprimand
to our posting bot..

Good luck!

Your SUSE Forums Team
http://forums.suse.com