CLVMD failed to start when bringing up single HA node

Dear Expert,

I am trying to build two node cluster. HA version is SLES 12 SP2 HAE. The problem is I'm able to build the cluster successfully but during HA test when both the node (node1 & node2) is down and I'm trying to bring up node1 "CLVMD" is always failed. Kindly help.

Error:

2017-07-12T06:16:30.076663-05:00 nfsnode1 dlm_controld[2343]: 107 fence work wait for quorum
2017-07-12T06:16:34.083064-05:00 nfsnode1 dlm_controld[2343]: 111 clvmd wait for quorum
........

Cluster configuration:

root@nfsnode1:/root>crm configure show
node 1084808202: nfsnode1
node 1084808203: nfsnode2
primitive admin_addr IPaddr2 \
params ip=192.168.220.13 \
op monitor interval=10 timeout=20
primitive clvm ocf:lvm2:clvmd \
params daemon_timeout=30 \
op start timeout=90s interval=0 \
op stop timeout=100s interval=0
primitive dlm ocf:pacemaker:controld \
op start timeout=90s interval=0 \
op stop timeout=100s interval=0
primitive fs_sap Filesystem \
params device="/dev/vgappdata/lvusrsap" directory="/usr/sap/XE1/" fstype=ext4 \
meta target-role=Started \
op start interval=0 timeout=60s \
op stop interval=0 timeout=60s \
op monitor interval=20s timeout=40s
primitive stonith-sbd stonith:external/sbd \
params pcmk_delay_max=30s
primitive vg1 LVM \
params volgrpname=vgappdata \
op start timeout=60s interval=0 \
op stop timeout=60s interval=0 \
op monitor interval=30s timeout=60s
primitive vip_sap IPaddr2 \
params ip=192.168.220.16 cidr_netmask=24 \
op start interval=0 timeout=20 \
op stop interval=0 timeout=20 \
op monitor interval=10 timeout=20
group g-clvm dlm clvm vg1
group g_sap fs_sap vip_sap \
meta target-role=Started
clone c-clvm g-clvm \
meta interleave=true ordered=true
order g-constraint Optional: c-clvm:start g_sap:start
property cib-bootstrap-options: \
have-watchdog=true \
dc-version=1.1.15-21.1-e174ec8 \
cluster-infrastructure=corosync \
cluster-name=hacluster \
stonith-enabled=true \
placement-strategy=balanced \
no-quorum-policy=ignore \
stonith-action=reboot \
stonith-timeout=150s
rsc_defaults rsc-options: \
resource-stickiness=1000 \
migration-threshold=5000
op_defaults op-options: \
timeout=600 \
record-pending=true
root@nfsnode1:/root>

Best regards,
Arunabha

Comments

  • jmozdzenjmozdzen Knowledge Partner
    Hi Arunabha,

    could you please list the DLM config via "dlm_tool dump_config"? Just to verify its quorum setting.

    Regards,
    J
  • arunabha_banerjeearunabha_banerjee New or Quiet Member
    Here it is ..


    root@nfsnode1:/root>dlm_tool dump_config
    daemon_debug=0
    foreground=0
    log_debug=0
    timewarn=0
    protocol=detect
    debug_logfile=0
    enable_fscontrol=0
    enable_plock=1
    plock_debug=0
    plock_rate_limit=0
    plock_ownership=0
    drop_resources_time=10000
    drop_resources_count=10
    drop_resources_age=10000
    post_join_delay=30
    enable_fencing=1
    enable_concurrent_fencing=0
    enable_startup_fencing=0
    enable_quorum_fencing=1
    enable_quorum_lockspace=1
    help=-1
    version=-1

    root@nfsnode1:/root>
  • jmozdzenjmozdzen Knowledge Partner
    I don't have an 12SP2 HAE installation at hand - what catches the eye is
    enable_quorum_fencing=1
    enable_quorum_lockspace=1

    which I would have expected to be "0" in your two-node setup.

    Unfortunately, "my hands are bound" (or rather, I'm sort of "blind") until I am back at my office next week - if anyone else would like to chime in here, you're welcome ;)

    Regards,
    J
Sign In or Register to comment.