PDA

View Full Version : Pacemaker with Apache



Stefanik74
08-Apr-2014, 15:36
Hi all,
I've two node with pacemaker and I should add to an existing configuration the apache active/standby check.
Apache is used in jboss, so I haven't the standard bin path.

I've add editing crm configuration (by "crm configure edit" command) following lines:
primitive apache ocf:heartbeat:apache \
params configfile="/opt/jboss/httpd/httpd/conf/httpd.conf" \
params httpd="/opt/jboss/httpd/sbin" \
op start interval="0" timeout="40" \
op stop interval="0" timeout="60" \
op monitor interval="120s" timeout="60s"

But I've the resource failed:

pengine[12738]: warning: unpack_rsc_op: Processing failed op start for apache on test: unknown error (1)
pengine[12738]: notice: LogActions: Start apache (test)
pengine[12738]: notice: process_pe_message: Calculated Transition 428: /var/lib/pacemaker/pengine/pe-input-166.bz2
crmd[12739]: notice: do_te_invoke: Processing graph 428 (ref=pe_calc-dc-1397330975-614) derived from /var/lib/pacemaker/pengine/pe-input-166.bz2
crmd[12739]: notice: te_rsc_command: Initiating action 120: start apache_start_0 on test
/usr/sbin/cron[26457]: (sogadm) CMD (/opt/ericsson/appl/ProcessController/0/bin/go_SPPM_PL_WSOMapper > /dev/null 2>&1)
crmd[12739]: warning: status_from_rc: Action 120 (apache_start_0) on test failed (target: 0 vs. rc: 1): Error
crmd[12739]: warning: update_failcount: Updating failcount for apache on test after failed start: rc=1 (update=INFINITY, time=1397331015)
crmd[12739]: warning: update_failcount: Updating failcount for apache on test after failed start: rc=1 (update=INFINITY, time=1397331015)
crmd[12739]: notice: run_graph: Transition 428 (Complete=1, Pending=0, Fired=0, Skipped=1, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-166.bz2): Stopped
pengine[12738]: notice: unpack_config: On loss of CCM Quorum: Ignore
pengine[12738]: warning: unpack_rsc_op: Processing failed op start for apache on test: unknown error (1)
pengine[12738]: notice: LogActions: Recover apache (Started test)
pengine[12738]: notice: process_pe_message: Calculated Transition 429: /var/lib/pacemaker/pengine/pe-input-167.bz2
pengine[12738]: notice: unpack_config: On loss of CCM Quorum: Ignore
pengine[12738]: warning: unpack_rsc_op: Processing failed op start for apache on test: unknown error (1)
pengine[12738]: warning: common_apply_stickiness: Forcing apache away from test after 1000000 failures (max=1000000)
pengine[12738]: notice: LogActions: Recover apache (Started test -> toemaprend1)
pengine[12738]: notice: process_pe_message: Calculated Transition 430: /var/lib/pacemaker/pengine/pe-input-168.bz2
crmd[12739]: notice: do_te_invoke: Processing graph 430 (ref=pe_calc-dc-1397331016-617) derived from /var/lib/pacemaker/pengine/pe-input-168.bz2
crmd[12739]: notice: te_rsc_command: Initiating action 17: stop apache_stop_0 on test
crmd[12739]: notice: te_rsc_command: Initiating action 121: start apache_start_0 on toemaprend1 (local)
apache(apache)[27301]: ERROR: /usr/lib/ocf/lib/heartbeat/ocf-shellfuncs: line 381: -DSTATUS: command not found
apache(apache)[27301]: INFO: waiting for apache /opt/jboss/httpd/httpd/conf/httpd.conf to come up
apache(apache)[27301]: INFO: apache not running
apache(apache)[27301]: INFO: waiting for apache /opt/jboss/httpd/httpd/conf/httpd.conf to come up
apache(apache)[27301]: INFO: apache not running
apache(apache)[27301]: INFO: waiting for apache /opt/jboss/httpd/httpd/conf/httpd.conf to come up
lrmd[12736]: warning: child_timeout_callback: apache_start_0 process (PID 27301) timed out
lrmd[12736]: warning: operation_finished: apache_start_0:27301 - timed out after 40000ms
lrmd[12736]: notice: operation_finished: apache_start_0:27301 [ basename: missing operand ]
lrmd[12736]: notice: operation_finished: apache_start_0:27301 [ Try `basename --help' for more information. ]
crmd[12739]: error: process_lrm_event: LRM operation apache_start_0 (306) Timed Out (timeout=40000ms)
crmd[12739]: warning: status_from_rc: Action 121 (apache_start_0) on toemaprend1 failed (target: 0 vs. rc: 1): Error
crmd[12739]: warning: update_failcount: Updating failcount for apache on toemaprend1 after failed start: rc=1 (update=INFINITY, time=1397331056)
attrd[12737]: notice: attrd_trigger_update: Sending flush op to all hosts for: fail-count-apache (INFINITY)
crmd[12739]: warning: update_failcount: Updating failcount for apache on toemaprend1 after failed start: rc=1 (update=INFINITY, time=1397331056)
crmd[12739]: notice: run_graph: Transition 430 (Complete=3, Pending=0, Fired=0, Skipped=1, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-168.bz2): Stopped
attrd[12737]: notice: attrd_perform_update: Sent update 212: fail-count-apache=INFINITY
attrd[12737]: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-apache (1397331056)
attrd[12737]: notice: attrd_perform_update: Sent update 214: last-failure-apache=1397331056
attrd[12737]: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-apache (1397331056)
pengine[12738]: notice: unpack_config: On loss of CCM Quorum: Ignore
pengine[12738]: warning: unpack_rsc_op: Processing failed op start for apache on toemaprend1: unknown error (1)
pengine[12738]: warning: unpack_rsc_op: Processing failed op start for apache on test: unknown error (1)
pengine[12738]: warning: common_apply_stickiness: Forcing apache away from test after 1000000 failures (max=1000000)
pengine[12738]: notice: LogActions: Recover apache (Started toemaprend1)
pengine[12738]: notice: process_pe_message: Calculated Transition 431: /var/lib/pacemaker/pengine/pe-input-169.bz2
attrd[12737]: notice: attrd_perform_update: Sent update 216: last-failure-apache=1397331056
pengine[12738]: notice: unpack_config: On loss of CCM Quorum: Ignore
pengine[12738]: warning: unpack_rsc_op: Processing failed op start for apache on toemaprend1: unknown error (1)
pengine[12738]: warning: unpack_rsc_op: Processing failed op start for apache on test: unknown error (1)
pengine[12738]: warning: common_apply_stickiness: Forcing apache away from toemaprend1 after 1000000 failures (max=1000000)
pengine[12738]: warning: common_apply_stickiness: Forcing apache away from test after 1000000 failures (max=1000000)
pengine[12738]: notice: LogActions: Stop apache (toemaprend1)
pengine[12738]: notice: process_pe_message: Calculated Transition 432: /var/lib/pacemaker/pengine/pe-input-170.bz2
crmd[12739]: notice: do_te_invoke: Processing graph 432 (ref=pe_calc-dc-1397331056-621) derived from /var/lib/pacemaker/pengine/pe-input-170.bz2
crmd[12739]: notice: te_rsc_command: Initiating action 15: stop apache_stop_0 on toemaprend1 (local)
apache(apache)[29867]: INFO: apache is not running.
lrmd[12736]: notice: operation_finished: apache_stop_0:29867 [ basename: missing operand ]
lrmd[12736]: notice: operation_finished: apache_stop_0:29867 [ Try `basename --help' for more information. ]
crmd[12739]: notice: process_lrm_event: LRM operation apache_stop_0 (call=309, rc=0, cib-update=661, confirmed=true) ok
crmd[12739]: notice: run_graph: Transition 432 (Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-170.bz2): Complete
crmd[12739]: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]


I'm newbby on pacemaker, so I'm not sure about the configuration and the way to edit it.
Can you help me?

Thanks,
Ste

jmozdzen
08-Apr-2014, 17:52
Hi Ste,

I recommend to create an rc-style start/stop script to use within Pacemaker. Maybe there already is a single command you can use to start/stop this specific Apache from the command line?

If you want to proceed your approach (which ought to work, too) - what happens when you try to start Apache manually via that resource script? The first line of your syslog hints at a failure starting the daemon, maybe some additional parameters are needed, maybe specific libs need to be available (proper LD_LIBRARY_PATH), etc.?

Regards,
Jens

Stefanik74
10-Apr-2014, 20:58
Thanks Jens,
I was focus on a wrong way :)
The resource is not apache, but just a jboss module, so I change configuration, I deleted the ocf:apache and add following:
primitive res_web_ha ocf:heartbeat:anything \
params binfile="/opt/jboss/httpd/sbin/apachectl" cmdline_options="start" \
operations $id="res_web_ha-operations" \
op start interval="0" timeout="20" \
op stop interval="0" timeout="20" \
op monitor interval="10" timeout="20" start-delay="0"

It seems "active":
res_web_ha (ocf::heartbeat:anything): Started totestprend1

but now I've following error in messages:
lrmd[13342]: notice: operation_finished: res_web_ha_monitor_10000:43055 [ /usr/lib/ocf/resource.d/heartbeat/anything: line 60: kill: (32314) - No such process ]

Really I'm not sure how to manage the resource.
I've to add rules for active/standby failover, should I use the same ocf:anything, or should I add ocf:IPaddr2 ?

Thanks again,
Ste

Stefanik74
14-Apr-2014, 14:06
I check monitor option doesn't work properly in my case, I think I should use monitor_hook option, but I can't find any documentation on how to use it. Can you help me with any suggestions?
Thanks.

jmozdzen
24-Apr-2014, 14:36
Hi Ste,

> I think I should use monitor_hook option, but I can't find any documentation on how to use it. Can you help me with any suggestions?

check the resource agent script to see what it's doing on the "monitor" action....

Regards,
Jens

PS: I'm sorry for the late reply - somehow I managed to drop a number of notifications on updated threads...