Hello people I have a heartbeat cluster that manages a ser 0.9.3 running on one machine and asterisk1.2.3 running on another in an Active/Active two IP address Configuration with failover support. I have SUSE 10.1 installed on both machines. My ha.cf files look like this ############################################### logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 10 warntime 10 initdead 20 udpport 694 baud 19200 bcast eth1 ping xxx.xxx.x.x auto_failback on node linux-xczz node prueba2 respawn hacluster /usr/local/lib/heartbeat/ipfail #################################################
My haresources files look like this ######################################## prueba2 xxx.xxx.x.125/24 safe_asterisk linux-xczz xxx.xxx.x.124/24 serctl ########################################
so "linux-xczz" is the master when running ser and "prueba2" is the master when running asterisk
the authkeys files are the same on both machines too with the right permissions (mod 600). The /etc/hosts files look like this
########################## 10.10.10.1 linux-xczz 10.10.10.2 prueba2 ##########################
The First time I try to run heartbeat on one machine (prueba2) with /etc/init.d/heartbeat start, both my services run good. But when I try to run heartbeat on the other machine (linux-xczz) so that it takes over the ser service, the system goes crazy and once linux-xczz takes over the ser service, prueba2 gives up the other resource (asterisk) which it should not do, and it appears on linux-xczz, only to disappear seconds later along with ser, leaving my cluster-ha a complete wreck with no service running on either machine. The error log I get from prueba2 is this:
heartbeat[12609]: 2007/03/15_11:41:18 info: Link linux-xczz:eth1 up. heartbeat[12609]: 2007/03/15_11:41:18 info: Status update for node linux-xczz: status init heartbeat[12609]: 2007/03/15_11:41:18 info: Status update for node linux-xczz: status up harc[13730]: 2007/03/15_11:41:18 info: Running /etc/ha.d/rc.d/status status harc[13741]: 2007/03/15_11:41:18 info: Running /etc/ha.d/rc.d/status status heartbeat[12609]: 2007/03/15_11:41:19 info: Status update for node linux-xczz: status active harc[13754]: 2007/03/15_11:41:19 info: Running /etc/ha.d/rc.d/status status heartbeat[12609]: 2007/03/15_11:41:19 info: remote resource transition completed. heartbeat[12609]: 2007/03/15_11:41:19 info: prueba2 wants to go standby [foreign] heartbeat[12609]: 2007/03/15_11:41:20 info: standby: linux-xczz can take our foreign resources heartbeat[13767]: 2007/03/15_11:41:20 info: give up foreign HA resources (standby). ResourceManager[13777]: 2007/03/15_11:41:20 info: Releasing resource group: linux-xczz xxx.xxx.x.124/24 serctl ResourceManager[13777]: 2007/03/15_11:41:20 info: Running /etc/init.d/serctl stop ResourceManager[13777]: 2007/03/15_11:41:20 info: Running /etc/ha.d/resource.d/IPaddr xxx.xxx.x.124/24 stop IPaddr[13915]: 2007/03/15_11:41:20 INFO: /sbin/route -n del -host xxx.xxx.x.124 IPaddr[13915]: 2007/03/15_11:41:20 INFO: /sbin/ifconfig eth0:0 xxx.xxx.x.124 down IPaddr[13915]: 2007/03/15_11:41:20 INFO: IP Address xxx.xxx.x.124 released IPaddr[13836]: 2007/03/15_11:41:20 INFO: IPaddr Success heartbeat[13767]: 2007/03/15_11:41:20 info: foreign HA resource release completed (standby). heartbeat[12609]: 2007/03/15_11:41:20 info: Local standby process completed [foreign]. heartbeat[12609]: 2007/03/15_11:41:23 WARN: 1 lost packet(s) for [linux-xczz] [13:15] heartbeat[12609]: 2007/03/15_11:41:23 info: remote resource transition completed. heartbeat[12609]: 2007/03/15_11:41:23 info: No pkts missing from linux-xczz! heartbeat[12609]: 2007/03/15_11:41:23 info: Other node completed standby takeover of foreign resources. heartbeat[12609]: 2007/03/15_11:41:35 info: linux-xczz wants to go standby [foreign] heartbeat[12609]: 2007/03/15_11:41:36 info: standby: acquire [foreign] resources from linux-xczz heartbeat[14011]: 2007/03/15_11:41:36 info: acquire local HA resources (standby). ResourceManager[14021]: 2007/03/15_11:41:36 info: Acquiring resource group: prueba2 xxx.xxx.x.125/24 asterisk-rosa IPaddr[14048]: 2007/03/15_11:41:36 INFO: IPaddr Running OK ResourceManager[14021]: 2007/03/15_11:41:36 info: Running /etc/init.d/safe_asterisk start ResourceManager[14021]: 2007/03/15_11:41:36 ERROR: Return code 1 from /etc/init.d/safe_asterisk ResourceManager[14021]: 2007/03/15_11:41:36 CRIT: Giving up resources due to failure of safe_asterisk ResourceManager[14021]: 2007/03/15_11:41:36 info: Releasing resource group: prueba2 xxx.xxx.x.125/24 asterisk-rosa ResourceManager[14021]: 2007/03/15_11:41:xxz.xxz.x.xxz36 info: Running /etc/init.d/safe_asterisk stop ResourceManager[14021]: 2007/03/15_11:41:37 info: Running /etc/ha.d/resource.d/IPaddr xxx.xxx.x.125/24 stop IPaddr[14310]: 2007/03/15_11:41:37 INFO: /sbin/route -n del -host xxx.xxx.x.125 IPaddr[14310]: 2007/03/15_11:41:37 INFO: /sbin/ifconfig eth0:2 xxx.xxx.x.125 down IPaddr[14310]: 2007/03/15_11:41:37 INFO: IP Address xxx.xxx.x.125 released IPaddr[14231]: 2007/03/15_11:41:37 INFO: IPaddr Success heartbeat[14011]: 2007/03/15_11:41:37 info: local HA resource acquisition completed (standby). heartbeat[12609]: 2007/03/15_11:41:37 info: Standby resource acquisition done [foreign]. heartbeat[12609]: 2007/03/15_11:41:37 info: remote resource transition completed. heartbeat[12609]: 2007/03/15_11:41:38 WARN: G_CH_dispatch_int: Dispatch function for read child took too long to execute: 520 ms (> 50 ms) (GSource: 0x80fbe00) hb_standby[14375]: 2007/03/15_11:42:07 Going standby [foreign]. heartbeat[12609]: 2007/03/15_11:42:07 info: prueba2 wants to go standby [foreign] heartbeat[12609]: 2007/03/15_11:42:08 info: standby: linux-xczz can take our foreign resources heartbeat[14385]: 2007/03/15_11:42:08 info: give up foreign HA resources (standby). ResourceManager[14395]: 2007/03/15_11:42:08 info: Releasing resource group: linux-xczz xxx.xxx.x.124/24 serctl ResourceManager[14395]: 2007/03/15_11:42:08 info: Running /etc/init.d/serctl stop ResourceManager[14395]: 2007/03/15_11:42:08 ERROR: Return code 1 from /etc/init.d/serctl
The error log I get from linux-xczz when I run heartbeat is this:
heartbeat[1063]: 2007/03/15_14:26:12 WARN: Core dumps could be lost if multiple dumps occur heartbeat[1063]: 2007/03/15_14:26:12 WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability heartbeat[1063]: 2007/03/15_14:26:12 WARN: Logging daemon is disabled --enabling logging daemon is recommended heartbeat[1063]: 2007/03/15_14:26:12 info: ************************** heartbeat[1063]: 2007/03/15_14:26:12 info: Configuration validated. Starting heartbeat 2.0.7 heartbeat[1064]: 2007/03/15_14:26:12 info: heartbeat: version 2.0.7 heartbeat[1064]: 2007/03/15_14:26:12 info: Heartbeat generation: 130 heartbeat[1064]: 2007/03/15_14:26:12 info: G_main_add_TriggerHandler: Added signal manual handler heartbeat[1064]: 2007/03/15_14:26:12 info: G_main_add_TriggerHandler: Added signal manual handler heartbeat[1064]: 2007/03/15_14:26:12 info: Removing /usr/local/var/run/heartbeat/rsctmp failed, recreating. heartbeat[1064]: 2007/03/15_14:26:12 info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1 heartbeat[1064]: 2007/03/15_14:26:12 info: glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1 heartbeat[1064]: 2007/03/15_14:26:12 info: glib: ping heartbeat started. heartbeat[1064]: 2007/03/15_14:26:12 info: G_main_add_SignalHandler: Added signal handler for signal 17 heartbeat[1064]: 2007/03/15_14:26:12 info: Local status now set to: 'up' heartbeat[1064]: 2007/03/15_14:26:13 info: Link linux-xczz:eth1 up. heartbeat[1064]: 2007/03/15_14:26:13 info: Link prueba2:eth1 up. heartbeat[1064]: 2007/03/15_14:26:13 info: Status update for node prueba2: status active heartbeat[1064]: 2007/03/15_14:26:13 info: Link xxx.xxx.x.x:xxx.xxx.x.x up. heartbeat[1064]: 2007/03/15_14:26:13 info: Status update for node xxx.xxx.x.x: status ping harc[1073]: 2007/03/15_14:26:13 info: Running /usr/local/etc/ha.d/rc.d/status status heartbeat[1064]: 2007/03/15_14:26:14 info: Comm_now_up(): updating status to active heartbeat[1064]: 2007/03/15_14:26:14 info: Local status now set to: 'active' heartbeat[1064]: 2007/03/15_14:26:14 info: Starting child client "/usr/local/lib/heartbeat/ipfail" (1001,100) heartbeat[1084]: 2007/03/15_14:26:14 info: Starting "/usr/local/lib/heartbeat/ipfail" as uid 1001 gid 100 (pid 1084) heartbeat[1064]: 2007/03/15_14:26:14 info: remote resource transition completed. heartbeat[1064]: 2007/03/15_14:26:14 info: remote resource transition completed. heartbeat[1064]: 2007/03/15_14:26:14 info: Local Resource acquisition completed. (none) heartbeat[1064]: 2007/03/15_14:26:15 info: prueba2 wants to go standby [foreign] heartbeat[1064]: 2007/03/15_14:26:15 info: standby: acquire [foreign] resources from prueba2 heartbeat[1088]: 2007/03/15_14:26:15 info: acquire local HA resources (standby). ResourceManager[1098]: 2007/03/15_14:26:15 info: Acquiring resource group: linux-xczz xxx.xxx.x.124/24 serctl IPaddr[1122]: 2007/03/15_14:26:16 INFO: IPaddr Resource is stopped ResourceManager[1098]: 2007/03/15_14:26:16 info: Running /usr/local/etc/ha.d/resource.d/IPaddr 192.168.1.124/24 start IPaddr[1321]: 2007/03/15_14:26:16 INFO: eval /sbin/ifconfig eth0:0 xxx.xxx.x.124 netmask 255.255.255.0 broadcast xxx.xxx.x.255 IPaddr[1321]: 2007/03/15_14:26:16 INFO: Sending Gratuitous Arp for xxx.xxx.x.124 on eth0:0 [eth0] IPaddr[1321]: 2007/03/15_14:26:16 INFO: /usr/local/lib/heartbeat/send_arp -i 500 -r 10 -p /usr/local/var/run/heartbeat/rsctmp/send_arp/send_arp-xxx.xxx.x.124 eth0 xxx.xxx.x.124 auto xxx.xxx.x.124 ffffffffffff IPaddr[1241]: 2007/03/15_14:26:16 INFO: IPaddr Success ResourceManager[1098]: 2007/03/15_14:26:16 info: Running /etc/init.d/serctl start heartbeat[1088]: 2007/03/15_14:26:17 info: local HA resource acquisition completed (standby). heartbeat[1064]: 2007/03/15_14:26:17 info: Standby resource acquisition done [foreign]. heartbeat[1064]: 2007/03/15_14:26:17 info: Initial resource acquisition complete (auto_failback) heartbeat[1064]: 2007/03/15_14:26:23 info: remote resource transition completed. heartbeat[1064]: 2007/03/15_14:26:28 info: linux-xczz wants to go standby [foreign] heartbeat[1064]: 2007/03/15_14:26:28 info: standby: prueba2 can take our foreign resources heartbeat[1492]: 2007/03/15_14:26:28 info: give up foreign HA resources (standby). ResourceManager[1502]: 2007/03/15_14:26:28 info: Releasing resource group: prueba2 xxx.xxx.x.125/24 safe_asterisk ResourceManager[1502]: 2007/03/15_14:26:28 info: Running /etc/init.d/safe_asterisk stop ResourceManager[1502]: 2007/03/15_14:26:28 info: Running /usr/local/etc/ha.d/resource.d/IPaddr xxx.xxx.x.125/24 stop IPaddr[1561]: 2007/03/15_14:26:29 INFO: IPaddr Success heartbeat[1492]: 2007/03/15_14:26:29 info: foreign HA resource release completed (standby). heartbeat[1064]: 2007/03/15_14:26:29 info: Local standby process completed [foreign]. heartbeat[1064]: 2007/03/15_14:26:30 WARN: 1 lost packet(s) for [prueba2] [68:70] heartbeat[1064]: 2007/03/15_14:26:30 info: remote resource transition completed. heartbeat[1064]: 2007/03/15_14:26:30 info: No pkts missing from prueba2! heartbeat[1064]: 2007/03/15_14:26:30 info: Other node completed standby takeover of foreign resources. heartbeat[1064]: 2007/03/15_14:27:00 info: prueba2 wants to go standby [foreign] heartbeat[1064]: 2007/03/15_14:27:11 info: standby: acquire [foreign] resources from prueba2 heartbeat[1784]: 2007/03/15_14:27:11 info: acquire local HA resources (standby). ResourceManager[1794]: 2007/03/15_14:27:11 info: Acquiring resource group: linux-xczz xxx.xxx.x.124/24 serctl IPaddr[1818]: 2007/03/15_14:27:11 INFO: IPaddr Running OK ResourceManager[1794]: 2007/03/15_14:27:11 info: Running /etc/init.d/serctl start ResourceManager[1794]: 2007/03/15_14:27:11 ERROR: Return code 1 from /etc/init.d/serctl ResourceManager[1794]: 2007/03/15_14:27:11 CRIT: Giving up resources due to failure of serctl ResourceManager[1794]: 2007/03/15_14:27:11 info: Releasing resource group: linux-xczz xxx.xxx.x.124/24 serctl ResourceManager[1794]: 2007/03/15_14:27:11 info: Running /etc/init.d/serctl stop ResourceManager[1794]: 2007/03/15_14:27:11 info: Running /usr/local/etc/ha.d/resource.d/IPaddr xxx.xxx.x.124/24 stop IPaddr[2090]: 2007/03/15_14:27:12 INFO: /sbin/route -n del -host xxx.xxx.x.124 IPaddr[2090]: 2007/03/15_14:27:12 INFO: /sbin/ifconfig eth0:0 xxx.xxx.x.124 down IPaddr[2090]: 2007/03/15_14:27:12 INFO: IP Address xxx.xxx.x.124 released IPaddr[2006]: 2007/03/15_14:27:12 INFO: IPaddr Success heartbeat[1784]: 2007/03/15_14:27:12 info: local HA resource acquisition completed (standby). heartbeat[1064]: 2007/03/15_14:27:12 info: Standby resource acquisition done [foreign]. heartbeat[1064]: 2007/03/15_14:27:12 info: remote resource transition completed. hb_standby[2228]: 2007/03/15_14:27:42 Going standby [foreign]. heartbeat[1064]: 2007/03/15_14:27:42 info: linux-xczz wants to go standby [foreign] heartbeat[1064]: 2007/03/15_14:27:42 info: standby: prueba2 can take our foreign resources heartbeat[2238]: 2007/03/15_14:27:42 info: give up foreign HA resources (standby). ResourceManager[2248]: 2007/03/15_14:27:43 info: Releasing resource group: prueba2 xxx.xxx.x.125/24 safe_asterisk ResourceManager[2248]: 2007/03/15_14:27:43 info: Running /etc/init.d/safe_asterisk stop ResourceManager[2248]: 2007/03/15_14:27:43 info: Running /usr/local/etc/ha.d/resource.d/IPaddr xxx.xxx.x.125/24 stop IPaddr[2310]: 2007/03/15_14:27:43 INFO: IPaddr Success heartbeat[2238]: 2007/03/15_14:27:43 info: foreign HA resource release completed (standby). heartbeat[1064]: 2007/03/15_14:27:43 info: Local standby process completed [foreign]. heartbeat[1064]: 2007/03/15_14:27:44 WARN: 1 lost packet(s) for [prueba2] [114:116] heartbeat[1064]: 2007/03/15_14:27:44 info: remote resource transition completed. heartbeat[1064]: 2007/03/15_14:27:44 info: No pkts missing from prueba2! heartbeat[1064]: 2007/03/15_14:27:44 info: Other node completed standby takeover of foreign resources.
What I am trying to have is ser running on linux-xczz and asterisk running on prueba2 with failover configured on both machines but apparently the failover crashes and I lose both my services if both heartbeats are running. Any idea why this happens or what I'm doing wrong?.
Can ser and asterisk be run by heartbeat with failover support?
thanxs in advance
I think you are pretty much on your own here. At least, I'm not capable of saying anything meaningful in such a complex (and not so usual) setup. Have you this working for SER only and Asterisk only? g-)
aespinoza@vivophone.com wrote:
Hello people I have a heartbeat cluster that manages a ser 0.9.3 running on one machine and asterisk1.2.3 running on another in an Active/Active two IP address Configuration with failover support. I have SUSE 10.1 installed on both machines. My ha.cf files look like this ############################################### logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 10 warntime 10 initdead 20 udpport 694 baud 19200 bcast eth1 ping xxx.xxx.x.x auto_failback on node linux-xczz node prueba2 respawn hacluster /usr/local/lib/heartbeat/ipfail #################################################
My haresources files look like this ######################################## prueba2 xxx.xxx.x.125/24 safe_asterisk linux-xczz xxx.xxx.x.124/24 serctl ########################################
so "linux-xczz" is the master when running ser and "prueba2" is the master when running asterisk
the authkeys files are the same on both machines too with the right permissions (mod 600). The /etc/hosts files look like this
########################## 10.10.10.1 linux-xczz 10.10.10.2 prueba2 ##########################
The First time I try to run heartbeat on one machine (prueba2) with /etc/init.d/heartbeat start, both my services run good. But when I try to run heartbeat on the other machine (linux-xczz) so that it takes over the ser service, the system goes crazy and once linux-xczz takes over the ser service, prueba2 gives up the other resource (asterisk) which it should not do, and it appears on linux-xczz, only to disappear seconds later along with ser, leaving my cluster-ha a complete wreck with no service running on either machine. The error log I get from prueba2 is this:
heartbeat[12609]: 2007/03/15_11:41:18 info: Link linux-xczz:eth1 up. heartbeat[12609]: 2007/03/15_11:41:18 info: Status update for node linux-xczz: status init heartbeat[12609]: 2007/03/15_11:41:18 info: Status update for node linux-xczz: status up harc[13730]: 2007/03/15_11:41:18 info: Running /etc/ha.d/rc.d/status status harc[13741]: 2007/03/15_11:41:18 info: Running /etc/ha.d/rc.d/status status heartbeat[12609]: 2007/03/15_11:41:19 info: Status update for node linux-xczz: status active harc[13754]: 2007/03/15_11:41:19 info: Running /etc/ha.d/rc.d/status status heartbeat[12609]: 2007/03/15_11:41:19 info: remote resource transition completed. heartbeat[12609]: 2007/03/15_11:41:19 info: prueba2 wants to go standby [foreign] heartbeat[12609]: 2007/03/15_11:41:20 info: standby: linux-xczz can take our foreign resources heartbeat[13767]: 2007/03/15_11:41:20 info: give up foreign HA resources (standby). ResourceManager[13777]: 2007/03/15_11:41:20 info: Releasing resource group: linux-xczz xxx.xxx.x.124/24 serctl ResourceManager[13777]: 2007/03/15_11:41:20 info: Running /etc/init.d/serctl stop ResourceManager[13777]: 2007/03/15_11:41:20 info: Running /etc/ha.d/resource.d/IPaddr xxx.xxx.x.124/24 stop IPaddr[13915]: 2007/03/15_11:41:20 INFO: /sbin/route -n del -host xxx.xxx.x.124 IPaddr[13915]: 2007/03/15_11:41:20 INFO: /sbin/ifconfig eth0:0 xxx.xxx.x.124 down IPaddr[13915]: 2007/03/15_11:41:20 INFO: IP Address xxx.xxx.x.124 released IPaddr[13836]: 2007/03/15_11:41:20 INFO: IPaddr Success heartbeat[13767]: 2007/03/15_11:41:20 info: foreign HA resource release completed (standby). heartbeat[12609]: 2007/03/15_11:41:20 info: Local standby process completed [foreign]. heartbeat[12609]: 2007/03/15_11:41:23 WARN: 1 lost packet(s) for [linux-xczz] [13:15] heartbeat[12609]: 2007/03/15_11:41:23 info: remote resource transition completed. heartbeat[12609]: 2007/03/15_11:41:23 info: No pkts missing from linux-xczz! heartbeat[12609]: 2007/03/15_11:41:23 info: Other node completed standby takeover of foreign resources. heartbeat[12609]: 2007/03/15_11:41:35 info: linux-xczz wants to go standby [foreign] heartbeat[12609]: 2007/03/15_11:41:36 info: standby: acquire [foreign] resources from linux-xczz heartbeat[14011]: 2007/03/15_11:41:36 info: acquire local HA resources (standby). ResourceManager[14021]: 2007/03/15_11:41:36 info: Acquiring resource group: prueba2 xxx.xxx.x.125/24 asterisk-rosa IPaddr[14048]: 2007/03/15_11:41:36 INFO: IPaddr Running OK ResourceManager[14021]: 2007/03/15_11:41:36 info: Running /etc/init.d/safe_asterisk start ResourceManager[14021]: 2007/03/15_11:41:36 ERROR: Return code 1 from /etc/init.d/safe_asterisk ResourceManager[14021]: 2007/03/15_11:41:36 CRIT: Giving up resources due to failure of safe_asterisk ResourceManager[14021]: 2007/03/15_11:41:36 info: Releasing resource group: prueba2 xxx.xxx.x.125/24 asterisk-rosa ResourceManager[14021]: 2007/03/15_11:41:xxz.xxz.x.xxz36 info: Running /etc/init.d/safe_asterisk stop ResourceManager[14021]: 2007/03/15_11:41:37 info: Running /etc/ha.d/resource.d/IPaddr xxx.xxx.x.125/24 stop IPaddr[14310]: 2007/03/15_11:41:37 INFO: /sbin/route -n del -host xxx.xxx.x.125 IPaddr[14310]: 2007/03/15_11:41:37 INFO: /sbin/ifconfig eth0:2 xxx.xxx.x.125 down IPaddr[14310]: 2007/03/15_11:41:37 INFO: IP Address xxx.xxx.x.125 released IPaddr[14231]: 2007/03/15_11:41:37 INFO: IPaddr Success heartbeat[14011]: 2007/03/15_11:41:37 info: local HA resource acquisition completed (standby). heartbeat[12609]: 2007/03/15_11:41:37 info: Standby resource acquisition done [foreign]. heartbeat[12609]: 2007/03/15_11:41:37 info: remote resource transition completed. heartbeat[12609]: 2007/03/15_11:41:38 WARN: G_CH_dispatch_int: Dispatch function for read child took too long to execute: 520 ms (> 50 ms) (GSource: 0x80fbe00) hb_standby[14375]: 2007/03/15_11:42:07 Going standby [foreign]. heartbeat[12609]: 2007/03/15_11:42:07 info: prueba2 wants to go standby [foreign] heartbeat[12609]: 2007/03/15_11:42:08 info: standby: linux-xczz can take our foreign resources heartbeat[14385]: 2007/03/15_11:42:08 info: give up foreign HA resources (standby). ResourceManager[14395]: 2007/03/15_11:42:08 info: Releasing resource group: linux-xczz xxx.xxx.x.124/24 serctl ResourceManager[14395]: 2007/03/15_11:42:08 info: Running /etc/init.d/serctl stop ResourceManager[14395]: 2007/03/15_11:42:08 ERROR: Return code 1 from /etc/init.d/serctl
The error log I get from linux-xczz when I run heartbeat is this:
heartbeat[1063]: 2007/03/15_14:26:12 WARN: Core dumps could be lost if multiple dumps occur heartbeat[1063]: 2007/03/15_14:26:12 WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability heartbeat[1063]: 2007/03/15_14:26:12 WARN: Logging daemon is disabled --enabling logging daemon is recommended heartbeat[1063]: 2007/03/15_14:26:12 info: ************************** heartbeat[1063]: 2007/03/15_14:26:12 info: Configuration validated. Starting heartbeat 2.0.7 heartbeat[1064]: 2007/03/15_14:26:12 info: heartbeat: version 2.0.7 heartbeat[1064]: 2007/03/15_14:26:12 info: Heartbeat generation: 130 heartbeat[1064]: 2007/03/15_14:26:12 info: G_main_add_TriggerHandler: Added signal manual handler heartbeat[1064]: 2007/03/15_14:26:12 info: G_main_add_TriggerHandler: Added signal manual handler heartbeat[1064]: 2007/03/15_14:26:12 info: Removing /usr/local/var/run/heartbeat/rsctmp failed, recreating. heartbeat[1064]: 2007/03/15_14:26:12 info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1 heartbeat[1064]: 2007/03/15_14:26:12 info: glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1 heartbeat[1064]: 2007/03/15_14:26:12 info: glib: ping heartbeat started. heartbeat[1064]: 2007/03/15_14:26:12 info: G_main_add_SignalHandler: Added signal handler for signal 17 heartbeat[1064]: 2007/03/15_14:26:12 info: Local status now set to: 'up' heartbeat[1064]: 2007/03/15_14:26:13 info: Link linux-xczz:eth1 up. heartbeat[1064]: 2007/03/15_14:26:13 info: Link prueba2:eth1 up. heartbeat[1064]: 2007/03/15_14:26:13 info: Status update for node prueba2: status active heartbeat[1064]: 2007/03/15_14:26:13 info: Link xxx.xxx.x.x:xxx.xxx.x.x up. heartbeat[1064]: 2007/03/15_14:26:13 info: Status update for node xxx.xxx.x.x: status ping harc[1073]: 2007/03/15_14:26:13 info: Running /usr/local/etc/ha.d/rc.d/status status heartbeat[1064]: 2007/03/15_14:26:14 info: Comm_now_up(): updating status to active heartbeat[1064]: 2007/03/15_14:26:14 info: Local status now set to: 'active' heartbeat[1064]: 2007/03/15_14:26:14 info: Starting child client "/usr/local/lib/heartbeat/ipfail" (1001,100) heartbeat[1084]: 2007/03/15_14:26:14 info: Starting "/usr/local/lib/heartbeat/ipfail" as uid 1001 gid 100 (pid 1084) heartbeat[1064]: 2007/03/15_14:26:14 info: remote resource transition completed. heartbeat[1064]: 2007/03/15_14:26:14 info: remote resource transition completed. heartbeat[1064]: 2007/03/15_14:26:14 info: Local Resource acquisition completed. (none) heartbeat[1064]: 2007/03/15_14:26:15 info: prueba2 wants to go standby [foreign] heartbeat[1064]: 2007/03/15_14:26:15 info: standby: acquire [foreign] resources from prueba2 heartbeat[1088]: 2007/03/15_14:26:15 info: acquire local HA resources (standby). ResourceManager[1098]: 2007/03/15_14:26:15 info: Acquiring resource group: linux-xczz xxx.xxx.x.124/24 serctl IPaddr[1122]: 2007/03/15_14:26:16 INFO: IPaddr Resource is stopped ResourceManager[1098]: 2007/03/15_14:26:16 info: Running /usr/local/etc/ha.d/resource.d/IPaddr 192.168.1.124/24 start IPaddr[1321]: 2007/03/15_14:26:16 INFO: eval /sbin/ifconfig eth0:0 xxx.xxx.x.124 netmask 255.255.255.0 broadcast xxx.xxx.x.255 IPaddr[1321]: 2007/03/15_14:26:16 INFO: Sending Gratuitous Arp for xxx.xxx.x.124 on eth0:0 [eth0] IPaddr[1321]: 2007/03/15_14:26:16 INFO: /usr/local/lib/heartbeat/send_arp -i 500 -r 10 -p /usr/local/var/run/heartbeat/rsctmp/send_arp/send_arp-xxx.xxx.x.124 eth0 xxx.xxx.x.124 auto xxx.xxx.x.124 ffffffffffff IPaddr[1241]: 2007/03/15_14:26:16 INFO: IPaddr Success ResourceManager[1098]: 2007/03/15_14:26:16 info: Running /etc/init.d/serctl start heartbeat[1088]: 2007/03/15_14:26:17 info: local HA resource acquisition completed (standby). heartbeat[1064]: 2007/03/15_14:26:17 info: Standby resource acquisition done [foreign]. heartbeat[1064]: 2007/03/15_14:26:17 info: Initial resource acquisition complete (auto_failback) heartbeat[1064]: 2007/03/15_14:26:23 info: remote resource transition completed. heartbeat[1064]: 2007/03/15_14:26:28 info: linux-xczz wants to go standby [foreign] heartbeat[1064]: 2007/03/15_14:26:28 info: standby: prueba2 can take our foreign resources heartbeat[1492]: 2007/03/15_14:26:28 info: give up foreign HA resources (standby). ResourceManager[1502]: 2007/03/15_14:26:28 info: Releasing resource group: prueba2 xxx.xxx.x.125/24 safe_asterisk ResourceManager[1502]: 2007/03/15_14:26:28 info: Running /etc/init.d/safe_asterisk stop ResourceManager[1502]: 2007/03/15_14:26:28 info: Running /usr/local/etc/ha.d/resource.d/IPaddr xxx.xxx.x.125/24 stop IPaddr[1561]: 2007/03/15_14:26:29 INFO: IPaddr Success heartbeat[1492]: 2007/03/15_14:26:29 info: foreign HA resource release completed (standby). heartbeat[1064]: 2007/03/15_14:26:29 info: Local standby process completed [foreign]. heartbeat[1064]: 2007/03/15_14:26:30 WARN: 1 lost packet(s) for [prueba2] [68:70] heartbeat[1064]: 2007/03/15_14:26:30 info: remote resource transition completed. heartbeat[1064]: 2007/03/15_14:26:30 info: No pkts missing from prueba2! heartbeat[1064]: 2007/03/15_14:26:30 info: Other node completed standby takeover of foreign resources. heartbeat[1064]: 2007/03/15_14:27:00 info: prueba2 wants to go standby [foreign] heartbeat[1064]: 2007/03/15_14:27:11 info: standby: acquire [foreign] resources from prueba2 heartbeat[1784]: 2007/03/15_14:27:11 info: acquire local HA resources (standby). ResourceManager[1794]: 2007/03/15_14:27:11 info: Acquiring resource group: linux-xczz xxx.xxx.x.124/24 serctl IPaddr[1818]: 2007/03/15_14:27:11 INFO: IPaddr Running OK ResourceManager[1794]: 2007/03/15_14:27:11 info: Running /etc/init.d/serctl start ResourceManager[1794]: 2007/03/15_14:27:11 ERROR: Return code 1 from /etc/init.d/serctl ResourceManager[1794]: 2007/03/15_14:27:11 CRIT: Giving up resources due to failure of serctl ResourceManager[1794]: 2007/03/15_14:27:11 info: Releasing resource group: linux-xczz xxx.xxx.x.124/24 serctl ResourceManager[1794]: 2007/03/15_14:27:11 info: Running /etc/init.d/serctl stop ResourceManager[1794]: 2007/03/15_14:27:11 info: Running /usr/local/etc/ha.d/resource.d/IPaddr xxx.xxx.x.124/24 stop IPaddr[2090]: 2007/03/15_14:27:12 INFO: /sbin/route -n del -host xxx.xxx.x.124 IPaddr[2090]: 2007/03/15_14:27:12 INFO: /sbin/ifconfig eth0:0 xxx.xxx.x.124 down IPaddr[2090]: 2007/03/15_14:27:12 INFO: IP Address xxx.xxx.x.124 released IPaddr[2006]: 2007/03/15_14:27:12 INFO: IPaddr Success heartbeat[1784]: 2007/03/15_14:27:12 info: local HA resource acquisition completed (standby). heartbeat[1064]: 2007/03/15_14:27:12 info: Standby resource acquisition done [foreign]. heartbeat[1064]: 2007/03/15_14:27:12 info: remote resource transition completed. hb_standby[2228]: 2007/03/15_14:27:42 Going standby [foreign]. heartbeat[1064]: 2007/03/15_14:27:42 info: linux-xczz wants to go standby [foreign] heartbeat[1064]: 2007/03/15_14:27:42 info: standby: prueba2 can take our foreign resources heartbeat[2238]: 2007/03/15_14:27:42 info: give up foreign HA resources (standby). ResourceManager[2248]: 2007/03/15_14:27:43 info: Releasing resource group: prueba2 xxx.xxx.x.125/24 safe_asterisk ResourceManager[2248]: 2007/03/15_14:27:43 info: Running /etc/init.d/safe_asterisk stop ResourceManager[2248]: 2007/03/15_14:27:43 info: Running /usr/local/etc/ha.d/resource.d/IPaddr xxx.xxx.x.125/24 stop IPaddr[2310]: 2007/03/15_14:27:43 INFO: IPaddr Success heartbeat[2238]: 2007/03/15_14:27:43 info: foreign HA resource release completed (standby). heartbeat[1064]: 2007/03/15_14:27:43 info: Local standby process completed [foreign]. heartbeat[1064]: 2007/03/15_14:27:44 WARN: 1 lost packet(s) for [prueba2] [114:116] heartbeat[1064]: 2007/03/15_14:27:44 info: remote resource transition completed. heartbeat[1064]: 2007/03/15_14:27:44 info: No pkts missing from prueba2! heartbeat[1064]: 2007/03/15_14:27:44 info: Other node completed standby takeover of foreign resources.
What I am trying to have is ser running on linux-xczz and asterisk running on prueba2 with failover configured on both machines but apparently the failover crashes and I lose both my services if both heartbeats are running. Any idea why this happens or what I'm doing wrong?.
Can ser and asterisk be run by heartbeat with failover support?
thanxs in advance
Serusers mailing list Serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers