[Serusers] Heartbeat with ser and asterisk

Greger V. Teigre greger at teigre.com
Tue Mar 20 09:57:01 CET 2007


I think you are pretty much on your own here. At least, I'm not capable 
of saying anything meaningful in such a complex (and not so usual) setup.
Have you this working for SER only and Asterisk only?
g-)

aespinoza at vivophone.com wrote:
> Hello people
> I have a heartbeat cluster that manages a ser 0.9.3 running  on one 
> machine and asterisk1.2.3 running on another in an Active/Active two 
> IP address Configuration with failover support. I have SUSE 10.1 
> installed on both machines. My ha.cf files look like this
> ###############################################
> logfile /var/log/ha-log
> logfacility     local0
> keepalive 2
> deadtime 10
> warntime 10
> initdead 20
> udpport 694
> baud    19200
> bcast   eth1
> ping    xxx.xxx.x.x
> auto_failback on
> node    linux-xczz
> node    prueba2
> respawn hacluster /usr/local/lib/heartbeat/ipfail
> #################################################
>
>
> My haresources files look like this
> ########################################
> prueba2 xxx.xxx.x.125/24 safe_asterisk
> linux-xczz xxx.xxx.x.124/24 serctl
> ########################################
>
>
> so "linux-xczz" is the master when running ser  and "prueba2" is the 
> master when running asterisk
>
> the authkeys files are the same on both machines too with the right 
> permissions (mod 600). The /etc/hosts files look like this
>
> ##########################
> 10.10.10.1      linux-xczz
> 10.10.10.2      prueba2
> ##########################
>
>
> The First time I try to run heartbeat on one machine (prueba2) with 
> /etc/init.d/heartbeat start, both my services run good. But when I try 
> to run heartbeat on the other machine (linux-xczz) so that it takes 
> over the ser service, the system goes crazy  and once linux-xczz takes 
> over the ser service, prueba2 gives up the other resource (asterisk) 
> which it should not do, and it appears on linux-xczz, only to 
> disappear seconds later along with ser, leaving my cluster-ha a 
> complete wreck with no service running on either machine. The error 
> log I get from prueba2 is this:
>
> heartbeat[12609]: 2007/03/15_11:41:18 info: Link linux-xczz:eth1 up.
> heartbeat[12609]: 2007/03/15_11:41:18 info: Status update for node 
> linux-xczz: status init
> heartbeat[12609]: 2007/03/15_11:41:18 info: Status update for node 
> linux-xczz: status up
> harc[13730]:    2007/03/15_11:41:18 info: Running 
> /etc/ha.d/rc.d/status status
> harc[13741]:    2007/03/15_11:41:18 info: Running 
> /etc/ha.d/rc.d/status status
> heartbeat[12609]: 2007/03/15_11:41:19 info: Status update for node 
> linux-xczz: status active
> harc[13754]:    2007/03/15_11:41:19 info: Running 
> /etc/ha.d/rc.d/status status
> heartbeat[12609]: 2007/03/15_11:41:19 info: remote resource transition 
> completed.
> heartbeat[12609]: 2007/03/15_11:41:19 info: prueba2 wants to go 
> standby [foreign]
> heartbeat[12609]: 2007/03/15_11:41:20 info: standby: linux-xczz can 
> take our foreign resources
> heartbeat[13767]: 2007/03/15_11:41:20 info: give up foreign HA 
> resources (standby).
> ResourceManager[13777]: 2007/03/15_11:41:20 info: Releasing resource 
> group: linux-xczz xxx.xxx.x.124/24 serctl
> ResourceManager[13777]: 2007/03/15_11:41:20 info: Running 
> /etc/init.d/serctl  stop
> ResourceManager[13777]: 2007/03/15_11:41:20 info: Running 
> /etc/ha.d/resource.d/IPaddr xxx.xxx.x.124/24 stop
> IPaddr[13915]:  2007/03/15_11:41:20 INFO: /sbin/route -n del -host 
> xxx.xxx.x.124
> IPaddr[13915]:  2007/03/15_11:41:20 INFO: /sbin/ifconfig eth0:0 
> xxx.xxx.x.124 down
> IPaddr[13915]:  2007/03/15_11:41:20 INFO: IP Address xxx.xxx.x.124 
> released
> IPaddr[13836]:  2007/03/15_11:41:20 INFO: IPaddr Success
> heartbeat[13767]: 2007/03/15_11:41:20 info: foreign HA resource 
> release completed (standby).
> heartbeat[12609]: 2007/03/15_11:41:20 info: Local standby process 
> completed [foreign].
> heartbeat[12609]: 2007/03/15_11:41:23 WARN: 1 lost packet(s) for 
> [linux-xczz] [13:15]
> heartbeat[12609]: 2007/03/15_11:41:23 info: remote resource transition 
> completed.
> heartbeat[12609]: 2007/03/15_11:41:23 info: No pkts missing from 
> linux-xczz!
> heartbeat[12609]: 2007/03/15_11:41:23 info: Other node completed 
> standby takeover of foreign resources.
> heartbeat[12609]: 2007/03/15_11:41:35 info: linux-xczz wants to go 
> standby [foreign]
> heartbeat[12609]: 2007/03/15_11:41:36 info: standby: acquire [foreign] 
> resources from linux-xczz
> heartbeat[14011]: 2007/03/15_11:41:36 info: acquire local HA resources 
> (standby).
> ResourceManager[14021]: 2007/03/15_11:41:36 info: Acquiring resource 
> group: prueba2 xxx.xxx.x.125/24 asterisk-rosa
> IPaddr[14048]:  2007/03/15_11:41:36 INFO: IPaddr Running OK
> ResourceManager[14021]: 2007/03/15_11:41:36 info: Running 
> /etc/init.d/safe_asterisk  start
> ResourceManager[14021]: 2007/03/15_11:41:36 ERROR: Return code 1 from 
> /etc/init.d/safe_asterisk
> ResourceManager[14021]: 2007/03/15_11:41:36 CRIT: Giving up resources 
> due to failure of safe_asterisk
> ResourceManager[14021]: 2007/03/15_11:41:36 info: Releasing resource 
> group: prueba2 xxx.xxx.x.125/24 asterisk-rosa
> ResourceManager[14021]: 2007/03/15_11:41:xxz.xxz.x.xxz36 info: Running 
> /etc/init.d/safe_asterisk  stop
> ResourceManager[14021]: 2007/03/15_11:41:37 info: Running 
> /etc/ha.d/resource.d/IPaddr xxx.xxx.x.125/24 stop
> IPaddr[14310]:  2007/03/15_11:41:37 INFO: /sbin/route -n del -host 
> xxx.xxx.x.125
> IPaddr[14310]:  2007/03/15_11:41:37 INFO: /sbin/ifconfig eth0:2 
> xxx.xxx.x.125 down
> IPaddr[14310]:  2007/03/15_11:41:37 INFO: IP Address xxx.xxx.x.125 
> released
> IPaddr[14231]:  2007/03/15_11:41:37 INFO: IPaddr Success
> heartbeat[14011]: 2007/03/15_11:41:37 info: local HA resource 
> acquisition completed (standby).
> heartbeat[12609]: 2007/03/15_11:41:37 info: Standby resource 
> acquisition done [foreign].
> heartbeat[12609]: 2007/03/15_11:41:37 info: remote resource transition 
> completed.
> heartbeat[12609]: 2007/03/15_11:41:38 WARN: G_CH_dispatch_int: 
> Dispatch function for read child took too long to execute: 520 ms (> 
> 50 ms)
> (GSource: 0x80fbe00)
> hb_standby[14375]:      2007/03/15_11:42:07 Going standby [foreign].
> heartbeat[12609]: 2007/03/15_11:42:07 info: prueba2 wants to go 
> standby [foreign]
> heartbeat[12609]: 2007/03/15_11:42:08 info: standby: linux-xczz can 
> take our foreign resources
> heartbeat[14385]: 2007/03/15_11:42:08 info: give up foreign HA 
> resources (standby).
> ResourceManager[14395]: 2007/03/15_11:42:08 info: Releasing resource 
> group: linux-xczz xxx.xxx.x.124/24 serctl
> ResourceManager[14395]: 2007/03/15_11:42:08 info: Running 
> /etc/init.d/serctl  stop
> ResourceManager[14395]: 2007/03/15_11:42:08 ERROR: Return code 1 from 
> /etc/init.d/serctl
>
>
>
>
>
>
>
> The error log I get from linux-xczz when I run heartbeat is this:
>
> heartbeat[1063]: 2007/03/15_14:26:12 WARN: Core dumps could be lost if 
> multiple dumps occur
> heartbeat[1063]: 2007/03/15_14:26:12 WARN: Consider setting 
> /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum 
> supportability
> heartbeat[1063]: 2007/03/15_14:26:12 WARN: Logging daemon is disabled 
> --enabling logging daemon is recommended
> heartbeat[1063]: 2007/03/15_14:26:12 info: **************************
> heartbeat[1063]: 2007/03/15_14:26:12 info: Configuration validated. 
> Starting heartbeat 2.0.7
> heartbeat[1064]: 2007/03/15_14:26:12 info: heartbeat: version 2.0.7
> heartbeat[1064]: 2007/03/15_14:26:12 info: Heartbeat generation: 130
> heartbeat[1064]: 2007/03/15_14:26:12 info: G_main_add_TriggerHandler: 
> Added signal manual handler
> heartbeat[1064]: 2007/03/15_14:26:12 info: G_main_add_TriggerHandler: 
> Added signal manual handler
> heartbeat[1064]: 2007/03/15_14:26:12 info: Removing 
> /usr/local/var/run/heartbeat/rsctmp failed, recreating.
> heartbeat[1064]: 2007/03/15_14:26:12 info: glib: UDP Broadcast 
> heartbeat started on port 694 (694) interface eth1
> heartbeat[1064]: 2007/03/15_14:26:12 info: glib: UDP Broadcast 
> heartbeat closed on port 694 interface eth1 - Status: 1
> heartbeat[1064]: 2007/03/15_14:26:12 info: glib: ping heartbeat started.
> heartbeat[1064]: 2007/03/15_14:26:12 info: G_main_add_SignalHandler: 
> Added signal handler for signal 17
> heartbeat[1064]: 2007/03/15_14:26:12 info: Local status now set to: 'up'
> heartbeat[1064]: 2007/03/15_14:26:13 info: Link linux-xczz:eth1 up.
> heartbeat[1064]: 2007/03/15_14:26:13 info: Link prueba2:eth1 up.
> heartbeat[1064]: 2007/03/15_14:26:13 info: Status update for node 
> prueba2: status active
> heartbeat[1064]: 2007/03/15_14:26:13 info: Link 
> xxx.xxx.x.x:xxx.xxx.x.x up.
> heartbeat[1064]: 2007/03/15_14:26:13 info: Status update for node 
> xxx.xxx.x.x: status ping
> harc[1073]:    2007/03/15_14:26:13 info: Running 
> /usr/local/etc/ha.d/rc.d/status status
> heartbeat[1064]: 2007/03/15_14:26:14 info: Comm_now_up(): updating 
> status to active
> heartbeat[1064]: 2007/03/15_14:26:14 info: Local status now set to: 
> 'active'
> heartbeat[1064]: 2007/03/15_14:26:14 info: Starting child client 
> "/usr/local/lib/heartbeat/ipfail" (1001,100)
> heartbeat[1084]: 2007/03/15_14:26:14 info: Starting 
> "/usr/local/lib/heartbeat/ipfail" as uid 1001  gid 100 (pid 1084)
> heartbeat[1064]: 2007/03/15_14:26:14 info: remote resource transition 
> completed.
> heartbeat[1064]: 2007/03/15_14:26:14 info: remote resource transition 
> completed.
> heartbeat[1064]: 2007/03/15_14:26:14 info: Local Resource acquisition 
> completed. (none)
> heartbeat[1064]: 2007/03/15_14:26:15 info: prueba2 wants to go standby 
> [foreign]
> heartbeat[1064]: 2007/03/15_14:26:15 info: standby: acquire [foreign] 
> resources from prueba2
> heartbeat[1088]: 2007/03/15_14:26:15 info: acquire local HA resources 
> (standby).
> ResourceManager[1098]:    2007/03/15_14:26:15 info: Acquiring resource 
> group: linux-xczz xxx.xxx.x.124/24 serctl
> IPaddr[1122]:    2007/03/15_14:26:16 INFO: IPaddr Resource is stopped
> ResourceManager[1098]:    2007/03/15_14:26:16 info: Running 
> /usr/local/etc/ha.d/resource.d/IPaddr 192.168.1.124/24 start
> IPaddr[1321]:    2007/03/15_14:26:16 INFO: eval /sbin/ifconfig eth0:0 
> xxx.xxx.x.124 netmask 255.255.255.0 broadcast xxx.xxx.x.255
> IPaddr[1321]:    2007/03/15_14:26:16 INFO: Sending Gratuitous Arp for 
> xxx.xxx.x.124 on eth0:0 [eth0]
> IPaddr[1321]:    2007/03/15_14:26:16 INFO: 
> /usr/local/lib/heartbeat/send_arp -i 500 -r 10 -p 
> /usr/local/var/run/heartbeat/rsctmp/send_arp/send_arp-xxx.xxx.x.124 
> eth0 xxx.xxx.x.124 auto xxx.xxx.x.124 ffffffffffff
> IPaddr[1241]:    2007/03/15_14:26:16 INFO: IPaddr Success
> ResourceManager[1098]:    2007/03/15_14:26:16 info: Running 
> /etc/init.d/serctl  start
> heartbeat[1088]: 2007/03/15_14:26:17 info: local HA resource 
> acquisition completed (standby).
> heartbeat[1064]: 2007/03/15_14:26:17 info: Standby resource 
> acquisition done [foreign].
> heartbeat[1064]: 2007/03/15_14:26:17 info: Initial resource 
> acquisition complete (auto_failback)
> heartbeat[1064]: 2007/03/15_14:26:23 info: remote resource transition 
> completed.
> heartbeat[1064]: 2007/03/15_14:26:28 info: linux-xczz wants to go 
> standby [foreign]
> heartbeat[1064]: 2007/03/15_14:26:28 info: standby: prueba2 can take 
> our foreign resources
> heartbeat[1492]: 2007/03/15_14:26:28 info: give up foreign HA 
> resources (standby).
> ResourceManager[1502]:    2007/03/15_14:26:28 info: Releasing resource 
> group: prueba2 xxx.xxx.x.125/24 safe_asterisk
> ResourceManager[1502]:    2007/03/15_14:26:28 info: Running 
> /etc/init.d/safe_asterisk  stop
> ResourceManager[1502]:    2007/03/15_14:26:28 info: Running 
> /usr/local/etc/ha.d/resource.d/IPaddr xxx.xxx.x.125/24 stop
> IPaddr[1561]:    2007/03/15_14:26:29 INFO: IPaddr Success
> heartbeat[1492]: 2007/03/15_14:26:29 info: foreign HA resource release 
> completed (standby).
> heartbeat[1064]: 2007/03/15_14:26:29 info: Local standby process 
> completed [foreign].
> heartbeat[1064]: 2007/03/15_14:26:30 WARN: 1 lost packet(s) for 
> [prueba2] [68:70]
> heartbeat[1064]: 2007/03/15_14:26:30 info: remote resource transition 
> completed.
> heartbeat[1064]: 2007/03/15_14:26:30 info: No pkts missing from prueba2!
> heartbeat[1064]: 2007/03/15_14:26:30 info: Other node completed 
> standby takeover of foreign resources.
> heartbeat[1064]: 2007/03/15_14:27:00 info: prueba2 wants to go standby 
> [foreign]
> heartbeat[1064]: 2007/03/15_14:27:11 info: standby: acquire [foreign] 
> resources from prueba2
> heartbeat[1784]: 2007/03/15_14:27:11 info: acquire local HA resources 
> (standby).
> ResourceManager[1794]:    2007/03/15_14:27:11 info: Acquiring resource 
> group: linux-xczz xxx.xxx.x.124/24 serctl
> IPaddr[1818]:    2007/03/15_14:27:11 INFO: IPaddr Running OK
> ResourceManager[1794]:    2007/03/15_14:27:11 info: Running 
> /etc/init.d/serctl  start
> ResourceManager[1794]:    2007/03/15_14:27:11 ERROR: Return code 1 
> from /etc/init.d/serctl
> ResourceManager[1794]:    2007/03/15_14:27:11 CRIT: Giving up 
> resources due to failure of serctl
> ResourceManager[1794]:    2007/03/15_14:27:11 info: Releasing resource 
> group: linux-xczz xxx.xxx.x.124/24 serctl
> ResourceManager[1794]:    2007/03/15_14:27:11 info: Running 
> /etc/init.d/serctl  stop
> ResourceManager[1794]:    2007/03/15_14:27:11 info: Running 
> /usr/local/etc/ha.d/resource.d/IPaddr xxx.xxx.x.124/24 stop
> IPaddr[2090]:    2007/03/15_14:27:12 INFO: /sbin/route -n del -host 
> xxx.xxx.x.124
> IPaddr[2090]:    2007/03/15_14:27:12 INFO: /sbin/ifconfig eth0:0 
> xxx.xxx.x.124 down
> IPaddr[2090]:    2007/03/15_14:27:12 INFO: IP Address xxx.xxx.x.124 
> released
> IPaddr[2006]:    2007/03/15_14:27:12 INFO: IPaddr Success
> heartbeat[1784]: 2007/03/15_14:27:12 info: local HA resource 
> acquisition completed (standby).
> heartbeat[1064]: 2007/03/15_14:27:12 info: Standby resource 
> acquisition done [foreign].
> heartbeat[1064]: 2007/03/15_14:27:12 info: remote resource transition 
> completed.
> hb_standby[2228]:    2007/03/15_14:27:42 Going standby [foreign].
> heartbeat[1064]: 2007/03/15_14:27:42 info: linux-xczz wants to go 
> standby [foreign]
> heartbeat[1064]: 2007/03/15_14:27:42 info: standby: prueba2 can take 
> our foreign resources
> heartbeat[2238]: 2007/03/15_14:27:42 info: give up foreign HA 
> resources (standby).
> ResourceManager[2248]:    2007/03/15_14:27:43 info: Releasing resource 
> group: prueba2 xxx.xxx.x.125/24 safe_asterisk
> ResourceManager[2248]:    2007/03/15_14:27:43 info: Running 
> /etc/init.d/safe_asterisk  stop
> ResourceManager[2248]:    2007/03/15_14:27:43 info: Running 
> /usr/local/etc/ha.d/resource.d/IPaddr xxx.xxx.x.125/24 stop
> IPaddr[2310]:    2007/03/15_14:27:43 INFO: IPaddr Success
> heartbeat[2238]: 2007/03/15_14:27:43 info: foreign HA resource release 
> completed (standby).
> heartbeat[1064]: 2007/03/15_14:27:43 info: Local standby process 
> completed [foreign].
> heartbeat[1064]: 2007/03/15_14:27:44 WARN: 1 lost packet(s) for 
> [prueba2] [114:116]
> heartbeat[1064]: 2007/03/15_14:27:44 info: remote resource transition 
> completed.
> heartbeat[1064]: 2007/03/15_14:27:44 info: No pkts missing from prueba2!
> heartbeat[1064]: 2007/03/15_14:27:44 info: Other node completed 
> standby takeover of foreign resources.
>
>
>
> What I am trying to have is ser running on linux-xczz and asterisk 
> running on prueba2 with failover configured on both machines but 
> apparently the failover crashes and I lose both my services if both 
> heartbeats are running. Any idea why this happens or what I'm doing 
> wrong?.
>
> Can ser and asterisk be run by heartbeat with failover support?
>
> thanxs in advance
>
> _______________________________________________
> Serusers mailing list
> Serusers at lists.iptel.org
> http://lists.iptel.org/mailman/listinfo/serusers
>
>



More information about the sr-users mailing list