[Serusers] Heartbeat with ser and asterisk
Greger V. Teigre
greger at teigre.com
Tue Mar 20 09:57:01 CET 2007
I think you are pretty much on your own here. At least, I'm not capable
of saying anything meaningful in such a complex (and not so usual) setup.
Have you this working for SER only and Asterisk only?
g-)
aespinoza at vivophone.com wrote:
> Hello people
> I have a heartbeat cluster that manages a ser 0.9.3 running on one
> machine and asterisk1.2.3 running on another in an Active/Active two
> IP address Configuration with failover support. I have SUSE 10.1
> installed on both machines. My ha.cf files look like this
> ###############################################
> logfile /var/log/ha-log
> logfacility local0
> keepalive 2
> deadtime 10
> warntime 10
> initdead 20
> udpport 694
> baud 19200
> bcast eth1
> ping xxx.xxx.x.x
> auto_failback on
> node linux-xczz
> node prueba2
> respawn hacluster /usr/local/lib/heartbeat/ipfail
> #################################################
>
>
> My haresources files look like this
> ########################################
> prueba2 xxx.xxx.x.125/24 safe_asterisk
> linux-xczz xxx.xxx.x.124/24 serctl
> ########################################
>
>
> so "linux-xczz" is the master when running ser and "prueba2" is the
> master when running asterisk
>
> the authkeys files are the same on both machines too with the right
> permissions (mod 600). The /etc/hosts files look like this
>
> ##########################
> 10.10.10.1 linux-xczz
> 10.10.10.2 prueba2
> ##########################
>
>
> The First time I try to run heartbeat on one machine (prueba2) with
> /etc/init.d/heartbeat start, both my services run good. But when I try
> to run heartbeat on the other machine (linux-xczz) so that it takes
> over the ser service, the system goes crazy and once linux-xczz takes
> over the ser service, prueba2 gives up the other resource (asterisk)
> which it should not do, and it appears on linux-xczz, only to
> disappear seconds later along with ser, leaving my cluster-ha a
> complete wreck with no service running on either machine. The error
> log I get from prueba2 is this:
>
> heartbeat[12609]: 2007/03/15_11:41:18 info: Link linux-xczz:eth1 up.
> heartbeat[12609]: 2007/03/15_11:41:18 info: Status update for node
> linux-xczz: status init
> heartbeat[12609]: 2007/03/15_11:41:18 info: Status update for node
> linux-xczz: status up
> harc[13730]: 2007/03/15_11:41:18 info: Running
> /etc/ha.d/rc.d/status status
> harc[13741]: 2007/03/15_11:41:18 info: Running
> /etc/ha.d/rc.d/status status
> heartbeat[12609]: 2007/03/15_11:41:19 info: Status update for node
> linux-xczz: status active
> harc[13754]: 2007/03/15_11:41:19 info: Running
> /etc/ha.d/rc.d/status status
> heartbeat[12609]: 2007/03/15_11:41:19 info: remote resource transition
> completed.
> heartbeat[12609]: 2007/03/15_11:41:19 info: prueba2 wants to go
> standby [foreign]
> heartbeat[12609]: 2007/03/15_11:41:20 info: standby: linux-xczz can
> take our foreign resources
> heartbeat[13767]: 2007/03/15_11:41:20 info: give up foreign HA
> resources (standby).
> ResourceManager[13777]: 2007/03/15_11:41:20 info: Releasing resource
> group: linux-xczz xxx.xxx.x.124/24 serctl
> ResourceManager[13777]: 2007/03/15_11:41:20 info: Running
> /etc/init.d/serctl stop
> ResourceManager[13777]: 2007/03/15_11:41:20 info: Running
> /etc/ha.d/resource.d/IPaddr xxx.xxx.x.124/24 stop
> IPaddr[13915]: 2007/03/15_11:41:20 INFO: /sbin/route -n del -host
> xxx.xxx.x.124
> IPaddr[13915]: 2007/03/15_11:41:20 INFO: /sbin/ifconfig eth0:0
> xxx.xxx.x.124 down
> IPaddr[13915]: 2007/03/15_11:41:20 INFO: IP Address xxx.xxx.x.124
> released
> IPaddr[13836]: 2007/03/15_11:41:20 INFO: IPaddr Success
> heartbeat[13767]: 2007/03/15_11:41:20 info: foreign HA resource
> release completed (standby).
> heartbeat[12609]: 2007/03/15_11:41:20 info: Local standby process
> completed [foreign].
> heartbeat[12609]: 2007/03/15_11:41:23 WARN: 1 lost packet(s) for
> [linux-xczz] [13:15]
> heartbeat[12609]: 2007/03/15_11:41:23 info: remote resource transition
> completed.
> heartbeat[12609]: 2007/03/15_11:41:23 info: No pkts missing from
> linux-xczz!
> heartbeat[12609]: 2007/03/15_11:41:23 info: Other node completed
> standby takeover of foreign resources.
> heartbeat[12609]: 2007/03/15_11:41:35 info: linux-xczz wants to go
> standby [foreign]
> heartbeat[12609]: 2007/03/15_11:41:36 info: standby: acquire [foreign]
> resources from linux-xczz
> heartbeat[14011]: 2007/03/15_11:41:36 info: acquire local HA resources
> (standby).
> ResourceManager[14021]: 2007/03/15_11:41:36 info: Acquiring resource
> group: prueba2 xxx.xxx.x.125/24 asterisk-rosa
> IPaddr[14048]: 2007/03/15_11:41:36 INFO: IPaddr Running OK
> ResourceManager[14021]: 2007/03/15_11:41:36 info: Running
> /etc/init.d/safe_asterisk start
> ResourceManager[14021]: 2007/03/15_11:41:36 ERROR: Return code 1 from
> /etc/init.d/safe_asterisk
> ResourceManager[14021]: 2007/03/15_11:41:36 CRIT: Giving up resources
> due to failure of safe_asterisk
> ResourceManager[14021]: 2007/03/15_11:41:36 info: Releasing resource
> group: prueba2 xxx.xxx.x.125/24 asterisk-rosa
> ResourceManager[14021]: 2007/03/15_11:41:xxz.xxz.x.xxz36 info: Running
> /etc/init.d/safe_asterisk stop
> ResourceManager[14021]: 2007/03/15_11:41:37 info: Running
> /etc/ha.d/resource.d/IPaddr xxx.xxx.x.125/24 stop
> IPaddr[14310]: 2007/03/15_11:41:37 INFO: /sbin/route -n del -host
> xxx.xxx.x.125
> IPaddr[14310]: 2007/03/15_11:41:37 INFO: /sbin/ifconfig eth0:2
> xxx.xxx.x.125 down
> IPaddr[14310]: 2007/03/15_11:41:37 INFO: IP Address xxx.xxx.x.125
> released
> IPaddr[14231]: 2007/03/15_11:41:37 INFO: IPaddr Success
> heartbeat[14011]: 2007/03/15_11:41:37 info: local HA resource
> acquisition completed (standby).
> heartbeat[12609]: 2007/03/15_11:41:37 info: Standby resource
> acquisition done [foreign].
> heartbeat[12609]: 2007/03/15_11:41:37 info: remote resource transition
> completed.
> heartbeat[12609]: 2007/03/15_11:41:38 WARN: G_CH_dispatch_int:
> Dispatch function for read child took too long to execute: 520 ms (>
> 50 ms)
> (GSource: 0x80fbe00)
> hb_standby[14375]: 2007/03/15_11:42:07 Going standby [foreign].
> heartbeat[12609]: 2007/03/15_11:42:07 info: prueba2 wants to go
> standby [foreign]
> heartbeat[12609]: 2007/03/15_11:42:08 info: standby: linux-xczz can
> take our foreign resources
> heartbeat[14385]: 2007/03/15_11:42:08 info: give up foreign HA
> resources (standby).
> ResourceManager[14395]: 2007/03/15_11:42:08 info: Releasing resource
> group: linux-xczz xxx.xxx.x.124/24 serctl
> ResourceManager[14395]: 2007/03/15_11:42:08 info: Running
> /etc/init.d/serctl stop
> ResourceManager[14395]: 2007/03/15_11:42:08 ERROR: Return code 1 from
> /etc/init.d/serctl
>
>
>
>
>
>
>
> The error log I get from linux-xczz when I run heartbeat is this:
>
> heartbeat[1063]: 2007/03/15_14:26:12 WARN: Core dumps could be lost if
> multiple dumps occur
> heartbeat[1063]: 2007/03/15_14:26:12 WARN: Consider setting
> /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum
> supportability
> heartbeat[1063]: 2007/03/15_14:26:12 WARN: Logging daemon is disabled
> --enabling logging daemon is recommended
> heartbeat[1063]: 2007/03/15_14:26:12 info: **************************
> heartbeat[1063]: 2007/03/15_14:26:12 info: Configuration validated.
> Starting heartbeat 2.0.7
> heartbeat[1064]: 2007/03/15_14:26:12 info: heartbeat: version 2.0.7
> heartbeat[1064]: 2007/03/15_14:26:12 info: Heartbeat generation: 130
> heartbeat[1064]: 2007/03/15_14:26:12 info: G_main_add_TriggerHandler:
> Added signal manual handler
> heartbeat[1064]: 2007/03/15_14:26:12 info: G_main_add_TriggerHandler:
> Added signal manual handler
> heartbeat[1064]: 2007/03/15_14:26:12 info: Removing
> /usr/local/var/run/heartbeat/rsctmp failed, recreating.
> heartbeat[1064]: 2007/03/15_14:26:12 info: glib: UDP Broadcast
> heartbeat started on port 694 (694) interface eth1
> heartbeat[1064]: 2007/03/15_14:26:12 info: glib: UDP Broadcast
> heartbeat closed on port 694 interface eth1 - Status: 1
> heartbeat[1064]: 2007/03/15_14:26:12 info: glib: ping heartbeat started.
> heartbeat[1064]: 2007/03/15_14:26:12 info: G_main_add_SignalHandler:
> Added signal handler for signal 17
> heartbeat[1064]: 2007/03/15_14:26:12 info: Local status now set to: 'up'
> heartbeat[1064]: 2007/03/15_14:26:13 info: Link linux-xczz:eth1 up.
> heartbeat[1064]: 2007/03/15_14:26:13 info: Link prueba2:eth1 up.
> heartbeat[1064]: 2007/03/15_14:26:13 info: Status update for node
> prueba2: status active
> heartbeat[1064]: 2007/03/15_14:26:13 info: Link
> xxx.xxx.x.x:xxx.xxx.x.x up.
> heartbeat[1064]: 2007/03/15_14:26:13 info: Status update for node
> xxx.xxx.x.x: status ping
> harc[1073]: 2007/03/15_14:26:13 info: Running
> /usr/local/etc/ha.d/rc.d/status status
> heartbeat[1064]: 2007/03/15_14:26:14 info: Comm_now_up(): updating
> status to active
> heartbeat[1064]: 2007/03/15_14:26:14 info: Local status now set to:
> 'active'
> heartbeat[1064]: 2007/03/15_14:26:14 info: Starting child client
> "/usr/local/lib/heartbeat/ipfail" (1001,100)
> heartbeat[1084]: 2007/03/15_14:26:14 info: Starting
> "/usr/local/lib/heartbeat/ipfail" as uid 1001 gid 100 (pid 1084)
> heartbeat[1064]: 2007/03/15_14:26:14 info: remote resource transition
> completed.
> heartbeat[1064]: 2007/03/15_14:26:14 info: remote resource transition
> completed.
> heartbeat[1064]: 2007/03/15_14:26:14 info: Local Resource acquisition
> completed. (none)
> heartbeat[1064]: 2007/03/15_14:26:15 info: prueba2 wants to go standby
> [foreign]
> heartbeat[1064]: 2007/03/15_14:26:15 info: standby: acquire [foreign]
> resources from prueba2
> heartbeat[1088]: 2007/03/15_14:26:15 info: acquire local HA resources
> (standby).
> ResourceManager[1098]: 2007/03/15_14:26:15 info: Acquiring resource
> group: linux-xczz xxx.xxx.x.124/24 serctl
> IPaddr[1122]: 2007/03/15_14:26:16 INFO: IPaddr Resource is stopped
> ResourceManager[1098]: 2007/03/15_14:26:16 info: Running
> /usr/local/etc/ha.d/resource.d/IPaddr 192.168.1.124/24 start
> IPaddr[1321]: 2007/03/15_14:26:16 INFO: eval /sbin/ifconfig eth0:0
> xxx.xxx.x.124 netmask 255.255.255.0 broadcast xxx.xxx.x.255
> IPaddr[1321]: 2007/03/15_14:26:16 INFO: Sending Gratuitous Arp for
> xxx.xxx.x.124 on eth0:0 [eth0]
> IPaddr[1321]: 2007/03/15_14:26:16 INFO:
> /usr/local/lib/heartbeat/send_arp -i 500 -r 10 -p
> /usr/local/var/run/heartbeat/rsctmp/send_arp/send_arp-xxx.xxx.x.124
> eth0 xxx.xxx.x.124 auto xxx.xxx.x.124 ffffffffffff
> IPaddr[1241]: 2007/03/15_14:26:16 INFO: IPaddr Success
> ResourceManager[1098]: 2007/03/15_14:26:16 info: Running
> /etc/init.d/serctl start
> heartbeat[1088]: 2007/03/15_14:26:17 info: local HA resource
> acquisition completed (standby).
> heartbeat[1064]: 2007/03/15_14:26:17 info: Standby resource
> acquisition done [foreign].
> heartbeat[1064]: 2007/03/15_14:26:17 info: Initial resource
> acquisition complete (auto_failback)
> heartbeat[1064]: 2007/03/15_14:26:23 info: remote resource transition
> completed.
> heartbeat[1064]: 2007/03/15_14:26:28 info: linux-xczz wants to go
> standby [foreign]
> heartbeat[1064]: 2007/03/15_14:26:28 info: standby: prueba2 can take
> our foreign resources
> heartbeat[1492]: 2007/03/15_14:26:28 info: give up foreign HA
> resources (standby).
> ResourceManager[1502]: 2007/03/15_14:26:28 info: Releasing resource
> group: prueba2 xxx.xxx.x.125/24 safe_asterisk
> ResourceManager[1502]: 2007/03/15_14:26:28 info: Running
> /etc/init.d/safe_asterisk stop
> ResourceManager[1502]: 2007/03/15_14:26:28 info: Running
> /usr/local/etc/ha.d/resource.d/IPaddr xxx.xxx.x.125/24 stop
> IPaddr[1561]: 2007/03/15_14:26:29 INFO: IPaddr Success
> heartbeat[1492]: 2007/03/15_14:26:29 info: foreign HA resource release
> completed (standby).
> heartbeat[1064]: 2007/03/15_14:26:29 info: Local standby process
> completed [foreign].
> heartbeat[1064]: 2007/03/15_14:26:30 WARN: 1 lost packet(s) for
> [prueba2] [68:70]
> heartbeat[1064]: 2007/03/15_14:26:30 info: remote resource transition
> completed.
> heartbeat[1064]: 2007/03/15_14:26:30 info: No pkts missing from prueba2!
> heartbeat[1064]: 2007/03/15_14:26:30 info: Other node completed
> standby takeover of foreign resources.
> heartbeat[1064]: 2007/03/15_14:27:00 info: prueba2 wants to go standby
> [foreign]
> heartbeat[1064]: 2007/03/15_14:27:11 info: standby: acquire [foreign]
> resources from prueba2
> heartbeat[1784]: 2007/03/15_14:27:11 info: acquire local HA resources
> (standby).
> ResourceManager[1794]: 2007/03/15_14:27:11 info: Acquiring resource
> group: linux-xczz xxx.xxx.x.124/24 serctl
> IPaddr[1818]: 2007/03/15_14:27:11 INFO: IPaddr Running OK
> ResourceManager[1794]: 2007/03/15_14:27:11 info: Running
> /etc/init.d/serctl start
> ResourceManager[1794]: 2007/03/15_14:27:11 ERROR: Return code 1
> from /etc/init.d/serctl
> ResourceManager[1794]: 2007/03/15_14:27:11 CRIT: Giving up
> resources due to failure of serctl
> ResourceManager[1794]: 2007/03/15_14:27:11 info: Releasing resource
> group: linux-xczz xxx.xxx.x.124/24 serctl
> ResourceManager[1794]: 2007/03/15_14:27:11 info: Running
> /etc/init.d/serctl stop
> ResourceManager[1794]: 2007/03/15_14:27:11 info: Running
> /usr/local/etc/ha.d/resource.d/IPaddr xxx.xxx.x.124/24 stop
> IPaddr[2090]: 2007/03/15_14:27:12 INFO: /sbin/route -n del -host
> xxx.xxx.x.124
> IPaddr[2090]: 2007/03/15_14:27:12 INFO: /sbin/ifconfig eth0:0
> xxx.xxx.x.124 down
> IPaddr[2090]: 2007/03/15_14:27:12 INFO: IP Address xxx.xxx.x.124
> released
> IPaddr[2006]: 2007/03/15_14:27:12 INFO: IPaddr Success
> heartbeat[1784]: 2007/03/15_14:27:12 info: local HA resource
> acquisition completed (standby).
> heartbeat[1064]: 2007/03/15_14:27:12 info: Standby resource
> acquisition done [foreign].
> heartbeat[1064]: 2007/03/15_14:27:12 info: remote resource transition
> completed.
> hb_standby[2228]: 2007/03/15_14:27:42 Going standby [foreign].
> heartbeat[1064]: 2007/03/15_14:27:42 info: linux-xczz wants to go
> standby [foreign]
> heartbeat[1064]: 2007/03/15_14:27:42 info: standby: prueba2 can take
> our foreign resources
> heartbeat[2238]: 2007/03/15_14:27:42 info: give up foreign HA
> resources (standby).
> ResourceManager[2248]: 2007/03/15_14:27:43 info: Releasing resource
> group: prueba2 xxx.xxx.x.125/24 safe_asterisk
> ResourceManager[2248]: 2007/03/15_14:27:43 info: Running
> /etc/init.d/safe_asterisk stop
> ResourceManager[2248]: 2007/03/15_14:27:43 info: Running
> /usr/local/etc/ha.d/resource.d/IPaddr xxx.xxx.x.125/24 stop
> IPaddr[2310]: 2007/03/15_14:27:43 INFO: IPaddr Success
> heartbeat[2238]: 2007/03/15_14:27:43 info: foreign HA resource release
> completed (standby).
> heartbeat[1064]: 2007/03/15_14:27:43 info: Local standby process
> completed [foreign].
> heartbeat[1064]: 2007/03/15_14:27:44 WARN: 1 lost packet(s) for
> [prueba2] [114:116]
> heartbeat[1064]: 2007/03/15_14:27:44 info: remote resource transition
> completed.
> heartbeat[1064]: 2007/03/15_14:27:44 info: No pkts missing from prueba2!
> heartbeat[1064]: 2007/03/15_14:27:44 info: Other node completed
> standby takeover of foreign resources.
>
>
>
> What I am trying to have is ser running on linux-xczz and asterisk
> running on prueba2 with failover configured on both machines but
> apparently the failover crashes and I lose both my services if both
> heartbeats are running. Any idea why this happens or what I'm doing
> wrong?.
>
> Can ser and asterisk be run by heartbeat with failover support?
>
> thanxs in advance
>
> _______________________________________________
> Serusers mailing list
> Serusers at lists.iptel.org
> http://lists.iptel.org/mailman/listinfo/serusers
>
>
More information about the sr-users
mailing list