[SR-Users] problem unreferencing dialog in dialog module

Anton Roman antonroman at gmail.com
Wed Mar 2 16:34:45 CET 2011


Hi all,

we are running Kamailio 3.1.2 in a production environment, using the dialog
module, and it crashed two hours ago.


Here you have the logs we got (addtional log fragments with the acc records
involved in this call are appended at the end of the mail):

Mar  2 14:43:05 kamailio2 /usr/local/sbin/kamailio[28927]: CRITICAL: dialog
[dlg_hash.c:599]: bogus ref -1 with cnt 1 for dlg 0x7f23f472db30
[2490:1070436595] with clid 'e0a20cb844d211e0acd8001422093865@<CLIENT IP>'
and tags '1577886432-3759264324-335599788-1698171170' ''
Mar  2 14:43:05 kamailio2 /usr/local/sbin/kamailio[28927]: : <core>
[mem/q_malloc.c:446]: BUG: qm_free: freeing already freed pointer, first
free: dialog: dlg_cb.c: destroy_dlg_callbacks_list(80) - aborting
Mar  2 14:43:05 kamailio2 /usr/local/sbin/kamailio[28896]: ALERT: <core>
[main.c:741]: child process 28927 exited by a signal 6
Mar  2 14:43:05 kamailio2 /usr/local/sbin/kamailio[28896]: ALERT: <core>
[main.c:744]: core was not generated
Mar  2 14:43:05 kamailio2 /usr/local/sbin/kamailio[28896]: INFO: <core>
[main.c:756]: INFO: terminating due to SIGCHLD
Mar  2 14:43:05 kamailio2 /usr/local/sbin/kamailio[28948]: INFO: <core>
[main.c:807]: INFO: signal 15 received
Mar  2 14:43:05 kamailio2 /usr/local/sbin/kamailio[28942]: INFO: <core>
[main.c:807]: INFO: signal 15 received

We get the kamailio code from git last week:

sercmd> core.info
{
    version: kamailio 3.1.2
    id: 4ace86
    compiler: gcc 4.3.2
    compiled: 09:12:36 Feb 23 2011
    flags: STATS: Off, USE_IPV6, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS,
DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC,
DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE,
USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
}

The problem looks like this other one already fixed:
http://lists.sip-router.org/pipermail/sr-users/2009-November/027351.html

We set the Kamailio to debug level in case it happens again.

On the other side, I need to know why the core is not been generated. I have
already checked the points mentioned in
http://www.kamailio.org/dokuwiki/doku.php/troubleshooting:corefiles

1. disable_core_dump is not set in the config file.

2. From /etc/default/kamailio:
...
DUMP_CORE=yes
...

2. From /etc/init.d/kamailio:
...
if test "$DUMP_CORE" = "yes" ; then
    # set proper ulimit
    ulimit -c unlimited

    # directory for the core dump files
     COREDIR=/home/corefiles
     [ -d $COREDIR ] || mkdir $COREDIR
     chmod 777 $COREDIR
     echo "$COREDIR/core.%e.sig%s.%p" > /proc/sys/kernel/core_pattern
fi
...

4. Writting permissions of $COREDIR

ls -hall /home
...
drwxrwxrwx  2 root   root   4.0K 2010-12-21 09:15 corefiles
...

What else should I check?

Thanks in advance,
regards

Antón


*Acc records related to the dialog whose destruction causes the problem:*

Mar  2 14:42:44 kamailio2 /usr/local/sbin/kamailio[28902]: NOTICE: acc
[acc.c:275]: ACC: transaction answered:
timestamp=1299073364;method=INVITE;from_tag=1577886432-3759264324-335599788-1698171170;to_tag=5FFAEA34-6A;call_id=e0a20cb844d211e0acd8001422093865@<client
IP>;code=200;reason=OK;src_user=<caller number>;src_domain=<client
IP>;dst_ouser=<called
number>;dst_user=<called number>;dst_domain=10.90.1.251;src_ip=<client IP>

...

Mar  2 14:42:44 kamailio2 /usr/local/sbin/kamailio[28920]: NOTICE: acc
[acc.c:275]: ACC: request acknowledged:
timestamp=1299073364;method=ACK;from_tag=1577886432-3759264324-335599788-1698171170;to_tag=5FFAEA34-6A;call_id=e0a20cb844d211e0acd8001422093865@<client
IP>;code=200;reason=OK;src_user=<caller number>;src_domain=<client
IP>;dst_ouser=<called number>;dst_user=<called
number>;dst_domain=10.90.1.251;src_ip=<client IP>
...


Mar  2 14:43:00 kamailio2 /usr/local/sbin/kamailio[28903]: ERROR: <script>:
ACK WITHOUT MATCHING TRANSACTION in e0a20cb844d211e0acd8001422093865@<client
IP> call... ignore and discard.

...

Mar  2 14:43:00 kamailio2 /usr/local/sbin/kamailio[28904]: NOTICE: acc
[acc.c:275]: ACC: transaction answered:
timestamp=1299073380;method=BYE;from_tag=1577886432-3759264324-335599788-1698171170;to_tag=5FFAEA34-6A;call_id=e0a20cb844d211e0acd8001422093865@<client
IP>;code=200;reason=OK;src_user=<caller number>;src_domain=<client
IP>;dst_ouser=<called number>;dst_user=<called
number>;dst_domain=10.90.1.251;src_ip=<client IP>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20110302/6ac68d43/attachment.htm>


More information about the sr-users mailing list