Hi All,

I wonder if some one could help me to diagnose a recurring issue?

It happens at random times/intervals and under varying load. But always, just before the time of crash, I see the same critical error in log:

CRITICAL: dialog [dlg_hash.c:841]: log_next_state_dlg(): bogus event 6 in state 1 for dlg 0xb0632134 [1367:5814] with clid '0695dd7a346188dd24e7520e6c01092c@sip.sipcentric.com' and tags 'as77c89620' ''


Analysing the core dump reveals:

Core was generated by `/usr/sbin/kamailio -P /var/run/kamailio.pid -m 128 -M 4 -u kamailio -g kamailio'.
Program terminated with signal 11, Segmentation fault.
#0  0x081a737c in parse_uri (buf=0x3a70006e <Address 0x3a70006e out of bounds>, len=275, uri=0xbfa2fd2c) at parser/parse_uri.c:389
389 scheme=buf[0]+(buf[1]<<8)+(buf[2]<<16)+(buf[3]<<24);

(gdb) frame 1

#1  0x008fe5dd in dialog_publish (state=0x903f37 "Trying", ruri=0xb0b5fd00, entity=0xb0632188, peer=0xb0632190, callid=0xb0632180, initiator=1, lifetime=7200, localtag=0x0, remotetag=0x0, localtarget=0x0, 
    remotetarget=0x0, do_pubruri_localcheck=1) at dialog_publish.c:275
275 if (parse_uri(ruri->s, ruri->len, &ruri_uri) < 0) {

(gdb) p *ruri

$1 = {s = 0x3a70006e <Address 0x3a70006e out of bounds>, len = 275}

(gdb) up

#2  0x008ff277 in dialog_publish_multi (state=0x903f37 "Trying", ruris=0xb0b5fd00, entity=0xb0632188, peer=0xb0632190, callid=0xb0632180, initiator=1, lifetime=7200, localtag=0x0, remotetag=0x0, 
    localtarget=0x0, remotetarget=0x0, do_pubruri_localcheck=1) at dialog_publish.c:387
387 dialog_publish(state,&(ruris->s),entity,peer,callid,initiator,lifetime,localtag,remotetag,localtarget,remotetarget,do_pubruri_localcheck);

(gdb) p *ruris

$2 = {s = {s = 0x3a70006e <Address 0x3a70006e out of bounds>, len = 275}, next = 0x0}

(gdb) up

#3  0x0090187a in __dialog_created (dlg=0xb0632134, type=2, _params=0x6db064) at pua_dialoginfo.c:470
470 dialog_publish_multi("Trying", dlginfo->pubruris_caller, &(dlg->from_uri), (include_req_uri)?&(dlg->req_uri):&(dlg->to_uri), &(dlg->callid), 1, dlginfo->lifetime, 0, 0, 0, 0, send_publish_flag==-1?1:0);

(gdb) p *dlginfo->pubruris_caller

$3 = {s = {s = 0x31590014 <Address 0x31590014 out of bounds>, len = 275}, next = 0xb0b5fd00}

(gdb) p *dlginfo->pubruris_caller->next

$4 = {s = {s = 0x3a70006e <Address 0x3a70006e out of bounds>, len = 275}, next = 0x0}


In config, for pua_dialoginfo we are enabling the option "use_pubruri_avps" and setting "pubruri_caller_avp" and "pubruri_callee_avp" accordingly.

Therefore, in pua_dialoginfo.c it is using get_str_list() function to set dlginfo->pubruris_caller from the avp.

Could this be some race condition or something completely different?

Thanks in advance,

Charles



www.sipcentric.com

Follow us on twitter @sipcentric

Sipcentric Ltd. Company registered in England & Wales no. 7365592. Registered office: Faraday Wharf, Innovation Birmingham Campus, Holt Street, Birmingham Science Park, Birmingham B7 4BB.