[sr-dev] [Redis module] Kamailio crashes in case of connection lost to redis server

Vicente Hernando vhernando at systemonenoc.com
Thu Nov 28 12:36:25 CET 2013


Hello Nguyen,

I have uploaded the patch in devel, 4.0, and 4.1 versions.


Regards,
Vicente.

On 11/28/2013 12:07 PM, Tuan Viet Nguyen wrote:
> Hi Vicente,
>
> It works now. Thank you for the patch. In which version will we have 
> this one integrated ?
>
>
> Regards,
>
>
> On Thu, Nov 28, 2013 at 11:36 AM, Vicente Hernando 
> <vhernando at systemonenoc.com <mailto:vhernando at systemonenoc.com>> wrote:
>
>     Hello,
>
>     could you test this patch and confirm the bug has disappeared?
>
>     Thanks,
>     Vicente.
>
>
>     On 11/28/2013 11:10 AM, Tuan Viet Nguyen wrote:
>>     Hi Vicente,
>>
>>     Thank you for your quick reply.
>>
>>     I'm ready to retest the patch.
>>
>>     Regards,
>>
>>
>>     On Thu, Nov 28, 2013 at 11:07 AM, Vicente Hernando
>>     <vhernando at systemonenoc.com <mailto:vhernando at systemonenoc.com>>
>>     wrote:
>>
>>         Hello,
>>
>>         I think you have discovered a bug I made using variadic
>>         functions.
>>
>>         Very soon I gonna send a patch to correct it.
>>
>>
>>         Thanks,
>>         Vicente.
>>
>>
>>         On 11/28/2013 10:14 AM, Tuan Viet Nguyen wrote:
>>>         Hello Vicente,
>>>
>>>         Thank you for your reply, you'll find my answer below
>>>
>>>         On Thu, Nov 28, 2013 at 12:03 AM, Vicente Hernando
>>>         <vhernando at systemonenoc.com
>>>         <mailto:vhernando at systemonenoc.com>> wrote:
>>>
>>>             Hello,
>>>
>>>             also full steps to crash kamailio and reproduce the
>>>             error would be good.
>>>
>>>
>>>         Here is the architecture
>>>
>>>         A <--> Asterisk <--> Kamailio 1 <---> kamailio2 <--- ISP--->
>>>         mobile
>>>
>>>         Kamailio 1 & 2 are connected to a local redis server
>>>         1/ I restarted the redis server
>>>         2/ From the mobile I made a call to A then cancelled it. In
>>>         the script of kamailio1, if a call has missed or failed, it
>>>         sends a message to the redis. And in this case, it crashes
>>>
>>>
>>>
>>>
>>>
>>>             On 11/27/2013 11:35 PM, Daniel-Constantin Mierla wrote:
>>>>             Hello,
>>>>
>>>>             can you give the full output for 'bt full' with gdb on
>>>>             the core file? You gave only partial list of the
>>>>             frames, not being enough to see the execution trace.
>>>>
>>>>             Cheers,
>>>>             Daniel
>>>>
>>>>             On 11/27/13 6:52 PM, Tuan Viet Nguyen wrote:
>>>>>             Hello,
>>>>>
>>>>>             I'll try to shut down the redis server to test the
>>>>>             behavior of kamailio and it has crashed if a call is
>>>>>             received and then cancelled.
>>>>>
>>>>>             *1/The kamailio version is 4.0.4*
>>>>>
>>>>>             *2/ Kamailio log *
>>>>>             /usr/local/sbin/kamailio[25333]: ERROR: ndb_redis
>>>>>             [redis_client.c:364]: redisc_exec(): Redis error:
>>>>>             Server closed the connection
>>>>>             /usr/local/sbin/kamailio[25361]: : <core>
>>>>>             [pass_fd.c:293]: receive_fd(): ERROR: receive_fd: EOF
>>>>>             on 13
>>>>>             /usr/local/sbin/kamailio[25328]: ALERT: <core>
>>>>>             [main.c:788]: handle_sigs(): child process 25333
>>>>>             exited by a signal 11
>>>>>             /usr/local/sbin/kamailio[25328]: ALERT: <core>
>>>>>             [main.c:791]: handle_sigs(): core was generated
>>>>>
>>>             I assume you disconnect redis server and don't reconnect
>>>             it. It is that correct?
>>>
>>>             Then this line is an error but it should recover from
>>>             that. I probably should set this as a warning instead an
>>>             error.
>>>
>>>             /usr/local/sbin/kamailio[25333]: ERROR: ndb_redis
>>>             [redis_client.c:364]: redisc_exec(): Redis error: Server
>>>             closed the connection
>>>
>>>
>>>         Yes, it has been restarted
>>>
>>>
>>>>>             _*3/ Interesting information in the core*_
>>>>>             #3 0x00007fc79412893d in redisvCommand (c=0x64657461,
>>>>>             format=0x9 <Address 0x9 out of bounds>, ap=0x30,
>>>>>             ap at entry=0x7fff0ff56aa8) at hiredis.c:1304
>>>>>             No locals.
>>>>>             #4 0x00007fc794341713 in redisc_exec
>>>>>             (srv=srv at entry=0x7fff0ff56be0,
>>>>>             res=res at entry=0x7fff0ff56c00,
>>>>>             cmd=cmd at entry=0x7fff0ff56bf0) at redis_client.c:368
>>>>>                     rsrv = 0x7fc794565150
>>>>>                     rpl = 0x7fc7946fab70
>>>>>                     c = 0 '\000'
>>>>>                     ap = {{gp_offset = 48, fp_offset = 48,
>>>>>             overflow_arg_area = 0x7fff0ff56bb0, reg_save_area =
>>>>>             0x7fff0ff56ac0}}
>>>>>             __FUNCTION__ = "redisc_exec"
>>>>>             #5 0x00007fc79433b781 in w_redis_cmd5 (msg=<optimized
>>>>>             out>, ssrv=<optimized out>, scmd=<optimized out>,
>>>>>             sargv1=<optimized out>, sargv2=0x7fc7946f7bf0
>>>>>             "p\243_\224\307\177", sres=0x7fc7946f7c50 "
>>>>>             \253_\224\307\177") at ndb_redis_mod.c:250
>>>>>                     s = {{s = 0x7fc7945fb300 "kamailio_redis", len
>>>>>             = 14}, {s = 0x7fc7945f5f50 "PUBLISH %s %s", len = 13},
>>>>>             {s = 0x7fc7945fab20 "r", len = 1}}
>>>>>                     arg1 = {s = 0x7fc7945f5f80 "notification", len
>>>>>             = 12}
>>>>>                     arg2 = {
>>>>>                       s = 0x7fc794551c60 "info XXX"...,
>>>>>                       len = 212}
>>>>>                     c1 = 0 '\000'
>>>>>                     c2 = 0 '\000'
>>>>>             __FUNCTION__ = "w_redis_cmd5"
>>>>>
>>>>>
>>>             In the source code:
>>>
>>>                 rpl->rplRedis = redisvCommand(rsrv->ctxRedis,
>>>             cmd->s, ap );
>>>                 if(rpl->rplRedis == NULL)
>>>                 {
>>>                     /* null reply, reconnect and try again */
>>>             if(rsrv->ctxRedis->err)
>>>                     {
>>>             LM_ERR("Redis error: %s\n", rsrv->ctxRedis->errstr);
>>>                     }
>>>             if(redisc_reconnect_server(rsrv)==0)
>>>                     {
>>>             rpl->rplRedis = redisvCommand(rsrv->ctxRedis, cmd->s, ap);
>>>                     }
>>>                 }
>>>
>>>             First redisvCommand executes but returns nothing. Then
>>>             it shows a redis error.
>>>
>>>             It tries to reconnect and it manages to connect ??
>>>             because it shows no more errors.
>>>
>>>             And then executes redisvCommand again and crashes.
>>>
>>>             If server is down it should not be able to connect and
>>>             so not to execute redisvCommand again.
>>>
>>>
>>>         According to the core, we MUST be in this case
>>>         *if(redisc_reconnect_server(rsrv)==0)
>>>         *
>>>         But I am wondering how the first redisvCommand can succeed
>>>         before the reconnection ? (the connection kamailio1 <->
>>>         redis has already been taken down). Does all the redis
>>>         context always there when we first call redisvCommand?
>>>
>>>
>>>
>>>             May be I would get more clues with more information.
>>>
>>>             Regards,
>>>             Vicente.
>>>
>>>
>>>         Thank you
>>>         Regards,
>>>
>>>
>>>
>>>>>             I've found one of post that this issue has been fixed
>>>>>             but it seems that it's always the case ..
>>>>>             http://www.mail-archive.com/search?l=sr-users@lists.sip-router.org&q=subject:%22Re%3A+%5BSR-Users%5D+ndb_redis+module+fails+after+a+while%22
>>>>>
>>>>>             Do you have any idea?
>>>>>             Thank you
>>>>>
>>>>>
>>>>>             _______________________________________________
>>>>>             sr-dev mailing list
>>>>>             sr-dev at lists.sip-router.org  <mailto:sr-dev at lists.sip-router.org>
>>>>>             http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>>>>
>>>>             -- 
>>>>             Daniel-Constantin Mierla -http://www.asipto.com
>>>>             http://twitter.com/#!/miconda  <http://twitter.com/#%21/miconda>  -http://www.linkedin.com/in/miconda
>>>>
>>>>
>>>>             _______________________________________________
>>>>             sr-dev mailing list
>>>>             sr-dev at lists.sip-router.org  <mailto:sr-dev at lists.sip-router.org>
>>>>             http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>>>
>>>
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-dev/attachments/20131128/bca8686b/attachment-0001.html>


More information about the sr-dev mailing list