[SR-Users] Kam 4.1 Crash: HELP PLZ!!!!

Daniel-Constantin Mierla miconda at gmail.com
Thu Apr 10 09:31:27 CEST 2014


Hello,

have you found the corefiles? The backtrace would be useful not to let 
the issue unresolved and be sure there are not other side effects.

Another question, in the logs I see that the reason for failing to 
forward the branch is a very long sip message:

     Apr  3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: ERROR: 
<core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
sendto(sock,0x7feac6836310,65572,0,205.251.172.14:5060,16): Message too 
long(90)
     Apr  3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: ERROR: 
tm [../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
     Apr  3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: ERROR: 
tm [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending 
request on branch 0 failed
     Apr  3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: ERROR: 
sl [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: 
Unfortunately error on sending to next hop occurred (477/SL)

So the branch has 65572 bytes, which exceed 64k, making the kernel 
reject sending the udp packet. Do you have such large SIP requests to 
send out?

Cheers,
Daniel

On 09/04/14 19:34, Daniel-Constantin Mierla wrote:
> Hello,
>
> the logs show that the core file was generated:
>
> Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7553]: ALERT: 
> <core> [main.c:778]: handle_sigs(): core was generated
>
> Unless you deleted them, then you should have them somewhere on the 
> file system.
>
> mem_safety won't affect any kind of processing - it is just a 
> protection for a double call of free() function. Actually, it is 
> turned on with f_malloc memory manager which was the default in the 
> past. Now we use q_malloc as default which is kind of f_malloc with 
> defrag option. We should make this back on by default, but catching 
> double free() is also good to solve anyhow.
>
> Cheers,
> Daniel
>
> On 09/04/14 17:15, Samuel Ware wrote:
>> Daniel,
>>
>> Here is the version information.
>>
>>     root at tel-vc-fs03.telariscom.com> kamailio -v
>>     version: kamailio 4.1.2 (x86_64/linux) 73ea61
>>     flags: STATS: Off, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS, 
>> DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, 
>> DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, 
>> USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
>>     ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 
>> 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 4MB
>>     poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
>>     id: 73ea61
>>     compiled on 20:41:21 Apr  1 2014 with gcc 4.4.7
>>
>> We don’t see this happen in the lab environment.  It only happens in 
>> production so we are little concern about it failing.  Unfortunately, 
>> we don’t have any core dumps of this occurring; we didn’t have that 
>> feature enabled when this happened.
>>
>> By enabling “men_safety”, will this affect the processing of any of 
>> other signaling messages when an error occurs or will it only affect 
>> the transaction in which the error occurred.  There isn’t much detail 
>> in the description of this flag in the documentation to layout how it 
>> affects the overall system in an otherwise “abort” situation.  Your 
>> assistance is appreciated and I hope that it doesn’t sound like we 
>> are being difficult but this element is setting in front of our 
>> entire network and failure causes us to block all calls.  We were 
>> running close to 800 CPS through it when it failed.  We don’t see any 
>> load on the machine that would indicate that the issue isn’t software 
>> based.  If the “men_safety” will not affect the rest of the traffic, 
>> we can add this as well as enabling core dumps and try it again when 
>> we can closely monitor the system.  Please provide the additional 
>> details about “men_safety” so we can assess the situation.  Is there 
>> any other information that I can provide to assist your input like 
>> the part of the config file or what else?
>>
>>
>>
>> Sam
>>
>>
>> On Apr 9, 2014, at 2:27 AM, Daniel-Constantin Mierla 
>> <miconda at gmail.com> wrote:
>>
>>> Can you get the output of kamailio -v?
>>>
>>> Also, send the backtrace for the corefiles taken with gdb - the logs 
>>> show that the core were generated (check / or the value of -w 
>>> parameter):
>>>
>>> gdb /usr/local/sbin/kamailio /path/to/core
>>> bt full
>>>
>>> To protect against such cases, you can use in config file:
>>>
>>> mem_safety=1
>>>
>>> Cheers,
>>> Daniel
>>>
>>> On 08/04/14 17:34, Samuel Ware wrote:
>>>> Here are the logs from the first crash
>>>>
>>>>    Apr  3 13:13:59 tel-vc-fs03 /usr/local/sbin/kamailio[7602]: 
>>>> ERROR: tm [../../forward.h:199]: msg_send(): msg_send: ERROR: 
>>>> udp_send failed
>>>>     Apr  3 13:13:59 tel-vc-fs03 /usr/local/sbin/kamailio[7602]: 
>>>> ERROR: tm [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: 
>>>> sending request on branch 0 failed
>>>>     Apr  3 13:13:59 tel-vc-fs03 /usr/local/sbin/kamailio[7602]: 
>>>> ERROR: sl [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error 
>>>> used: Unfortunately error on sending to next hop occurred (477/SL)
>>>>     Apr  3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: 
>>>> ERROR: <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7feac6836310,65572,0,205.251.172.14:5060,16): Message 
>>>> too long(90)
>>>>     Apr  3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: 
>>>> ERROR: tm [../../forward.h:199]: msg_send(): msg_send: ERROR: 
>>>> udp_send failed
>>>>     Apr  3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: 
>>>> ERROR: tm [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: 
>>>> sending request on branch 0 failed
>>>>     Apr  3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: 
>>>> ERROR: sl [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error 
>>>> used: Unfortunately error on sending to next hop occurred (477/SL)
>>>>     Apr  3 13:14:20 tel-vc-fs03 /usr/local/sbin/kamailio[7594]: 
>>>> ERROR: <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7feaace02c98,65533,0,205.251.172.14:5060,16): Message 
>>>> too long(90)
>>>>     Apr  3 13:14:20 tel-vc-fs03 /usr/local/sbin/kamailio[7594]: 
>>>> ERROR: tm [../../forward.h:199]: msg_send(): msg_send: ERROR: 
>>>> udp_send failed
>>>>     Apr  3 13:14:20 tel-vc-fs03 /usr/local/sbin/kamailio[7594]: 
>>>> ERROR: tm [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: 
>>>> sending request on branch 0 failed
>>>>     Apr  3 13:14:20 tel-vc-fs03 /usr/local/sbin/kamailio[7594]: 
>>>> ERROR: sl [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error 
>>>> used: Unfortunately error on sending to next hop occurred (477/SL)
>>>>     Apr  3 13:14:27 tel-vc-fs03 /usr/local/sbin/kamailio[7583]: 
>>>> ERROR: <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7feac6836310,65569,0,205.251.172.14:5060,16): Message 
>>>> too long(90)
>>>>     Apr  3 13:14:27 tel-vc-fs03 /usr/local/sbin/kamailio[7583]: 
>>>> ERROR: tm [../../forward.h:199]: msg_send(): msg_send: ERROR: 
>>>> udp_send failed
>>>>     Apr  3 13:14:27 tel-vc-fs03 /usr/local/sbin/kamailio[7583]: 
>>>> ERROR: tm [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: 
>>>> sending request on branch 0 failed
>>>>     Apr  3 13:14:27 tel-vc-fs03 /usr/local/sbin/kamailio[7583]: 
>>>> ERROR: sl [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error 
>>>> used: Unfortunately error on sending to next hop occurred (477/SL)
>>>>     Apr  3 13:14:32 tel-vc-fs03 /usr/local/sbin/kamailio[7589]: : 
>>>> <core> [mem/q_malloc.c:468]: qm_free(): BUG: qm_free: freeing 
>>>> already freed pointer (0x7feaca6c9738), called from <core>: 
>>>> mem/shm_mem.c: sh_realloc(88), first free <core>: mem/shm_mem.c: 
>>>> sh_realloc(88) - aborting
>>>>     Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7553]: 
>>>> ALERT: <core> [main.c:775]: handle_sigs(): child process 7589 
>>>> exited by a signal 6
>>>>     Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7553]: 
>>>> ALERT: <core> [main.c:778]: handle_sigs(): core was generated
>>>>     Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7553]: 
>>>> INFO: <core> [main.c:790]: handle_sigs(): INFO: terminating due to 
>>>> SIGCHLD
>>>>     Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7604]: 
>>>> INFO: <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>     Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7597]: 
>>>> INFO: <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>     Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: 
>>>> INFO: <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>     Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7603]: 
>>>> INFO: <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>     Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7601]: 
>>>> INFO: <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>
>>>>
>>>>   Logs from second crash
>>>>
>>>> Apr  3 14:38:42 tel-vc-fs03 /usr/local/sbin/kamailio[3979]: ERROR: 
>>>> <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7f34ec4f6ed8,65569,0,205.251.172.14:5060,16): Message 
>>>> too long(90)
>>>>     Apr  3 14:38:42 tel-vc-fs03 /usr/local/sbin/kamailio[3979]: 
>>>> ERROR: tm [../../forward.h:199]: msg_send(): msg_send: ERROR: 
>>>> udp_send failed
>>>>     Apr  3 14:38:42 tel-vc-fs03 /usr/local/sbin/kamailio[3979]: 
>>>> ERROR: tm [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: 
>>>> sending request on branch 0 failed
>>>>     Apr  3 14:38:42 tel-vc-fs03 /usr/local/sbin/kamailio[3979]: 
>>>> ERROR: sl [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error 
>>>> used: Unfortunately error on sending to next hop occurred (477/SL)
>>>>     Apr  3 14:38:44 tel-vc-fs03 /usr/local/sbin/kamailio[3965]: 
>>>> ERROR: <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7f35242e5d28,65535,0,205.251.172.14:5060,16): Message 
>>>> too long(90)
>>>>     Apr  3 14:38:44 tel-vc-fs03 /usr/local/sbin/kamailio[3965]: 
>>>> ERROR: tm [../../forward.h:199]: msg_send(): msg_send: ERROR: 
>>>> udp_send failed
>>>>     Apr  3 14:38:44 tel-vc-fs03 /usr/local/sbin/kamailio[3965]: 
>>>> ERROR: tm [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: 
>>>> sending request on branch 0 failed
>>>>     Apr  3 14:38:44 tel-vc-fs03 /usr/local/sbin/kamailio[3965]: 
>>>> ERROR: sl [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error 
>>>> used: Unfortunately error on sending to next hop occurred (477/SL)
>>>>     Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: 
>>>> ERROR: <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7f3516fb6720,65578,0,205.251.172.14:5060,16): Message 
>>>> too long(90)
>>>>     Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: 
>>>> ERROR: tm [../../forward.h:199]: msg_send(): msg_send: ERROR: 
>>>> udp_send failed
>>>>     Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: 
>>>> ERROR: tm [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: 
>>>> sending request on branch 0 failed
>>>>     Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: 
>>>> ERROR: <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7f37d8e91088,65522,0,205.251.172.14:5060,16): Message 
>>>> too long(90)
>>>>     Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: 
>>>> ERROR: sl [../../forward.h:199]: msg_send(): msg_send: ERROR: 
>>>> udp_send failed
>>>>     Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: 
>>>> ERROR: sl [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error 
>>>> used: Unfortunately error on sending to next hop occurred (477/SL)
>>>>     Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: 
>>>> ERROR: <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7f37d8e91088,65522,0,205.251.172.14:5060,16): Message 
>>>> too long(90)
>>>>     Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: 
>>>> ERROR: tm [../../forward.h:199]: msg_send(): msg_send: ERROR: 
>>>> udp_send failed
>>>>     Apr  3 14:38:58 tel-vc-fs03 /usr/local/sbin/kamailio[3961]: 
>>>> ERROR: <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7f37d81fa960,65522,0,205.251.172.14:5060,16): Message 
>>>> too long(90)
>>>>     Apr  3 14:38:58 tel-vc-fs03 /usr/local/sbin/kamailio[3961]: 
>>>> ERROR: tm [../../forward.h:199]: msg_send(): msg_send: ERROR: 
>>>> udp_send failed
>>>>     Apr  3 14:38:58 tel-vc-fs03 /usr/local/sbin/kamailio[3974]: : 
>>>> <core> [mem/q_malloc.c:468]: qm_free(): BUG: qm_free: freeing 
>>>> already freed pointer (0x7f34d7d6b720), called from <core>: 
>>>> mem/shm_mem.c: sh_realloc(88), first free <core>: mem/shm_mem.c: 
>>>> sh_realloc(88) - aborting
>>>>     Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3935]: 
>>>> ALERT: <core> [main.c:775]: handle_sigs(): child process 3974 
>>>> exited by a signal 6
>>>>     Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3935]: 
>>>> ALERT: <core> [main.c:778]: handle_sigs(): core was generated
>>>>     Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3935]: 
>>>> INFO: <core> [main.c:790]: handle_sigs(): INFO: terminating due to 
>>>> SIGCHLD
>>>>     Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3985]: 
>>>> INFO: <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>     Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3983]: 
>>>> INFO: <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>     Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3966]: 
>>>> INFO: <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>     Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3988]: 
>>>> INFO: <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>     Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3972]: 
>>>> INFO: <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>     Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3984]: 
>>>> INFO: <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing 
>>>> list
>>>> sr-users at lists.sip-router.org
>>>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>> -- 
>>> Daniel-Constantin Mierla - http://www.asipto.com
>>> http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
>>>
>

-- 
Daniel-Constantin Mierla - http://www.asipto.com
http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda




More information about the sr-users mailing list