[SR-Users] [sr-dev] kamailio core at qm_status ( patch required important)
Daniel-Constantin Mierla
miconda at gmail.com
Fri Sep 21 09:46:20 CEST 2012
Hello,
yes, I'll apply it, just that I was mainly offline these days to had a
proper chance to look at it.
Thanks for troubleshooting and patching,
Daniel
On 9/20/12 2:36 PM, Jijo wrote:
> Hi Daniel,
>
> This patch needs to be applied to avoid core.
>
> Thanks
> Jijo
>
> On Wed, Sep 19, 2012 at 10:54 AM, Jijo <realjijo at gmail.com
> <mailto:realjijo at gmail.com>> wrote:
>
>
> Hi All,
>
> Finally i found the issue,
>
> Here is one of the bad trace for SUBSCRIBE(722bytes) and
> NOTIFY(1282bytes) which corrupted the memory. The messages came in
> the order NOTIFY and SUBSCRIBE. The core is generated in a
> different place but I believe this could be the reason for memory
> corruption.
>
> Here is the trace UDP Process 27294processing NOTIFY and Process
> 27303processing SUBSCRIBE .
>
> The explanation and implementation is below
>
> 2012-09-19T02:06:17+01:00 [info] sipserver: [27303] INFO:
> <script>: CI=1-3292 at 10.233.20.152 <mailto:CI=1-3292 at 10.233.20.152>
> -R39 - Entry R-URI=1234 FD=10.233.20.152 SI=10.233.20.152
>
> 2012-09-19T02:06:17+01:00 [info] sipserver: [27294] INFO:
> <script>: CI=1-3292 at 10.233.20.152 <mailto:CI=1-3292 at 10.233.20.152>
> -R1 - Force LAN socket: tcp:10.233.20.151:5060
> <http://10.233.20.151:5060> <null>
>
> 2012-09-19T02:06:17+01:00 [info] sipserver: [27303] INFO:
> <script>: CI=1-3292 at 10.233.20.152 <mailto:CI=1-3292 at 10.233.20.152>
> -R1 - Force LAN socket: tcp:10.233.20.151:5060
> <http://10.233.20.151:5060> <null>
>
> 2012-09-19T02:06:17+01:00 [err] sipserver: [27303] ERROR: <core>
> [tcp_main.c:2357]: tcp_conn_send_put : calling wbufq_add
>
> 2012-09-19T02:06:17+01:00 [err] sipserver: [27303] ERROR: <core>
> [tcp_main.c:730]: ERROR: wbufq_add(722 bytes): buf:SUBSCRIBE
> sip:1234 at 10.233.20.141:5063;transport=tcp SIP/2.0
>
> Record-Route: <sip:10.233.20.151;tran
>
> 2012-09-19T02:06:17+01:00 [err] sipserver: [27303] ERROR: <core>
> [tcp_main.c:747]: ERROR: wbufq_add(722 bytes): first:b00519f4
> last:b00519f4
>
> 2012-09-19T02:06:17+01:00 [err] sipserver: [27303] ERROR: <core>
> [tcp_main.c:774]: ERROR: wbufq_add(2 last free crt_size:722):
> first:b00519f4 last:b00519f4
>
> 2012-09-19T02:06:17+01:00 [err] sipserver: [27294] ERROR: <core>
> [tcp_main.c:796]: ERROR: wbufq_insert(1282 bytes): buf:NOTIFY
> sip:1234 at 10.233.20.141:5063;transport=tcp SIP/2.0
>
> Record-Route: <sip:10.233.20.151;transpo
>
> 2012-09-19T02:06:17+01:00 [err] sipserver: [27294] ERROR: <core>
> [tcp_main.c:801]: ERROR: wbufq_insert(2 last free ):
> first:b00519f4 last:b00519f4
>
> 2012-09-19T02:06:17+01:00 [err] sipserver: [27294] ERROR: <core>
> [tcp_main.c:820]: ERROR: wbufq_insert(22 last free ):
> first:b00519f4 last:b00519f4
>
> 2012-09-19T02:06:17+01:00 [err] sipserver: [27359] ERROR: <core>
> [tcp_main.c:887]: ERROR: wbufq_run(3 last free ): first:b00519f4
> last:b00519f4
>
> 2012-09-19T02:06:17+01:00 [crit] sipserver: [27359] : <core>
> [mem/q_malloc.c:157]: BUG: qm_*: fragm. 0xb00519dc (address
> 0xb00519f4) end overwritten(0, 0)!
>
> 2012-09-19T02:06:18+01:00 [alert] sipserver: [27265] ALERT: <core>
> [main.c:755]: child process 27359 exited by a signal 11
>
> 2012-09-19T02:06:18+01:00 [alert] sipserver: [27265] ALERT: <core>
> [main.c:758]: core was generated
>
> 2012-09-19T02:06:18+01:00 [info] sipserver: [27265] INFO: <core>
> [main.c:770]: INFO: terminating due to SIGCHLD
>
> Process 27294(NOTIFY) created the TCP connection structure for
> destination IP and just before calling wbufq_insert(), context
> switch happened and process 27303(SUBSCRIBE) got the cpu. Since
> the connection structure is already available process 27303 add
> the SUBSCRIBE message(722 bytes) to wbufq. Afterwards process
> 27294 got the CPU and invoked wbufq_insert() which basically
> corrupted the memory due to an overflow with the existing
> implementation.
>
> inline static int _wbufq_insert(struct tcp_connection* c, const
> char* data,
>
> unsigned int size)
>
> {
>
> struct tcp_wbuffer_queue* q;
>
> struct tcp_wbuffer* wb;
>
> q=&c->wbuf_q;
>
> if (likely(q->first==0)) /* if empty, use wbufq_add */
>
> return _wbufq_add(c, data, size);
>
> :
>
> :
>
> :
>
> :
>
> if ((q->first==q->last) && ((q->last->b_size-q->last_used)>=size)){
>
> /* one block with enough space in it for size bytes */
>
> memmove(q->first->buf+size, q->first->buf, size);
>
> memcpy(q->first->buf, data, size);
>
> q->last_used+=size;
>
> }
>
> The above condition shall be true in this case and memmove was
> moving the pointer which was causing the overflow.
>
> memmove(void *dest, const void *src, size_t n);
>
> As per the memmove man page, the src shall be copied with size ‘n’
> to a temporary buffer and then temporary buffer to dest.
>
> dest is q->first->buf+size: which is basically (q->first->buf +
> NOTIFY MSG SIZE). so the dst will move my 1282 bytes, so we have
> remaining space of only 818 bytes.
>
> src is q->first->buf: which is basically copied to temp buffer
> with NOTIFY SIZE(1282bytes).
>
> Finally we are moving the buffer from temporary buffer of size
> 1282 bytes to buffer which we left with 818 bytes, This basically
> corrupt the memory and on wbufq_run we see the memory corruption
>
> 2012-09-19T02:06:17+01:00 [crit] sipserver: [27359] : <core>
> [mem/q_malloc.c:157]: BUG: qm_*: fragm. 0xb00519dc (address
> 0xb00519f4) end overwritten(0, 0)!
>
> I think we don’t need memove, so we can change the code as below
>
> if ((q->first==q->last) &&
> ((q->last->b_size-q->last_used)>=size)){
>
> /* one block with enough space in it for size bytes */
>
> //memmove(q->first->buf+size, q->first->buf, size);
>
> memcpy(q->first->buf+q->last_used, data, size);
>
> q->last_used+=size;
>
> }
>
>
> OR
> we need only the else part as it always add the block to the first.
>
> Thanks
> Jijo
>
> On Wed, Jul 18, 2012 at 12:42 PM, Daniel-Constantin Mierla
> <miconda at gmail.com <mailto:miconda at gmail.com>> wrote:
>
> Hello,
>
> sorry, I just read the last messages in the thread and I
> didn't noticed it is about a crash, but thought it is about an
> annoying log message.
>
> The backtrace is no longer matching the latest sources of the
> 3.1 branch, but I expect it is due to a double free issue, so
> you have to update to latest 3.1.6, as suggested by other
> users here. There were many fixes from 3.1.0 to 3.1.6.
>
> Cheers,
> Daniel
>
>
> On 7/17/12 6:03 PM, Jijo wrote:
>> Hi,
>>
>> This is not happening at shutdown or status check. Its
>> aborting when the system is active.
>>
>> Thanks
>> Jijo
>> On Tue, Jul 17, 2012 at 3:06 AM, Daniel-Constantin Mierla
>> <miconda at gmail.com <mailto:miconda at gmail.com>> wrote:
>>
>> Hello,
>>
>> does it keep being or it is one time and that's it?
>>
>> That is printed at shut down or if pkg_status() or
>> shm_status() is executed from some part of code or in
>> config via cfgutils module functions.
>>
>> You can get rid of them by setting memdbg and memlog to a
>> value higher than debug global parameter.
>>
>> Cheers,
>> Daniel
>>
>>
>> On 7/16/12 5:28 PM, Jijo wrote:
>>> Thanks.. It is not easy to upgrade as it is happening at
>>> customer system.
>>> Is there any change occurred for this issue.I looked at
>>> it, but didn't see anything in q_malloc.c/qm_status()
>>>
>>>
>>> On Mon, Jul 16, 2012 at 11:12 AM, Jon Bonilla
>>> <manwe at aholab.ehu.es <mailto:manwe at aholab.ehu.es>> wrote:
>>>
>>> El Mon, 16 Jul 2012 10:27:42 -0400
>>> Jijo <realjijo at gmail.com
>>> <mailto:realjijo at gmail.com>> escribió:
>>>
>>> > Hi All,
>>> >
>>> > I'm observing a core intermittently at "qm_status
>>> (qm=0x786cd000) at
>>> > mem/q_malloc.c:763" for kamailio version 3.1.0
>>> >
>>>
>>> I'd say that you're using a very old version. You
>>> should update your branch to
>>> 3.1.6 or upgrade to a newer branch.
>>>
>>>
>>> _______________________________________________
>>> sr-dev mailing list
>>> sr-dev at lists.sip-router.org
>>> <mailto:sr-dev at lists.sip-router.org>
>>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> sr-dev mailing list
>>> sr-dev at lists.sip-router.org <mailto:sr-dev at lists.sip-router.org>
>>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>>
>> --
>> Daniel-Constantin Mierla -http://www.asipto.com
>> http://twitter.com/#!/miconda <http://twitter.com/#%21/miconda> -http://www.linkedin.com/in/miconda
>> Kamailio Advanced Training, Seattle, USA, Sep 23-26, 2012 -http://asipto.com/u/katu
>> Kamailio Practical Workshop, Netherlands, Sep 10-12, 2012 -http://asipto.com/u/kpw
>>
>>
>
> --
> Daniel-Constantin Mierla -http://www.asipto.com
> http://twitter.com/#!/miconda <http://twitter.com/#%21/miconda> -http://www.linkedin.com/in/miconda
> Kamailio Advanced Training, Seattle, USA, Sep 23-26, 2012 -http://asipto.com/u/katu
> Kamailio Practical Workshop, Netherlands, Sep 10-12, 2012 -http://asipto.com/u/kpw
>
>
>
--
Daniel-Constantin Mierla - http://www.asipto.com
http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
Kamailio Advanced Training, Berlin, Nov 5-8, 2012 - http://asipto.com/u/kat
Kamailio Advanced Training, Miami, USA, Nov 12-14, 2012 - http://asipto.com/u/katu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20120921/6841d5dc/attachment-0001.htm>
More information about the sr-users
mailing list