[SR-Users] [sr-dev] kamailio core at qm_status

Jijo realjijo at gmail.com
Wed Sep 19 16:54:19 CEST 2012


Hi All,

Finally i found the issue,

Here is one of the bad trace for SUBSCRIBE(722bytes) and NOTIFY(1282bytes)
which corrupted the memory. The messages came in the order NOTIFY and
SUBSCRIBE. The core is generated in a different place but I believe this
could be the reason for memory corruption.

Here is the trace UDP Process 27294 processing NOTIFY and Process
27303 processing
SUBSCRIBE .



The explanation and implementation is below



2012-09-19T02:06:17+01:00 [info] sipserver: [27303] INFO: <script>:
CI=1-3292 at 10.233.20.152 -R39 - Entry R-URI=1234 FD=10.233.20.152
SI=10.233.20.152

2012-09-19T02:06:17+01:00 [info] sipserver: [27294] INFO: <script>:
CI=1-3292 at 10.233.20.152 -R1 - Force LAN socket: tcp:10.233.20.151:5060<null>

2012-09-19T02:06:17+01:00 [info] sipserver: [27303] INFO: <script>:
CI=1-3292 at 10.233.20.152 -R1 - Force LAN socket: tcp:10.233.20.151:5060<null>

2012-09-19T02:06:17+01:00 [err] sipserver: [27303] ERROR: <core>
[tcp_main.c:2357]:  tcp_conn_send_put : calling wbufq_add

2012-09-19T02:06:17+01:00 [err] sipserver: [27303] ERROR: <core>
[tcp_main.c:730]: ERROR: wbufq_add(722 bytes): buf:SUBSCRIBE
sip:1234 at 10.233.20.141:5063;transport=tcp SIP/2.0

Record-Route: <sip:10.233.20.151;tran

2012-09-19T02:06:17+01:00 [err] sipserver: [27303] ERROR: <core>
[tcp_main.c:747]: ERROR: wbufq_add(722 bytes): first:b00519f4 last:b00519f4

2012-09-19T02:06:17+01:00 [err] sipserver: [27303] ERROR: <core>
[tcp_main.c:774]: ERROR: wbufq_add(2 last free crt_size:722):
first:b00519f4 last:b00519f4

2012-09-19T02:06:17+01:00 [err] sipserver: [27294] ERROR: <core>
[tcp_main.c:796]: ERROR: wbufq_insert(1282 bytes): buf:NOTIFY
sip:1234 at 10.233.20.141:5063;transport=tcp SIP/2.0

Record-Route: <sip:10.233.20.151;transpo

2012-09-19T02:06:17+01:00 [err] sipserver: [27294] ERROR: <core>
[tcp_main.c:801]: ERROR: wbufq_insert(2 last free ): first:b00519f4
last:b00519f4

2012-09-19T02:06:17+01:00 [err] sipserver: [27294] ERROR: <core>
[tcp_main.c:820]: ERROR: wbufq_insert(22 last free ): first:b00519f4
last:b00519f4

2012-09-19T02:06:17+01:00 [err] sipserver: [27359] ERROR: <core>
[tcp_main.c:887]: ERROR: wbufq_run(3 last free ): first:b00519f4
last:b00519f4

2012-09-19T02:06:17+01:00 [crit] sipserver: [27359] : <core>
[mem/q_malloc.c:157]: BUG: qm_*: fragm. 0xb00519dc (address 0xb00519f4) end
overwritten(0, 0)!

2012-09-19T02:06:18+01:00 [alert] sipserver: [27265] ALERT: <core>
[main.c:755]: child process 27359 exited by a signal 11

2012-09-19T02:06:18+01:00 [alert] sipserver: [27265] ALERT: <core>
[main.c:758]: core was generated

2012-09-19T02:06:18+01:00 [info] sipserver: [27265] INFO: <core>
[main.c:770]: INFO: terminating due to SIGCHLD



Process 27294(NOTIFY) created the TCP connection structure for destination
IP and just before calling wbufq_insert(), context switch happened and
process 27303(SUBSCRIBE) got the cpu. Since the connection structure is
already available process 27303 add the SUBSCRIBE message(722 bytes) to
wbufq. Afterwards process 27294 got the CPU and invoked wbufq_insert()
which basically corrupted the memory due to an overflow with the existing
implementation.





inline static int _wbufq_insert(struct  tcp_connection* c, const char*
data,


unsigned int size)

{

                struct tcp_wbuffer_queue* q;

                struct tcp_wbuffer* wb;



                q=&c->wbuf_q;

                if (likely(q->first==0)) /* if empty, use wbufq_add */

                                return _wbufq_add(c, data, size);

:

:

:

:

                if ((q->first==q->last) &&
((q->last->b_size-q->last_used)>=size)){

                                /* one block with enough space in it for
size bytes */



                                memmove(q->first->buf+size, q->first->buf,
size);

                                memcpy(q->first->buf, data, size);

                                q->last_used+=size;

                }



The above condition shall be true in this case and memmove was moving the
pointer which was causing the overflow.

                        memmove(void *dest, const void *src, size_t n);



As per the memmove man page, the src shall be copied with size ‘n’ to a
temporary buffer and then temporary buffer to dest.



dest is q->first->buf+size: which is basically (q->first->buf  + NOTIFY MSG
SIZE). so the dst will move my 1282 bytes, so we have remaining space of
only 818 bytes.



src is q->first->buf: which is basically copied to temp buffer with NOTIFY
SIZE(1282bytes).



Finally we are moving the buffer from temporary buffer of size 1282 bytes
to buffer which we left with 818 bytes, This basically corrupt the memory
and on wbufq_run we see the memory corruption



2012-09-19T02:06:17+01:00 [crit] sipserver: [27359] : <core>
[mem/q_malloc.c:157]: BUG: qm_*: fragm. 0xb00519dc (address 0xb00519f4) end
overwritten(0, 0)!



I think we don’t need memove, so we can change the code as below



       if ((q->first==q->last) && ((q->last->b_size-q->last_used)>=size)){

                /* one block with enough space in it for size bytes */

                //memmove(q->first->buf+size, q->first->buf, size);

                memcpy(q->first->buf+q->last_used, data, size);

                q->last_used+=size;

        }

OR
we need only the else part as it always add the block to the first.

Thanks
Jijo

On Wed, Jul 18, 2012 at 12:42 PM, Daniel-Constantin Mierla <
miconda at gmail.com> wrote:

>  Hello,
>
> sorry, I just read the last messages in the thread and I didn't noticed it
> is about a crash, but thought it is about an annoying log message.
>
> The backtrace is no longer matching the latest sources of the 3.1 branch,
> but I expect it is due to a double free issue, so you have to update to
> latest 3.1.6, as suggested by other users here. There were many fixes from
> 3.1.0 to 3.1.6.
>
> Cheers,
> Daniel
>
>
> On 7/17/12 6:03 PM, Jijo wrote:
>
> Hi,
>
>  This is not happening at shutdown or status check. Its aborting when the
> system is active.
>
> Thanks
> Jijo
> On Tue, Jul 17, 2012 at 3:06 AM, Daniel-Constantin Mierla <
> miconda at gmail.com> wrote:
>
>>  Hello,
>>
>> does it keep being or it is one time and that's it?
>>
>> That is printed at shut down or if pkg_status() or shm_status() is
>> executed from some part of code or in config via cfgutils module functions.
>>
>> You can get rid of them by setting memdbg and memlog to a value higher
>> than debug global parameter.
>>
>> Cheers,
>> Daniel
>>
>>
>> On 7/16/12 5:28 PM, Jijo wrote:
>>
>> Thanks.. It is not easy to upgrade as it is happening at customer system.
>> Is there any change occurred for this issue.I looked at it, but didn't
>> see anything in q_malloc.c/qm_status()
>>
>>
>> On Mon, Jul 16, 2012 at 11:12 AM, Jon Bonilla <manwe at aholab.ehu.es>wrote:
>>
>>> El Mon, 16 Jul 2012 10:27:42 -0400
>>> Jijo <realjijo at gmail.com> escribió:
>>>
>>> > Hi All,
>>> >
>>> > I'm observing a core intermittently at "qm_status (qm=0x786cd000) at
>>> > mem/q_malloc.c:763" for kamailio version 3.1.0
>>> >
>>>
>>>  I'd say that you're using a very old version. You should update your
>>> branch to
>>> 3.1.6 or upgrade to a newer branch.
>>>
>>>
>>> _______________________________________________
>>> sr-dev mailing list
>>> sr-dev at lists.sip-router.org
>>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>>>
>>
>>
>>
>> _______________________________________________
>> sr-dev mailing listsr-dev at lists.sip-router.orghttp://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>>
>>
>>   --
>> Daniel-Constantin Mierla - http://www.asipto.comhttp://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
>> Kamailio Advanced Training, Seattle, USA, Sep 23-26, 2012 - http://asipto.com/u/katu
>> Kamailio Practical Workshop, Netherlands, Sep 10-12, 2012 - http://asipto.com/u/kpw
>>
>>
>
> --
> Daniel-Constantin Mierla - http://www.asipto.comhttp://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
> Kamailio Advanced Training, Seattle, USA, Sep 23-26, 2012 - http://asipto.com/u/katu
> Kamailio Practical Workshop, Netherlands, Sep 10-12, 2012 - http://asipto.com/u/kpw
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20120919/bea7ebd8/attachment-0001.htm>


More information about the sr-users mailing list