[Serusers] Crash in usrloc module [backtrace]

Maxim Sobolev sobomax at portaone.com
Thu Apr 3 12:20:51 CEST 2003


We are using 0.8.10 (not a CVS one) with locally fixed radius module.

-Maxim

On Thu, Apr 03, 2003 at 12:43:52AM +0200, Jan Janak wrote:
> On 03-04 01:33, Maxim Sobolev wrote:
> > On Wed, Apr 02, 2003 at 11:15:59PM +0200, Jan Janak wrote:
> > > On 02-04 23:50, Maxim Sobolev wrote:
> > > > On Wed, Apr 02, 2003 at 10:33:36PM +0200, Jan Janak wrote:
> > > > > Hello,
> > > > > 
> > > > > this is really strange, the structure could never be filled with zeroes
> > > > > under normal circumstancies. At least domain and aor must be set to
> > > > > non-zero value. After the structure is created these fields are automatically
> > > > > set. And they must be non-zero otherwise the structure would never be
> > > > > created.
> > > > > 
> > > > > So either gdb doesn't show correct values or the memory has been
> > > > > corrupted somehow.
> > > > 
> > > > Yes, it looks like that, because ptr is a valid pointer in the trace
> > > > and ptr->next shouldn't cause sig11. Maybe there is a problem with
> > > > locking? Is it possible that two or more processes would start
> > > > modifying _r linked list simulateneously, therefore breaking its
> > > > integrity?
> > > 
> > >   I reviewed the locking and haven't found any problem. We have been
> > >   running ser for very long time without any problems (I think that
> > >   locking problem would show up on iptel.org - it's registrar is very
> > >   active).
> > > 
> > >   Could you please tar the sources along with the core dump (and log
> > >   files if possible) and send it to me ? I currently have no clue why
> > >   such a mysterious crash happended (it can be even HW problem) but I'd
> > >   like to review it later.
> > 
> > No problems, I'll do it for you. Could it be the problem related to the
> > fact that we are using auth_radius() to authenticate REGISTER requests 
> > before allowing them in, while the Radius server talks to a MySQL database
> > and therefore, sometimes adds significant delay to processing (up to
> > several seconds)? In this case I think that it is quite possible that
> > expiration timer hits at the same moment of time when positive Radius reply
> > arrives and save() is called, causing bad things to happen
> > 
> 
>   Are you using a CVS snapshot ? I though that you are using 0.8.10
>   (from the info you sent me). auth_radius module is not available for
>   0.8.10 (it was called radius_auth).
> 
>   The situation you describe is not possible. radius authentication is
>   done in a different module and save() is executed after the
>   authentication completes successfully. So it doesn't matter how long
>   the radius query takes.
> 
>   And both save() function and the timer are protected by a mutex so
>   they cannot modify contect of user location database at the same time.
> 
>   What version of ser are you using ?
> 
>      Jan.
> 
> > -Maxim
> > 
> > > 
> > >     thanks, Jan.
> > > 
> > > > 
> > > > -Maxim
> > > > 
> > > > > 
> > > > > Is anything suspicous in your log files ?
> > > > > 
> > > > >   regards, Jan.
> > > > > 
> > > > > On 02-04 22:54, Maxim Sobolev wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > I've observed rather mysterious crash in ser, see attached debug log.
> > > > > > 
> > > > > > Any ideas what gives?
> > > > > > 
> > > > > > -Maxim
> > > > > 
> > > > > > Script started on Wed Apr  2 11:46:12 2003
> > > > > > bash-2.05a$ sudo gdb ~/PortaSIP/ser/work/ser-0.8.10/ser ser.core
> > > > > > GNU gdb 4.18 (FreeBSD)
> > > > > > Copyright 1998 Free Software Foundation, Inc.
> > > > > > GDB is free software, covered by the GNU General Public License, and you are
> > > > > > welcome to change it and/or distribute copies of it under certain conditions.
> > > > > > Type "show copying" to see the conditions.
> > > > > > There is absolutely no warranty for GDB.  Type "show warranty" for details.
> > > > > > This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 2627 in elfstab_build_psymtabs
> > > > > > Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf
> > > > > > 
> > > > > > Core was generated by `ser'.
> > > > > > Program terminated with signal 11, Segmentation fault.
> > > > > > Reading symbols from /usr/lib/libc.so.4...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/sl.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/tm.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/rr.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/maxfwd.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/usrloc.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/registrar.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/nathelper.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/textops.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/radius_auth.so...done.
> > > > > > Reading symbols from /usr/local/lib/libradiusclient.so.0...done.
> > > > > > Reading symbols from /usr/lib/libmd.so.2...done.
> > > > > > Reading symbols from /usr/lib/libcrypt.so.2...done.
> > > > > > Reading symbols from /usr/libexec/ld-elf.so.1...done.
> > > > > > #0  0x2a1b0cb3 in nodb_timer (_r=0x282ee1a8) at urecord.c:203
> > > > > > 203				ptr = ptr->next;
> > > > > > (gdb) print ptr
> > > > > > $1 = (ucontact_t *) 0x29647362
> > > > > > (gdb) print *ptr
> > > > > > $2 = {domain = 0x0, aor = 0x0, c = {s = 0x0, len = 0}, expires = 0, q = 0, callid = {s = 0x0, 
> > > > > >     len = 0}, cseq = 0, state = CS_NEW, next = 0x0, prev = 0x0}
> > > > > > (gdb) bt
> > > > > > #0  0x2a1b0cb3 in nodb_timer (_r=0x282ee1a8) at urecord.c:203
> > > > > > #1  0x2a1b02cc in timer_urecord (_r=0x282ee1a8) at urecord.c:333
> > > > > > #2  0x2a1aae28 in timer_udomain (_d=0x282eadc8) at udomain.c:311
> > > > > > #3  0x2a1a76d7 in synchronize_all_udomains () at dlist.c:211
> > > > > > #4  0x2a1af8c9 in timer (ticks=720, param=0x0) at ul_mod.c:234
> > > > > > #5  0x80735c9 in timer_ticker () at timer.c:118
> > > > > > #6  0x805e922 in main_loop () at main.c:654
> > > > > > #7  0x80611b1 in main (argc=1, argv=0xbfbffbe8) at main.c:1383
> > > > > > #8  0x804c5a6 in _start ()
> > > > > > (gdb) up
> > > > > > #1  0x2a1b02cc in timer_urecord (_r=0x282ee1a8) at urecord.c:333
> > > > > > 333		case NO_DB:         return nodb_timer(_r);
> > > > > > (gdb) print _r
> > > > > > $3 = (urecord_t *) 0x282ee1a8
> > > > > > (gdb) print *_r
> > > > > > $4 = {domain = 0x282ead78, aor = {s = 0x282ee1e8 "16045215277aa\"\r\nContent-Length: 0\r\n\r\n", 
> > > > > >     len = 11}, contacts = 0x282eea68, slot = 0x282eb188, d_ll = {prev = 0x282ee088, next = 0x0}, 
> > > > > >   s_ll = {prev = 0x0, next = 0x0}}
> > > > > > (gdb) up
> > > > > > #2  0x2a1aae28 in timer_udomain (_d=0x282eadc8) at udomain.c:311
> > > > > > 311			if (timer_urecord(ptr) < 0) {
> > > > > > (gdb) q
> > > > > > bash-2.05a$ exit
> > > > > > 
> > > > > > Script done on Wed Apr  2 11:47:14 2003
> > > > > 
> > > > 
> > > > 
> > > > _______________________________________________
> > > > Serusers mailing list
> > > > serusers at lists.iptel.org
> > > > http://lists.iptel.org/mailman/listinfo/serusers
> > 
> > 
> > _______________________________________________
> > Serusers mailing list
> > serusers at lists.iptel.org
> > http://lists.iptel.org/mailman/listinfo/serusers





More information about the sr-users mailing list