[Serusers] Crash in usrloc module [backtrace]

Jan Janak jan at iptel.org
Thu Apr 3 00:43:52 CEST 2003


On 03-04 01:33, Maxim Sobolev wrote:
> On Wed, Apr 02, 2003 at 11:15:59PM +0200, Jan Janak wrote:
> > On 02-04 23:50, Maxim Sobolev wrote:
> > > On Wed, Apr 02, 2003 at 10:33:36PM +0200, Jan Janak wrote:
> > > > Hello,
> > > > 
> > > > this is really strange, the structure could never be filled with zeroes
> > > > under normal circumstancies. At least domain and aor must be set to
> > > > non-zero value. After the structure is created these fields are automatically
> > > > set. And they must be non-zero otherwise the structure would never be
> > > > created.
> > > > 
> > > > So either gdb doesn't show correct values or the memory has been
> > > > corrupted somehow.
> > > 
> > > Yes, it looks like that, because ptr is a valid pointer in the trace
> > > and ptr->next shouldn't cause sig11. Maybe there is a problem with
> > > locking? Is it possible that two or more processes would start
> > > modifying _r linked list simulateneously, therefore breaking its
> > > integrity?
> > 
> >   I reviewed the locking and haven't found any problem. We have been
> >   running ser for very long time without any problems (I think that
> >   locking problem would show up on iptel.org - it's registrar is very
> >   active).
> > 
> >   Could you please tar the sources along with the core dump (and log
> >   files if possible) and send it to me ? I currently have no clue why
> >   such a mysterious crash happended (it can be even HW problem) but I'd
> >   like to review it later.
> 
> No problems, I'll do it for you. Could it be the problem related to the
> fact that we are using auth_radius() to authenticate REGISTER requests 
> before allowing them in, while the Radius server talks to a MySQL database
> and therefore, sometimes adds significant delay to processing (up to
> several seconds)? In this case I think that it is quite possible that
> expiration timer hits at the same moment of time when positive Radius reply
> arrives and save() is called, causing bad things to happen
> 

  Are you using a CVS snapshot ? I though that you are using 0.8.10
  (from the info you sent me). auth_radius module is not available for
  0.8.10 (it was called radius_auth).

  The situation you describe is not possible. radius authentication is
  done in a different module and save() is executed after the
  authentication completes successfully. So it doesn't matter how long
  the radius query takes.

  And both save() function and the timer are protected by a mutex so
  they cannot modify contect of user location database at the same time.

  What version of ser are you using ?

     Jan.

> -Maxim
> 
> > 
> >     thanks, Jan.
> > 
> > > 
> > > -Maxim
> > > 
> > > > 
> > > > Is anything suspicous in your log files ?
> > > > 
> > > >   regards, Jan.
> > > > 
> > > > On 02-04 22:54, Maxim Sobolev wrote:
> > > > > Hi,
> > > > > 
> > > > > I've observed rather mysterious crash in ser, see attached debug log.
> > > > > 
> > > > > Any ideas what gives?
> > > > > 
> > > > > -Maxim
> > > > 
> > > > > Script started on Wed Apr  2 11:46:12 2003
> > > > > bash-2.05a$ sudo gdb ~/PortaSIP/ser/work/ser-0.8.10/ser ser.core
> > > > > GNU gdb 4.18 (FreeBSD)
> > > > > Copyright 1998 Free Software Foundation, Inc.
> > > > > GDB is free software, covered by the GNU General Public License, and you are
> > > > > welcome to change it and/or distribute copies of it under certain conditions.
> > > > > Type "show copying" to see the conditions.
> > > > > There is absolutely no warranty for GDB.  Type "show warranty" for details.
> > > > > This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 2627 in elfstab_build_psymtabs
> > > > > Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf
> > > > > 
> > > > > Core was generated by `ser'.
> > > > > Program terminated with signal 11, Segmentation fault.
> > > > > Reading symbols from /usr/lib/libc.so.4...done.
> > > > > Reading symbols from /usr/local/lib/ser/modules/sl.so...done.
> > > > > Reading symbols from /usr/local/lib/ser/modules/tm.so...done.
> > > > > Reading symbols from /usr/local/lib/ser/modules/rr.so...done.
> > > > > Reading symbols from /usr/local/lib/ser/modules/maxfwd.so...done.
> > > > > Reading symbols from /usr/local/lib/ser/modules/usrloc.so...done.
> > > > > Reading symbols from /usr/local/lib/ser/modules/registrar.so...done.
> > > > > Reading symbols from /usr/local/lib/ser/modules/nathelper.so...done.
> > > > > Reading symbols from /usr/local/lib/ser/modules/textops.so...done.
> > > > > Reading symbols from /usr/local/lib/ser/modules/radius_auth.so...done.
> > > > > Reading symbols from /usr/local/lib/libradiusclient.so.0...done.
> > > > > Reading symbols from /usr/lib/libmd.so.2...done.
> > > > > Reading symbols from /usr/lib/libcrypt.so.2...done.
> > > > > Reading symbols from /usr/libexec/ld-elf.so.1...done.
> > > > > #0  0x2a1b0cb3 in nodb_timer (_r=0x282ee1a8) at urecord.c:203
> > > > > 203				ptr = ptr->next;
> > > > > (gdb) print ptr
> > > > > $1 = (ucontact_t *) 0x29647362
> > > > > (gdb) print *ptr
> > > > > $2 = {domain = 0x0, aor = 0x0, c = {s = 0x0, len = 0}, expires = 0, q = 0, callid = {s = 0x0, 
> > > > >     len = 0}, cseq = 0, state = CS_NEW, next = 0x0, prev = 0x0}
> > > > > (gdb) bt
> > > > > #0  0x2a1b0cb3 in nodb_timer (_r=0x282ee1a8) at urecord.c:203
> > > > > #1  0x2a1b02cc in timer_urecord (_r=0x282ee1a8) at urecord.c:333
> > > > > #2  0x2a1aae28 in timer_udomain (_d=0x282eadc8) at udomain.c:311
> > > > > #3  0x2a1a76d7 in synchronize_all_udomains () at dlist.c:211
> > > > > #4  0x2a1af8c9 in timer (ticks=720, param=0x0) at ul_mod.c:234
> > > > > #5  0x80735c9 in timer_ticker () at timer.c:118
> > > > > #6  0x805e922 in main_loop () at main.c:654
> > > > > #7  0x80611b1 in main (argc=1, argv=0xbfbffbe8) at main.c:1383
> > > > > #8  0x804c5a6 in _start ()
> > > > > (gdb) up
> > > > > #1  0x2a1b02cc in timer_urecord (_r=0x282ee1a8) at urecord.c:333
> > > > > 333		case NO_DB:         return nodb_timer(_r);
> > > > > (gdb) print _r
> > > > > $3 = (urecord_t *) 0x282ee1a8
> > > > > (gdb) print *_r
> > > > > $4 = {domain = 0x282ead78, aor = {s = 0x282ee1e8 "16045215277aa\"\r\nContent-Length: 0\r\n\r\n", 
> > > > >     len = 11}, contacts = 0x282eea68, slot = 0x282eb188, d_ll = {prev = 0x282ee088, next = 0x0}, 
> > > > >   s_ll = {prev = 0x0, next = 0x0}}
> > > > > (gdb) up
> > > > > #2  0x2a1aae28 in timer_udomain (_d=0x282eadc8) at udomain.c:311
> > > > > 311			if (timer_urecord(ptr) < 0) {
> > > > > (gdb) q
> > > > > bash-2.05a$ exit
> > > > > 
> > > > > Script done on Wed Apr  2 11:47:14 2003
> > > > 
> > > 
> > > 
> > > _______________________________________________
> > > Serusers mailing list
> > > serusers at lists.iptel.org
> > > http://lists.iptel.org/mailman/listinfo/serusers
> 
> 
> _______________________________________________
> Serusers mailing list
> serusers at lists.iptel.org
> http://lists.iptel.org/mailman/listinfo/serusers
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20030403/131e6077/attachment.pgp>


More information about the sr-users mailing list