[Serusers] Crash in usrloc module [backtrace]
Maxim Sobolev
sobomax at portaone.com
Thu Apr 3 12:20:51 CEST 2003
We are using 0.8.10 (not a CVS one) with locally fixed radius module.
-Maxim
On Thu, Apr 03, 2003 at 12:43:52AM +0200, Jan Janak wrote:
> On 03-04 01:33, Maxim Sobolev wrote:
> > On Wed, Apr 02, 2003 at 11:15:59PM +0200, Jan Janak wrote:
> > > On 02-04 23:50, Maxim Sobolev wrote:
> > > > On Wed, Apr 02, 2003 at 10:33:36PM +0200, Jan Janak wrote:
> > > > > Hello,
> > > > >
> > > > > this is really strange, the structure could never be filled with zeroes
> > > > > under normal circumstancies. At least domain and aor must be set to
> > > > > non-zero value. After the structure is created these fields are automatically
> > > > > set. And they must be non-zero otherwise the structure would never be
> > > > > created.
> > > > >
> > > > > So either gdb doesn't show correct values or the memory has been
> > > > > corrupted somehow.
> > > >
> > > > Yes, it looks like that, because ptr is a valid pointer in the trace
> > > > and ptr->next shouldn't cause sig11. Maybe there is a problem with
> > > > locking? Is it possible that two or more processes would start
> > > > modifying _r linked list simulateneously, therefore breaking its
> > > > integrity?
> > >
> > > I reviewed the locking and haven't found any problem. We have been
> > > running ser for very long time without any problems (I think that
> > > locking problem would show up on iptel.org - it's registrar is very
> > > active).
> > >
> > > Could you please tar the sources along with the core dump (and log
> > > files if possible) and send it to me ? I currently have no clue why
> > > such a mysterious crash happended (it can be even HW problem) but I'd
> > > like to review it later.
> >
> > No problems, I'll do it for you. Could it be the problem related to the
> > fact that we are using auth_radius() to authenticate REGISTER requests
> > before allowing them in, while the Radius server talks to a MySQL database
> > and therefore, sometimes adds significant delay to processing (up to
> > several seconds)? In this case I think that it is quite possible that
> > expiration timer hits at the same moment of time when positive Radius reply
> > arrives and save() is called, causing bad things to happen
> >
>
> Are you using a CVS snapshot ? I though that you are using 0.8.10
> (from the info you sent me). auth_radius module is not available for
> 0.8.10 (it was called radius_auth).
>
> The situation you describe is not possible. radius authentication is
> done in a different module and save() is executed after the
> authentication completes successfully. So it doesn't matter how long
> the radius query takes.
>
> And both save() function and the timer are protected by a mutex so
> they cannot modify contect of user location database at the same time.
>
> What version of ser are you using ?
>
> Jan.
>
> > -Maxim
> >
> > >
> > > thanks, Jan.
> > >
> > > >
> > > > -Maxim
> > > >
> > > > >
> > > > > Is anything suspicous in your log files ?
> > > > >
> > > > > regards, Jan.
> > > > >
> > > > > On 02-04 22:54, Maxim Sobolev wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I've observed rather mysterious crash in ser, see attached debug log.
> > > > > >
> > > > > > Any ideas what gives?
> > > > > >
> > > > > > -Maxim
> > > > >
> > > > > > Script started on Wed Apr 2 11:46:12 2003
> > > > > > bash-2.05a$ sudo gdb ~/PortaSIP/ser/work/ser-0.8.10/ser ser.core
> > > > > > GNU gdb 4.18 (FreeBSD)
> > > > > > Copyright 1998 Free Software Foundation, Inc.
> > > > > > GDB is free software, covered by the GNU General Public License, and you are
> > > > > > welcome to change it and/or distribute copies of it under certain conditions.
> > > > > > Type "show copying" to see the conditions.
> > > > > > There is absolutely no warranty for GDB. Type "show warranty" for details.
> > > > > > This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 2627 in elfstab_build_psymtabs
> > > > > > Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf
> > > > > >
> > > > > > Core was generated by `ser'.
> > > > > > Program terminated with signal 11, Segmentation fault.
> > > > > > Reading symbols from /usr/lib/libc.so.4...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/sl.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/tm.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/rr.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/maxfwd.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/usrloc.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/registrar.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/nathelper.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/textops.so...done.
> > > > > > Reading symbols from /usr/local/lib/ser/modules/radius_auth.so...done.
> > > > > > Reading symbols from /usr/local/lib/libradiusclient.so.0...done.
> > > > > > Reading symbols from /usr/lib/libmd.so.2...done.
> > > > > > Reading symbols from /usr/lib/libcrypt.so.2...done.
> > > > > > Reading symbols from /usr/libexec/ld-elf.so.1...done.
> > > > > > #0 0x2a1b0cb3 in nodb_timer (_r=0x282ee1a8) at urecord.c:203
> > > > > > 203 ptr = ptr->next;
> > > > > > (gdb) print ptr
> > > > > > $1 = (ucontact_t *) 0x29647362
> > > > > > (gdb) print *ptr
> > > > > > $2 = {domain = 0x0, aor = 0x0, c = {s = 0x0, len = 0}, expires = 0, q = 0, callid = {s = 0x0,
> > > > > > len = 0}, cseq = 0, state = CS_NEW, next = 0x0, prev = 0x0}
> > > > > > (gdb) bt
> > > > > > #0 0x2a1b0cb3 in nodb_timer (_r=0x282ee1a8) at urecord.c:203
> > > > > > #1 0x2a1b02cc in timer_urecord (_r=0x282ee1a8) at urecord.c:333
> > > > > > #2 0x2a1aae28 in timer_udomain (_d=0x282eadc8) at udomain.c:311
> > > > > > #3 0x2a1a76d7 in synchronize_all_udomains () at dlist.c:211
> > > > > > #4 0x2a1af8c9 in timer (ticks=720, param=0x0) at ul_mod.c:234
> > > > > > #5 0x80735c9 in timer_ticker () at timer.c:118
> > > > > > #6 0x805e922 in main_loop () at main.c:654
> > > > > > #7 0x80611b1 in main (argc=1, argv=0xbfbffbe8) at main.c:1383
> > > > > > #8 0x804c5a6 in _start ()
> > > > > > (gdb) up
> > > > > > #1 0x2a1b02cc in timer_urecord (_r=0x282ee1a8) at urecord.c:333
> > > > > > 333 case NO_DB: return nodb_timer(_r);
> > > > > > (gdb) print _r
> > > > > > $3 = (urecord_t *) 0x282ee1a8
> > > > > > (gdb) print *_r
> > > > > > $4 = {domain = 0x282ead78, aor = {s = 0x282ee1e8 "16045215277aa\"\r\nContent-Length: 0\r\n\r\n",
> > > > > > len = 11}, contacts = 0x282eea68, slot = 0x282eb188, d_ll = {prev = 0x282ee088, next = 0x0},
> > > > > > s_ll = {prev = 0x0, next = 0x0}}
> > > > > > (gdb) up
> > > > > > #2 0x2a1aae28 in timer_udomain (_d=0x282eadc8) at udomain.c:311
> > > > > > 311 if (timer_urecord(ptr) < 0) {
> > > > > > (gdb) q
> > > > > > bash-2.05a$ exit
> > > > > >
> > > > > > Script done on Wed Apr 2 11:47:14 2003
> > > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > Serusers mailing list
> > > > serusers at lists.iptel.org
> > > > http://lists.iptel.org/mailman/listinfo/serusers
> >
> >
> > _______________________________________________
> > Serusers mailing list
> > serusers at lists.iptel.org
> > http://lists.iptel.org/mailman/listinfo/serusers
More information about the sr-users
mailing list