Hi folks,
I am having problems with locking in timer routine (ser 0.8.10), particularly with the my recent addition to the nathelper - udp pinger. This piece of code is being invoked periodically by the timer, retrieves list of all currently registered contacts and sends short udp message to each of them. Obviously, routine which retrieves all contacts locks each domain before accessing it, but apparently it doesn't work as expected.
Following is the dump of debugging session:
-bash-2.05b$ sudo gdb ~/PortaSIP/ser/work/ser-0.8.10/ser ser.core GNU gdb 4.18 (FreeBSD) Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 2627 in elfstab_build_psymtabs Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf
Core was generated by `ser'. Program terminated with signal 11, Segmentation fault. Reading symbols from /usr/lib/libc.so.4...done. Reading symbols from /usr/local/lib/ser/modules/sl.so...done. Reading symbols from /usr/local/lib/ser/modules/tm.so...done. Reading symbols from /usr/local/lib/ser/modules/rr.so...done. Reading symbols from /usr/local/lib/ser/modules/maxfwd.so...done. Reading symbols from /usr/local/lib/ser/modules/usrloc.so...done. Reading symbols from /usr/local/lib/ser/modules/registrar.so...done. Reading symbols from /usr/local/lib/ser/modules/nathelper.so...done. Reading symbols from /usr/local/lib/ser/modules/textops.so...done. Reading symbols from /usr/local/lib/ser/modules/radius_auth.so...done. Reading symbols from /usr/local/lib/libradiusclient.so.0...done. Reading symbols from /usr/lib/libmd.so.2...done. Reading symbols from /usr/lib/libcrypt.so.2...done. Reading symbols from /usr/libexec/ld-elf.so.1...done. #0 0x2a1a62df in get_all_ucontacts (buf=0x80d5248, len=1402) at dlist.c:110 110 if (c->c.len <= 0) (gdb) bt #0 0x2a1a62df in get_all_ucontacts (buf=0x80d5248, len=1402) at dlist.c:110 #1 0x2a1c4b3e in _init () from /usr/local/lib/ser/modules/nathelper.so #2 0x8073679 in timer_ticker () at timer.c:118 #3 0x805e912 in main_loop () at main.c:654 #4 0x80611a1 in main (argc=1, argv=0xbfbffdb8) at main.c:1383 #5 0x804c5a6 in _start () (gdb) print c $1 = (ucontact_t *) 0x460a0d30 (gdb) print *c Cannot access memory at address 0x460a0d30. (gdb) print *r $2 = {domain = 0x282ec0d8, aor = {s = 0x282f4638 "011801", len = 6}, contacts = 0x282f5cb8, slot = 0x282ed418, d_ll = { prev = 0x282ef0d8, next = 0x0}, s_ll = {prev = 0x0, next = 0x0}} (gdb) print *r->contacts $3 = {domain = 0x7a3d6863, aor = 0x34476839, c = {s = 0x37364b62 <Address 0x37364b62 out of bounds>, len = 959328819}, expires = 825243494, q = 2.12359957e+20, callid = {s = 0x32663238 <Address 0x32663238 out of bounds>, len = 1714774885}, cseq = 1631019574, state = 1631020084, user_agent = {s = 0x62386438 <Address 0x62386438 out of bounds>, len = 775107636}, next = 0x460a0d30, prev = 0x3a6d6f72} (gdb) print *r->contacts->next Cannot access memory at address 0x460a0d30. (gdb) l 100 95 void *cp; 96 int shortage; 97 98 cp = buf; 99 shortage = 0; 100 /* Reserve space for terminating 0000 */ 101 len -= sizeof(c->c.len); 102 for (p = root; p != NULL; p = p->next) { 103 lock_udomain(p->d); 104 if (p->d->d_ll.n <= 0) { (gdb) l 105 unlock_udomain(p->d); 106 continue; 107 } 108 for (r = p->d->d_ll.first; r != NULL; r = r->d_ll.next) { 109 for (c = r->contacts; c != NULL; c = c->next) { 110 if (c->c.len <= 0) 111 continue; 112 if (len >= (int)(sizeof(c->c.len) + c->c.len)) { 113 memcpy(cp, &c->c.len, sizeof(c->c.len)); 114 cp += sizeof(c->c.len);
As you can see, we are locked domain in question (line 103), but still, found one of records to be in the inconsistent state (contacts aren't initialized).
Does anyone have any ideas about what could be wrong with this? I an seeing similar problem in the code that periodically expires contacts.
-Maxim
Maxim,
do you experience the same problem without your extensions ? Are you able to reproduce it ? Could you try a CVS snapshot ?
I have never seen such a problem, could you send me some description if there is any way to reproduce it ?
Jan.
On 29-05 00:44, Maxim Sobolev wrote:
Hi folks,
I am having problems with locking in timer routine (ser 0.8.10), particularly with the my recent addition to the nathelper - udp pinger. This piece of code is being invoked periodically by the timer, retrieves list of all currently registered contacts and sends short udp message to each of them. Obviously, routine which retrieves all contacts locks each domain before accessing it, but apparently it doesn't work as expected.
Following is the dump of debugging session:
-bash-2.05b$ sudo gdb ~/PortaSIP/ser/work/ser-0.8.10/ser ser.core GNU gdb 4.18 (FreeBSD) Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 2627 in elfstab_build_psymtabs Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf
Core was generated by `ser'. Program terminated with signal 11, Segmentation fault. Reading symbols from /usr/lib/libc.so.4...done. Reading symbols from /usr/local/lib/ser/modules/sl.so...done. Reading symbols from /usr/local/lib/ser/modules/tm.so...done. Reading symbols from /usr/local/lib/ser/modules/rr.so...done. Reading symbols from /usr/local/lib/ser/modules/maxfwd.so...done. Reading symbols from /usr/local/lib/ser/modules/usrloc.so...done. Reading symbols from /usr/local/lib/ser/modules/registrar.so...done. Reading symbols from /usr/local/lib/ser/modules/nathelper.so...done. Reading symbols from /usr/local/lib/ser/modules/textops.so...done. Reading symbols from /usr/local/lib/ser/modules/radius_auth.so...done. Reading symbols from /usr/local/lib/libradiusclient.so.0...done. Reading symbols from /usr/lib/libmd.so.2...done. Reading symbols from /usr/lib/libcrypt.so.2...done. Reading symbols from /usr/libexec/ld-elf.so.1...done. #0 0x2a1a62df in get_all_ucontacts (buf=0x80d5248, len=1402) at dlist.c:110 110 if (c->c.len <= 0) (gdb) bt #0 0x2a1a62df in get_all_ucontacts (buf=0x80d5248, len=1402) at dlist.c:110 #1 0x2a1c4b3e in _init () from /usr/local/lib/ser/modules/nathelper.so #2 0x8073679 in timer_ticker () at timer.c:118 #3 0x805e912 in main_loop () at main.c:654 #4 0x80611a1 in main (argc=1, argv=0xbfbffdb8) at main.c:1383 #5 0x804c5a6 in _start () (gdb) print c $1 = (ucontact_t *) 0x460a0d30 (gdb) print *c Cannot access memory at address 0x460a0d30. (gdb) print *r $2 = {domain = 0x282ec0d8, aor = {s = 0x282f4638 "011801", len = 6}, contacts = 0x282f5cb8, slot = 0x282ed418, d_ll = { prev = 0x282ef0d8, next = 0x0}, s_ll = {prev = 0x0, next = 0x0}} (gdb) print *r->contacts $3 = {domain = 0x7a3d6863, aor = 0x34476839, c = {s = 0x37364b62
<Address 0x37364b62 out of bounds>, len = 959328819}, expires = 825243494, q = 2.12359957e+20, callid = {s = 0x32663238 <Address 0x32663238 out of bounds>, len = 1714774885}, cseq = 1631019574, state = 1631020084, user_agent = {s = 0x62386438 <Address 0x62386438 out of bounds>, len = 775107636}, next = 0x460a0d30, prev = 0x3a6d6f72} (gdb) print *r->contacts->next Cannot access memory at address 0x460a0d30. (gdb) l 100 95 void *cp; 96 int shortage; 97 98 cp = buf; 99 shortage = 0; 100 /* Reserve space for terminating 0000 */ 101 len -= sizeof(c->c.len); 102 for (p = root; p != NULL; p = p->next) { 103 lock_udomain(p->d); 104 if (p->d->d_ll.n <= 0) { (gdb) l 105 unlock_udomain(p->d); 106 continue; 107 } 108 for (r = p->d->d_ll.first; r != NULL; r = r->d_ll.next) { 109 for (c = r->contacts; c != NULL; c = c->next) { 110 if (c->c.len <= 0) 111 continue; 112 if (len >= (int)(sizeof(c->c.len) + c->c.len)) { 113 memcpy(cp, &c->c.len, sizeof(c->c.len)); 114 cp += sizeof(c->c.len);
As you can see, we are locked domain in question (line 103), but still, found one of records to be in the inconsistent state (contacts aren't initialized).
Does anyone have any ideas about what could be wrong with this? I an seeing similar problem in the code that periodically expires contacts.
-Maxim
Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers