[sr-dev] [kamailio/kamailio] dmq_usrloc: fix deadlock (#911)

Julien Chavanton notifications at github.com
Wed Jan 4 16:59:04 CET 2017


While running the CI tests Kamctl / MI fifo was hanging, after further investigation it seems it was trying to acquire a lock while the DMK module was stuck using it.


Processes 
```
Process::  ID=0 PID=6 Type=main process - attendant
Process::  ID=1 PID=168 Type=udp receiver child=0 sock=0.0.0.0:5060 ! (172.19.0.12:5060) !
Process::  ID=2 PID=169 Type=udp receiver child=1 sock=0.0.0.0:5060 (172.19.0.12:5060) !
Process::  ID=3 PID=170 Type=slow timer 
Process::  ID=4 PID=171 Type=timer 
Process::  ID=5 PID=172 Type=secondary timer
Process::  ID=6 PID=173 Type=MI FIFO
Process::  ID=7 PID=174 Type=ctl handler !
Process::  ID=8 PID=175 Type=RTIMER SEC EXEC
Process::  ID=9 PID=176 Type=RTIMER SEC EXEC
Process::  ID=10 PID=177 Type=HTable Timer
Process::  ID=11 PID=178 Type=USRLOC Timer
Process::  ID=12 PID=179 Type=USRLOC Timer
Process::  ID=13 PID=180 Type=TIMER NH
Process::  ID=14 PID=181 Type=TIMER NH
Process::  ID=15 PID=182 Type=DMQ WORKER
Process::  ID=16 PID=183 Type=DMQ WORKER
Process::  ID=17 PID=184 Type=tcp main process
```

gdb /usr/sbin/kamailio 173
```
(gdb) bt
#0  0x0000785ee4f2f042 in __syscall6 (a6=<optimized out>, a5=<optimized out>, a4=<optimized out>, a3=<optimized out>, a2=<optimized out>, a1=<optimized out>, n=202) at ./arch/x86_64/syscall_arch.h:59
#1  syscall (n=n at entry=202) at src/misc/syscall.c:18
#2  0x0000785eb6ec670b in futex_get (lock=0x785eb79215f4) at ../../lib/kcore/../../mem/../futexlock.h:108
#3  0x0000785eb6ecc200 in lock_ulslot (_d=_d at entry=0x785eb7955690, i=i at entry=472189) at udomain.c:1077
#4  0x0000785eb6ec1947 in mi_usrloc_dump (cmd=<optimized out>, param=<optimized out>) at ul_mi.c:451
#5  0x0000785edc4c0fa1 in run_mi_cmd (t=0xa9bbcf24100, cmd=0xa9bbce892e0) at ../../lib/kmi/mi.h:77
#6  mi_fifo_server (fifo_stream=fifo_stream at entry=0xa9bbce8a5a0) at fifo_fnc.c:509
#7  0x0000785edc4c3317 in fifo_process (rank=rank at entry=1) at mi_fifo.c:270
#8  0x0000785edc4c53b4 in mi_child_init (rank=<optimized out>) at mi_fifo.c:234
#9  0x00000a9bb8e9b6e6 in init_mod_child (m=0x785edcf22008, rank=rank at entry=0) at sr_module.c:921
#10 0x00000a9bb8e9b4f0 in init_mod_child (m=0x785edcf22270, rank=rank at entry=0) at sr_module.c:918
#11 0x00000a9bb8e9b4f0 in init_mod_child (m=0x785edcf227f0, rank=rank at entry=0) at sr_module.c:918
#12 0x00000a9bb8e9b4f0 in init_mod_child (m=0x785edcf22d40, rank=rank at entry=0) at sr_module.c:918
#13 0x00000a9bb8e9b4f0 in init_mod_child (m=0x785edcf246d0, rank=rank at entry=0) at sr_module.c:918
#14 0x00000a9bb8e9b4f0 in init_mod_child (m=0x785edcf252f8, rank=rank at entry=0) at sr_module.c:918
```

gdb /usr/sbin/kamailio 182
```
(gdb) bt
#0  0x0000785ee4f2f042 in __syscall6 (a6=<optimized out>, a5=<optimized out>, a4=<optimized out>, a3=<optimized out>, a2=<optimized out>, a1=<optimized out>, n=202) at ./arch/x86_64/syscall_arch.h:59
#1  syscall (n=n at entry=202) at src/misc/syscall.c:18
#2  0x0000785eb6ec674e in futex_get (lock=0x785eb79215f4) at ../../lib/kcore/../../mem/../futexlock.h:121
#3  0x0000785eb6ece5f7 in get_urecord_by_ruid (_d=0x785eb7955690, _aorhash=1819751549, _ruid=_ruid at entry=0x785eb68201c0 <ci>, _r=_r at entry=0x7d7464b7e400, _c=_c at entry=0x7d7464b7e410) at udomain.c:1178
#4  0x0000785eb66187b3 in add_contact (ci=0x785eb68201c0 <ci>, aor=...) at usrloc_sync.c:72
#5  usrloc_dmq_handle_msg (msg=<optimized out>, resp=0x7d7464b7e520, node=0x785eb8d5b418) at usrloc_sync.c:429
#6  0x0000785eb682adcc in worker_loop (id=id at entry=0) at worker.c:105
#7  0x0000785eb682c56b in child_init (rank=<optimized out>) at dmq.c:293
#8  0x00000a9bb8e9b6e6 in init_mod_child (m=0x785edcf9fe78, rank=rank at entry=0) at sr_module.c:921
#9  0x00000a9bb8e9b4f0 in init_mod_child (m=0x785edcfa02e0, rank=0) at sr_module.c:918
#10 0x00000a9bb8eacbde in main_loop () at main.c:1682
#11 0x00000a9bb8e6249b in main (argc=<optimized out>, argv=<optimized out>) at main.c:2616
```

gdb /usr/sbin/kamailio 183
```
(gdb) bt
#0  0x0000785ee4f2f042 in __syscall6 (a6=<optimized out>, a5=<optimized out>, a4=<optimized out>, a3=<optimized out>, a2=<optimized out>, a1=<optimized out>, n=202) at ./arch/x86_64/syscall_arch.h:59
#1  syscall (n=n at entry=202) at src/misc/syscall.c:18
#2  0x0000785eb6ec670b in futex_get (lock=0x785eb79215f4) at ../../lib/kcore/../../mem/../futexlock.h:108
#3  0x0000785eb661807c in add_contact (ci=0x785eb68201c0 <ci>, aor=...) at usrloc_sync.c:64
#4  usrloc_dmq_handle_msg (msg=<optimized out>, resp=0x7d7464b7e520, node=0x785eb8d5b418) at usrloc_sync.c:429
#5  0x0000785eb682adcc in worker_loop (id=id at entry=1) at worker.c:105
#6  0x0000785eb682c56b in child_init (rank=<optimized out>) at dmq.c:293
#7  0x00000a9bb8e9b6e6 in init_mod_child (m=0x785edcf9fe78, rank=rank at entry=0) at sr_module.c:921
#8  0x00000a9bb8e9b4f0 in init_mod_child (m=0x785edcfa02e0, rank=0) at sr_module.c:918
#9  0x00000a9bb8eacbde in main_loop () at main.c:1682
#10 0x00000a9bb8e6249b in main (argc=<optimized out>, argv=<optimized out>) at main.c:2616
```

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/pull/911#issuecomment-270406869
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-dev/attachments/20170104/d739783c/attachment-0001.html>


More information about the sr-dev mailing list