[sr-dev] [kamailio/kamailio] deadlock in ims_registrar_scscf (#1647)

Sergey Zyrianov notifications at github.com
Tue Sep 18 09:26:39 CEST 2018


### Description

<!--
Explain what you did, what you expected to happen, and what actually happened.
-->
S-CSCF stops processing REGISTER and INVITE requests.

### Troubleshooting
Gdb stack traces revealed several processes trying to do lock_udmain() unsuccessfully.  
Adding more debug pointed to the direction of ims_registrar_scscf [lookup.c:107]: lookup(). 
This function does lock_udomain() with one "slot" and releases it with another. 

The lookup() [locks](https://github.com/kamailio/kamailio/blob/48de203fda213749ac1e6fdb081c22dd701f85c4/src/modules/ims_registrar_scscf/lookup.c#L107) domain with one value in the **aor** but before unlocking it at lockup.c:209 the said
aor is [changed](https://github.com/kamailio/kamailio/blob/48de203fda213749ac1e6fdb081c22dd701f85c4/src/modules/ims_registrar_scscf/lookup.c#L134). Aor points to the URI that is rewritten and therefore the wrong slot is [unlocked](https://github.com/kamailio/kamailio/blob/48de203fda213749ac1e6fdb081c22dd701f85c4/src/modules/ims_registrar_scscf/lookup.c#L209). In the log below the locked slot is 461 while unlocked is 427.

9(791) DEBUG: ims_registrar_scscf [lookup.c:90]: lookup(): looking for any type of terminal
9(791) DEBUG: ims_registrar_scscf [lookup.c:103]: lookup(): Looking for tel:+46xxxxxxxxxx
9(791) DEBUG: ims_usrloc_scscf [udomain.c:445]: lock_ulslot(): LOCKING UDOMAIN SLOT [**461**]
9(791) DEBUG: ims_registrar_scscf [lookup.c:119]: lookup(): Found a valid contact [sip:10.110.2.199:5060;alias=10.110.2.19950602]
9(791) DEBUG: [core/parser/parse_rr.c:464]: get_path_dst_uri(): path for branch: 'sip:term at pcscf.yyy.xxx.3gppnetwork.org;lr'
9(791) DEBUG: ims_usrloc_scscf [udomain.c:465]: unlock_ulslot(): UN-LOCKING UDOMAIN SLOT [**427**]
#### Reproduction

<!--
If the issue can be reproduced, describe how it can be done.
-->

#### Debugging Data

<!--
If you got a core dump, use gdb to extract troubleshooting data - full backtrace,
local variables and the list of the code at the issue location.

  gdb /path/to/kamailio /path/to/corefile
  bt full
  info locals
  list

If you are familiar with gdb, feel free to attach more of what you consider to
be relevant.
-->

```
(gdb) bt #0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 
#1 0x00007fecd3f61dbd in futex_get (lock=0x7feccf6a2214) at ../../core/mem/../futexlock.h:108 #2 0x00007fecd3f6832e in lock_ulslot (_d=0x7feccf70a718, i=461) at udomain.c:450 
#3 0x00007fecd3f68273 in lock_udomain (_d=0x7feccf70a718, _aor=0x7feccf72e5a8) at udomain.c:424 #4 0x00007fecd3ccd375 in update_contacts (msg=0x7fecd3f1b6c0 <_pv_treq>, _d=0x7feccf70a718, public_identity=0x7feccf73e0c8, assignment_type=2, s=0x7fff31beac58, ccf1=0x7fff31beac80, ccf2=0x7fff31beac90, ecf1=0x7fff31beaca0, ecf2=0x7fff31beacb0, contact_header=0x7feccf73e0e0) at save.c:886 #5 0x00007fecd3c6cbe5 in async_cdp_callback (is_timeout=0, param=0x7feccf73e0a0, saa=0x7feccf7b9430, elapsed_msecs=59) at cxdx_sar.c:252 
#6 0x00007fecd460fb18 in api_callback (p=0x7feccf69cd18, msg=0x7feccf7b9430, ptr=0x0) at api_process.c:120 #7 0x00007fecd46868aa in worker_process (id=0) at worker.c:346 
#8 0x00007fecd464636b in diameter_peer_start (blocking=0) at diameter_peer.c:242 
#9 0x00007fecd4622666 in cdp_child_init (rank=0) at cdp_mod.c:243 #10 0x00000000005d2fbc in init_mod_child (m=0x7fecd940f8a0, rank=0) at core/sr_module.c:943 #11 0x00000000005d2c5e in init_mod_child (m=0x7fecd9410550, rank=0) at core/sr_module.c:939 #12 0x00000000005d2c5e in init_mod_child (m=0x7fecd9410958, rank=0) at core/sr_module.c:939 #13 0x00000000005d2c5e in init_mod_child (m=0x7fecd9410d68, rank=0) at core/sr_module.c:939 #14 0x00000000005d2c5e in init_mod_child (m=0x7fecd9411560, rank=0) at core/sr_module.c:939 #15 0x00000000005d2c5e in init_mod_child (m=0x7fecd9411aa0, rank=0) at core/sr_module.c:939 #16 0x00000000005d2c5e in init_mod_child (m=0x7fecd9411f00, rank=0) at core/sr_module.c:939 #17 0x00000000005d3390 in init_child (rank=0) at core/sr_module.c:970 #18 0x000000000042539c in main_loop () at main.c:1701 #19 0x000000000042bdd5 in main (argc=6, argv=0x7fff31beb998) at main.c:2638
```

#### Log Messages

<!--
Check the syslog file and if there are relevant log messages printed by Kamailio, add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site).
-->

```
9(791) DEBUG: ims_registrar_scscf [lookup.c:90]: lookup(): looking for any type of terminal
9(791) DEBUG: ims_registrar_scscf [lookup.c:103]: lookup(): Looking for tel:+46xxxxxxxxxx
9(791) DEBUG: ims_usrloc_scscf [udomain.c:445]: lock_ulslot(): LOCKING UDOMAIN SLOT [461]
9(791) DEBUG: ims_registrar_scscf [lookup.c:119]: lookup(): Found a valid contact [sip:10.110.2.199:5060;alias=10.110.2.19950602]
9(791) DEBUG: [core/parser/parse_rr.c:464]: get_path_dst_uri(): path for branch: 'sip:term at pcscf.yyy.xxx.3gppnetwork.org;lr'
9(791) DEBUG: ims_usrloc_scscf [udomain.c:465]: unlock_ulslot(): UN-LOCKING UDOMAIN SLOT [427]
```

#### SIP Traffic

<!--
If the issue is exposed by processing specific SIP messages, grab them with ngrep or save in a pcap file, then add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site).
-->

```
(paste your sip traffic here)
```

### Possible Solutions

<!--
If you found a solution or workaround for the issue, describe it. Ideally, provide a pull request with a fix.
-->
Make a copy of the aor string.

### Additional Information

  * **Kamailio Version** - output of `kamailio -v`

```
version: kamailio 5.1.5 (x86_64/linux) 
flags: STATS: Off, USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144 MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: unknown 
compiled with gcc 4.8.2
```

* **Operating System**:

<!--
Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...;
Kernel details (output of `uname -a`)
-->

```
Linux scscf 4.14.67-coreos #1 SMP Mon Sep 10 23:14:26 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
```


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/1647
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-dev/attachments/20180918/d4713539/attachment.html>


More information about the sr-dev mailing list