Hi Daniel,
Today it happened again. I have more information:
We have currently running 4x Debian 9 servers with Kamailio v5:
KamailioA & KamailioB
KamailioC & KamailioD
Servers A/B only take care of SSL offloading and loadbalance/failover between servers C/D.
(using dispatcher with minimal config)
Flow would be:
User <-> KamailioA/B <-> KamailioC/D <-> ...
Today KamailioB stop replying... and I got the backtraces.
From KamailioA: 0 problems
From KamailioB:
```
root:~/bt2# grep DISPATCHER /var/log/kamailio/kamailio.log
Jul 1 15:03:06 13cn4 sbc[14833]: WARNING: <script>: [DISPATCHER] - Destination
down: OPTIONS sip:A.A.A.A:5060 (<null>)
Jul 1 15:03:06 13cn4 sbc[14833]: WARNING: <script>: [DISPATCHER] - Destination
down: OPTIONS sip:B.B.B.B:5060 (<null>)
Jul 1 15:39:50 13cn4 sbc[14818]: WARNING: <script>: [DISPATCHER] - Destination up:
OPTIONS sip:A.A.A.A:5060
Jul 1 15:39:50 13cn4 sbc[14818]: WARNING: <script>: [DISPATCHER] - Destination up:
OPTIONS sip:B.B.B.B:5060
root:~/bt2#
```
A.A.A.A and B.B.B.B would be KamailioC/D.
Note, this only happened in KamailioB, KamailioA had 0 problems.
When I connected to the server, I did the following:
```
# for i in `kamctl ps | grep PID | awk '{print $2}' | tr -d ","`; do gdb
/usr/sbin/kamailio -ex "bt full" --batch $i >> $i.txt 2>&1; done
```
Well, I got the backtraces per PID, the thing is, right after running that command,
traffic started flowing again (I didn't restart or anything), hence the timestamps:
```
root:~/bt2# ls -lh
total 356K
-rw-r--r-- 1 root root 1.8K Jul 1 15:39 14816.txt
-rw-r--r-- 1 root root 15K Jul 1 15:39 14817.txt
-rw-r--r-- 1 root root 15K Jul 1 15:39 14818.txt
-rw-r--r-- 1 root root 15K Jul 1 15:39 14819.txt
-rw-r--r-- 1 root root 48K Jul 1 15:39 14820.txt
-rw-r--r-- 1 root root 47K Jul 1 15:39 14821.txt
-rw-r--r-- 1 root root 15K Jul 1 15:39 14822.txt
-rw-r--r-- 1 root root 15K Jul 1 15:39 14823.txt
-rw-r--r-- 1 root root 15K Jul 1 15:39 14824.txt
-rw-r--r-- 1 root root 3.4K Jul 1 15:39 14825.txt
-rw-r--r-- 1 root root 3.4K Jul 1 15:39 14826.txt
-rw-r--r-- 1 root root 3.4K Jul 1 15:39 14827.txt
-rw-r--r-- 1 root root 3.4K Jul 1 15:39 14828.txt
-rw-r--r-- 1 root root 3.4K Jul 1 15:39 14829.txt
-rw-r--r-- 1 root root 3.4K Jul 1 15:39 14830.txt
-rw-r--r-- 1 root root 3.4K Jul 1 15:39 14831.txt
-rw-r--r-- 1 root root 3.4K Jul 1 15:39 14832.txt
-rw-r--r-- 1 root root 42K Jul 1 15:39 14833.txt
-rw-r--r-- 1 root root 1.8K Jul 1 15:39 14834.txt
-rw-r--r-- 1 root root 2.2K Jul 1 15:39 14835.txt
-rw-r--r-- 1 root root 4.0K Jul 1 15:39 14836.txt
-rw-r--r-- 1 root root 2.9K Jul 1 15:39 14837.txt
-rw-r--r-- 1 root root 2.8K Jul 1 15:39 14838.txt
-rw-r--r-- 1 root root 4.7K Jul 1 15:39 14839.txt
-rw-r--r-- 1 root root 2.7K Jul 1 15:39 14840.txt
-rw-r--r-- 1 root root 2.7K Jul 1 15:39 14841.txt
-rw-r--r-- 1 root root 2.7K Jul 1 15:39 14842.txt
-rw-r--r-- 1 root root 7.5K Jul 1 15:39 14843.txt
-rw-r--r-- 1 root root 2.7K Jul 1 15:39 14844.txt
-rw-r--r-- 1 root root 2.7K Jul 1 15:39 14845.txt
-rw-r--r-- 1 root root 7.5K Jul 1 15:39 14846.txt
-rw-r--r-- 1 root root 2.7K Jul 1 15:39 14847.txt
-rw-r--r-- 1 root root 7.5K Jul 1 15:39 14848.txt
-rw-r--r-- 1 root root 4.2K Jul 1 15:39 14849.txt
root:~/bt2#
```
[
backtrace_20170701_1539.tar.gz](https://github.com/kamailio/kamailio/files/…
I have tried to look at the backtraces, but to me they seem ok (kind of like the previous
ones, where Kamailio is just waiting for new requests).
So at this point my assumption is that something is triggering dispatcher to see both
nodes down and therefor stops processing traffic, when dispatcher sees the nodes up again,
all starts working.
Let me know what you think Daniel and how I can investigate this further.
I also have tcpdump captures at the time, I will check to see if OPTIONS are actually
being sent out or not to try to narrow this a little more.
Thanks!
Joel.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/1172#issuecomment-312461004