I attached an strace to the main TCP receiver process, and saw this preceding the crash:

recvmsg(10, {msg_name(0)=NULL, msg_iov(1)=[{"", 16}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT) = 0

This then engenders the following sequence of events:

sendto(5, "<26>Sep 26 13:09:01 /sbin/kamail"..., 99, MSG_NOSIGNAL, NULL, 0) = 99
epoll_ctl(63, EPOLL_CTL_DEL, 10, {EPOLLWRNORM|EPOLLHUP|EPOLLRDHUP|EPOLLET|0x39e9800, {u32=32767, u64=32767}}) = 0
epoll_wait(63, 2b9481eca8c8, 1006, 5000) = -1 EINTR (Interrupted system call)
--- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=628, si_uid=106} ---
exit_group(0)                           = ?
+++ exited with 0 +++

I'm still not sure what is raising SIGTERM exactly, but it thickens the plot insofar as it appears that the cause of the TCP receiver process is dying is actually a SIGTERM from another place, rather than the 0 return value of recvmsg() per se. I assume this is something related to a different child process dying -- I think Kamailio kills all the other children upon receipt of a SIGCHLD from one of the workers, right? -- but I haven't been able to get to the bottom of which child process is dying and why.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.