I attached an `strace` to the main TCP receiver process, and saw this preceding the crash:
``` recvmsg(10, {msg_name(0)=NULL, msg_iov(1)=[{"", 16}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT) = 0 ```
This then engenders the following sequence of events:
``` sendto(5, "<26>Sep 26 13:09:01 /sbin/kamail"..., 99, MSG_NOSIGNAL, NULL, 0) = 99 epoll_ctl(63, EPOLL_CTL_DEL, 10, {EPOLLWRNORM|EPOLLHUP|EPOLLRDHUP|EPOLLET|0x39e9800, {u32=32767, u64=32767}}) = 0 epoll_wait(63, 2b9481eca8c8, 1006, 5000) = -1 EINTR (Interrupted system call) --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=628, si_uid=106} --- exit_group(0) = ? +++ exited with 0 +++ ```
I'm still not sure what is raising `SIGTERM` exactly, but it thickens the plot insofar as it appears that the cause of the TCP receiver process is dying is actually a `SIGTERM` from another place, rather than the 0 return value of `recvmsg()` per se. I assume this is something related to a different child process dying -- I think Kamailio kills all the other children upon receipt of a `SIGCHLD` from one of the workers, right? -- but I haven't been able to get to the bottom of which child process is dying and why.