On 02/06/2021 18:25, Daniel-Constantin Mierla wrote:

Hello,

On 02.06.21 19:00, Trevor Hemsley wrote:
Hi

We've seen a few occurrences of this crash since we implemented kamailio 5.4.5 for our inbound call traffic.

to clarify: have you made an update to kamailio 5.4.5 from an older version and since then you started to have such crashes? If yes, what was the previous version? I quickly checked and there were no major changes to dialog module in 5.4 branch.

If you started with this version, is it under heavy load? Have you captured the SIP traffic? It would be useful to see the sip messages for such call.


We were running an ancient kamailio before, I think 4.0.3 or something like that. This is a completely fresh install from scratch. And yes, depends on your definition of heavy load but looking at the stats I think it's about 2.5 million calls in approximately 4 weekdays (plus 3 low traffic weekend days). The one that crashed this morning at 07:xx has done 425k in the approximately 12 hours since. The box doesn't seem to be under heavy load from a linux perspective, cpu usage under 20% (two cores), more than 1GB RAM free of the 2GB allocated, load average under 1.0 all the time.


Preceding the crash we always get a message in the logs about "dialog [dlg_hash.c:1182]: next_state_dlg(): bogus event 2 in state 5 for dlg" and whenever we see that message, we get the crash.

Jun  1 10:12:14 thissystem /usr/sbin/kamailio[20001]: CRITICAL: {2 1 INVITE 6787142-3831531134-1330894187@some.telco.domain} dialog [dlg_hash.c:1182]: next_state_dlg(): bogus event 2 in state 5 for dlg 0x7fae153d3cf0 [3973:6059] with clid '82608924-3831089984-1833452161@some.telco.domain' and tags '3831089984-684203260' ''

The crash happens on line 879 of https://github.com/kamailio/kamailio/blob/master/src/modules/dialog/dlg_db_handler.c

You reference to code in master branch, not in 5.4 branch, so unless you did the remapping, the lines might not be the same.


The code is the same, at least as far as line 879 is concerned. And the source I've been referencing is the local copy that I built our packages from, I just figured that it would be more visible to use a link to the git copy rather than quoting bits out of the local copy.


You can eventually open an issue in the bug tracker to collect the details there and assist with troubleshooting.


https://github.com/kamailio/kamailio/issues/2757

P.S. I did try to ask on Freenode #kamailio but I guess the channel is abandoned due to the recent Freenode kerfuffle though I did look on libera.chat for an equivalent and could not find one.


Trevor Hemsley




Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here.