Hello,
do you have proper retransmission detection in the kamailio.cfg? If you
look at etc/kamailio.cfg with the latest master, it uses
t_precheck_trans() to detect the retransmission in early phase and not
allow parallel processing of the same request.
If routing of BYE doesn't depend on any response from your socket
communication in event route dialog:end, maybe you can redesign a bit
and do that in a dedicated process using mqueue+rtimer, so you avoid
blocking from sip routing workers.
Cheers,
Daniel
On 03.07.17 13:31, Paul Claudiu Boriga wrote:
Hello,
We are experiencing an issue when having some blocking code in the
dialog:end event route. If the code takes too long to execute then CDR
information is not logged.
In our case we have a blocking socket trying to establish a connection
and waiting 500ms before failing. When it timeouts, the accounting
module will never log any CDR logs for that call. It appears that the
callbacks for DLGCB_TERMINATED_CONFIRMED event are never called. That
is because the function dlg_terminated_confirmed is never called (this
function is registered as a callback to the TMCB_RESPONSE_OUT event
from tm module).
After further investigation, the problem seems to be related to a race
condition between two processes when dealing with ending a dialog:
From what we saw the dialog:end event route is treated when a BYE
request is received. But if that takes too long, and that BYE request
will be retransmitted (we are using SIPp to test this, and SIPp will
automatically retransmit the message), it will then be treated by a
different process.
When this happens, the first process has already modified the dialog
state, so the second process will not execute code specific to dialog
ending from the dialog module dlg_onroute function. It will not
execute the event route, nor the DLGCB_TERMINATED event callbacks, and
it will neither register the dlg_terminated_confirmed function to the
TMCB_RESPONSE_OUT event. This means that the second process will
actually finish processing the BYE faster then the first process. By
the time the first process finishes with processing the dialog:end
event route, the transaction corresponding to the BYE request is
already created, and even though dlg_terminated_confirmed function is
eventually registered for the TMCB_RESPONSE_OUT event, it is too late
because this event was already treated, and all callbacks registered
for it were called. This means that DLGCB_TERMINATED_CONFIRMED event
will never be treated, and the registered callbacks for this event
(including the accounting module functions) will not get called.
We managed to work around this behavior by decreasing the timeout of
the socket so that the BYE can be processed before a retransmit
occurs, but maybe there is a different solution to fix this issue. Do
you have any suggestions on how to solve this problem?
Thank you,
Claudiu Boriga.
_______________________________________________
Kamailio (SER) - Development Mailing List
sr-dev(a)lists.kamailio.org
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev
--
Daniel-Constantin Mierla
www.twitter.com/miconda --
www.linkedin.com/in/miconda
Kamailio Advanced Training -
www.asipto.com
Kamailio World Conference -
www.kamailioworld.com