Hello,
I have two issues related to dialog module state keeping/logging that caught my attention today:
1.) The module considers requests routed in early (but not yet confirmed) dialogs to be bogus, as can be seen from a few lines of state machine code in dlg_hash.c:
http://git.sip-router.org/cgi-bin/gitweb.cgi?p=sip-router;a=blob;f=modules_k...
In the logs, it shows up something like this:
CRITICAL:dialog:log_next_state_dlg: bogus event 8 in state 2 for dlg 0x7f9f237ffff8 [2901:1286709577] with clid '<Call-ID>' and tags '<caller tag>' '<callee tag, always empty>'
I believe that such in-early-dialog requests should not trigger bogus-event log messages as they look fine from a standard's point of view.
2.) Various race conditions may generate log messages such as following when DID mode is enabled:
WARNING:dialog:dlg_onroute: unable to find dialog for <REQUEST TYPE> with route param 'faf.f69e7b4' [4015:79161711]
Observed REQUEST TYPEs at our site were "BYE" when BYE requests were transmitted by both caller and callee at roughly the same time due to user behavior; and "PRACK" when a PRACK was received by the UAS after a 486 (Busy Here)/ACK pair of messages was already exchanged between the UAS and the proxy. In both cases, the associated dialog had already been destroyed when another request was received.
My suggestion is to lower the log level for this kind of message because it can simply happen too often but doesn't affect proper dialog handling.
Feedback?
Cheers,
--Timo
2010/6/22 Timo Reimann timo.reimann@1und1.de:
1.) The module considers requests routed in early (but not yet confirmed) dialogs to be bogus, as can be seen from a few lines of state machine code in dlg_hash.c:
http://git.sip-router.org/cgi-bin/gitweb.cgi?p=sip-router;a=blob;f=modules_k...
In the logs, it shows up something like this:
CRITICAL:dialog:log_next_state_dlg: bogus event 8 in state 2 for dlg 0x7f9f237ffff8 [2901:1286709577] with clid '<Call-ID>' and tags '<caller tag>' '<callee tag, always empty>'
I believe that such in-early-dialog requests should not trigger bogus-event log messages as they look fine from a standard's point of view.
Do you mean that an in-dialog request (during early dialog state) is considered a "bug" by dialog module? like a PRACK, INFO, OPTIONS, MESSAGE or whatever valid in-dialog request in early dialog? If so, then it's for sure a bug in the code.
Hey Iñaki,
Iñaki Baz Castillo wrote:
2010/6/22 Timo Reimann timo.reimann@1und1.de:
1.) The module considers requests routed in early (but not yet confirmed) dialogs to be bogus, as can be seen from a few lines of state machine code in dlg_hash.c:
http://git.sip-router.org/cgi-bin/gitweb.cgi?p=sip-router;a=blob;f=modules_k...
In the logs, it shows up something like this:
CRITICAL:dialog:log_next_state_dlg: bogus event 8 in state 2 for dlg 0x7f9f237ffff8 [2901:1286709577] with clid '<Call-ID>' and tags '<caller tag>' '<callee tag, always empty>'
I believe that such in-early-dialog requests should not trigger bogus-event log messages as they look fine from a standard's point of view.
Do you mean that an in-dialog request (during early dialog state) is considered a "bug" by dialog module? like a PRACK, INFO, OPTIONS, MESSAGE or whatever valid in-dialog request in early dialog?
Except for PRACK, all types of in-dialog requests received during early state are considered "bugs" (see the link to dlg_hash.c I posted above).
If so, then it's for sure a bug in the code.
Then I will fix it the other day.
Cheers,
--Timo
On Tue, Jun 22, 2010 at 12:27 PM, Timo Reimann timo.reimann@1und1.de wrote:
Hello,
I have two issues related to dialog module state keeping/logging that caught my attention today:
1.) The module considers requests routed in early (but not yet confirmed) dialogs to be bogus, as can be seen from a few lines of state machine code in dlg_hash.c:
http://git.sip-router.org/cgi-bin/gitweb.cgi?p=sip-router;a=blob;f=modules_k...
In the logs, it shows up something like this:
CRITICAL:dialog:log_next_state_dlg: bogus event 8 in state 2 for dlg 0x7f9f237ffff8 [2901:1286709577] with clid '<Call-ID>' and tags '<caller tag>' '<callee tag, always empty>'
I believe that such in-early-dialog requests should not trigger bogus-event log messages as they look fine from a standard's point of view.
If this means adding the DLG_STATE_EARLY to the switch, then it should be fine.
2.) Various race conditions may generate log messages such as following when DID mode is enabled:
WARNING:dialog:dlg_onroute: unable to find dialog for <REQUEST TYPE> with route param 'faf.f69e7b4' [4015:79161711]
Observed REQUEST TYPEs at our site were "BYE" when BYE requests were transmitted by both caller and callee at roughly the same time due to user behavior; and "PRACK" when a PRACK was received by the UAS after a 486 (Busy Here)/ACK pair of messages was already exchanged between the UAS and the proxy. In both cases, the associated dialog had already been destroyed when another request was received.
My suggestion is to lower the log level for this kind of message because it can simply happen too often but doesn't affect proper dialog handling.
The proper fix here would be to extend the lifetime of a dialog in deleted state and absorb any late in-dialog requests. If we just lower the log level, we don't know if it's a valid warning log or a bogus one.
Feedback?
Cheers,
--Timo
SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list sr-users@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
Hi,
Ovidiu Sas wrote:
On Tue, Jun 22, 2010 at 12:27 PM, Timo Reimann timo.reimann@1und1.de wrote:
2.) Various race conditions may generate log messages such as following when DID mode is enabled:
WARNING:dialog:dlg_onroute: unable to find dialog for <REQUEST TYPE> with route param 'faf.f69e7b4' [4015:79161711]
Observed REQUEST TYPEs at our site were "BYE" when BYE requests were transmitted by both caller and callee at roughly the same time due to user behavior; and "PRACK" when a PRACK was received by the UAS after a 486 (Busy Here)/ACK pair of messages was already exchanged between the UAS and the proxy. In both cases, the associated dialog had already been destroyed when another request was received.
My suggestion is to lower the log level for this kind of message because it can simply happen too often but doesn't affect proper dialog handling.
The proper fix here would be to extend the lifetime of a dialog in deleted state and absorb any late in-dialog requests. If we just lower the log level, we don't know if it's a valid warning log or a bogus one.
I've been wondering how to achieve this.
AFAICS, the current implementation removes dialog entries right after the moment the final request/response pair was seen. In order to prolong a dying dialog's lifetime, I see two possibilities: Use some time-triggered function that delays dialog deletion for a short period of time; or, couple dialog deletion to the deletion of the final transaction, i.e., the one were dialog state transitions to DLG_STATE_DELETED. Personally, I prefer the latter approach since it would quite elegantly re-use the transaction layer's timer.
Cheers,
--Timo
On Wed, Jun 23, 2010 at 9:25 AM, Timo Reimann timo.reimann@1und1.de wrote:
Hi,
Ovidiu Sas wrote:
On Tue, Jun 22, 2010 at 12:27 PM, Timo Reimann timo.reimann@1und1.de wrote:
2.) Various race conditions may generate log messages such as following when DID mode is enabled:
WARNING:dialog:dlg_onroute: unable to find dialog for <REQUEST TYPE> with route param 'faf.f69e7b4' [4015:79161711]
Observed REQUEST TYPEs at our site were "BYE" when BYE requests were transmitted by both caller and callee at roughly the same time due to user behavior; and "PRACK" when a PRACK was received by the UAS after a 486 (Busy Here)/ACK pair of messages was already exchanged between the UAS and the proxy. In both cases, the associated dialog had already been destroyed when another request was received.
My suggestion is to lower the log level for this kind of message because it can simply happen too often but doesn't affect proper dialog handling.
The proper fix here would be to extend the lifetime of a dialog in deleted state and absorb any late in-dialog requests. If we just lower the log level, we don't know if it's a valid warning log or a bogus one.
I've been wondering how to achieve this.
AFAICS, the current implementation removes dialog entries right after the moment the final request/response pair was seen. In order to prolong a dying dialog's lifetime, I see two possibilities: Use some time-triggered function that delays dialog deletion for a short period of time; or, couple dialog deletion to the deletion of the final transaction, i.e., the one were dialog state transitions to DLG_STATE_DELETED. Personally, I prefer the latter approach since it would quite elegantly re-use the transaction layer's timer.
Yes, the latter approach seems to be the right one.
Regards, Ovidiu Sas
2010/6/23 Timo Reimann timo.reimann@1und1.de:
AFAICS, the current implementation removes dialog entries right after the moment the final request/response pair was seen. In order to prolong a dying dialog's lifetime, I see two possibilities: Use some time-triggered function that delays dialog deletion for a short period of time; or, couple dialog deletion to the deletion of the final transaction, i.e., the one were dialog state transitions to DLG_STATE_DELETED. Personally, I prefer the latter approach since it would quite elegantly re-use the transaction layer's timer.
Me too. There is no a "timer for dialog" in RFC 3261 so reusing transaction timers is the best approach IMHO.
Hey,
Iñaki Baz Castillo wrote:
2010/6/23 Timo Reimann timo.reimann@1und1.de:
AFAICS, the current implementation removes dialog entries right after the moment the final request/response pair was seen. In order to prolong a dying dialog's lifetime, I see two possibilities: Use some time-triggered function that delays dialog deletion for a short period of time; or, couple dialog deletion to the deletion of the final transaction, i.e., the one were dialog state transitions to DLG_STATE_DELETED. Personally, I prefer the latter approach since it would quite elegantly re-use the transaction layer's timer.
Me too. There is no a "timer for dialog" in RFC 3261 so reusing transaction timers is the best approach IMHO.
I have tested and committed the proposed re-usage of the transaction timer to sip-router, Kamailio 3.0 and 1.5.
Cheers,
--Timo
2010/7/13 Timo Reimann timo.reimann@1und1.de:
Me too. There is no a "timer for dialog" in RFC 3261 so reusing transaction timers is the best approach IMHO.
I have tested and committed the proposed re-usage of the transaction timer to sip-router, Kamailio 3.0 and 1.5.
Great, already compiled and will use it in production very soon :)
Thanks a lot.