Ovidiu Sas wrote:
If not, I don't think creating two dialogs
for each call were one of
them will definitely be dangling and not cleaned up until the dialog
timeout (which is quite long by default) triggers is a viable option for
large-scale environments (such as ours). The rate at which new dialogs
are created could possibly outrun the rate at which they terminate,
which isn't desirable resource-wise.
The issue in the current design is the dialog matching algorithm.
If two dialogs are created, both of them will be chained in the same
dialog list.
When an in dialog request is received, the first dialog that matched
the callid, to/from tag is updated and the second one is just hanging
around (it will never be touched).
What we need, is a better dialog match (if an in dialog request for
the second dialog is received, then the second dialog should be
selected and updated, not the first one). What I'm proposing here is
that dialog matching should be done based on callid, to/from tag,
branch id, list of Route/Record-Route headers). This should ensure
that the proper dialog is handle for each in dialog request and no
dialog will be left over.
Agreed that more strict matching logic will allow having multiple
dialogs to be resolved properly. Still, multiple dialogs come at a
price, please see further below.
Even if we have a spiral and the INVITE is going twice
through the
server (and two individual transactions are created), IMHO one dialog
is not a proper representation of the call.
Can you be more specific as to why a single dialog isn't appropriate?
I am thinking that a single dialog could actually be helpful: It abstracts
from numerous transactions that could possibly be established for a
request that is essentially the same (i.e., except for routing information)
at several locations in spiral scenarios. From an end-to-end perspective,
it doesn't seem that important to know how many transactions a request
spawns but how the associated dialog's state changes over time. Having as
little dialogs as possible will help with observing these changes.
Let's assume the following scenario:
UA1 --> P1 --> P2 --> P1 --> UA2
Now, UA2 rejects the call, but P2 decides to reroute the call to an IVR
UA1 --> P1 --> P2 --> IVR
What will be the state of the dialog on P1?
I think this example of yours shows a potential problem of the
multiple-dialogs case while there is none in the single-dialog case.
First, the single-dialog case: When P1 receives the request from UA1, it
establishes a new, yet unconfirmed dialog. When it receives the request
from P2, it continues the dialog and does not create a new one.
When UA2's rejection is received by P1, the dialog will be terminated.
(This is what the dialog module already does now.) When P2 decides to send
out the request to another destination (like the IVR), P1 will establish a
new dialog (since there's none to continue anymore). I cannot see any
dialog handling problems in this approach.
Now to the multiple-dialog case. When P1 receives P2's request, instead
of continuing the first dialog, it establishes another one which is
differentiated from the first by means of improved dialog matching
capabilities (covering the additional items branch ID and record/record-
route headers that you mentioned above). Now, if I get you right, when
UA2's rejection is received by P1, it will terminate just one of its
dialogs (the second one I suppose).
The next question that I ask myself is: When P2's re-routed request is
received by P1, will it re-use the first dialog? I'm not sure if that
will definitely happen, especially not if you select the Via header as
branch ID that changes between the initial and re-routed request. In that
case, you will end up with another new dialog (a third one), and the very
first one will never have a chance to be destroyed prior to the dialog
timeout trigger.
I think having multiple dialogs for each branch of the
spiral keeps
things clean and easy to understand. The key is to perform proper
matching for in-dialog requests to the corresponding dialog.
Unless my comparison of the two approaches w.r.t. the call setup above
is wrong, it seems to me that the continuation method is easier to
grasp (you do not have to think it terms of multiple dialogs), consumes
less memory, and presumably requires less code modifications.
Another issue from the dialog users' perspective: If you want to track
the call given above using the dialog module's callbacks, it should be
easier to do so the less dialogs of essentially the same call exist.
When there are multiple dialogs, users will need to take care themselves
not to track calls multiple times. For instance, if you want to make a
copy of the SIP messages exchanged for some reason, you'd need special
effort to avoid duplicate copies if several dialogs track the same call
data.
One issue that I have with one single dialog is
related to how dialog
termination is handled on timeout. When BYE on timeout needs to be
sent, where will be sent, as the single dialog will have four
endpoints:
- UA1 (the original caller)
- P2 (the routed destination for the initial request)
- P2 (the incoming destination for the forwarded request)
- UA2 (the routed destination for the forwarded request)
I believe it's still just two endpoints no matter how long the route
path is, namely the hosts comprising the end-to-end dialog relationship
at the edge defined by the Contact header addresses.
That's where BYE messages are sent based on a look at the dialog code.
It also seems that record-routing is honored, so even if hosts on the
route require to see the triggered or "natural" BYE message, they will
do so.
If you have a single dialog, then were the BYE messages will be sent:
- to UA1 and P2
- to P2 and UA1
- to UA1 and UA2
To UA1 and UA2: They represent the endpoints because they provide the
respective Contact headers used to send the BYE message. The Contact
addresses will never change no matter at which point in the routing path
you set up a dialog.
Cheers,
--Timo