I'm observing the following scenario:

The situation as described is not ideal since it'll fill up your logs with errors, but isn't critical per se. Much more problematic is when there are more than 2 PUBLISHes generated for the same dialog simultaneously, as this can cause a (near) infinite race between the various PUBLISH requests all fighting to update the same etag. For example, 10 PUBLISH are sent out for etag A; all but one are rejected with a 412; then the other 9 keep on bouncing back and forth between pua_dialoginfo and presence_dialoginfo because they do not share the same view on the dialog's latest etag.

Even worse is when presence_dialoginfo is rejecting all incoming PUBLISHes with a 412, for example because of a database/memory/replication problem or a malformed request. A t_reply("412", "Not today") in the presence_dialoginfo server, combined with a single PUBLISH from pua_dialoginfo is enough to reproducibly brick the pua_dialoginfo server because it runs into critical memory fragmentation levels.

I think there are multiple ways to fix or alleviate this problem.

pua generic

pua_dialoginfo specific


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.