Hi Greger,
Tks for taking sometime on my problem ... i appreciate it.
I have to apologize ... i think i sent a log file which does not correspond "exactly" to the ser.cfg file attached :) Also, some of the questions you rise are because i removed several parts from the ser.cfg (they are "top secret" ;D ). But the differences are minimal ...
On 11/26/05, Greger V. Teigre greger@teigre.com wrote:
I've tried to find some meaning in what you sent, but I don't really know the details of the ACK hack (pun not intended) Jan mentioned. To me it looks like the ACK is absorbed correctly.
In the logs provided, the acks are not absorbed at all (well, yes, due to a hack in my config file ... but otherwise not).
? Hm, I saw some debug lines on ACK absorbed...?!
Do you mean the "3(11987) LOG: ACK intercom transaction DOES NOT exhist ... simply absorbing"? this is from a log message that i removed from the config ... it it in the route (SEMS) ... there was an if( method==ACK) { break; }
However, the REGISTER messages confuses me a bit. Also, you don't record_route CANCELs and ACKs, why?
I don't know :) So far it worked fine. Do you think that record-routing the ACKs may solve this problem? ( See the following paragraph for a strange development )
Also, we checked yesterday and ... surprise! :) We discovered this "feature" after upgrading from ser_0.9.0 to ser_0.9.4 ... we still have some pcs with ser_0.9.0 (ser.cfg and sems are the same overall), and they don't show this behavior (OK/ACK being resent for a while). This only happens in the newer setups with ser_0.9.4
Hm. Strange. I'm not enough on-top-of diffs from 0.9.0 to say anything smart about that :-)
I tend to think more and more that this is the key ... i will take a look today in detail ... Jan, or any other ... where should i look? where does the ACK cancel the retransmission timer? are you aware of any changes from 0.9.0 to 0.9.4 that could cause this?
You had some strange messages in there about message with multicast address looped back.
Ok ... got me ... i use multicast ... the message being loop'd back is just a side effect ... but it is no problem. Those messages are discarded at the top of the ser.cfg (not shown in the one i sent).
You alsoe have this: Warning: sl_send_reply: I won't send a reply for ACK!! Which normally means that an ACK hits an sl_send_reply() somewhere in your script.
There was an sl_send_reply in the route(SEMS) ... which is not there anymore ... i was trying things ...
And now that I had another look at your config, I see that you loose_route to ROUTE_RELAY, but then you do lookup before t_relay(). Loose routing is NOT looking up and ACKs with Route should not be looked up, but just t_relayed. That may introduce some problems. I would guess that the Warning for ACK above is due to this: if( ! lookup("location") ) { sl_send_reply("404", "Not Found"); break; } in your ROUTE_RELAY.
Again, i removed some parts from the ser.cfg ... but the loose-routed msgs are simply t_relayed, nothing else. See that on the if( loose_route() ) ... i set a flag, which on a part not shown from route(RELAY) simply t_relay() those flagged messages. Sorry for the confusion ...
Cesc