[Serusers] Via: branch tags from SER for CANCELs/BYEs frequently don't match value in the INVITE

Frank Durda IV frank.durda at hypercube-llc.com
Sat Dec 5 05:35:46 CET 2009


Jiri Kuthan wrote:
> While I have a lot of sympathies for your disappointment, I'm not really
> sure you are tying it with the proper causes. Let me explain my viewpoint:
>
> - I think the Email you have received from Andrei explains even using
> references why Via(INVITE)==Via(BYE) requirement violates standards
> and actual functionality as well. Why do you think you have not been
> getting a good advice? What's wrong with the CANCEL suggestion?

The problem here is that the discussion of BYE being involved
went away some time ago.  The branch tag in CANCELs not matching
the branch tag for the INVITEs for the same call was the problem
and what needed a fix.   (I incorrectly assumed initially that BYE
would be wrong as well, but as it is not a two method message sequence,
BYE can't get into trouble with the branch tag computation going awry.
Just INVITE & CANCEL for a call must always have the same branch tag.

That's the problem with a discussion that bounces back and forth for
weeks or months before you finally just climb under the hood, add
hundreds of printfs/xlogs and figure out what it is doing wrong
yourself and then have to figure out a way to make it right or at
least closer to what the RFC and real-life equipment claim is right.
Once that was done, the muddle of earlier theories went away.

Anyway, I have fixed the problem myself from all appearances,
and the INVITE/CANCEL mismatch issue is closed here.

>
> - the MD5 performance problem you are worried about has been addressed.
> I admit we have answered by classifying this as a non-problem, still this
> is our best-knowledge of the matter based on quite details profiling.
> Do you have any numbers supporting this is a real problem?


The root problem was that the branch value that was computed for
the INVITE wasn't the answer that was computed for the CANCEL.
How slowly or how quickly that was done was not the main
problem,  although I do see enormous waste in using md5 for
generating a hash that could have been devised from a hundred
simpler algorithms.  It's just a branch tag!  It doesn't have to
defeat NSA crypotgraphic analysis.   time_t in hex down to
the usec would easily exceed the uniqueness requirement and be
a lot smaller and is usually right there in an integer, ready
for saving a copy of and/or using.

>
> Generally we don't think that for any given hardware performance is
> a problem with SER. Of course it can degrade for example by use of
> database, or any other expensive operations, but I'm quite confident
> that SER's thoughput is excellent and if service logic consumes more
> resources hardly anything can be done. At a point of time, more throughput
> takes more boxes but I think that this threshold is actually very high
> with SER.


For this particular task, MD5 is just a needless waste of
computational power.  Inefficiencies in this and other tasks
took CPU away that could have been used for other things, never
mind what.   I could have run more rtpproxy sessions on the
computer if it wasn't so busy doing MD5 and other unnecessary
math.  (Understand that for years of my career  I had to
count T-states on individual instructions while writing
assembly language drivers so that things could happen in the
alloted time, so unnecessary math == bloat and I point out
such poor practice when I see it.)

And that is one of the things that baffle me about SER as a
whole.  SER goes out of its way in some places to do things
in a way that someone thought would make the code run really
really fast,  like using inline code macros, or building
the 32-bit integer representations of strings you were
expecting to find and comparing those, all clearly done in
the name of speed.  The latter probably doesn't help much
on modern compilers with aggressive optimizers, but such
coding practices makes the memory footprint bigger and risks
more paging.  Meanwhile, the second technique created a
hardcoding that broke the ability for SER to handle SIP-T/SIP-I,
something SER could have passed transparently otherwise,
ao that was something of a foolish thing to do for the
perceived speed gains.   I even suspected the lack of a way
to pass variables to functions was because of an
obession with speed, not because someone didn't know
how to add four or five lines of rules to the lex file.

So all those things were obviously done in the name of speed,
but then what does SER do on every message to generate a
measly branch tag?   Why, it uses an intentionally slow and
complex computational algorithm (MD5) when such intensive
number crunching isn't needed.   Something of a
dual-personality there when it comes to wanting to be
fast or not caring about speed.


>
> With certainty I know that SER has been used in the big and in the 
> *biggest*
> deployments. I'm worried that this may sound a bit unfriendly towards the
> effort you guys developed or purchased as professional service, but I 
> think
> that the presence of such deployments demonstrates scale is a non-issue
> in reasonably dimensioned environment. SER doesn't come up with 
> dimensioning
> plans, one of the reasons being that it is non-trivial to provide 
> generally
> valid assertions (traffic, hardware, confgiruation, dependencies on 
> database,
> network architecture, all of these differences in actual deployments 
> make it
> hard to provide general rules of thumb.)


And I know people at other telecom companies or companies that
have a telecom presence or product, ones with really well
known names and a zillion dollars.  Some of these use SER,
and you know what they tell me?  They say, yeah we had to hire
programmers to fix the problems in SER and write the missing
bits,  but it was the closest starting point to what we wanted
that we could find.

So, yes companies far bigger than mine are using it, but they
are having to hire people to panel-beat it into an usable shape
and document it.    These outfits are on this list or can see
its contents and are seeing my words (I know because they have
commented on my messages I have posted on this list before),
but some of these companies have a rule to not post anything
back to lists like this because then their competition might
think they were doing something in this or that area or their
enemies might know where weaknesses and vulnerabilities exist
that could be exploited.   It seems that this is one of those
things that happen when your company gets big enough or has
people at the top that are paranoid enough.

By the way, with maybe one exception, I plan to post all the
improvements/fixes I have made to SER (most of which are in
and around NAThelper and rtpproxy),  and maybe they will
rolled into the main tree or maybe not, but at least they
will be available to others and might help them avoid some
headaches we have had to endure.

>
>
> - I cannot possibly comment on interoperability of "high-dollar SBCs and
> switches" to the general extent you are implying. I'm worried you can
> be dramattically disappointed if you tie all your expectations to dollars.
> I only know with certainty that in the specific cases we have encountered
> I can impossibly assert that "high-dollar" and " brands" means knowing
> how INVITE shall look like. In fact, we have been using SER to fix 
> INVITEs from
> high-dollar brands to look like they are supposed to look like. Which
> is a double-edged sword, as frequently turning a message into something
> that A likes makes it hard-to-swallow for B. Unless you are in a 
> single-vendor
> environment, the likelihood of necessity to address interop issues is,
> say, higher than noticeable.

My point here is when the SER maintainers or active respondents
get feedback on these lists that these well-respected devices
do not behave well with SER on specific points and that these
devices rebel against coding shortcuts (like INVITE!=CANCEL
or syn_branch=1), the response here should to be to address
the problem and devise a fix, not to tell me or some other
messenger how something else, ANYTHING else should change
but not SER.   I think it unlikely that I or anybody else
on this list could get the maker of a SBC that costs $250,000
per box to change their device to overlook a point in the RFC
that uses the word MUST three times.   When you get caught
not being compatible, undertake the work to be compatible.


>
> - I agree that lack of parameter passing is a shortcoming. I agree the
> documentation is suboptimal. I'm very thankful to all participants who
> spend their valuable time and return SER's value by their contributions,
> but there is no "central contribution control" that would allow someone
> to cause the participants to address your particular wishes.


I agree and to those participants who provided constructive
suggestions over the past two years, I thanked them publicly
and privately, and do again now in case they missed it.
To the developers et all, well to be honest I haven't seen
much of them.  I mean, I count 70 times that Jiri has posted
here in this group from Feb 2008 thru Oct 2009.   (I posted
a higher number over the same period.)  Anyway, I know that
at some point each developer put a lot of effort into writing
this piece or that part of SER at some point in the past.
I also realize that people get other jobs, get families, run out
of the spare time needed to stuff like this, and so software
and documentation fall into the marginally maintained category.
Maybe that is what I'm seeing here, but I don't know.

There are also cases where the continued development is in a
pay-for version and the free version languishes, with the
carrot that if we pay for it, that version will be better.
I expected the latter was the case for SER.  Great, except
that offering to pay for help and fixes didn't work either.

Believe me, two years ago my company tried five different ways
to get someone at the listed "pay support" entities listed
on the SER web site to pay attention to us, tell us how much
and we were prepared to put them on a plane and have them
configure our lab setup and make it work cleanly and
efficiently.  However, we couldn't even get a reply to the
voice-mails and e-mails left.  So even the pay-for support
didn't seem too promising, and after three weeks of being
ignored with cash in hand and deadlines looming, we just
resigned ourselves to the fact that if it didn't work right
or didn't do something we wanted, we would have to fix it or
write it ourselves, and here we are.


Frank Durda IV - send mail to this address and remove the "LOSE"
   and adjust the month/year password accordingly:
<uhclemLOSE.dec09%nemesis.lonestar.org>    http://nemesis.lonestar.org
  "The guy that said that the only stupid question is the one that was
   never asked clearly has never worked a computer center help desk."
Copyright 2009, ask before reprinting.





More information about the sr-users mailing list