[Kamailio-Users] Can openser.cfg lead to pkg memory problem?
Daniel-Constantin Mierla
miconda at gmail.com
Mon Oct 6 17:08:51 CEST 2008
Hello,
On 10/06/08 13:22, mayamatakeshi wrote:
> [...]
>
>
>
> Hello,
> we have openser 1.3.3 running in production
> (current rev.:
> 4943).
> For 3 times in 50 days we had to restart openser to
> correct pkg memory problem.
>
> openser 1.3.3 was released 3 weeks ago, so I guess you were
> running previous version before, but it happened again
> since
> you upgraded to 1.3.3, right?
>
>
> After some time logging messages like this:
> /openser.log:Aug 19 10:39:18 ipx022
> /usr/local/sbin/openser[16991]:
> ERROR:core:new_credentials: no pkg memory left,
> openser will eventually run out of pkg memory and
> refuse
> all subsequent requests.
>
> We are trying to recreate this in our lab so that
> we can
> follow memory troubleshooting instructions at
>
> http://kamailio.net/dokuwiki/doku.php/troubleshooting:memory,
> but so far we were unable to do it even when generating
> millions of calls and registration transactions (we are
> using SIPp to generate normal call flows and even
> abnormal
> call flows detected when reading openser.log, like
> 'invalid cseq for aor', malformed SIP messages etc).
>
> We can spot memory leaks even the "out of memory"
> message is
> not printed. Just archive the logs (the most important
> is the
> shut down time) and made them available for download so
> they
> can be investigated.
>
> There could be two reasons:
> - there is memory leak but happens in some cases that you
> don't reproduce in lab, but they are in the production
> environment
> - you get memory fragmentation
>
> Let's see first the debug messages...
>
>
> Hello,
> here are the link for openser.log and cfg files:
> http://www.yousendit.com/download/bVlEV0o4R3NoeWJIRGc9PQ
>
> After compilation with debug flags for memory manager, I left
> openser running in production for 24 hours. Then, I moved all
> traffic to another host and waited for more than 30 minutes
> before
> stopping openser.
> In the openser.cfg, I set debug=2. If you need, I can run
> it again
> with a higher value (but I hope it doesn't have to be too high,
> due to overhead concerns).
>
> Sorry, I forgot to tell one thing: the last revision that
> showed this problem was 4809, so we reverted back to that
> revision before performing the above.
>
> to understand that you couldn't reproduce with latest svn version?
> So you had to get a previous version?
>
>
> Hi,
> no, the reason for reversion is that the latest version running in
> production will not show the problem because we adopted preventive
> reset to minimize impact to customer calls. So I don't know yet if it
> shows this problem or not.
> So I collected the logs using a revision that I was sure could
> recreate the problem.
OK, I understand now. I was looking at the logs and there seems to be a
leak with db operations - something does not free a db result. I will go
over the modules that you are using and try to spot any issue -- i will
check the change log to see if something happened in the last time
regarding such issue..
>
> But here's some developments on my investigation:
> Up to now, I was trying to recreate the problem using VirtualMachines
> running the same OS (Fedora 5) as in production. It never happened
> there, even after 30 million of calls.
> But we eventually were able to test openser 1.3 using a production
> machine with the same spec as the ones showing the problem and we were
> able to generate pkg memory problem using a simple outgoing SIPp
> scenario. The problem always happens after we reach around 28.000
> calls and we confirmed the amount of calls needed to cause the problem
> grows linearly with the amount of pkg memory (after increase of pkg
> memory pool by 4, problem started to happen only after around 128.000
> calls).
> However, we also tried the same tests with kamailio 1.4 (rev. 5017) on
> that machine and we could not recreate the problem after 1.5 million
> calls, so we are thinking in just upgrade to 1.4 after other scenarios
> show everything else is working.
OK, 1.4 is recommended, it has lot of new features and many fixes.
>
> But I don't know why the problem cannot be recreated using the VMs:
> the only significant difference is that the productions machines have
> 4 NICs that are bound in 2 pairs (1 for private ip and another for
> public ip) while the VMs have just one NIC.
I see no relation with the NICs.
>
> I hope upgrading to 1.4 will solve everything, however, since nobody
> is complaining about having openser stopping after 28.000 calls, I
> still believe we have some problem in the openser.cfg itself. I'll
> check it after we put kamailio 1.4 in production.
OK, I will dig in further, I might be a bit slow, however, these days.
Cheers,
Daniel
--
Daniel-Constantin Mierla
http://www.asipto.com
More information about the Users
mailing list