[Kamailio-Users] Can openser.cfg lead to pkg memory problem?
Ovidiu Sas
osas at voipembedded.com
Mon Oct 6 17:12:13 CEST 2008
I also hit the package oom issue on a 1.3 server.
Just to confirm that this problem really exists. I can't confirm
after how much number of calls ...
Regards,
Ovidiu Sas
On Mon, Oct 6, 2008 at 11:08 AM, Daniel-Constantin Mierla
<miconda at gmail.com> wrote:
> Hello,
>
> On 10/06/08 13:22, mayamatakeshi wrote:
>> [...]
>>
>>
>>
>> Hello,
>> we have openser 1.3.3 running in production
>> (current rev.:
>> 4943).
>> For 3 times in 50 days we had to restart openser to
>> correct pkg memory problem.
>>
>> openser 1.3.3 was released 3 weeks ago, so I guess you were
>> running previous version before, but it happened again
>> since
>> you upgraded to 1.3.3, right?
>>
>>
>> After some time logging messages like this:
>> /openser.log:Aug 19 10:39:18 ipx022
>> /usr/local/sbin/openser[16991]:
>> ERROR:core:new_credentials: no pkg memory left,
>> openser will eventually run out of pkg memory and
>> refuse
>> all subsequent requests.
>>
>> We are trying to recreate this in our lab so that
>> we can
>> follow memory troubleshooting instructions at
>>
>> http://kamailio.net/dokuwiki/doku.php/troubleshooting:memory,
>> but so far we were unable to do it even when generating
>> millions of calls and registration transactions (we are
>> using SIPp to generate normal call flows and even
>> abnormal
>> call flows detected when reading openser.log, like
>> 'invalid cseq for aor', malformed SIP messages etc).
>>
>> We can spot memory leaks even the "out of memory"
>> message is
>> not printed. Just archive the logs (the most important
>> is the
>> shut down time) and made them available for download so
>> they
>> can be investigated.
>>
>> There could be two reasons:
>> - there is memory leak but happens in some cases that you
>> don't reproduce in lab, but they are in the production
>> environment
>> - you get memory fragmentation
>>
>> Let's see first the debug messages...
>>
>>
>> Hello,
>> here are the link for openser.log and cfg files:
>> http://www.yousendit.com/download/bVlEV0o4R3NoeWJIRGc9PQ
>>
>> After compilation with debug flags for memory manager, I left
>> openser running in production for 24 hours. Then, I moved all
>> traffic to another host and waited for more than 30 minutes
>> before
>> stopping openser.
>> In the openser.cfg, I set debug=2. If you need, I can run
>> it again
>> with a higher value (but I hope it doesn't have to be too high,
>> due to overhead concerns).
>>
>> Sorry, I forgot to tell one thing: the last revision that
>> showed this problem was 4809, so we reverted back to that
>> revision before performing the above.
>>
>> to understand that you couldn't reproduce with latest svn version?
>> So you had to get a previous version?
>>
>>
>> Hi,
>> no, the reason for reversion is that the latest version running in
>> production will not show the problem because we adopted preventive
>> reset to minimize impact to customer calls. So I don't know yet if it
>> shows this problem or not.
>> So I collected the logs using a revision that I was sure could
>> recreate the problem.
> OK, I understand now. I was looking at the logs and there seems to be a
> leak with db operations - something does not free a db result. I will go
> over the modules that you are using and try to spot any issue -- i will
> check the change log to see if something happened in the last time
> regarding such issue..
>>
>> But here's some developments on my investigation:
>> Up to now, I was trying to recreate the problem using VirtualMachines
>> running the same OS (Fedora 5) as in production. It never happened
>> there, even after 30 million of calls.
>> But we eventually were able to test openser 1.3 using a production
>> machine with the same spec as the ones showing the problem and we were
>> able to generate pkg memory problem using a simple outgoing SIPp
>> scenario. The problem always happens after we reach around 28.000
>> calls and we confirmed the amount of calls needed to cause the problem
>> grows linearly with the amount of pkg memory (after increase of pkg
>> memory pool by 4, problem started to happen only after around 128.000
>> calls).
>> However, we also tried the same tests with kamailio 1.4 (rev. 5017) on
>> that machine and we could not recreate the problem after 1.5 million
>> calls, so we are thinking in just upgrade to 1.4 after other scenarios
>> show everything else is working.
> OK, 1.4 is recommended, it has lot of new features and many fixes.
>>
>> But I don't know why the problem cannot be recreated using the VMs:
>> the only significant difference is that the productions machines have
>> 4 NICs that are bound in 2 pairs (1 for private ip and another for
>> public ip) while the VMs have just one NIC.
> I see no relation with the NICs.
>>
>> I hope upgrading to 1.4 will solve everything, however, since nobody
>> is complaining about having openser stopping after 28.000 calls, I
>> still believe we have some problem in the openser.cfg itself. I'll
>> check it after we put kamailio 1.4 in production.
> OK, I will dig in further, I might be a bit slow, however, these days.
>
> Cheers,
> Daniel
>
> --
> Daniel-Constantin Mierla
> http://www.asipto.com
>
>
> _______________________________________________
> Users mailing list
> Users at lists.kamailio.org
> http://lists.kamailio.org/cgi-bin/mailman/listinfo/users
>
More information about the Users
mailing list