Hi Daniel!
Summary:
- Without t_release() (no modifications to source code) openser leaks
memory.
- with t_release() openser does not leak. But after some time there is
strange behaviour, e.g.:
-: openser stops reacting for some minutes and afterwards gets
terminated with signal 9. When openser stops working the load
increase to > 40. This happend 3 times now.
-: openser stops reacting for some minutes and the linux PC
where openser is running gets unresponsive. No login. Open
SSH sessions are unresponsive. I had to reboot the PC. Happend
1 time.
Maybe this is not pure openser related, but a problem with openser and
Linux (as I had to reboot the server one time).
Any hints how to debug this?
regards
klaus
Daniel-Constantin Mierla wrote:
Hello Klaus,
I will try to find some ways to investigate the signal 9. As said,
except while waiting for writing the mem log, there is no signal 9 to be
issued.
Regarding the tm stuff, I am not sure whether your last answer about the
mem leak as being solved is due to t_release() or you tried the second
option as well (removing in tm/uac.c the line 224). Can you give a short
summary?
Cheers,
Daniel
On 04/30/07 16:34, Klaus Darilion wrote:
> Hi!
>
> I tried again and it happened again:
>
> Apr 30 15:00:54 ds3000 /usr/sbin/openser[7648]:
> 32b24f15e52d603ba890a9729723c4b0.0167///45-6782(a)83.136.32.132 PUBLISH
> detected, handle_publish ...
> outside t_newtran
> Apr 30 15:00:54 ds3000 /usr/sbin/openser[7655]:
> 32b24f15e52d603ba890a9729723c4b0.7e11///14-6782(a)83.136.32.132 PUBLISH
> detected, handle_publish ...
> outside t_newtran
> Apr 30 15:00:54 ds3000 /usr/sbin/openser[7648]:
> 32b24f15e52d603ba890a9729723c4b0.0167///45-6782(a)83.136.32.132 PUBLISH
> detected, handle_publish ...
> inside t_newtran
> Apr 30 15:00:54 ds3000 /usr/sbin/openser[7655]:
> 32b24f15e52d603ba890a9729723c4b0.7e11///14-6782(a)83.136.32.132 PUBLISH
> detected, handle_publish ...
> inside t_newtran
> Apr 30 15:01:03 ds3000 /usr/sbin/openser[7644]: child process 7648
> exited by a signal 9
> Apr 30 15:01:08 ds3000 /usr/sbin/openser[7644]: core was not generated
> Apr 30 15:01:08 ds3000 /usr/sbin/openser[7644]: INFO: terminating due
> to SIGCHLD
> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7657]: INFO: signal 15 received
> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7657]: Memory status (pkg):
> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7657]: qm_status (0x8145960):
> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7657]: heap size= 1048576
> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7659]: INFO: signal 15 received
> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7653]: INFO: signal 15 received
> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7653]: Memory status (pkg):
> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7650]: INFO: signal 15 received
>
>
> Any hints how to debug this?
>
> regards
> klaus
>
> Daniel-Constantin Mierla wrote:
>> Hello Klaus,
>>
>> On 04/30/07 13:55, Klaus Darilion wrote:
>>>
>>>
>>> Daniel-Constantin Mierla wrote:
>>>> Hello Klaus,
>>>>
>>>> On 04/27/07 09:27, Klaus Darilion wrote:
>>>>> Hi Daniel!
>>>>>
>>>>> I've tried with t_release and it was running fine several hours
>>>>> without leaking. But then, unfortunately openser terminated with
>>>>> signal 9. I've never seen this before.
>>>>
>>>> signal 9 is KILL, it is very strange if it was not issued by a user
>>>> or other process.
>>>>
>>>> We discovered the issue (tm/uac.c, line 224), ther eis flag that is
>>>> kept to see if there was some operation with the transaction, but
>>>> in case of handle_publish() that flag is set by TM api when sending
>>>> NOTIFY. The patch is trivial, removing a line, but we have to
>>>> investigate if there are some other effects -- so it may take a
>>>> while. t_release() should solve meanwhile.
>>>
>>> Should solve the memory-leak - but the signal 9?
>> It might be that it took so long to write the mem long at shut down
>> and openser killed itself. If it was not due to a openser stop, then
>> I am not aware of any case when signal 9 is sent unless issued by user.
>>
>> Cheers,
>> Daniel
>>
>>>
>>> regards
>>> klaus
>>>
>