[SR-Users] kamailio crash periodically in timer handler

Péter Barabás peter.barabas at arenim.com
Tue May 12 19:12:01 CEST 2015


Hi Daniel,

if the patch works fine, i can push it to the main repository. But in current state, until it is not stable, i would not do it.
So that you asked: the interesting thing that this patch works fine on Debian squeeze properly as expected, but on Ubuntu trusty, the NOTIFYs cannot reach the clients. The code runs, it can be seen in the logs, but the NOTIFYs cannot be delivered. And maybe, it is only a suspicion, it can be the cause of the memory crash after some time. In debian, there are no crashes, but in Ubuntu there are many.

It seems the solution does not belong to the async patch, i tested it with kamailio 4.2.4 now, the results are the same.

How can I continue the debugging to solve the problem? What information do you also need?

Peter

From: Daniel-Constantin Mierla [mailto:miconda at gmail.com]
Sent: Tuesday, May 12, 2015 2:45 PM
To: Péter Barabás; Kamailio (SER) - Users Mailing List
Subject: Re: [SR-Users] kamailio crash periodically in timer handler

Hi Peter,

good you mentioned the the patch, overlooked it -- is it something that you want to be pushed on main repository or some customization good for your needs? Somehow I didn't understand properly if you think that is the cause of the issue or is the solution.

If you want to be pushed to the master repository, can you make a pull request on github project:

  - https://github.com/kamailio/kamailio

It makes it easier to review, comment if needed, and merge when all is ok.

On the other hand, if you were using t_suspend()/t_continue(), I pushed a patch that should avoid some race when removing suspended transaction from timer.

Now, for the new issue, do I understand correctly that is when running on ubuntu? And it is not visible on Debian, where everything works as expected? What are the versions for ubuntu and debian you are using?

Cheers,
Daniel
On 12/05/15 13:05, Péter Barabás wrote:
Dear Daniel, community,

A few days ago I have already uploaded the diff that probably causes a rare crash with kamailio, but as I mentioned, we have another issue as well.

Our clients are registered via TLS to kamailio, and we want to change their presence ( to offline) when the TLS connection is disconnected.
TLS disconnection is detected properly, location record is removed – but presence is not updated properly on Ubuntu….

It works perfectly with Kamailio 4.2.4 on debian, but does not work on 4.2.4 on Ubuntu trusty.

The configuration is the same, network environment is the same.
Kamailio is built from source, with a few modifications in ul_publish.c (diff was sent a few weeks ago).

We want is that if a tcp socket has broken, the contact’s of the user gets a NOTIFY with closed state. It seems it is generated but the clients do not receive this. Here are some log parts about the NOTIFY: (ERROR logs are DEBUG logs, only for easier separation)
---------
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:341]: ul_publish(): #012EXPIRE type
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:350]: ul_publish(): Building pidf...
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:163]: build_pidf(): PUBLISH: found expired #012
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:166]: build_pidf(): Setting open to 0
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:110]: pua_set_publish(): Device: device
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:112]: pua_set_publish(): State: closed
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:114]: pua_set_publish(): Content: <status><state>offline</state><message>normal</message></status>
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:285]: build_pidf(): new_body:#012<?xml version="1.0"?>#012<presence xmlns="urn:ietf:params:xml:ns:pidf" xmlns:dm="urn:ietf:params:xml:ns:pidf:data-model" xmlns:rpid="urn:ietf:params:xml:ns:pidf:rpid" xmlns:c="urn:ietf:params:xml:ns:pidf:cipid" entity="sip:1821043984 at xxx.com"<sip:1821043984 at xxx.com>>#012  <tuple id="closed">#012    <status>#012      <basic>close</basic>#012    </status>#012    <note><status><state>offline</state><message>normal</message></status></note>#012  </tuple>#012</presence>#012
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:380]: ul_publish(): uri= sip:1821043984 at xxx.com
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:136]: print_publ(): publ:
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:137]: print_publ(): uri= sip:1821043984 at xxx.com
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:138]: print_publ(): id= UL_PUBLISH.lsnTjwVj_QSobXpbRkCExA..
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:139]: print_publ(): expires= 0
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23221]: ERROR: pua_usrloc [ul_publish.c:444]: ul_publish(): Sending PUBLISH...
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23213]: INFO: presence [notify.c:1614]: send_notify_request(): NOTIFY sip:1792450464 at xxx.com via sip:xcaplist at xxx.com on behalf of sip:1821043984 at xxx.com for event presence
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23213]: INFO: presence [notify.c:1614]: send_notify_request(): NOTIFY sip:1792450464 at xxx.com via sip:xcaplist at xxx.com on behalf of sip:1821043984 at xxx.com for event presence
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23213]: INFO: presence [notify.c:1614]: send_notify_request(): NOTIFY sip:1792450464 at xxx.com via sip:xcaplist at xxx.com on behalf of sip:1821043984 at xxx.com for event presence
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23213]: INFO: presence [notify.c:1614]: send_notify_request(): NOTIFY sip:1792450464 at xxx.com via sip:xcaplist at xxx.com on behalf of sip:1821043984 at xxx.com for event presence
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23213]: INFO: presence [notify.c:1614]: send_notify_request(): NOTIFY sip:1549424873 at xxx.com via sip:xcaplist at xxx.com on behalf of sip:1821043984 at xxx.com for event presence
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:123]: get_dialog_from_did(): record not found in hash_table [rlsubs_did]= 2z1FcPpUy4xbqBTKZGdQ4g..;91f8f651;2b499685060cb5fb6380a8dd7f172ad6-f9b9
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:255]: send_notifies(): Dialog is NULL
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:123]: get_dialog_from_did(): record not found in hash_table [rlsubs_did]= 3is85cyMOm4RZhyFzjNllw..;f46c7707;2b499685060cb5fb6380a8dd7f172ad6-a884
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:255]: send_notifies(): Dialog is NULL
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:123]: get_dialog_from_did(): record not found in hash_table [rlsubs_did]= 740k58rpJgK0AWjeesW5Xw..;84887d68;2b499685060cb5fb6380a8dd7f172ad6-31ce
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:255]: send_notifies(): Dialog is NULL
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:123]: get_dialog_from_did(): record not found in hash_table [rlsubs_did]= iPVWP2vQ6k35ovJtmfGzjQ..;a1cc1376;2b499685060cb5fb6380a8dd7f172ad6-36fd
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:255]: send_notifies(): Dialog is NULL
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:123]: get_dialog_from_did(): record not found in hash_table [rlsubs_did]= jXrdUiNy6Ga3dNOe1ZenoA..;4f9f247e;2b499685060cb5fb6380a8dd7f172ad6-5727
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:255]: send_notifies(): Dialog is NULL
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:123]: get_dialog_from_did(): record not found in hash_table [rlsubs_did]= jXrdUiNy6Ga3dNOe1ZenoA..;4f9f247e;2b499685060cb5fb6380a8dd7f172ad6-5727
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:255]: send_notifies(): Dialog is NULL
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:123]: get_dialog_from_did(): record not found in hash_table [rlsubs_did]= mncpVELOtBwV7lacqGmK1Q..;52646447;2b499685060cb5fb6380a8dd7f172ad6-e916
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:255]: send_notifies(): Dialog is NULL
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:123]: get_dialog_from_did(): record not found in hash_table [rlsubs_did]= NEaF1zBPbDja6pEQ_wxhRw..;a037337d;2b499685060cb5fb6380a8dd7f172ad6-87a3
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:255]: send_notifies(): Dialog is NULL
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:123]: get_dialog_from_did(): record not found in hash_table [rlsubs_did]= qinj4ZFgRynvXr2TOtokTg..;9874a845;2b499685060cb5fb6380a8dd7f172ad6-0fdc
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:255]: send_notifies(): Dialog is NULL
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:123]: get_dialog_from_did(): record not found in hash_table [rlsubs_did]= v85ox7ScNNGDK4BWP7T-5Q..;6fe6b538;2b499685060cb5fb6380a8dd7f172ad6-e6ff
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:255]: send_notifies(): Dialog is NULL
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:123]: get_dialog_from_did(): record not found in hash_table [rlsubs_did]= XqXkXhaJlA4r7UOTZOl6qw..;e6908358;2b499685060cb5fb6380a8dd7f172ad6-8e1f
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:255]: send_notifies(): Dialog is NULL
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:123]: get_dialog_from_did(): record not found in hash_table [rlsubs_did]= Xrw7Za_5mqpNuFyGbEj_5w..;1ef4503f;2b499685060cb5fb6380a8dd7f172ad6-e6e3
May 12 10:44:41 ctdsip3 /usr/local/sbin/kamailio[23218]: INFO: rls [resource_notify.c:255]: send_notifies(): Dialog is NULL
--------------------

Could you give any hints how to start the troubleshooting?
Is it possible that the issue comes from a system library that is buggy in Ubuntu fine in Debian? Or kamilio is build with different paramters in different OS?

Thank you in advance.

Péter

From: Daniel-Constantin Mierla [mailto:miconda at gmail.com]
Sent: Tuesday, April 28, 2015 2:28 PM
To: Péter Barabás; Kamailio (SER) - Users Mailing List
Subject: Re: [SR-Users] kamailio crash periodically in timer handler

Hello,

haven't noticed new memory operations, still could be some side effects.

Anyhow, as I saw some remark on another email from you, are you using in this config t_suspend()/t_continue()? Can you upgrade to run latest version from branch 4.2, there were some fixes since version 4.2.3, which you mentioned you run.

In that way we rule out that your instance is not affected by those issues fixed meanwhile.

Daniel
On 28/04/15 11:50, Péter Barabás wrote:
Hi,

here is the svn diff of ul_publish.c attached.

Péter


From: Daniel-Constantin Mierla [mailto:miconda at gmail.com]
Sent: Tuesday, April 28, 2015 9:03 AM
To: Péter Barabás; Kamailio (SER) - Users Mailing List
Subject: Re: [SR-Users] kamailio crash periodically in timer handler

Hello,
On 27/04/15 23:59, Péter Barabás wrote:
Hi,

you gave me an idea where to find the error.
We modified the ul_publish.c source in order to achieve the following results:

-          when client sends a REGISTER, an implicit PUBLISH is called so in each registration, publication should also happen

-          when tcp socket is lost, we want that the contacts of the lost user get a presence NOTIFY with open=0 state to show the unregistered (lost) state

Here is the ul_publish.c source, maybe the bug is in it. But in our dev system, it works like a charm:
[...]

it is not easy to review the full file alone. Can you post the diff with your changes (use git diff if you cloned via git or diff -u)? That should be easier to analyze and see if you introduced any issue or is from another place.

Cheers,
Daniel








--

Daniel-Constantin Mierla

http://twitter.com/#!/miconda<http://twitter.com/#%21/miconda> - http://www.linkedin.com/in/miconda

Kamailio World Conference, May 27-29, 2015

Berlin, Germany - http://www.kamailioworld.com




--

Daniel-Constantin Mierla

http://twitter.com/#!/miconda<http://twitter.com/#%21/miconda> - http://www.linkedin.com/in/miconda

Kamailio World Conference, May 27-29, 2015

Berlin, Germany - http://www.kamailioworld.com



--

Daniel-Constantin Mierla

http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda

Kamailio World Conference, May 27-29, 2015

Berlin, Germany - http://www.kamailioworld.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20150512/3b368cc5/attachment.html>


More information about the sr-users mailing list