[OpenSER-Devel] [ openser-Bugs-1876311 ] OpenSER Message of Death?

SourceForge.net noreply at sourceforge.net
Mon Jan 21 16:08:03 UTC 2008


Bugs item #1876311, was opened at 2008-01-21 01:03
Message generated for change (Comment added) made by nobody
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=743020&aid=1876311&group_id=139143

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Helmut Kuper (hekuper)
Assigned to: Nobody/Anonymous (nobody)
Summary: OpenSER Message of Death?

Initial Comment:
I found that when u send this SIP-Message:

BYE sip:80.228.245.138:5070 SIP/2.0
Via: SIP/2.0/UDP 212.6.95.88:1028;branch=z9hG4bK-7rhl6qnns4ej;rport
Route: <sip:212.6.123.118;lr=on;ftag=7w4sllx8cc>
Route:
<sip:212.6.123.118:5090;lr;ftag=7w4sllx8cc;vsf=ZlRqaD8wOzEbMTUlTiwzCnRNJ2F0eDBzMTZ0HVdlUg-->
Route: <sip:Gk-int at 80.228.245.176:5050;lr>
From: "HK SoftSwitch" <sip:v101 at sip0.en.ewetel.de>;tag=7w4sllx8cc
To: <sip:12345 at sip0.en.ewetel.de;user=phone>;tag=0082-00000054-0103i0
Call-ID: 3c26c9bec396-loo61fn1yu42
CSeq: 3 BYE
Max-Forwards: 70
Contact: <sip:v101 at 212.6.95.88:1028;line=nopsgn9d>;flow-id=1
User-Agent: snom370/7.1.30
RTP-RxStat:
Total_Rx_Pkts=120,Rx_Pkts=120,Rx_Pkts_Lost=0,Remote_Rx_Pkts_Lost=0
RTP-TxStat: Total_Tx_Pkts=2263,Tx_Pkts=2263,Remote_Tx_Pkts=160
Content-Length: 0


to an openser with this config:

#
# $Id: openser.cfg,v 1.6.2.1 2006/07/17 15:51:03 klaus_darilion Exp $
#
# simple quick-start config script
#

# ----------- global configuration parameters ------------------------

debug=4            # debug level (cmd line: -dddddddddd)
fork=no
log_stderror=1    # (cmd line: -E)

check_via=yes   # (cmd. line: -v)
dns=no          # (cmd. line: -r)
rev_dns=no      # (cmd. line: -R)
listen=212.6.123.118
port=5060
children=1

user="ruth"
group="voip"

# ------------------ module loading ----------------------------------
mpath="/home/hk/openser/lib/openser/modules/"

# Uncomment this if you want to use SQL database
loadmodule "tm.so"
loadmodule "sl.so"
loadmodule "rr.so"
loadmodule "uri.so"
# ----------------- AVP aliase (zum einfacheren Lesen) ---------------

# ------------------------- request routing logic -------------------

# main routing logic

route{
~                t_relay();
}



openser will stop working, consuming all CPU-Power.


regards
Helmut


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2008-01-21 08:08

Message:
Logged In: NO 

The process that consumes 100% CPU and the one that have processed the BYE
message are not the same (look at the PIDs). You should check what is the
role of the process that consumes 100% CPU using openserctl ps. Then you
could see what it's actually doing using strace, ltrace and gdb.

Dan


----------------------------------------------------------------------

Comment By: Henning Westerholt (henningw)
Date: 2008-01-21 07:22

Message:
Logged In: YES 
user_id=337916
Originator: NO

Hm, strange. I would like to get a more detailed cause for this error than
'kernel problem'. Have you tried to analyze a core dump from the process
hanging in the loop? Perhaps this gives some more hints about the
problem..



----------------------------------------------------------------------

Comment By: Helmut Kuper (hekuper)
Date: 2008-01-21 07:10

Message:
Logged In: YES 
user_id=1851349
Originator: YES

No, there is no SIP-Traffic. Messages comes into openser an nothing comes
out.

I installed latest openser on ubuntu 2.6.20-15-server (feisty) with test
config and test message (see below). I replaced all sip0.en. ... as well as
IP-Adresses of my old server with new server's address and the error
disappeard. Obviously ubuntu 2.6.15-29-server (dapper) is not a good choice
for openser ...

grrr have to update my machines now ... or do u think we should run for
that error ?

Thanks for ur help so far.


----------------------------------------------------------------------

Comment By: Henning Westerholt (henningw)
Date: 2008-01-21 07:02

Message:
Logged In: YES 
user_id=337916
Originator: NO

I use (k)ubuntu 7.10, kernel 2.6.22-14. My environment is not identical to
your test setup, because of the network setup, perhaps this prevent the
triggering of this bug on my machine.

The periodical select is produced from the TCP receiver process and is not
problematical. Did you send the SIP msg over TCP or UDP? 

In the log it seems that another message is parsed before the msg you've
attached to this bug (branch z9hG4bK.7b5f690b). Where does this BYE came
from? Any signs of this loop you've mentioned above in the log files?

Cheers,

Henning

----------------------------------------------------------------------

Comment By: Helmut Kuper (hekuper)
Date: 2008-01-21 06:33

Message:
Logged In: YES 
user_id=1851349
Originator: YES

Hi Henning,

what kind of OS do u use ? I use ubuntu 2.6.15-29-server. Meybe there is a
bug in my OS ...  

----------------------------------------------------------------------

Comment By: Helmut Kuper (hekuper)
Date: 2008-01-21 05:59

Message:
Logged In: YES 
user_id=1851349
Originator: YES

Hi again,

I compiled openser trunk (build 3557) and tested it. Same result. Openser
hangs somwhere in a loop or so.

A strace as a wrapper around openser in no fork mode dbug level 4 says
this:

[pid  2829] time([1200923769])          = 1200923769
[pid  2829] write(2, "Jan 21 14:56:09 [2829] DBG:core:"..., 56Jan 21
14:56:09 [2829] DBG:core:parse_headers: flags=78
) = 56
[pid  2829] time([1200923769])          = 1200923769
[pid  2829] write(2, "Jan 21 14:56:09 [2829] DBG:tm:t_"..., 85Jan 21
14:56:09 [2829] DBG:tm:t_lookup_request: start searching: hash=19447,
isACK=0
) = 85
[pid  2829] time([1200923769])          = 1200923769
[pid  2829] write(2, "Jan 21 14:56:09 [2829] DBG:tm:ma"..., 81Jan 21
14:56:09 [2829] DBG:tm:matching_3261: RFC3261 transaction matching failed
) = 81
[pid  2829] time([1200923769])          = 1200923769
[pid  2829] write(2, "Jan 21 14:56:09 [2829] DBG:tm:t_"..., 69Jan 21
14:56:09 [2829] DBG:tm:t_lookup_request: no transaction found
) = 69
[pid  2830] <... select resumed> )      = 0 (Timeout)
[pid  2830] select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
[pid  2830] select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
[pid  2830] select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
[pid  2830] select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
[pid  2830] select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
[pid  2830] select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
[pid  2830] select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
[pid  2830] select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)

the "select" line comes priodically (2-3 times per sec)



----------------------------------------------------------------------

Comment By: Helmut Kuper (hekuper)
Date: 2008-01-21 05:45

Message:
Logged In: YES 
user_id=1851349
Originator: YES

Hi Henning,

I got this log from openser:

Jan 21 14:36:40 [27027] DBG:core:parse_msg: SIP Request:
Jan 21 14:36:40 [27027] DBG:core:parse_msg:  method:  <BYE>
Jan 21 14:36:40 [27027] DBG:core:parse_msg:  uri:    
<sip:80.228.245.138:5070>
Jan 21 14:36:40 [27027] DBG:core:parse_msg:  version: <SIP/2.0>
Jan 21 14:36:40 [27027] DBG:core:parse_headers: flags=2
Jan 21 14:36:40 [27027] DBG:core:parse_via_param: found param type 232,
<branch> = <z9hG4bK.7b5f690b>; state=6
Jan 21 14:36:40 [27027] DBG:core:parse_via_param: found param type 235,
<rport> = <n/a>; state=6
Jan 21 14:36:40 [27027] DBG:core:parse_via_param: found param type 237,
<alias> = <n/a>; state=16
Jan 21 14:36:40 [27027] DBG:core:parse_via: end of header reached,
state=5
Jan 21 14:36:40 [27027] DBG:core:parse_headers: via found, flags=2
Jan 21 14:36:40 [27027] DBG:core:parse_headers: this is the first via
Jan 21 14:36:40 [27027] DBG:core:receive_msg: After parse_msg...
Jan 21 14:36:40 [27027] DBG:core:receive_msg: preparing to run routing
scripts...
Jan 21 14:36:40 [27027] DBG:core:parse_headers: flags=78
Jan 21 14:36:40 [27027] DBG:core:parse_via_param: found param type 232,
<branch> = <z9hG4bK-7rhl6qnns4ej>; state=6
Jan 21 14:36:40 [27027] DBG:core:parse_via_param: found param type 235,
<rport> = <n/a>; state=17
Jan 21 14:36:40 [27027] DBG:core:parse_via: end of header reached,
state=5
Jan 21 14:36:40 [27027] DBG:core:parse_headers: via found, flags=78
Jan 21 14:36:40 [27027] DBG:core:parse_headers: parse_headers: this is the
second via
Jan 21 14:36:40 [27027] DBG:core:parse_to_param: tag=0082-00000054-0103i0
Jan 21 14:36:40 [27027] DBG:core:parse_to: end of header reached,
state=29
Jan 21 14:36:40 [27027] DBG:core:parse_to: display={},
ruri={sip:12345 at sip0.en.ewetel.de;user=phone}
Jan 21 14:36:40 [27027] DBG:core:get_hdr_field: <To> [67];
uri=[sip:12345 at sip0.en.ewetel.de;user=phone]
Jan 21 14:36:40 [27027] DBG:core:get_hdr_field: to body
[<sip:12345 at sip0.en.ewetel.de;user=phone>]
Jan 21 14:36:40 [27027] DBG:core:get_hdr_field: cseq <CSeq>: <3> <BYE>
Jan 21 14:36:40 [27027] DBG:tm:t_lookup_request: start searching:
hash=19447, isACK=0
Jan 21 14:36:40 [27027] DBG:tm:matching_3261: RFC3261 transaction matching
failed
Jan 21 14:36:40 [27027] DBG:tm:t_lookup_request: no transaction found
Jan 21 14:36:40 [27027] DBG:core:parse_headers: flags=200
Jan 21 14:36:40 [27027] DBG:rr:is_preloaded: is_preloaded: No
Jan 21 14:36:40 [27027] DBG:core:grep_sock_info: checking if host==us:
14==13 &&  [80.228.245.138] == [212.6.123.118]
Jan 21 14:36:40 [27027] DBG:core:grep_sock_info: checking if port 5066
matches port 5070
Jan 21 14:36:40 [27027] DBG:core:grep_sock_info: checking if host==us:
14==13 &&  [80.228.245.138] == [212.6.123.118]
Jan 21 14:36:40 [27027] DBG:core:grep_sock_info: checking if port 5066
matches port 5070
Jan 21 14:36:40 [27027] DBG:core:check_self: host != me
Jan 21 14:36:40 [27027] DBG:core:grep_sock_info: checking if host==us:
13==13 &&  [212.6.123.118] == [212.6.123.118]
Jan 21 14:36:40 [27027] DBG:core:grep_sock_info: checking if port 5066
matches port 5060
Jan 21 14:36:40 [27027] DBG:core:grep_sock_info: checking if host==us:
13==13 &&  [212.6.123.118] == [212.6.123.118]
Jan 21 14:36:40 [27027] DBG:core:grep_sock_info: checking if port 5066
matches port 5060
Jan 21 14:36:40 [27027] DBG:core:check_self: host != me
Jan 21 14:36:40 [27027] DBG:rr:after_loose: Topmost URI is NOT myself
Jan 21 14:36:40 [27027] DBG:rr:after_loose: URI to be processed:
'sip:212.6.123.118;lr=on;ftag=7w4sllx8cc'
Jan 21 14:36:40 [27027] DBG:rr:after_loose: Next URI is a loose router
Loose Route
Jan 21 14:36:40 [27027] DBG:tm:t_newtran: transaction on entrance=(nil)
Jan 21 14:36:40 [27027] DBG:core:parse_headers: flags=ffffffffffffffff
Jan 21 14:36:40 [27027] DBG:core:get_hdr_field: content_length=0
Jan 21 14:36:40 [27027] DBG:core:get_hdr_field: found end of header
Jan 21 14:36:40 [27027] DBG:core:parse_headers: flags=78
Jan 21 14:36:40 [27027] DBG:tm:t_lookup_request: start searching:
hash=19447, isACK=0
Jan 21 14:36:40 [27027] DBG:tm:matching_3261: RFC3261 transaction matching
failed
Jan 21 14:36:40 [27027] DBG:tm:t_lookup_request: no transaction found


A strace on the cpu wasting process says nothing:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
27168 ruth      25   0 37328 3304 3076 R 98.8  0.4   0:07.32 openser
    1 root      16   0  1568  532  460 S  0.0  0.1   0:06.75 init
    2 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/0
    3 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0
    4 root      RT   0     0    0    0 S  0.0  0.0   0:00.02 watchdog/0
    5 root      10  -5     0    0    0 S  0.0  0.0   0:01.71 events/0
    6 root      10  -5     0    0    0 S  0.0  0.0   0:00.01 khelper
    7 root      10  -5     0    0    0 S  0.0  0.0   0:00.02 kthread
    9 root      10  -5     0    0    0 S  0.0  0.0   0:00.37 kblockd/0
   10 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 kacpid
  103 root      13  -5     0    0    0 S  0.0  0.0   0:00.00 aio/0
  102 root      15   0     0    0    0 S  0.0  0.0   0:45.57 kswapd0
  691 root      11  -5     0    0    0 S  0.0  0.0   0:00.01 kseriod
 1640 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 scsi_eh_0
 1725 root      15   0     0    0    0 S  0.0  0.0   0:17.81 kjournald
 1963 root      11  -4  2128  708  468 S  0.0  0.1   0:00.87 udevd
 2801 root      20   0     0    0    0 S  0.0  0.0   0:00.00
shpchpd_event
 2903 daemon    16  -2  1668  476  376 S  0.0  0.1   0:00.12 portmap
 3066 root      15   0     0    0    0 S  0.0  0.0   0:42.14 kjournald
 5063 root      16   0  1684  500  412 S  0.0  0.1   0:00.08 dd
 5065 klog      16   0  2408 1328  388 S  0.0  0.2   0:00.08 klogd
 5097 root      18   0 12552 2412 1364 S  0.0  0.3   0:00.02 slapd
 5171 root      15   0  1688  576  484 S  0.0  0.1   4:38.29
vmware-guestd
 5226 root      16   0  2644 1316 1064 S  0.0  0.2   0:00.03 mysqld_safe
 5299 mysql     16   0  135m  40m 4880 S  0.0  5.4   1:45.73 mysqld
 5300 root      16   0  1552  504  432 S  0.0  0.1   0:00.00 logger
 5417 root      16   0  4768 1056  756 S  0.0  0.1   0:00.12 sshd
 5456 root      16   0  2216  796  664 S  0.0  0.1   0:06.50 xinetd
 5473 root      18   0  1728  716  616 S  0.0  0.1   0:00.00 rpc.statd
 5555 root      16   0  8228 2092  980 S  0.0  0.3   0:15.42 sendmail-mta
ruth at vm-en-test0:~/tmp$ strace -p 27168
Process 27168 attached - interrupt to quit




When I start openser in no fork mode and within strace I got this at last
lines:

write(2, "Jan 21 14:39:05 [27054] DBG:tm:t"..., 72Jan 21 14:39:05 [27054]
DBG:tm:t_newtran: transaction on entrance=(nil)
) = 72
time([1200922745])                      = 1200922745
write(2, "Jan 21 14:39:05 [27054] DBG:core"..., 71Jan 21 14:39:05 [27054]
DBG:core:parse_headers: flags=ffffffffffffffff
) = 71
time([1200922745])                      = 1200922745
write(2, "Jan 21 14:39:05 [27054] DBG:core"..., 65Jan 21 14:39:05 [27054]
DBG:core:get_hdr_field: content_length=0
) = 65
time([1200922745])                      = 1200922745
write(2, "Jan 21 14:39:05 [27054] DBG:core"..., 68Jan 21 14:39:05 [27054]
DBG:core:get_hdr_field: found end of header
) = 68
time([1200922745])                      = 1200922745
write(2, "Jan 21 14:39:05 [27054] DBG:core"..., 57Jan 21 14:39:05 [27054]
DBG:core:parse_headers: flags=78
) = 57
time([1200922745])                      = 1200922745
write(2, "Jan 21 14:39:05 [27054] DBG:tm:t"..., 86Jan 21 14:39:05 [27054]
DBG:tm:t_lookup_request: start searching: hash=19447, isACK=0
) = 86
time([1200922745])                      = 1200922745
write(2, "Jan 21 14:39:05 [27054] DBG:tm:m"..., 82Jan 21 14:39:05 [27054]
DBG:tm:matching_3261: RFC3261 transaction matching failed
) = 82
time([1200922745])                      = 1200922745
write(2, "Jan 21 14:39:05 [27054] DBG:tm:t"..., 70Jan 21 14:39:05 [27054]
DBG:tm:t_lookup_request: no transaction found
) = 70


I tested it against an umodified (I extended avp aliases for using script
flags) openser 1.4  ... same problem.


regards
helmut



----------------------------------------------------------------------

Comment By: Henning Westerholt (henningw)
Date: 2008-01-21 04:05

Message:
Logged In: YES 
user_id=337916
Originator: NO

Hi Helmut,

i've tried to reproduce this bug on my machine. This config and sip
message (of course with adapted IP and user) works for me. I've tried
several permutations of IP and CFGs, unfortunally without any trace to the
bug you described. I get this debug message:

Jan 21 13:02:03 [1172] DBG:core:parse_msg: SIP Request:
Jan 21 13:02:03 [1172] DBG:core:parse_msg:  method:  <BYE>
Jan 21 13:02:03 [1172] DBG:core:parse_msg:  uri:    
<sip:80.228.245.138:5070>
Jan 21 13:02:03 [1172] DBG:core:parse_msg:  version: <SIP/2.0>
Jan 21 13:02:03 [1172] DBG:core:parse_headers: flags=2
Jan 21 13:02:03 [1172] DBG:core:parse_via_param: found param type 232,
<branch> = <z9hG4bK-7rhl6qnns4ej>; state=6
Jan 21 13:02:03 [1172] DBG:core:parse_via_param: found param type 235,
<rport> = <n/a>; state=17
Jan 21 13:02:03 [1172] DBG:core:parse_via: end of header reached, state=5
Jan 21 13:02:03 [1172] DBG:core:parse_headers: via found, flags=2
Jan 21 13:02:03 [1172] DBG:core:parse_headers: this is the first via
Jan 21 13:02:03 [1172] DBG:core:receive_msg: After parse_msg...
Jan 21 13:02:03 [1172] DBG:core:receive_msg: preparing to run routing
scripts...
Jan 21 13:02:03 [1172] DBG:tm:t_newtran: transaction on
entrance=0xffffffff
Jan 21 13:02:03 [1172] DBG:core:parse_headers: flags=ffffffffffffffff
Jan 21 13:02:03 [1172] DBG:core:parse_to_param: tag=0082-00000054-0103i0
Jan 21 13:02:03 [1172] DBG:core:parse_to: end of header reached, state=29
Jan 21 13:02:03 [1172] DBG:core:parse_to: display={},
ruri={sip:12345 at sip0.en.ewetel.de;user=phone}
Jan 21 13:02:03 [1172] DBG:core:get_hdr_field: <To> [66];
uri=[sip:12345 at sip0.en.ewetel.de;user=phone]
Jan 21 13:02:03 [1172] DBG:core:get_hdr_field: to body
[<sip:12345 at sip0.en.ewetel.de;user=phone>]
Jan 21 13:02:03 [1172] DBG:core:get_hdr_field: cseq <CSeq>: <3> <BYE>
Jan 21 13:02:03 [1172] DBG:core:get_hdr_field: content_length=0
Jan 21 13:02:03 [1172] ERROR:tm:t_newtran: EoH not parsed
Jan 21 13:02:03 [1172] DBG:core:destroy_avp_list: destroying list (nil)
Jan 21 13:02:03 [1172] DBG:core:receive_msg: cleaning up

Could you please try again on your machine, and compare the debug output
that is reported? You could also try to attach with 'strace' to the server
process and look where its hangs.

Cheers,

Henning

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=743020&aid=1876311&group_id=139143



More information about the Devel mailing list