[Kamailio-Devel] OpenSer 1.3.0 crash

Henning Westerholt henning.westerholt at 1und1.de
Tue Sep 16 11:04:26 CEST 2008


On Monday 15 September 2008, Robin Vleij wrote:
> Well, that one happened a long time ago and I actually posted our
> backtrace in that bug report you're talking about. It looks different
> now (I'm not an expect with gdb, but still stuff looks different) with
> the new traces. Thing is, after making sure the max while loops is
> higher, it's been running really OK. One more thing to note that we
> think is odd: the only server in our farm that is getting the problems,
> is the one with lowest traffic load. The other ones are fine, running
> the same compile, same hardware and same config. On the other hand,
> we've only HAD 3 crashes, so we're statistically not sure it's only that
> machine. :)

Hi Robin,

ok, that explains why this bug seems so familiar. :-) Yes, you're correct, the 
traces are not really identical. 

> >> #0  free_hostent (dst=0x6e3fe0) at proxy.c:203
> >> 203                     for (r=0; dst->h_addr_list[r];r++) {
> >
> > Can you inspect the value of the 'dst' parameter with gdb, to see what is
> > invalid in there? And then take a look to the previous t_relay
> > statements, where and why it was introduced?
>
> How do I inspect the value of that dst parameter? With my gdb knowledge
> I don't get much further than:
>
> Core was generated by `/usr/local/sbin/openser -P
> /var/run/openser/openser.pid -m 64 -u root -g root'.
> Program terminated with signal 11, Segmentation fault.
> #0  free_hostent (dst=0x6e3fe0) at proxy.c:203
> 203                     for (r=0; dst->h_addr_list[r];r++) {
> (gdb) print dst
> $1 = (struct hostent *) 0x6e3fe0
> (gdb) print dst->h_addr_list
> $2 = (char **) 0x38
> (gdb) print dst->h_addr_list[0]
> Cannot access memory at address 0x38
>
> What do you need more from me and how do I get that with gdb and my cores?

The h_addr_list should normally contain a list of hosts, it seems that this 
variable gets somehow corrupt. You could investigate in previous frames why 
this happens, just change with e.g. "f 1" [1] to the first frame, and examine 
the variables there, and so on. You can also print the actual source code of 
the function with "list" [2].

> >> Last crash:
> >> #0  free_lump_list (l=0x636d20) at data_lump.c:412
> >> #1  0x000000000048ed02 in free_sip_msg (msg=0x6df7b8) at
> >> parser/msg_parser.c:661
> >
> > And for this crash, the value of the 'l' lump list? Daniel, perhaps this
> > is related to the header cloning issues that you investigate at the
> > moment? Just an idea..
>
> Same lack-of-gdb-knowledge here, what do I do to get these values of the
> lump list?

Just as you did in the first place, just print them in gdb, and try to 
investigate the source of the problem. :-)

Hope that helps,

Henning

[1] http://sourceware.org/gdb/download/onlinedocs/gdb_7.html#SEC49
[2] http://sourceware.org/gdb/download/onlinedocs/gdb_8.html#SEC52



More information about the Devel mailing list