Hi,
we had got series of SER cores with failed assertion in
t_retransmit_reply():
assert(t->uas.response.dst.send_sock);
Essential backtrace part:
#0 0x48167fc4 in kill () from /usr/lib/libc.so.4
#1 0x481a993e in abort () from /usr/lib/libc.so.4
#2 0x481858d3 in __assert () from /usr/lib/libc.so.4
#3 0x48281da3 in t_retransmit_reply () from /usr/local/lib/ser/modules/tm.so
#4 0x48277433 in t_newtran () from /usr/local/lib/ser/modules/tm.so
This place from ser.cfg:
if (!t_newtran()) {
sl_send_reply("500", "could not create transaction");
break;
};
response.dst.send_sock is filled in init_rb(), which is called
from t_newtran(). t_newtran() has the following:
=== cut ===
UNLOCK_HASH(p_msg->hash_index);
/* now, when the transaction state exists, check if
there is a meaningful Via and calculate it; better
do it now than later: state is established so that
subsequent retransmissions will be absorbed and will
not possibly block during Via DNS resolution; doing
it later would only burn more CPU as if there is an
error, we cannot relay later whatever comes out of the
the transaction
*/
if (!init_rb( &T->uas.response, p_msg)) {
LOG(L_ERR, "ERROR: t_newtran: unresolvable via1\n");
put_on_wait( T );
t_unref(p_msg);
return E_BAD_VIA;
}
return 1;
=== end cut ===
Does there race condition between UNLOCK_HASH(), init_rb() and
parallel t_newtran() from another SER process exist?
Details: FreeBSD 4.10, SER 0.9.3 (with PortaOne additions, but not
essential for the question).
--
Valentin Nechayev
PortaOne Inc., Software Engineer
mailto:netch@portaone.com