[Devel] Re: [Serusers] Re: Fw: [Users] TM : retransmission timers
Andrei Pelinescu-Onciul
andrei at iptel.org
Tue Nov 28 16:08:37 CET 2006
On Nov 28, 2006 at 02:03, Andrei Pelinescu-Onciul <andrei at iptel.org> wrote:
> On Nov 27, 2006 at 23:14, Andrei Pelinescu-Onciul <andrei at iptel.org> wrote:
> > On Nov 27, 2006 at 19:02, Bogdan-Andrei Iancu <bogdan at voice-system.ro> wrote:
> > > small typo below...
> > >
> > > Bogdan-Andrei Iancu wrote:
> > >
> > > >Hi Klaus,
> > > >
> > > >as the discussion about ser's improvements bounces again to openser, I
> > > >had to do a bit of digging to provide a correct answer to openser's
> > > >users.
> > > >
> > > >yes, there were some improvements did by Andrei to TM, mainly in timer
> > > >implementation. As you were wondering, the 0.9.6 SER should be
> > > >relatively close to openser 0.9.4 as TM performance and merging the
> > > >results from Vaclav with Andrei tests before the timer improvement in
> > > >SER 0.9.6, seams to be correct. See:
> > > > http://lists.iptel.org/pipermail/serdev/2005-December/006583.html
> > > >
> > > >After this improvement, SER's 0.9.6 performances dramatically
> > > >increased, but unfortunately, according to our tests it is also
> > > >dramatically wrong. TM timers are not working correctly when variable
> > > >timeouts are used in SER.
> > > >
> > > >With the improvement, the following scenario gets broken - 3 calls
> > > >only, no load, default cfg:
> > > >
> > > >CALL 1: has 60 secs timeout for Final_response_timeout - nobody
> > > >answers, still ringing
> > > >in less than 2 secs ->
> > > >CALL 2: has 70 secs timeout - it is immediately answered.
> > > >in less than 2 secs ->
> > > >CALL 3: has 10 secs timeout for Final_response_timeout - nobody
> > > >answers, ringing.
> > > >
> > > >Of course, everybody expects that CALL3 will timeout before CALL1
> > > >(with more than 40 secs), but in SER 0.9.6 (latest stable for the last
> > > >2 years), this will not happened - both CALL3 and CALL1 will timeout
> > > >simultaneously when the CALL3 timer hits.
> >
> >
> > Thanks a lot for the bug report.
> >
>
> I've made a quick fix attempt (patch attached).
> Note that this is work in progress, very lightly tested, not optimized.
> It's just a "patch" draft, if it proves not to break something else (it
> needs a lot of testing which complicated call scenarios), it can be
> optimized later (e.g. lock only once when deleting the timers for all
> the branches). I've made it available so early so that other people can
> test it, improve it, point obvious problems a.s.o.
> The basic ideea is to immediately remove the fr_timer from the fr timer lists,
> so that no fr_timer with TIMER_DELETED will remain on the FR*LISTs
> (which is what causes problems with the variable timers).
>
>
> Some quick numbers (don't quote them, just for patched/not patched/old stuff
> comparison):
>
> ser 0.9.7 (no patch) ~20000 cps (*)
>
> ser 0.9.7 (patched) about the same (*)
>
> ser 0.9.7 with the 0.9.4 fix removed <4000 cps (**)
> (TIMER_SKIP_DELETED defined )
>
> (*) - unfortunately I cannot generate more then 20000-21000 cps right
> now, so this is not the maximum ser can handle, is just how much I could
> generate without too much trouble. Memory becomes also a problem,
> on a 64 bit machine the 20000 cps need over 900Mb of memory for
> ser 0.9.7 (ser.cvs needs ~800Mb).
>
> (**) starts @4000 then drops to <1000
>
> Note also that this are in fact non-INVITE transaction/s tests.
> INVITEs should be slower (more complex & more packets).
> Normal calls per second (INVITE, then BYE) should be a little faster
> then twice as slow (2 transaction per call).
>
> I've left openser.org lists in the cc in case the openser developers are
> also interested in fixing this (which would enable them to use the 0.9.4
> fix).
Another completely different version, this time a new member is added to the
timer_link structure, just to keep an is_deleted flag.
Advantages: - simpler, much easier to verify (doesn't need so much testing)
Disadvantages: a little more memory is used (can be seen after testing),
probably a little slower in the normal case (though both versions maxed
out my testing sipp, I think the other is faster), probably
significantly slower if a lot of different variable timers are used.
Andrei
>
>
> Andrei
> Index: t_cancel.c
> ===================================================================
> RCS file: /cvsroot/ser/sip_router/modules/tm/t_cancel.c,v
> retrieving revision 1.13.2.1
> diff -u -r1.13.2.1 t_cancel.c
> --- t_cancel.c 13 Jun 2006 13:31:48 -0000 1.13.2.1
> +++ t_cancel.c 28 Nov 2006 00:27:42 -0000
> @@ -147,7 +147,7 @@
>
> /* stop_rb_timers(&t->uac[i].request); */
> reset_timer( &t->uac[i].request.retr_timer );
> - reset_timer( &t->uac[i].request.fr_timer );
> + del_fr_timer( &t->uac[i].request.fr_timer );
> }
> else {
>
> Index: t_funcs.c
> ===================================================================
> RCS file: /cvsroot/ser/sip_router/modules/tm/t_funcs.c,v
> retrieving revision 1.175.2.3
> diff -u -r1.175.2.3 t_funcs.c
> --- t_funcs.c 28 Aug 2006 11:21:02 -0000 1.175.2.3
> +++ t_funcs.c 28 Nov 2006 00:27:42 -0000
> @@ -133,7 +133,7 @@
> {
> set_kr(REQ_RLSD);
>
> - reset_timer( & trans->uas.response.fr_timer );
> + del_fr_timer( & trans->uas.response.fr_timer );
> reset_timer( & trans->uas.response.retr_timer );
>
> cleanup_uac_timers( trans );
> Index: t_fwd.c
> ===================================================================
> RCS file: /cvsroot/ser/sip_router/modules/tm/t_fwd.c,v
> retrieving revision 1.61.2.2
> diff -u -r1.61.2.2 t_fwd.c
> --- t_fwd.c 28 Aug 2006 11:21:02 -0000 1.61.2.2
> +++ t_fwd.c 28 Nov 2006 00:27:42 -0000
> @@ -349,7 +349,7 @@
> * retransmission timers
> */
> reset_timer(&t_invite->uac[i].request.retr_timer);
> - reset_timer(&t_invite->uac[i].request.fr_timer);
> + del_fr_timer(&t_invite->uac[i].request.fr_timer);
>
> /* Generate faked reply */
> LOCK_REPLIES(t_invite);
> Index: t_reply.c
> ===================================================================
> RCS file: /cvsroot/ser/sip_router/modules/tm/t_reply.c,v
> retrieving revision 1.98.2.6
> diff -u -r1.98.2.6 t_reply.c
> --- t_reply.c 12 Jun 2006 08:22:48 -0000 1.98.2.6
> +++ t_reply.c 28 Nov 2006 00:27:43 -0000
> @@ -66,6 +66,7 @@
> * the request (bogdan)
> * 2005-09-01 reverted to the old way of checking response.dst.send_sock
> * in t_retransmit_reply & reply_light (andrei)
> + * 2006-12-27 replaced reset(fr_timer) with del_fr_timer(...) (andrei)
> */
>
>
> @@ -954,7 +955,7 @@
> /* reset FR/retransmission timers */
> for (i=0; i<t->nr_of_outgoings; i++ ) {
> reset_timer( &t->uac[i].request.retr_timer );
> - reset_timer( &t->uac[i].request.fr_timer );
> + del_fr_timer( &t->uac[i].request.fr_timer );
> }
> DBG("DEBUG: cleanup_uac_timers: RETR/FR timers reset\n");
> }
> @@ -1282,7 +1283,7 @@
> /* ... then just stop timers */
> reset_timer( &uac->local_cancel.retr_timer);
> if ( msg_status >= 200 ) {
> - reset_timer( &uac->local_cancel.fr_timer);
> + del_fr_timer( &uac->local_cancel.fr_timer);
> }
> DBG("DEBUG: reply to local CANCEL processed\n");
> goto done;
> @@ -1294,7 +1295,7 @@
>
> /* stop final response timer only if I got a final response */
> if ( msg_status >= 200 ) {
> - reset_timer( &uac->request.fr_timer);
> + del_fr_timer(&uac->request.fr_timer);
> }
>
> /* acknowledge negative INVITE replies (do it before detailed
> Index: timer.c
> ===================================================================
> RCS file: /cvsroot/ser/sip_router/modules/tm/timer.c,v
> retrieving revision 1.58.2.3
> diff -u -r1.58.2.3 timer.c
> --- timer.c 6 Dec 2005 13:20:14 -0000 1.58.2.3
> +++ timer.c 28 Nov 2006 00:27:43 -0000
> @@ -98,6 +98,8 @@
> * 2003-06-27 timers are not unlinked if timerlist is 0 (andrei)
> * 2004-02-13 t->is_invite, t->local, t->noisy_ctimer replaced;
> * timer_link.payload removed (bogdan)
> + * 2006-11-27 added del_fr_timer(): fr timers are immediately removed
> + * from the FR* lists (andrei)
> */
>
> #include "defs.h"
> @@ -322,7 +324,7 @@
> "request resending (t=%p, %.9s ... )\n",
> r_buf->my_T, r_buf->buffer);
> if (SEND_BUFFER( r_buf )==-1) {
> - reset_timer( &r_buf->fr_timer );
> + del_fr_timer( &r_buf->fr_timer );
> fake_reply(r_buf->my_T, r_buf->branch, 503 );
> return;
> }
> @@ -436,7 +438,7 @@
> int i;
> for (i=0; i<t->nr_of_outgoings; i++ ) {
> reset_timer( &t->uac[i].local_cancel.retr_timer );
> - reset_timer( &t->uac[i].local_cancel.fr_timer );
> + del_fr_timer( &t->uac[i].local_cancel.fr_timer );
> }
> }
>
> @@ -776,6 +778,33 @@
> #endif
> }
>
> +
> +
> +/* remove a timer from the FR_TIMER_LIST or FR_INV_TIMER_LIST
> + * (it allows immediate delete of a fr timer => solves the race with
> + * variables timers inserted after longer deleted timers)
> + * WARNING: - don't try to use it to "move" a timer from one list
> + * to another, you'll run into races
> + */
> +void del_fr_timer( struct timer_link *tl)
> +{
> + /* the FR lock is common/shared by both FR_INV_TIMER_LIST
> + * and FR_TIMER_LIST, so we must lock only one of them */
> + lock(timertable->timers[FR_TIMER_LIST].mutex);
> + /* check first if we are on the "detached" timer_routine list (the fr
> + * handle is executing or timer_routine prepares to execute it).
> + * if so do nothing, except reseting the timer to TIMER_DELETED
> + * (just to give us a change at racing with timer_routine, if
> + * TIMER_DELETED is set and the fr handle is not already executing =>
> + * it will not be called anymore)
> + */
> + if (tl->timer_list!=DETACHED_LIST){
> + remove_timer_unsafe(tl); /* safe to call for null list */
> + }else{
> + reset_timer(tl);
> + }
> + unlock(timertable->timers[FR_TIMER_LIST].mutex);
> +}
>
>
>
> Index: timer.h
> ===================================================================
> RCS file: /cvsroot/ser/sip_router/modules/tm/timer.h,v
> retrieving revision 1.37
> diff -u -r1.37 timer.h
> --- timer.h 1 Nov 2004 14:09:09 -0000 1.37
> +++ timer.h 28 Nov 2006 00:27:43 -0000
> @@ -109,6 +109,8 @@
> */
>
> void reset_timer( struct timer_link* tl );
> +/* remove a timer from FR_TIMER_LIST or FR_INV_TIMER_LIST */
> +void del_fr_timer( struct timer_link *tl);
> /* determine timer length and put on a correct timer list */
> void set_timer( struct timer_link *new_tl, enum lists list_id, unsigned int* ext_timeout );
> /* similar to set_timer, except it allows only one-time
> _______________________________________________
> Serusers mailing list
> Serusers at lists.iptel.org
> http://lists.iptel.org/mailman/listinfo/serusers
-------------- next part --------------
Index: timer.c
===================================================================
RCS file: /cvsroot/ser/sip_router/modules/tm/timer.c,v
retrieving revision 1.58.2.3
diff -u -r1.58.2.3 timer.c
--- timer.c 6 Dec 2005 13:20:14 -0000 1.58.2.3
+++ timer.c 28 Nov 2006 14:56:28 -0000
@@ -98,6 +98,8 @@
* 2003-06-27 timers are not unlinked if timerlist is 0 (andrei)
* 2004-02-13 t->is_invite, t->local, t->noisy_ctimer replaced;
* timer_link.payload removed (bogdan)
+ * 2006-11-28 switched to using the new timer_link flag for the timer
+ * deleted check (andrei)
*/
#include "defs.h"
@@ -536,7 +538,9 @@
struct timer_table *tm_init_timers()
{
enum lists i;
-
+#ifdef TIMER_DELETED_FLAG
+ LOG(L_CRIT, "timer init, TIMER_DELETED_FLAG suppport\n");
+#endif
timertable=(struct timer_table *) shm_malloc(sizeof(struct timer_table));
if (!timertable) {
LOG(L_ERR, "ERROR: tm_init_timers: no shmem for timer_Table\n");
@@ -660,7 +664,7 @@
ptr = ptr->prev_tl) {
if (
#ifdef TIMER_SKIP_DELETED
- (ptr->time_out != TIMER_DELETED) &&
+ (!timer_is_deleted(ptr)) &&
#endif
(ptr->time_out <= time_out)) break;
}
@@ -770,7 +774,11 @@
but not execute; there is a race condition, though -- see
timer.c for more details
*/
+#ifdef TIMER_DELETED_FLAG
+ tl->is_deleted=1;
+#else
tl->time_out = TIMER_DELETED;
+#endif
#ifdef EXTRA_DEBUG
DBG("DEBUG: reset_timer (group %d, tl=%p)\n", tl->tg, tl );
#endif
@@ -854,7 +862,7 @@
list= &(timertable->timers[ list_id ]);
lock(list->mutex);
- if (!(new_tl->time_out>TIMER_DELETED)) {
+ if (!(timer_is_active(new_tl))) {
/* make sure I'm not already on a list */
/* remove_timer_unsafe( new_tl ); */
/*
@@ -944,7 +952,7 @@
(_tl)->next_tl = (_tl)->prev_tl = 0;\
DBG("DEBUG: timer routine:%d,tl=%p next=%p\n",\
id,(_tl),tmp_tl);\
- if ((_tl)->time_out>TIMER_DELETED) \
+ if (timer_is_active(_tl)) \
(_handler)( _tl );\
(_tl) = tmp_tl;\
}
Index: timer.h
===================================================================
RCS file: /cvsroot/ser/sip_router/modules/tm/timer.h,v
retrieving revision 1.37
diff -u -r1.37 timer.h
--- timer.h 1 Nov 2004 14:09:09 -0000 1.37
+++ timer.h 28 Nov 2006 14:56:28 -0000
@@ -29,6 +29,8 @@
* --------
* 2003-09-12 timer_link.tg exists only if EXTRA_DEBUG (andrei)
* 2004-02-13 timer_link.payload removed (bogdan)
+ * 2006-11-28 added a special is_deleted flag for the timer_link structure,
+ * added timer_is_deleted() and timer_is_active() (andrei)
*/
@@ -42,9 +44,16 @@
/* timer timestamp value indicating a timer has been
deactivated and shall not be executed
*/
-#define TIMER_DELETED 1
-
+#define TIMER_DELETED_FLAG
+#ifdef TIMER_DELETED_FLAG
+ #define timer_is_deleted(tl) (tl->is_deleted)
+ #define timer_is_active(tl) ( ! tl->is_deleted && tl->time_out!=0)
+#else
+ #define TIMER_DELETED 1
+ #define timer_is_deleted(tl) (tl->time_out==TIMER_DELETED)
+ #define timer_is_active(tl) (tl->time_out>TIMER_DELETED)
+#endif
/* identifiers of timer lists;*/
/* fixed-timer retransmission lists (benefit: fixed timer$
@@ -67,6 +76,9 @@
struct timer_link *prev_tl;
volatile unsigned int time_out;
struct timer *timer_list;
+#ifdef TIMER_DELETED_FLAG
+ volatile char is_deleted;
+#endif
#ifdef EXTRA_DEBUG
enum timer_groups tg;
#endif
More information about the Devel
mailing list