[SR-Users] [sr-dev] dialog module with DB Backend

jay binks jaybinks at gmail.com
Thu Mar 6 09:38:11 CET 2014


I have fixed the remaining return 1's , and also added a little more
logging for if a query reties to connect and fails ( retry count ).

find attached latest patch file... ( tested to patch against master head )
hopefully this is good enough for you to commit.




On 5 March 2014 21:42, Daniel-Constantin Mierla <miconda at gmail.com> wrote:

>  Hello,
>
> can you make the patch for master branch? I just backported two patches
> that were in master branch but not yet in 4.1.
>
> With this occasion, can you review if the other 'return 1' expose the same
> issue? I noticed another one in db_cassa_delete and in db_cassa_query.
>
> Thanks,
> Daniel
>
>
> On 05/03/14 03:57, jay binks wrote:
>
> Just noticed the same thing in db_cassa_delete..
> patch updated to fix both
>
>  Jay
>
>
> On 5 March 2014 12:52, jay binks <jaybinks at gmail.com> wrote:
>
>> Hi All,
>>
>> so Ive done what Carlos suggested and swapped out my dialog db to Mysql
>> rather than cassandra.
>> All worked 100% as you would expect.
>>
>> Right so the issue is db_cassandra .
>>
>> I started testing and going through the code.
>>
>> I found I had these lines, which was interesting & concerning.
>> update_dialog_dbinfo_unsafe(): could not add another dialog to db
>> I had been ignoring them, because the dialog was in the DB and I figured
>> I would come back and figure that out later.
>>
>> but this seems to have been key to this whole thing.
>>
>> ends up that in dlg_db_handler.c dialog_dbf.insert was getting a 1 back
>> from kamailio on the insert and a 0 back from mysql... WTF.. ok.
>>
>> so I trace into db_cassa_insert which calls db_cassa_modify ..
>> around line 1210 I can see this ..
>>
>> CON_CASSA(_h)->con->batch_mutate(CFMap, oac::ConsistencyLevel::ONE);
>> return 1;
>>
>> wrapped in a try / catch block..
>> seems db_cassandra wants to return 1 for success but kamailio ( or dialog
>> module at least ) expects 0 for success .
>>
>> so I change that to be return 0, and re-test.
>> everything works as expected,    "could not add another dialog to db"
>> stops coming up on my console,
>> and dialogs are removed when calls hangup.
>>
>> seems this 1 thing is enough to screw dialogs in cassandra ( and who
>> knows what else ).
>> This is the reason for my email though,   if we simply change that to 0,
>> what else may break !??
>>
>> however http://www.asipto.com/pub/kamailio-devel-guide/#c09f_insertclearly states that "0 if everything is OK"
>> so this is clearly a bug that needs fixing.
>>
>> Can I get someone with more experience to test this for me and possibly
>> apply the attached patch !?
>>
>>  Jay
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On 25 February 2014 05:58, Daniel-Constantin Mierla <miconda at gmail.com>
>> wrote:
>> >
>> > Hello,
>> >
>> > I pushed some patches to the master branch in order to remove the
>> dialog from its associated profiles when it gets in terminated state. I
>> encountered such issue (not that) recently, but I haven't gotten the time
>> to get to it before.
>> >
>> > Then, the second patch is to not add dialogs in profiles when loading
>> from database and the state is terminated (5).
>> >
>> > Here are the links to the patches:
>> >
>> > -
>> http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=edf61acb57ed5e8ee0ca9ec1f796e43ce993be48
>> > -
>> http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=9b88eb7ee2d243882383a44f601baa21fd679cd5
>> >
>> > Should be straightforward to cherry pick to 4.1 (even 4.0 I expect). If
>> you test and all goes fine, I will backport -- here I had no time for real
>> testing.
>> >
>> > I plan also to not add the dialogs in memory for state terminated, but
>> destroy them at db load time. But this needs a bit of a review, to be sure
>> that all necessary callbacks are executed.
>> >
>> > On the other hand, if the dialogs are not removed from db, might be an
>> issue with the database driver (cassandra in this case, which is rather new
>> module). Do you get any syslog errors from kamailio or database server? I
>> expect that people would have reported such issue for other database
>> engines so far. Still it might be an issue, just that was not noticed...
>> >
>> > Cheers,
>> > Daniel
>> >
>> > On 24/02/14 11:19, jay binks wrote:
>> >
>> > So poking round the code for the dialog module....
>> > Im not sure what im missing here.
>> >
>> >
>> > get_profile_size dosnt care bout the state of a dialog... so you get
>> ALL dialogs that are in the hash table.
>> > ( which is interesting if you want to use dialog module to enforce
>> channel limits etc )
>> >
>> > So you go... OK...  kamailio only expects to have "ACTIVE" dialogs in
>> the hash table... kewl..
>> > lets assume that to be the case.
>> >
>> > but then in dlg_db_handler.c , load_dialog_info_from_db loads all
>> dialogs from the DB, regardless of state.
>> > so all dialogs in the DB ( ones that didnt get deleted yet... but were
>> in state 5 ) get re-created in kamailio
>> > upon startup.
>> >
>> > what this means is...
>> > ( assume starting with empty DB )
>> >
>> > I start kamailio, make some calls... they get synced to the DB.
>> > I end the calls,  kamailio removes from dialogs module internal hash,
>> but the sync to DB hasnt happened yet.
>> >
>> > I kill kamailio ( or crash .. whatever )....  restart kamailio and it
>> re-loads all those dialogs
>> > and thinks they are still active calls.
>> >
>> > Im SURE Im missing something here, because it seems to be VERY common
>> to use dialogs for channel limiting..
>> > maybe not so much using cassandra db behind the scenes, but as of yet
>> ... Im still yet to find anything that makes me thing this is db_cassandra
>> mis-behaving.
>> >
>> > if im wrong, please point me in the right direction.
>> >
>> > Jay
>> >
>> >
>> >
>> >
>> > On 24 February 2014 17:54, jay binks <jaybinks at gmail.com> wrote:
>> >>
>> >> Am I REALLY the only person who has ever run into this !?
>> >>
>> >>
>> >> On 19 February 2014 14:08, jay binks <jaybinks at gmail.com> wrote:
>> >>>
>> >>> Hi all, im using the dialog module with db_cassandra backend..
>> >>> I dont believe this issue is related to cassandra, but its worth
>> mentioning anyways.
>> >>>
>> >>> so... I run kamailio, make calls, see dialogs in the DB..
>> >>> and I Can use "kamctl mi dlg_list" and see that dialogs go away when
>> I hangup a call..
>> >>>
>> >>> When I query the DB Backend, I still see the queries, but they have a
>> state of 5.
>> >>> I Initially thought this was a bug, but it seems dialogs in state 5
>> get cleaned up after a period.
>> >>> so I moved on.
>> >>>
>> >>> now , lets restart kamailio..
>> >>> kamailio loads all dialogs on startup, after kamailio starts I call
>> "kamctl mi dlg_list" again, and it shows all my dialogs from the DB.   they
>> DO show as "State 5"
>> >>> but for some reason, these dialogs appear to stick around for a long
>> time, and the bigger issue it causes me is that my channel limiting ( using
>> get_profile_size ) seems to consider these dialogs ( in state 5 ) as being
>> active calls.
>> >>>
>> >>> Please someone point me in the right direction... :)
>> >>>
>> >>> what am I doing wrong ?
>> >>> ( or is this a bug somewhere )
>> >>>
>> >>> Sincerely
>> >>>
>> >>> Jay
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Sincerely
>> >>
>> >> Jay
>> >
>> >
>> >
>> >
>> > --
>> > Sincerely
>> >
>> > Jay
>> >
>> >
>> > _______________________________________________
>> > sr-dev mailing list
>> > sr-dev at lists.sip-router.org
>> > http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>> >
>> >
>> > --
>> > Daniel-Constantin Mierla - http://www.asipto.com
>> > http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
>> >
>> >
>>  > _______________________________________________
>> > SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
>> > sr-users at lists.sip-router.org
>> > http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>> >
>>
>>
>>
>> --
>> Sincerely
>>
>> Jay
>>
>
>
>
>  --
> Sincerely
>
> Jay
>
>
> --
> Daniel-Constantin Mierla - http://www.asipto.comhttp://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
>
>


-- 
Sincerely

Jay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20140306/6daa20eb/attachment-0001.html>
-------------- next part --------------
diff --git a/modules/db_cassandra/dbcassa_base.cpp b/modules/db_cassandra/dbcassa_base.cpp
index e9d3a32..285fe16 100644
--- a/modules/db_cassandra/dbcassa_base.cpp
+++ b/modules/db_cassandra/dbcassa_base.cpp
@@ -561,7 +561,7 @@ ColumnVecPtr cassa_translate_query(const db1_con_t* _h, const db_key_t* _k,
 			}
 			dbcassa_reconnect(CON_CASSA(_h));
 		} while(cassa_auto_reconnect && retr++ < cassa_retries);
-
+		LM_ERR("Failed to connect, retries exceeded.\n");
 	} catch (const oac::InvalidRequestException ir) {
 		LM_ERR("Failed Invalid query request: %s\n", ir.why.c_str());
 	} catch (const at::TException &tx) {
@@ -914,7 +914,7 @@ int db_cassa_query(const db1_con_t* _h, const db_key_t* _k, const db_op_t* _op,
 done:
 	*_r = db_res;
 	LM_DBG("Exited with success\n");
-	return 1;
+	return 0;
 
 error:
 	if(db_res)
@@ -1060,14 +1060,14 @@ int db_cassa_modify(const db1_con_t* _h, const db_key_t* _k, const db_val_t* _v,
 			if(CON_CASSA(_h)->con) {
 				try{
 					CON_CASSA(_h)->con->batch_mutate(CFMap, oac::ConsistencyLevel::ONE);
-					return 1;
+					return 0;
 				}  catch (const att::TTransportException &tx) {
 					LM_ERR("Failed to query: %s\n", tx.what());
 				}
 			}
 			dbcassa_reconnect(CON_CASSA(_h));
 		} while (cassa_auto_reconnect && retr++ < cassa_retries);
-
+		LM_ERR("Failed to connect, retries exceeded.\n");
 	} catch (const oac::InvalidRequestException ir) {
 		LM_ERR("Failed Invalid query request: %s\n", ir.why.c_str());
 	} catch (const at::TException &tx) {
@@ -1188,13 +1188,14 @@ int db_cassa_delete(const db1_con_t* _h, const db_key_t* _k, const db_op_t* _o,
 				if(CON_CASSA(_h)->con) {
 					try {
 						cassa_client->remove(row_key, cp, (int64_t)time(0), oac::ConsistencyLevel::ONE);
-						return 1;
+						return 0;
 					} catch  (const att::TTransportException &tx) {
 							LM_ERR("Failed to query: %s\n", tx.what());
 					}
 				}
 				dbcassa_reconnect(CON_CASSA(_h));
 			} while(cassa_auto_reconnect && retr++ < cassa_retries);
+			LM_ERR("Failed to connect, retries exceeded.\n");
 		} else {
 
 			if(!seckey_len) {
@@ -1247,7 +1248,7 @@ int db_cassa_delete(const db1_con_t* _h, const db_key_t* _k, const db_op_t* _o,
 				if(CON_CASSA(_h)->con) {
 					try {
 						cassa_client->batch_mutate(CFMap, oac::ConsistencyLevel::ONE);
-						return 1;
+						return 0;
 					} catch  (const att::TTransportException &tx) {
 							LM_ERR("Failed to query: %s\n", tx.what());
 					}
@@ -1255,7 +1256,7 @@ int db_cassa_delete(const db1_con_t* _h, const db_key_t* _k, const db_op_t* _o,
 				dbcassa_reconnect(CON_CASSA(_h));
 			} while(cassa_auto_reconnect && retr++ < cassa_retries);
 		}
-		return 1;
+		LM_ERR("Failed to connect, retries exceeded.\n");
 	} catch (const oac::InvalidRequestException ir) {
 		LM_ERR("Invalid query: %s\n", ir.why.c_str());
 	} catch (const at::TException &tx) {


More information about the sr-users mailing list