[Devel] Postgres reconnect patch

Michael Ulitskiy mdu113 at acedsl.com
Fri Aug 5 17:51:35 CEST 2005


Sure, it's neccessary :)
But I suspect that the problem is in acc module. When query fails the postgres module
would return an error to the caller and I guess this is what acc module doesn't like.
I don't use acc, but with avpops, uri_db, auth_db and usrloc  I didn't manage to crash it.
At least so far :)

Michael

On Friday 05 August 2005 11:25 am, Klaus Darilion wrote:
> Hi!
> 
> The patched opeser still crashes if the acc table does not exist:
> 
>   8(14200) PG[367] submit_query query 'insert into acc 
> (from_uri,to_uri,sip_method,i_uri,o_uri,sip_from,sip_callid,sip_to,sip_status,username,totag,fromtag,domain,time 
> ) values 
> ('sip:klaus at enum.at','sip:klaus.darilion at nic.at43.at','INVITE','sip:klaus.darilion at nic.at43.at;transport=udp','sip:10.10.0.76:14528','Klaus 
> Darilion enum 
> eyebeam<sip:klaus at enum.at>;tag=d96e8a16','c34fcc4e3b141d18','<sip:klaus.darilion at nic.at43.at>;tag=bf99c5e5ef87440cadfbabf7b1e47a7c','200','klaus','bf99c5e5ef87440cadfbabf7b1e47a7c','d96e8a16','enum.at','2005-08-05 
> 15:21:24')', result 'ERROR:  Relation "acc" does not exist
> ...
> ...
> 19(14218)  0(14186) child process 14195 exited by a signal 11
>   0(14186) core was generated
>   0(14186) INFO: terminating due to SIGCHLD
>   3(14191) INFO: signal 15 received
> 
> 
> Still some code review is necessary ;-)
> 
> klaus
> 
> 
> Michael Ulitskiy wrote:
> > On Friday 05 August 2005 10:04 am, you wrote:
> > 
> >>Michael Ulitskiy wrote:
> >>
> >>>Hi Klaus,
> >>>
> >>>I've seen the problem described by Klaus too, but first of all I was 
> >>>concerned that postgres module cannot reconnect to db after connection 
> >>>failure. It crashed openser.
> >>
> >>I tried CVS version and stoped postgres during operation. Then, openser 
> >>failed to update the location table (of course) but did not crashed. 
> >>Maybe it depends on the module which tries to access the database. 
> >>Nevertheless, reconnect failed after postgres returned.
> >>
> >>Then I applied your patch. Now, reconnect works.
> >>Although, the debug message looks a little bit strange when the postgres 
> >>module tries to reconnect when the db ist still down:
> >>
> >>   7(32748) PG[142] connect_db could not connect to server: ÿÿÿÿ
> >>
> >>                                                            ^^^^
> > 
> >  
> > Well, it tries to reconnect every time openser wants to do a query. If db is still down
> > then the message looks ok.
> > BTW as far as I see the problem you described is also gone. I did a couple of tests -
> > removing permissions for openser user and dropping a view it tries to select from
> > and it didn't crash. It got back to normal after I restored the changes.
> > 
> > Michael
> > 
> >  
> > 
> >>regards,
> >>klaus
> >>
> >>
> >>>I've done some research on the module source code and I  believe
> >>>I've found a logical mistake in it.
> >>>The issue is that parse_sql_url() function is supplied with the only copy
> >>>of sql url and it corrupts it. So if the connection fails and module
> >>>tries to reparse url it fails as CON_SQLURL(_h) is corrupted by first
> >>>parsing. 
> >>>I've created a simple patch that corrects the problem. What it does it
> >>>it introduce a temporary buffer with sql url string for parse_sql_url to work
> >>>on. Also it makes so that original sql url is not deleted from connection structure
> >>>until db_close() is called.
> >>>Patch is attached.
> >>>Also if in dbase.c in db_init() function you comment out the following:
> >>>
> >>>/*
> >>>        if (connect_db(res) < 0)
> >>>        {
> >>>                PLOG("db_init", "Error while trying to open database, FATAL\n")
> >>>                aug_free(res);
> >>>                return((db_con_t *) 0);
> >>>        }
> >>>*/
> >>>
> >>>you'll get a "delay connect until used" feature that can be usefull due to lack
> >>>of connection pool for postgres.
> >>>Could please some of developers review this patch?
> >>>
> >>>Michael
> >>>
> >>>
> >>>On Thursday 04 August 2005 11:39 am, Klaus Darilion wrote:
> >>>
> >>>
> >>>>Hi!
> >>>>
> >>>>I had several situations in which openser crashed if the DB lookup 
> >>>>fails, e.g:
> >>>>table does not exist (acc module)
> >>>>wrong SQL query (lcr module)
> >>>>
> >>>>Whereas is some cases (wrong table permissions, avpops module) openser 
> >>>>keeps running.
> >>>>
> >>>>I do not know if these problems also occurs with mysql. If not, the 
> >>>>postgresql module would really need some review.
> >>>>
> >>>>regards,
> >>>>klaus
> >>>>
> >>>>_______________________________________________
> >>
> >>
> > 
> > 
> 
> 

-- 
See you later,
                    Michael



More information about the Devel mailing list