[Devel] Postgres reconnect patch

Klaus Darilion klaus.mailinglists at pernau.at
Fri Aug 5 17:25:07 CEST 2005


The patched opeser still crashes if the acc table does not exist:

  8(14200) PG[367] submit_query query 'insert into acc 
) values 
('sip:klaus at enum.at','sip:klaus.darilion at nic.at43.at','INVITE','sip:klaus.darilion at nic.at43.at;transport=udp','sip:','Klaus 
Darilion enum 
eyebeam<sip:klaus at enum.at>;tag=d96e8a16','c34fcc4e3b141d18','<sip:klaus.darilion at nic.at43.at>;tag=bf99c5e5ef87440cadfbabf7b1e47a7c','200','klaus','bf99c5e5ef87440cadfbabf7b1e47a7c','d96e8a16','enum.at','2005-08-05 
15:21:24')', result 'ERROR:  Relation "acc" does not exist
19(14218)  0(14186) child process 14195 exited by a signal 11
  0(14186) core was generated
  0(14186) INFO: terminating due to SIGCHLD
  3(14191) INFO: signal 15 received

Still some code review is necessary ;-)


Michael Ulitskiy wrote:
> On Friday 05 August 2005 10:04 am, you wrote:
>>Michael Ulitskiy wrote:
>>>Hi Klaus,
>>>I've seen the problem described by Klaus too, but first of all I was 
>>>concerned that postgres module cannot reconnect to db after connection 
>>>failure. It crashed openser.
>>I tried CVS version and stoped postgres during operation. Then, openser 
>>failed to update the location table (of course) but did not crashed. 
>>Maybe it depends on the module which tries to access the database. 
>>Nevertheless, reconnect failed after postgres returned.
>>Then I applied your patch. Now, reconnect works.
>>Although, the debug message looks a little bit strange when the postgres 
>>module tries to reconnect when the db ist still down:
>>   7(32748) PG[142] connect_db could not connect to server: ÿÿÿÿ
>>                                                            ^^^^
> Well, it tries to reconnect every time openser wants to do a query. If db is still down
> then the message looks ok.
> BTW as far as I see the problem you described is also gone. I did a couple of tests -
> removing permissions for openser user and dropping a view it tries to select from
> and it didn't crash. It got back to normal after I restored the changes.
> Michael
>>>I've done some research on the module source code and I  believe
>>>I've found a logical mistake in it.
>>>The issue is that parse_sql_url() function is supplied with the only copy
>>>of sql url and it corrupts it. So if the connection fails and module
>>>tries to reparse url it fails as CON_SQLURL(_h) is corrupted by first
>>>I've created a simple patch that corrects the problem. What it does it
>>>it introduce a temporary buffer with sql url string for parse_sql_url to work
>>>on. Also it makes so that original sql url is not deleted from connection structure
>>>until db_close() is called.
>>>Patch is attached.
>>>Also if in dbase.c in db_init() function you comment out the following:
>>>        if (connect_db(res) < 0)
>>>        {
>>>                PLOG("db_init", "Error while trying to open database, FATAL\n")
>>>                aug_free(res);
>>>                return((db_con_t *) 0);
>>>        }
>>>you'll get a "delay connect until used" feature that can be usefull due to lack
>>>of connection pool for postgres.
>>>Could please some of developers review this patch?
>>>On Thursday 04 August 2005 11:39 am, Klaus Darilion wrote:
>>>>I had several situations in which openser crashed if the DB lookup 
>>>>fails, e.g:
>>>>table does not exist (acc module)
>>>>wrong SQL query (lcr module)
>>>>Whereas is some cases (wrong table permissions, avpops module) openser 
>>>>keeps running.
>>>>I do not know if these problems also occurs with mysql. If not, the 
>>>>postgresql module would really need some review.

More information about the Devel mailing list