[OpenSER-Devel] Re: [OpenSER-Users] New db_berkeley module

Fri Oct 12 17:56:49 CEST 2007

On Friday 12 October 2007, Will Quan wrote:
> Henning, Thanks for working through this. I can definitely understand
> consistency across the DB modules is important architecturally.
> I have been think about this all day, and I dont think I have a 'easy'
> response to the issue of the row id as a primary key in db_berkeley.

Hello William.

> The berkeley database is not relational and the extra burden of
> maintaining an artificial key (id) for each row will not actually
> improve performance as it would in a relational database.
> I am not an expert in DB internals, so I'll just explain things as I
> understand them. We need to hash this out :)
> The api for querying in berkeley is either:
> 1. get() - where your provide the key, and in our case it must be
> lexicographically equal in order to find a result. I believe this is the
> 'natural join'.

Thank you for the detailed explanation, now i understand the problem in much 
more detail.

> 2. cursor() - where you iterate over each row, do the join on any
> columns you want, and create a result set.
> As implemented, without the id columns, the queries are implemented with
> get() which implies a natural join, or exact string equality on the
> 'key', which is in most cases a composite key comprised of the
> METADATA_KEY columns seperated by a delimiter. 

It is not possible to use only one key of the set instead of all? E.g. use 
only the username, or the id?

> Since the underlying 
> access method is db_hash, the query runtime is constant.
> I think if we change things in the bdb schema to use the id column as
> part of the composite key, we will be limiting ourselves to using cursor
> based queries, since we will not know the id until after the first query.

Well, if i understand it correctly, this would be rather slow, iterating over 
the columns. So this is not a good solution.

> Aside, my understanding is that that future development would implement
> queries that fetch and store the oid such that subsequent queries would
> perform queries in that table with a 'WHERE id = oid' clause. (Please
> let me know if this assumption is incorrect.)

I don't think any current module that uses a id query. Daniel or Bogdan, is 
this planned for the for future, and in what timeframe? 
I remember a discussion some month ago that this was the reason for the 
introduction of the id columns..

> As I sit here, I think I 
> would have to create a secondary bdb database for each table that
> requires the id column. The key would be a unique integer id, and the
> value would point to the row of the 'real' table. This would probably
> work but it does add a layer of complexity that we take for granted in
> the relational databases. Today, these secondary databases are not
> implemented, and there are other issues not discussed like the concept
> of uniqueness of the ids, etc. However, to be honest I dont know if I
> can get all this secondary db stuff working in the next 2 months.

As long as no one using this access method, you don't need to hurry at the 
moment in this area, in my opinion. Using DB->associate (from berkeley_db) 
sounds not so difficult, but i'm not an bdb expert. There probably many other 
issues that need to be worked out.

Its not possible to implement this for 1.3 anyway, the code is frozen.

> Please do not take this as me rejecting your ideas, but rather full
> discloser that making db_berkeley more 'relational' comes at the cost of
> additional complexities that are not implemented yet.

No problem, as i wrote this mail i don't understand the full implication of 
this problem completely.

> Aside, I started looking at the code for the openserctl cmds today, and
> I think I need to add some fifo cmds to the modules since openser is
> actually running at the time the openserctl util is being invoked. This
> means the DBs are open and some data may not be commited to disk, etc. I
> thought I'd use the carrierroute module as the starting example for
> implemented such fifo commands, but I need a few more days to get all
> those command implemeted/tested. I will continue on this path over the
> next few days, such that there will be parity between the db modules
> from the perspective of the openserctl cmds.

Ok, so you want to implement some kind of "flush data" parameter?

> If you prefer discussions in this working group that is good, but I am
> also available via sip if you want to discuss voice. Just so you know
> its an option.

I can give you my company number if you like, but i'm also available at the 
openser irc channel for private chat thorough the day (german time). 

Cheers,

Henning