[Serusers] SER data model discussion - developers please read
Jiri Kuthan
jiri at iptel.org
Tue Dec 4 00:33:33 CET 2007
At 19:39 03/12/2007, SIP wrote:
>On the flip side of having to do more modifications for integration, of
>course, is that we get the ABILITY to do more modifications.
>
>SER 0.9.6 wasn't limited, per se, but certain aspects of what we've
>implemented (user-configurable timers, number-based AND uri-based call
>blocking, call forwarding, call return, whitelisting number mode, etc)
>in 0.9.6 has felt like a complete and total hack because of the way it
>has had to be done using a mish mash of AVPs. We're HOPING we can
>forego some of that with the new SER using the new db queries and such,
>as well as being able to better-implement some functionality of other
>RFCs using the more flexible core.
>
>But yes... integration with the new db schema is proving to be a pain in
>the left ventricle if only because it's so totally focused around
>optimisation for SER.
The question to me is whether this pain is really specific to current release
of the data model (which I apparently don't think so) or to do the burden
of migration from the previous. IMO, it is the latter, to be compensated
by extensibility which minimizes future migration burden.
-jiri
>As with all new stuff, there's good and there's bad.
>
>N.
>
>
>X Z wrote:
>> Hello serdev list,
>>
>> Please see the thread below started on serusers list.
>>
>> I do understand the power and design of the new data model and that it
>> does allow the SER Core functionality to be very efficient.
>>
>> In addition, I definitely agree that of all the scenarios below, data
>> duplication is the best of the undesirable options.
>>
>> The undesirable options being:
>> 1. The old data model that didn't support avpairs sufficiently,
>> 2. The new data model that introduces very inefficient joins and ugly
>> SQL queries in order to create a view to integrate with Asterisk or
>> any other third party application
>> 3. The new data model that works very well for SER core functionality
>> PLUS a "subscriber" table that includes user data which may duplicated
>> in the avpairs tables.
>>
>> I wasn't suggesting that the avpairs model was bad, just that the
>> result of the data model as it stands makes it inefficient to
>> integrate with other apps data models (like Asterisk for example.)
>>
>> I might also suggest that the subscriber table comes back into the
>> core code in an expanded form that covers a high percentage of
>> integration use cases, and that triggers get added in to populate the
>> data which would be duplicated. Users can simply opt to not populate
>> the subscriber table at all and the triggers can be disabled as a db
>> installation option.
>>
>> The great advantage to this is that new users don't have to reinvent
>> the wheel when they want to integrate SER with third party apps to
>> create an "overall system" as sip at arcdiv says below.
>>
>> I'd be happy to propose a table structure and triggers if this idea
>> gains traction with the development team. We could begin a discussion
>> in the serusers list as to what people would like to see in this table.
>>
>> I guess I'm suggesting that SER core is great, but coming from the
>> user base, I'd like to see a few more considerations going into how
>> SER will be used. I'm guessing that a vast majority of users will not
>> be using it completely stand alone. Everyone is trying to add
>> value-added services like voicemail, presence, a web interface for
>> users and admin management, etc., etc. Can we come up with a reference
>> model for this and included it in the repository?
>>
>> I think it's very interesting that sip at ardev is commenting that he has
>> to do more modifications to use the new version than he had to do
>> before. I would hope that the development roadmap has improvements in
>> core functionality AND functionality that helps users actually use SER
>> (like simpler integrations and fewer custom mods). Making software
>> simpler to implement always helps encourage adoption and helps
>> increase user base and market share, which supports the success of the
>> project long term.
>>
>> Thanks,
>> Mahatma
>>
>>
>> On Dec 3, 2007 4:17 AM, SIP <sip at arcdiv.com <mailto:sip at arcdiv.com>>
>> wrote:
>>
>> The current data model, while designed to make certain things easier
>> DOES indeed encourage data duplication, as, in order to create a
>> unified
>> system, most of us will opt to duplicate data into more usable tables.
>>
>> When meshing with an overall system, gathering all relevant data for a
>> particular substructure or user by using joins is neither speedy
>> (especially when the tables get HUGE -- and they very much will) nor
>> terribly convenient.
>>
>> Of course, data duplication is hardly a cardinal sin, but there's
>> ALWAYS
>> a trade off between abstraction and actual functionality. I understand
>> the reasons WHY they've chosen the current data model, and to a
>> degree
>> it makes sense for the core SER system, but for meshing SER with other
>> systems, it's god-awful ugly. :)
>>
>> My recommendation, Mahatma, is to AVOID using the new data model for
>> anything other than the most basic of SER functionality, or, if you
>> gather users in the 50-100,000 user range, your user_attrs table
>> is just
>> going to be one ugly, unmanageably large pile of annoyance. Since the
>> user_attrs and domain_attrs are designed in part, from what I can
>> tell,
>> to make selects more rational and to allow basic SIP flexibility,
>> if you
>> don't NEED to keep the data there, don't do it. They could keep
>> modifying the data model to suit everyone's tastes, but right now,
>> it's
>> designed more to ensure that SER works and works well. Want to store
>> user information like a cell phone number and a fax number and a
>> timezone and a flag on whether or not that user has DND enabled or has
>> access to the PSTN, etc, etc? Keep it elsewhere in a more usable
>> table
>> for such purposes.
>>
>> As we've begun our testing with SER 2.0, we've had to modify
>> things more
>> drastically than we modified them in SER 0.9.6, but in ANY
>> environment,
>> you're going to have to modify things to work for you. At least
>> the core
>> system is flexible enough to let you tailor things. :)
>>
>> N.
>>
>>
>> Greger V. Teigre wrote:
>> > Dear Mahatma,
>> > If you want to engage the developers, you will have to subscribe to
>> > serdev, as many of the developers don't follow serusers.
>> >
>> > I have not participated in SER's 2.0 data model, but my initial take
>> > on your suggestion is the following:
>> > With all respect, I believe you may have misunderstood what Tom is
>> > addressing in his discussion. He targets a generic database model
>> > built around attribute value pairs. This is not the case for SER, on
>> > the contrary, SER's new data model is much sounder from a db
>> > perspective (than 0.8 and 0..9) and is built around the uid and
>> did as
>> > unique identifiers. Queries will through joins across the tables
>> > construct the needed data in a very efficient manner, as uid and
>> did
>> > are indexed and where the queries will use uid and did in the where
>> > clause. Without having checked, I assume the tables have been
>> > normalized just as they should (i.e. splitting them up).
>> >
>> > The attribute-value pairs you are referring to are not part of the
>> > core data model (which Tom covers), but rather attributes that
>> may be
>> > loaded and made available in ser.cfg through a query created to
>> > retrieve the attributes-value pairs. Without creating a limited
>> set of
>> > attributes that can be supported in ser.cfg, the generic avpairs
>> > cannot be avoided. However, the queries that retrieve avpairs do
>> not
>> > use the semantic of the avpairs to select which avpairs to load, ALL
>> > avpairs belonging to a specific uid and did are loaded at the
>> same time.
>> >
>> > Also, I'm afraid this statement is wrong: "It's a better idea
>> from a
>> > database architecture and performance perspective to keep adding
>> > columns into that table for data that has a 1 to 1 relationship
>> with a
>> > user."
>> > This is exactly how you should not do it if you have complex data
>> > relationships that need to be represented and retrieved without
>> > duplicating data.
>> >
>> > Does this answer alleviate your fears?
>> > If not, please subscribe to serdev and post your comment there to
>> > engage people closer to the design of the database.
>> > g-)
>> >
>> >
>> >
>> >
>> > X Z wrote:
>> >> Hi All,
>> >> This is specifically for the SER/OpenSER developers, but I'm not a
>> >> serdev list member so I'm posting here.
>> >>
>> >> I've been using SER since version 0.8.X and I'm still running
>> 0.8.14
>> >> production for my company PBX to this day.
>> >>
>> >> I was very excited as version 2 became a release candidate and I
>> >> downloaded it for testing. I was pretty disappointed with one
>> aspect
>> >> of the new data model and I'm requesting that the developers
>> consider
>> >> a further revision on the data model.
>> >>
>> >> Basically, taking all fields out of the subscriber table like
>> >> Last_name, first_name, email, timezone, rpid/asserted identity,
>> etc,
>> >> etc is not the greatest idea. It's a better idea from a database
>> >> architecture and performance perspective to keep adding columns
>> into
>> >> that table for data that has a 1 to 1 relationship with a user,
>> and
>> >> that is common in > 90% of SER's use cases ( i.e. corporate,
>> >> carrier/VSP.) I would suggest adding voicemail_password, and maybe
>> >> every other field that is being added into the default attributes
>> >> script that I saw in CVS recently. If you already know what
>> >> attributes a user has (and they have a 1 to 1 relationship),
>> then its
>> >> far better from a db performance perspective to keep these
>> attributes
>> >> in the user table. I know that the code becomes more
>> complicated, but
>> >> I think it may be a tradeoff worth discussing.
>> >>
>> >> See this discussion (
>> >>
>> http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:10678084117056
>> >> ) between Oracle users and Tom, (an Oracle engineer/architect.)
>> The
>> >> full text of this discussion is very informative and I highly
>> >> recommend people read it through.
>> >>
>> >> Tom's conclusion is that the type of data model being
>> discussed, and
>> >> now being used in SER fails for all but the most trivial of
>> >> applications. Maybe SER *by itself* qualifies as "trivial" from a
>> >> database architect's perspective, but think about things like
>> >> Asterisk integration, which is quite common. You quickly run into
>> >> some very nasty queries . . .
>> >>
>> >> Please note that I am not a software developer nor a database
>> >> engineer, just a user who reads a lot, so I'm open to being the
>> >> ignorant one here, but I thought that this should be discussed
>> among
>> >> users and developers.
>> >>
>> >> Thanks for considering,
>> >> Mahatma
>> >>
>> >>
>> http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:10678084117056
>> <http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:10678084117056>
>> >>
>> <http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:10678084117056>
>> >>
>> >>
>> >> The following is an excerpt from the above link:
>> >>
>> >> Here is a excerpt from my forthcoming book where I talk about
>> this (and show you how
>> >> ugly, hard and inefficient queries against your very flexible
>> model will be)
>> >>
>> >>
>> >>
>> >>
>> >> (2)Do not use Generic Data Models
>> >>
>> >> Frequently I see applications built on a generic data model for
>> "maximum flexibility" or
>> >> applications built in ways that prohibit performance. Many
>> times - these are one in the
>> >>
>> >> same thing! For example, it is well known you can represent any
>> object in a database
>> >> using just four tables:
>> >>
>> >> Create table objects ( oid int primary key, name varchar2(255) );
>> >>
>> >> Create table attributes
>> >>
>> >> ( attrId int primary key, attrName varchar2(255),
>> >> datatype varchar2(25) );
>> >>
>> >> Create table object_Attributes
>> >> ( oid int, attrId int, value varchar2(4000),
>> >> primary key(oid,attrId) );
>> >>
>> >> Create table Links ( oid1 int, oid2 int,
>> >>
>> >> primary key (oid1, oid2) );
>> >>
>> >>
>> >> That's it - no more CREATE TABLE for me! I can fill the
>> attributes table up with rows
>> >> like this:
>> >>
>> >> insert into attributes values ( 1, 'DATE_OF_BIRTH', 'DATE' );
>> >>
>> >> insert into attributes values ( 2, 'FIRST_NAME', 'STRING' );
>> >> insert into attributes values ( 3, 'LAST_NAME', 'STRING' );
>> >> commit;
>> >>
>> >>
>> >> And now I'm ready to create a PERSON record:
>> >>
>> >>
>> >> insert into objects values ( 1, 'PERSON' );
>> >> insert into object_Attributes values( 1, 1, '15-mar-1965' );
>> >> insert into object_Attributes values( 1, 2, 'Thomas' );
>> >> insert into object_Attributes values( 1, 3, 'Kyte' );
>> >>
>> >> commit;
>> >>
>> >> insert into objects values ( 2, 'PERSON' );
>> >> insert into object_Attributes values( 2, 1, '21-oct-1968' );
>> >> insert into object_Attributes values( 2, 2, 'John' );
>> >> insert into object_Attributes values( 2, 3, 'Smith' );
>> >>
>> >> commit;
>> >>
>> >> And since I'm good at SQL, I can even query this record up to
>> get the FIRST_NAME and
>> >> LAST_NAME of all PERSON records:
>> >>
>> >> ops$tkyte at ORA920 > select
>> >> max( decode(attrName, 'FIRST_NAME', value, null )) first_name,
>> >>
>> >> 2 max( decode( attrName, 'LAST_NAME', value, null ) ) last_name
>> >> 3 from objects, object_attributes, attributes
>> >> 4 where attributes.attrName in ( 'FIRST_NAME', 'LAST_NAME' )
>> >>
>> >> 5 and object_attributes.attrId = attributes.attrId
>> >> 6 and object_attributes.oid = objects.oid
>> >> 7 and objects.name <http://objects.name>
>> <http://objects.name> = 'PERSON'
>> >> 8 group by objects.oid
>> >>
>> >> 9 /
>> >>
>> >> FIRST_NAME LAST_NAME
>> >> -------------------- --------------------
>> >> Thomas Kyte
>> >> John Smith
>> >>
>> >>
>> >> Looks great, right? I mean, the developers don't have to create
>> tables anymore, we can
>> >>
>> >> add columns at the drop of a hat (just requires an insert into
>> the ATTRIBUTES table). The
>> >> developers can do whatever they want and the DBA can't stop
>> them. This is ultimate
>> >> "flexibility". I've seen people try to build entire systems on
>> this model.
>> >>
>> >>
>> >> But, how does it perform? Miserably, terribly, horribly. A
>> simple "select first_name,
>> >> last_name from person" query is transformed into a 3-table join
>> with aggregates and all.
>> >> Further, if the attributes are "NULLABLE" - that is, there
>> might not be a row in
>> >>
>> >> OBJECT_ATTRIBUTES for some attributes, you may have to outer
>> join instead of just joining
>> >> which in some cases can remove more optimal query plans from
>> consideration.
>> >>
>> >> Writing queries might look pretty straightforward, but it's
>> impossible to do in a
>> >>
>> >> performant fashion. For example, if we wanted to get everyone
>> that was born in MARCH or
>> >> has a LAST_NAME = 'SMITH', we could simply take the query from
>> above and just wrap an
>> >> inline view around that:
>> >>
>> >>
>> >>
>> >> ops$tkyte at ORA920> select *
>> >> 2 from (
>> >> 3 select
>> >> max(decode(attrName, 'FIRST_NAME', value, null)) first_name,
>> >> 4 max(decode(attrName, 'LAST_NAME', value, null)) last_name,
>> >>
>> >> 5 max(decode(attrName, 'DATE_OF_BIRTH', value, null))
>> >>
>> date_of_birth
>> >> 6 from objects, object_attributes, attributes
>> >> 7 where attributes.attrName
>> >> in ( 'FIRST_NAME',
>> >> 'LAST_NAME',
>> 'DATE_OF_BIRTH' )
>> >> 8 and object_attributes.attrId = attributes.attrId
>> >> 9 and object_attributes.oid = objects.oid
>> >>
>> >> 10 and objects.name <http://objects.name>
>> <http://objects.name> = 'PERSON'
>> >> 11 group by objects.oid
>> >> 12 )
>> >> 13 where last_name = 'Smith'
>> >> 14 or date_of_birth like '%-mar-%'
>> >>
>> >> 15 /
>> >>
>> >> FIRST_NAME LAST_NAME DATE_OF_BIRTH
>> >> -------------------- -------------------- --------------------
>> >> Thomas Kyte 15-mar-1965
>> >> John Smith 21-oct-1968
>> >>
>> >>
>> >> So, it looks "easy" to query, but think about the performance!
>> If you had a couple
>> >> thousand OBJECT records, and a couple tens of thousands of
>> OBJECT_ATTRIBUTES - Oracle
>> >> would have to process the entire inner group by query first and
>> then apply the WHERE
>> >>
>> >> clause.
>> >>
>> >> This is not a made up data model, one that I crafted just to
>> make a point. This is an
>> >> actual data model that I've seen people try to use. Their goal
>> is ultimate flexibility.
>> >> They don't know what OBJECTS they need, they don't know what
>> ATTRIBUTES they will have.
>> >>
>> >> Well - that is what the database was written for in the first
>> place: Oracle implemented
>> >> this thing called SQL to define OBJECTS and ATTRIBUTES and lets
>> you use SQL to query
>> >> them. You are trying to put a generic layer on top of a generic
>> layer - and it fails each
>> >>
>> >> and every time except for the most trivial of applications.
>> >>
>> >>
>> >>
>> ------------------------------------------------------------------------
>> >>
>> >> _______________________________________________
>> >> Serusers mailing list
>> >> Serusers at lists.iptel.org <mailto:Serusers at lists.iptel.org>
>> >> http://lists.iptel.org/mailman/listinfo/serusers
>> <http://lists.iptel.org/mailman/listinfo/serusers>
>> >>
>> >
>> ------------------------------------------------------------------------
>> >
>> > _______________________________________________
>> > Serusers mailing list
>> > Serusers at lists.iptel.org <mailto:Serusers at lists.iptel.org>
>> > http://lists.iptel.org/mailman/listinfo/serusers
>> >
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Serusers mailing list
>> Serusers at lists.iptel.org
>> http://lists.iptel.org/mailman/listinfo/serusers
>>
>
>_______________________________________________
>Serusers mailing list
>Serusers at lists.iptel.org
>http://lists.iptel.org/mailman/listinfo/serusers
--
Jiri Kuthan http://iptel.org/~jiri/
More information about the sr-users
mailing list