Hello serdev list,
Please see the thread below started on serusers list.
I do understand the power and design of the new data model and that it
does allow the SER Core functionality to be very efficient.
In addition, I definitely agree that of all the scenarios below, data
duplication is the best of the undesirable options.
The undesirable options being:
1. The old data model that didn't support avpairs sufficiently,
2. The new data model that introduces very inefficient joins and ugly
SQL queries in order to create a view to integrate with Asterisk or
any other third party application
3. The new data model that works very well for SER core functionality
PLUS a "subscriber" table that includes user data which may duplicated
in the avpairs tables.
I wasn't suggesting that the avpairs model was bad, just that the
result of the data model as it stands makes it inefficient to
integrate with other apps data models (like Asterisk for example.)
I might also suggest that the subscriber table comes back into the
core code in an expanded form that covers a high percentage of
integration use cases, and that triggers get added in to populate the
data which would be duplicated. Users can simply opt to not populate
the subscriber table at all and the triggers can be disabled as a db
installation option.
The great advantage to this is that new users don't have to reinvent
the wheel when they want to integrate SER with third party apps to
create an "overall system" as sip@arcdiv says below.
I'd be happy to propose a table structure and triggers if this idea
gains traction with the development team. We could begin a discussion
in the serusers list as to what people would like to see in this table.
I guess I'm suggesting that SER core is great, but coming from the
user base, I'd like to see a few more considerations going into how
SER will be used. I'm guessing that a vast majority of users will not
be using it completely stand alone. Everyone is trying to add
value-added services like voicemail, presence, a web interface for
users and admin management, etc., etc. Can we come up with a reference
model for this and included it in the repository?
I think it's very interesting that sip@ardev is commenting that he has
to do more modifications to use the new version than he had to do
before. I would hope that the development roadmap has improvements in
core functionality AND functionality that helps users actually use SER
(like simpler integrations and fewer custom mods). Making software
simpler to implement always helps encourage adoption and helps
increase user base and market share, which supports the success of the
project long term.
Thanks,
Mahatma
On Dec 3, 2007 4:17 AM, SIP <sip(a)arcdiv.com <mailto:sip@arcdiv.com>>
wrote:
The current data model, while designed to make certain things easier
DOES indeed encourage data duplication, as, in order to create a
unified
system, most of us will opt to duplicate data into more usable tables.
When meshing with an overall system, gathering all relevant data for a
particular substructure or user by using joins is neither speedy
(especially when the tables get HUGE -- and they very much will) nor
terribly convenient.
Of course, data duplication is hardly a cardinal sin, but there's
ALWAYS
a trade off between abstraction and actual functionality. I understand
the reasons WHY they've chosen the current data model, and to a
degree
it makes sense for the core SER system, but for meshing SER with other
systems, it's god-awful ugly. :)
My recommendation, Mahatma, is to AVOID using the new data model for
anything other than the most basic of SER functionality, or, if you
gather users in the 50-100,000 user range, your user_attrs table
is just
going to be one ugly, unmanageably large pile of annoyance. Since the
user_attrs and domain_attrs are designed in part, from what I can
tell,
to make selects more rational and to allow basic SIP flexibility,
if you
don't NEED to keep the data there, don't do it. They could keep
modifying the data model to suit everyone's tastes, but right now,
it's
designed more to ensure that SER works and works well. Want to store
user information like a cell phone number and a fax number and a
timezone and a flag on whether or not that user has DND enabled or has
access to the PSTN, etc, etc? Keep it elsewhere in a more usable
table
for such purposes.
As we've begun our testing with SER 2.0, we've had to modify
things more
drastically than we modified them in SER 0.9.6, but in ANY
environment,
you're going to have to modify things to work for you. At least
the core
system is flexible enough to let you tailor things. :)
N.
Greger V. Teigre wrote:
Dear Mahatma,
If you want to engage the developers, you will have to subscribe to
serdev, as many of the developers don't follow serusers.
I have not participated in SER's 2.0 data model, but my initial take
on your suggestion is the following:
With all respect, I believe you may have misunderstood what Tom is
addressing in his discussion. He targets a generic database model
built around attribute value pairs. This is not the case for SER, on
the contrary, SER's new data model is much sounder from a db
perspective (than 0.8 and 0..9) and is built around the uid and
did as
unique identifiers. Queries will through joins
across the tables
construct the needed data in a very efficient manner, as uid and
did
are indexed and where the queries will use uid
and did in the where
clause. Without having checked, I assume the tables have been
normalized just as they should (i.e. splitting them up).
The attribute-value pairs you are referring to are not part of the
core data model (which Tom covers), but rather attributes that
may be
loaded and made available in ser.cfg through a
query created to
retrieve the attributes-value pairs. Without creating a limited
set of
attributes that can be supported in ser.cfg, the
generic avpairs
cannot be avoided. However, the queries that retrieve avpairs do
not
use the semantic of the avpairs to select which
avpairs to load, ALL
avpairs belonging to a specific uid and did are loaded at the
same time.
Also, I'm afraid this statement is wrong: "It's a better idea
from a
database architecture and performance perspective
to keep adding
columns into that table for data that has a 1 to 1 relationship
with a
user."
This is exactly how you should not do it if you have complex data
relationships that need to be represented and retrieved without
duplicating data.
Does this answer alleviate your fears?
If not, please subscribe to serdev and post your comment there to
engage people closer to the design of the database.
g-)
X Z wrote:
> Hi All,
> This is specifically for the SER/OpenSER developers, but I'm not a
> serdev list member so I'm posting here.
>
> I've been using SER since version 0.8.X and I'm still running
0.8.14
> production for my company PBX to this day.
>
> I was very excited as version 2 became a release candidate and I
> downloaded it for testing. I was pretty disappointed with one
aspect
> of the new data model and I'm requesting
that the developers
consider
> a further revision on the data model.
>
> Basically, taking all fields out of the subscriber table like
> Last_name, first_name, email, timezone, rpid/asserted identity,
etc,
> etc is not the greatest idea. It's a
better idea from a database
> architecture and performance perspective to keep adding columns
into
> that table for data that has a 1 to 1
relationship with a user,
and
> that is common in > 90% of SER's use
cases ( i.e. corporate,
> carrier/VSP.) I would suggest adding voicemail_password, and maybe
> every other field that is being added into the default attributes
> script that I saw in CVS recently. If you already know what
> attributes a user has (and they have a 1 to 1 relationship),
then its
> far better from a db performance perspective
to keep these
attributes
> in the user table. I know that the code
becomes more
complicated, but
> I think it may be a tradeoff worth
discussing.
>
> See this discussion (
>
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1067808…
> ) between Oracle users and Tom, (an Oracle
engineer/architect.)
The
> full text of this discussion is very
informative and I highly
> recommend people read it through.
>
> Tom's conclusion is that the type of data model being
discussed, and
> now being used in SER fails for all but the
most trivial of
> applications. Maybe SER *by itself* qualifies as "trivial" from a
> database architect's perspective, but think about things like
> Asterisk integration, which is quite common. You quickly run into
> some very nasty queries . . .
>
> Please note that I am not a software developer nor a database
> engineer, just a user who reads a lot, so I'm open to being the
> ignorant one here, but I thought that this should be discussed
among
> users and developers.
>
> Thanks for considering,
> Mahatma
>
>
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1067808…
<http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:10678084117056>
>
<http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:10678084117056>
>
>
>> The following is an excerpt from the above link:
>
>> Here is a excerpt from my
forthcoming book where I talk about
this (and show you how
> ugly, hard and inefficient queries against
your very flexible
model will be)
>
>
>
>
>> (2)Do not use Generic Data Models
>
>> Frequently I see
applications built on a generic data model for
"maximum flexibility" or
> applications built in ways that prohibit
performance. Many
times - these are one in the
>
>> same thing! For example,
it is well known you can represent any
object in a database
>> using just four tables:
>
>> Create table objects (
oid int primary key, name varchar2(255) );
>
>> Create table attributes
>
>> ( attrId int primary key,
attrName varchar2(255),
>> datatype varchar2(25) );
>
>> Create table
object_Attributes
>> ( oid int, attrId int, value varchar2(4000),
>> primary key(oid,attrId) );
>
>> Create table Links ( oid1
int, oid2 int,
>
>> primary key (oid1, oid2)
);
>
>
>> That's it - no more CREATE TABLE for me! I can fill the
attributes table up with rows
>> like this:
>
>> insert into attributes
values ( 1, 'DATE_OF_BIRTH', 'DATE' );
>
>> insert into attributes
values ( 2, 'FIRST_NAME', 'STRING' );
>> insert into attributes values ( 3, 'LAST_NAME', 'STRING'
);
>> commit;
>
>
>> And now I'm ready to create a PERSON record:
>
>
>> insert into objects values ( 1, 'PERSON' );
>> insert into object_Attributes values( 1, 1, '15-mar-1965' );
>> insert into object_Attributes values( 1, 2, 'Thomas' );
>> insert into object_Attributes values( 1, 3, 'Kyte' );
>
>> commit;
>
>> insert into objects
values ( 2, 'PERSON' );
>> insert into object_Attributes values( 2, 1, '21-oct-1968' );
>> insert into object_Attributes values( 2, 2, 'John' );
>> insert into object_Attributes values( 2, 3, 'Smith' );
>
>> commit;
>
>> And since I'm good at
SQL, I can even query this record up to
get the FIRST_NAME and
>> LAST_NAME of all PERSON records:
>
>> ops$tkyte@ORA920 >
select
>> max( decode(attrName, 'FIRST_NAME', value, null )) first_name,
>
>> 2 max( decode(
attrName, 'LAST_NAME', value, null ) ) last_name
>> 3 from objects, object_attributes, attributes
>> 4 where attributes.attrName in ( 'FIRST_NAME',
'LAST_NAME' )
>
>> 5 and
object_attributes.attrId = attributes.attrId
>> 6 and object_attributes.oid = objects.oid
>> 7 and objects.name <http://objects.name>
<http://objects.name> = 'PERSON'
>> 8 group by objects.oid
>
>> 9 /
>
>> FIRST_NAME
LAST_NAME
>> -------------------- --------------------
>> Thomas Kyte
>> John Smith
>
>
>> Looks great, right? I mean, the developers don't have to
create
tables anymore, we can
>
>> add columns at the drop
of a hat (just requires an insert into
the ATTRIBUTES table). The
> developers can do whatever they want and the
DBA can't stop
them. This is ultimate
> "flexibility". I've seen people
try to build entire systems on
this model.
>
>
>> But, how does it perform? Miserably, terribly, horribly. A
simple "select first_name,
> last_name from person" query is
transformed into a 3-table join
with aggregates and all.
> Further, if the attributes are
"NULLABLE" - that is, there
might not be a row in
>
>> OBJECT_ATTRIBUTES for
some attributes, you may have to outer
join instead of just joining
> which in some cases can remove more optimal
query plans from
consideration.
>
>> Writing queries might
look pretty straightforward, but it's
impossible to do in a
>
>> performant fashion. For
example, if we wanted to get everyone
that was born in MARCH or
> has a LAST_NAME = 'SMITH', we could
simply take the query from
above and just wrap an
>> inline view around that:
>
>
>
>>
ops$tkyte@ORA920> select *
>> 2 from (
>> 3 select
>> max(decode(attrName, 'FIRST_NAME', value, null)) first_name,
>> 4 max(decode(attrName, 'LAST_NAME', value, null)) last_name,
>
>> 5 max(decode(attrName,
'DATE_OF_BIRTH', value, null))
>>
date_of_birth
> 6 from objects, object_attributes,
attributes
> 7 where attributes.attrName
> in ( 'FIRST_NAME',
> 'LAST_NAME',
'DATE_OF_BIRTH' )
>> 8 and object_attributes.attrId = attributes.attrId
>> 9 and object_attributes.oid = objects.oid
>
>> 10 and objects.name
<http://objects.name>
<http://objects.name> = 'PERSON'
>> 11 group by objects.oid
>> 12 )
>> 13 where last_name = 'Smith'
>> 14 or date_of_birth like '%-mar-%'
>
>> 15 /
>
>> FIRST_NAME
LAST_NAME DATE_OF_BIRTH
>> -------------------- -------------------- --------------------
>> Thomas Kyte 15-mar-1965
>> John Smith 21-oct-1968
>
>
>> So, it looks "easy" to query, but think about the
performance!
If you had a couple
> thousand OBJECT records, and a couple tens of
thousands of
OBJECT_ATTRIBUTES - Oracle
> would have to process the entire inner group
by query first and
then apply the WHERE
>
>> clause.
>
>> This is not a made up
data model, one that I crafted just to
make a point. This is an
> actual data model that I've seen people
try to use. Their goal
is ultimate flexibility.
> They don't know what OBJECTS they need,
they don't know what
ATTRIBUTES they will have.
>
>> Well - that is what the
database was written for in the first
place: Oracle implemented
> this thing called SQL to define OBJECTS and
ATTRIBUTES and lets
you use SQL to query
> them. You are trying to put a generic layer
on top of a generic
layer - and it fails each
>
>> and every time except for
the most trivial of applications.
>
>
>
------------------------------------------------------------------------
>
>>
_______________________________________________
>> Serusers mailing list
>> Serusers(a)lists.iptel.org <mailto:Serusers@lists.iptel.org>
>>
http://lists.iptel.org/mailman/listinfo/serusers
<http://lists.iptel.org/mailman/listinfo/serusers>
>
>
------------------------------------------------------------------------
_______________________________________________
Serusers mailing list
Serusers(a)lists.iptel.org <mailto:Serusers@lists.iptel.org>
http://lists.iptel.org/mailman/listinfo/serusers
------------------------------------------------------------------------
_______________________________________________
Serusers mailing list
Serusers(a)lists.iptel.org
http://lists.iptel.org/mailman/listinfo/serusers
_______________________________________________
Serusers mailing list
Serusers(a)lists.iptel.org