[Serusers] SER data model discussion - developers please read
SIP
sip at arcdiv.com
Tue Dec 4 02:34:25 CET 2007
Jiri Kuthan wrote:
> At 19:39 03/12/2007, SIP wrote:
>
>> On the flip side of having to do more modifications for integration, of
>> course, is that we get the ABILITY to do more modifications.
>>
>> SER 0.9.6 wasn't limited, per se, but certain aspects of what we've
>> implemented (user-configurable timers, number-based AND uri-based call
>> blocking, call forwarding, call return, whitelisting number mode, etc)
>> in 0.9.6 has felt like a complete and total hack because of the way it
>> has had to be done using a mish mash of AVPs. We're HOPING we can
>> forego some of that with the new SER using the new db queries and such,
>> as well as being able to better-implement some functionality of other
>> RFCs using the more flexible core.
>>
>> But yes... integration with the new db schema is proving to be a pain in
>> the left ventricle if only because it's so totally focused around
>> optimisation for SER.
>>
>
> The question to me is whether this pain is really specific to current release
> of the data model (which I apparently don't think so) or to do the burden
> of migration from the previous. IMO, it is the latter, to be compensated
> by extensibility which minimizes future migration burden.
>
>
I disagree entirely that it's strictly based on migration.
Let's look at the simple act of adding information to the system:
With a flat subscriber table with all the fields I desire, adding
information to the system is 1 insert.
Let's say that my subscriber table looks like this (because in the dev
ser 0.9.6 box, it does):
+-------------------+--------------+------+-----+---------------------+-------+
| Field | Type | Null | Key | Default |
Extra |
+-------------------+--------------+------+-----+---------------------+-------+
| phplib_id | varchar(32) | NO | UNI |
| |
| username | varchar(64) | NO | PRI |
| |
| domain | varchar(128) | NO | PRI |
| |
| password | varchar(25) | NO | |
| |
| first_name | varchar(25) | NO | |
| |
| last_name | varchar(45) | NO | |
| |
| phone | varchar(15) | NO | |
| |
| email_address | varchar(50) | NO | |
| |
| datetime_created | datetime | NO | | 0000-00-00 00:00:00
| |
| datetime_modified | timestamp | NO | | CURRENT_TIMESTAMP
| |
| confirmation | varchar(64) | NO | |
| |
| flag | char(1) | NO | | o
| |
| sendnotification | varchar(50) | NO | |
| |
| greeting | varchar(50) | NO | |
| |
| ha1 | varchar(128) | NO | |
| |
| ha1b | varchar(128) | NO | |
| |
| allow_find | char(1) | NO | | 0
| |
| timezone | varchar(128) | YES | | NULL
| |
| rpid | varchar(128) | YES | | NULL
| |
| domn | int(10) | YES | | NULL
| |
| uuid | varchar(64) | YES | | NULL
| |
| signup_ip | varchar(15) | NO | |
| |
+-------------------+--------------+------+-----+---------------------+-------+
Now... ignoring the fact that we have 1 possibly 2 unused fields in that
table, let's assume we have to add a user in the new schema.
That's 22 inserts right there (one for each attribute, especially since
we can't have defaults).
Searching through data? Let's say I want to know the
first_name,last_name,username,domain,confirmation, and the
datetime_modified of a user. Simple enough. That's 1 query. If any of
those data points were null or their default, grabbing that is still
easy as can be.
Now... in the new schema, that's a join (which IS less efficient than a
query, I'm afraid, though I can't quote exact meaningful metrics on how
much). If any of those data points are not SET (assuming we were going
to minimise our impact by NOT doing a full 22-insert creation for each
user), it causes yet another series of headaches. When you want to
reference data in other relations, it becomes even MORE of a hassle. The
queries themselves become these unwieldy beasts that increase the chance
of an error in syntax or logic.
One of the simplest queries we have is with our admin interface to query
who's logged in in such a way that the data will be used in the web
interface:
select username, first_name, last_name, signup_ip from subscriber where
username=ANY (select username from location);
Simple. Succinct. Clear.We grab the fields we need and can format them.
One short line.
Now I'll leave the new schema version as an exercise for the reader, as
I don't recall the syntax right off (which is the first clue as to its
added complexity). It's lengthy and utterly non-relational -- which is
why we simply don't USE it for these sorts of things.
This is not just a migration headache. This is a schema problem. Now, I
know you love your schema, and you're allowed, and I fully understand
that it's flexible and good for the SER service. But for integration
with any sort of system beyond the most basic, it's a right pain in the
ass. You can tout its joys and wonders all you'd like, but we're
developing with it, and I can assure you, it's not as easy and as
straightforward as you like to believe.
I'm not saying there's anything wrong with the schema overall. There
have been some great forward leaps in this one, such as the riddance of
the incredibly tiresome domain column in every table (that was sometimes
checked and sometimes not). But it DOES make for additional complexity
when building a system around it.
N.
More information about the sr-users
mailing list