[SR-Users] HTable 'size' parameter

Sat Oct 16 15:12:39 CEST 2010

Iñaki,

Just as an addition from experience on hash tables, the first
"sweet-spot" on the design is to have a fairly good algorithm so that
different keys are mapped on different hash values. You cannot pick
this (as far as I read) but we'll think it as fair enough. What you
can do to "help" the algorithm is to pick a field already random or
large. If you have redundant fields (like username and table ID from a
database) try with both "unique" fields as a key and check which gives
you better results.

Then you should use a table big enough to map a big number of
different hash values. As computing is discrete you will find a
maximum size, above that size you'll find collisions on some slots and
empty slots (because the algorithm cannot generate those hash values).
Keep in mind that sizing the hash table will consume memory resources.

Last you know that statistically you will bump into collisions, they
are inevitable and no real-life practice will guarantee collision-free
operation. When you have collisions, what you can do is have a
powerful processor to operate searches and inserts quickly.

What would be best to optimize the system is to have some statistic
information, save every now and then the status of the htable and act
accordingly to:
- If the table is pretty much or completely used and you have
collisions, make the size parameter bigger
- If the table is big enough so you'll have some slots free and still
have collisions you can try picking some key (if you have redundant
ones) that is more "random" so the hash result will be more "random".
- If all above is not producing any result, add processor power to the
system so searches and inserts are faster

Hope these guidelines help, they are some conclusions from hashing and
queuing theory books.

Cheers,
Uriel

On Sat, Oct 16, 2010 at 7:47 AM, Juha Heinanen <jh at tutpro.com> wrote:
> Iñaki Baz Castillo writes:
>
>> And those collisions mean that I could look for a key name and
>> retrieve another key (as both key names produce the same hash), am I
>> right?
>
> no, the values whose keys hash to the same index of hash table, are
> added into a linked list and then list list will be searched until the
> correct key is found.
>
>> So, if I want to handle ~ 300 concurrent entries, with *no* collision,
>> which "size" value is good enough? how to determine it?
>
> you cannot unless you know the hash algorithm and your keys in advance.
>
> -- juha
>
> _______________________________________________
> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
> sr-users at lists.sip-router.org
> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>