Iñaki,
Just as an addition from experience on hash tables, the first "sweet-spot" on the design is to have a fairly good algorithm so that different keys are mapped on different hash values. You cannot pick this (as far as I read) but we'll think it as fair enough. What you can do to "help" the algorithm is to pick a field already random or large. If you have redundant fields (like username and table ID from a database) try with both "unique" fields as a key and check which gives you better results.
Then you should use a table big enough to map a big number of different hash values. As computing is discrete you will find a maximum size, above that size you'll find collisions on some slots and empty slots (because the algorithm cannot generate those hash values). Keep in mind that sizing the hash table will consume memory resources.
Last you know that statistically you will bump into collisions, they are inevitable and no real-life practice will guarantee collision-free operation. When you have collisions, what you can do is have a powerful processor to operate searches and inserts quickly.
What would be best to optimize the system is to have some statistic information, save every now and then the status of the htable and act accordingly to: - If the table is pretty much or completely used and you have collisions, make the size parameter bigger - If the table is big enough so you'll have some slots free and still have collisions you can try picking some key (if you have redundant ones) that is more "random" so the hash result will be more "random". - If all above is not producing any result, add processor power to the system so searches and inserts are faster
Hope these guidelines help, they are some conclusions from hashing and queuing theory books.
Cheers, Uriel
On Sat, Oct 16, 2010 at 7:47 AM, Juha Heinanen jh@tutpro.com wrote:
Iñaki Baz Castillo writes:
And those collisions mean that I could look for a key name and retrieve another key (as both key names produce the same hash), am I right?
no, the values whose keys hash to the same index of hash table, are added into a linked list and then list list will be searched until the correct key is found.
So, if I want to handle ~ 300 concurrent entries, with *no* collision, which "size" value is good enough? how to determine it?
you cannot unless you know the hash algorithm and your keys in advance.
-- juha
SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list sr-users@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users