On 26.08.2009 17:14 Uhr, Alex Balashov wrote:
Daniel-Constantin Mierla wrote:
just to be sure everyone got it right regarding
the size attribute in
htable parameter ... the number of slots for hash table is 2^size (2
power size). So, size=4 means that the hash table has 16 slots.
Reason for that is the speed in getting the slot number - modulo
function is very fast in such cases, the slot number being actually a
bitwise operation: hash_value & (2^size - 1)
--- 2^size is cached at startup.
Thanks, that was a much-needed clarification omitted from my explanation.
I have always been told that table sizes that are prime numbers result
in better distribution of factors due to the way modulo works,
although I am not a mathematician and don't have a ready understanding
of why. Is this true? Is there some reason you did not take this
approach? Is hash_value & 2(^size - 1) equally fast?
not a mathematician myself and havent tried the approach with prime
numbers. The fair distribution is (attempted to be) ensured by the hash
functions. They exist for long time in ser/openser, iirc, Andrei
Pelinescu implemented them (with some improvements in SR) and he tested
to get a fair distribution for strings, SIP addresses (being used in
location table as well).
I did tests as well, mainly for location and under heavy load the (max -
min) was in average less than 10% of max, which is good.
Cheers,
Daniel
--
Daniel-Constantin Mierla
*
http://www.asipto.com/