by accident i sent this message to devel list when i intended to send it to users list.
i'm thinking to speed up lcr load_gws() implementation by storing prefixes into hash table. it would allow load_gws() to execute (if necessary) in constant time independent of the number of prefixes. also, i'm planing to make the size of hash table controllable by module parameter.
the side effect of this is that prefix_mode=1, where prefixes are regular expressions, would have to be dropped, because hashing on regular expressions makes no sense.
something close to prefix_mode=1 could be achieved using dialplan module to store the regular expressions and return as attribute a gateway grp_id. that could then be used by new load_gws_from_group(grp_id) to load gws in the given group.
any comments on this? is someone depending on prefix_mode=1?
by the way, before starting this project, i took a look at opensips drouting module, but unfortunately it does not cover many features of lcr module that i use in my configurations and thus cannot be used to replace lcr module.
-- juha
What about some sort of hashing algorithm that operates on regular expression sums or is otherwise tuned by regex elements?
I have no idea what such an algorithm is, but I imagine there exist ways to evaluate large lists of regular expressions in a way that is at least to some extent non-linear, somewhat in the way that a spell checker works?
Juha Heinanen wrote:
by accident i sent this message to devel list when i intended to send it to users list.
i'm thinking to speed up lcr load_gws() implementation by storing prefixes into hash table. it would allow load_gws() to execute (if necessary) in constant time independent of the number of prefixes. also, i'm planing to make the size of hash table controllable by module parameter.
the side effect of this is that prefix_mode=1, where prefixes are regular expressions, would have to be dropped, because hashing on regular expressions makes no sense.
something close to prefix_mode=1 could be achieved using dialplan module to store the regular expressions and return as attribute a gateway grp_id. that could then be used by new load_gws_from_group(grp_id) to load gws in the given group.
any comments on this? is someone depending on prefix_mode=1?
by the way, before starting this project, i took a look at opensips drouting module, but unfortunately it does not cover many features of lcr module that i use in my configurations and thus cannot be used to replace lcr module.
-- juha
Users mailing list Users@lists.kamailio.org http://lists.kamailio.org/cgi-bin/mailman/listinfo/users
Alex Balashov writes:
What about some sort of hashing algorithm that operates on regular expression sums or is otherwise tuned by regex elements?
the problem is: given a string, find from set of prefixes the longest one that matches the string. if prefixes are known strings, they can be organized in a data structure (tree, hash table, etc), where matching is fast to execute.
if prefixes are regular expressions, i don't know any other solution than linearly match them one by one to string.
-- juha
Juha Heinanen wrote:
Alex Balashov writes:
What about some sort of hashing algorithm that operates on regular expression sums or is otherwise tuned by regex elements?
the problem is: given a string, find from set of prefixes the longest one that matches the string. if prefixes are known strings, they can be organized in a data structure (tree, hash table, etc), where matching is fast to execute.
if prefixes are regular expressions, i don't know any other solution than linearly match them one by one to string.
The solution, I imagine, would be to devise a tree or hash structure wherein certain generalisable and atomic elements of regular expression parsing are used to place the keys.
That said, I do not know what such an algorithm would be, cannot point to any concrete implementations as examples, etc. and probably ought to shut up. :)
Besides, I frequently make the erroneous assumption that such data structures and optimisations exist where they don't, or at least are not implemented. For instance, I was recently party to a thread on a list for network service providers using Cisco gear in which it was revealed to me, contrary to what I have characteristically and habitually assumed, that ordinary access lists are actually evaluated in a linear way; thus, it makes sense to place the most frequently-matched rules near the top of the list if it is long.
I had no idea. I just took for granted that of course the matching is done using some sort of compiled tree or hash structure that gives matching performance somewhere between O(1) and O(N).
So, don't listen to me. :)
Cheers,