[sr-dev] str_str function
Vicente Hernando
vhernando at systemonenoc.com
Wed Jun 13 15:34:37 CEST 2012
Hello Henning,
On 06/13/2012 03:10 PM, Henning Westerholt wrote:
> Am Dienstag, 12. Juni 2012, 13:05:18 schrieb Vicente Hernando:
>> str_str function in lib/cds/sstr.c file has a comment: /* FIXME:
>> reimplement using better algorithm */
>>
>> Attached patch reimplements that function using
>> Boyer-Moore-Horspool-Raita algorithm, which is the one most text editors
>> use to search for substrings.
>>
>> Boyer-Moore-Horspool-Raita algorithm is a simplification from
>> Boyer-Moore one, with less preprocessing overhead.
>>
>>
>> If n is the string length, and m is substring length, naive brute force
>> algorithm has a cost of O(n*m).
>>
>> Raita algoritm has same cost in worst case but Ω(n/m) a better average
>> cost.
>> As a drawback it adds a preprocessing cost of O(m + charset_length)
>> where charset_length is 256 for us, because we use char.
>>
>>
>> Another way to approach this substring search issue would be separate it
>> into two functions, a preprocessing one, and a searching one. Then in
>> module initializacion where the function is used, calculate the needed
>> preprocess and when the function is called do the search only.
>>
>>
>> This Raita algorithm could also be useful for str_search function in
>> ut.c. Its usefulness depends on n, and m lengths, the bigger they are
>> the better the algoritm works.
>>
>> http://en.wikipedia.org/wiki/String_searching_algorithm
>> http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore%E2%80%93Horspool_algorithm
>> http://www-igm.univ-mlv.fr/~lecroq/string/node22.html
> Hallo Vincente,
>
> thanks for your patch. One question, would it not make sense to use the strstr
> function of<string.h>, with a small wrapper, which is probably quite
> optimised instead of implementing it our self?
actually, I do not know. Kamailio provides some functions that are
already available in some libraries, like isdigit, etc.
Is there any kamailio policy about what functions to use?
str_str function is used in presence_b2b SER module.
We would manage further optimization by separating substring
preprocessing and searching in the string, because we know we are always
going to search for "accepted", "terminated" and "pending".
Best regards,
Vicente
> Best regards,
>
> Henning
>
More information about the sr-dev
mailing list