[sr-dev] str_str function

Vicente Hernando vhernando at systemonenoc.com
Wed Jun 13 15:47:59 CEST 2012


Looking into strstr man page: char *strstr(const char *haystack, const 
char *needle);

the patch advantage would be it already knows string lengths and has not 
to calculate them. Otherwise I expect strstr use a similar to Raita 
algoritm.

Regards,
Vicente.

On 06/13/2012 03:34 PM, Vicente Hernando wrote:
> Hello Henning,
>
> On 06/13/2012 03:10 PM, Henning Westerholt wrote:
>> Am Dienstag, 12. Juni 2012, 13:05:18 schrieb Vicente Hernando:
>>> str_str function in lib/cds/sstr.c file has a comment: /* FIXME:
>>> reimplement using better algorithm */
>>>
>>> Attached patch reimplements that function using
>>> Boyer-Moore-Horspool-Raita algorithm, which is the one most text 
>>> editors
>>> use to search for substrings.
>>>
>>> Boyer-Moore-Horspool-Raita algorithm is a simplification from
>>> Boyer-Moore one, with less preprocessing overhead.
>>>
>>>
>>> If n is the string length, and m is substring length, naive brute force
>>> algorithm has a cost of O(n*m).
>>>
>>> Raita algoritm has same cost in worst case but Ω(n/m) a better average
>>> cost.
>>>   As a drawback it adds a preprocessing cost of O(m + charset_length)
>>> where charset_length is 256 for us, because we use char.
>>>
>>>
>>> Another way to approach this substring search issue would be 
>>> separate it
>>> into two functions, a preprocessing one, and a searching one. Then in
>>> module initializacion where the function is used, calculate the needed
>>> preprocess and when the function is called do the search only.
>>>
>>>
>>> This Raita algorithm could also be useful for str_search function in
>>> ut.c. Its usefulness depends on n, and m lengths, the bigger they are
>>> the better the algoritm works.
>>>
>>> http://en.wikipedia.org/wiki/String_searching_algorithm
>>> http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore%E2%80%93Horspool_algorithm 
>>>
>>> http://www-igm.univ-mlv.fr/~lecroq/string/node22.html
>> Hallo Vincente,
>>
>> thanks for your patch. One question, would it not make sense to use 
>> the strstr
>> function of<string.h>, with a small wrapper, which is probably quite
>> optimised instead of implementing it our self?
> actually, I do not know. Kamailio provides some functions that are 
> already available in some libraries, like isdigit, etc.
>
> Is there any kamailio policy about what functions to use?
>
> str_str function is used in presence_b2b SER module.
>
> We would manage further optimization by separating substring 
> preprocessing and searching in the string, because we know we are 
> always going to search for "accepted", "terminated" and "pending".
>
> Best regards,
> Vicente
>
>
>> Best regards,
>>
>> Henning
>>
>
>
> _______________________________________________
> sr-dev mailing list
> sr-dev at lists.sip-router.org
> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev




More information about the sr-dev mailing list