[sr-dev] str_str function

Henning Westerholt hw at kamailio.org
Wed Jun 13 15:10:19 CEST 2012


Am Dienstag, 12. Juni 2012, 13:05:18 schrieb Vicente Hernando:
> str_str function in lib/cds/sstr.c file has a comment: /* FIXME: 
> reimplement using better algorithm */
> 
> Attached patch reimplements that function using 
> Boyer-Moore-Horspool-Raita algorithm, which is the one most text editors 
> use to search for substrings.
> 
> Boyer-Moore-Horspool-Raita algorithm is a simplification from 
> Boyer-Moore one, with less preprocessing overhead.
> 
> 
> If n is the string length, and m is substring length, naive brute force 
> algorithm has a cost of O(n*m).
> 
> Raita algoritm has same cost in worst case but Ω(n/m) a better average
> cost.
>  As a drawback it adds a preprocessing cost of O(m + charset_length)
> where charset_length is 256 for us, because we use char.
> 
> 
> Another way to approach this substring search issue would be separate it 
> into two functions, a preprocessing one, and a searching one. Then in 
> module initializacion where the function is used, calculate the needed 
> preprocess and when the function is called do the search only.
> 
> 
> This Raita algorithm could also be useful for str_search function in 
> ut.c. Its usefulness depends on n, and m lengths, the bigger they are 
> the better the algoritm works.
> 
> http://en.wikipedia.org/wiki/String_searching_algorithm
> http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore%E2%80%93Horspool_algorithm
> http://www-igm.univ-mlv.fr/~lecroq/string/node22.html

Hallo Vincente,

thanks for your patch. One question, would it not make sense to use the strstr 
function of <string.h>, with a small wrapper, which is probably quite 
optimised instead of implementing it our self? 

Best regards,

Henning



More information about the sr-dev mailing list