[OpenSER-Devel] [ openser-Bugs-1819248 ] replace_all, replace_body_all, subst anchor bug

SourceForge.net noreply at sourceforge.net
Tue Nov 13 09:09:44 UTC 2007


Bugs item #1819248, was opened at 2007-10-24 14:54
Message generated for change (Comment added) made by bogdan_iancu
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=743020&aid=1819248&group_id=139143

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: core
Group: ver 1.2.x
Status: Closed
Resolution: Accepted
Priority: 5
Private: No
Submitted By: Alexander Bergolth (bergolth)
Assigned to: Bogdan (bogdan_iancu)
Summary: replace_all, replace_body_all, subst anchor bug

Initial Comment:
Hi!

The replace_all, replace_body_all and subst functions all fail to match the beginning of the next line, when the last match ended with a newline.

e.g. when trying to remove the three lines
-------------------- 8< --------------------
a=alt:1 3 : xmv2v5mo ix7lpuPK 137.208.90.150 13726.
a=alt:2 2 : lPFc4bEV udepPIjW 192.168.217.1 13726.
a=alt:3 1 : 4o1caMp2 9P6XBBmd 192.168.172.1 13726.
-------------------- 8< --------------------

... replace_body_all("^a=alt:.*\r\n","") will remove only the first and the third line, it won't match the second line.

The reason is that after the first match, eflags|=REG_NOTBOL is set, so that in all subsequent matches, "^" won't match the beginning of the string. Unfortunately, since the last match ended with a newline, "^" _should_ match at the beginning of the string in this case.

Maybe the REG_NOTBOL flag should be set depending on the last character before the next match starts?

This bug is also present in subst_run() of re.c in the openser core.

I've attached a patch that fixes the problem for me.
(Not very well tested.)

Cheers,
--leo


----------------------------------------------------------------------

>Comment By: Bogdan (bogdan_iancu)
Date: 2007-11-13 11:09

Message:
Logged In: YES 
user_id=1275325
Originator: NO

OK - implemented ! :)

Thanks and regards,
Bogdan

----------------------------------------------------------------------

Comment By: Alexander Bergolth (bergolth)
Date: 2007-11-13 00:17

Message:
Logged In: YES 
user_id=24695
Originator: YES

Yes, this sounds like a good idea. :)
Cheers,
--leo


----------------------------------------------------------------------

Comment By: Bogdan (bogdan_iancu)
Date: 2007-11-12 18:39

Message:
Logged In: YES 
user_id=1275325
Originator: NO

Hi Alex,

Thanks for pointing out the issue with empty matches. I took a closer look
at the code and indeed, in re.c, such cases are reported and reported as
errors:
            if (pmatch[0].rm_so==pmatch[0].rm_eo){
                LM_ERR("matched string is empty... invalid regexp?\n");
                goto error;
            }

My sugestion will be to do the same in textops module. And in this case,
we can drop the "begining of string" test.
What do you think?

Regards,
Bogdan

----------------------------------------------------------------------

Comment By: Bogdan (bogdan_iancu)
Date: 2007-11-12 18:03

Message:
Logged In: YES 
user_id=1275325
Originator: NO

OK - I exteneded the test to include the case of a zero-len matched
string.

Thanks and regards,
Bogdan

----------------------------------------------------------------------

Comment By: Alexander Bergolth (bergolth)
Date: 2007-11-12 18:01

Message:
Logged In: YES 
user_id=24695
Originator: YES

Please note that my test for empty matches will only avoid accessing a
character outside the buffer in case of an empty first match.

If null-matches are possible, there is a general problem with the code,
since it will continue with the next regexec where the last match ended,
which, after a null-match, is the still the same position as the previous
run.
There is already a check for null-matches in re.c but in replace_all_f()
and replace_body_all_f() in textops.c, this potential loop isn't detected.

Cheers,
--leo


----------------------------------------------------------------------

Comment By: Bogdan (bogdan_iancu)
Date: 2007-11-12 17:03

Message:
Logged In: YES 
user_id=1275325
Originator: NO

Ok - I see your point. I will take care and propagate the empty test
also.

Thanks and regards,
Bogdan

----------------------------------------------------------------------

Comment By: Alexander Bergolth (bergolth)
Date: 2007-11-12 16:47

Message:
Logged In: YES 
user_id=24695
Originator: YES

Hi!

Sorry for the late reply.
I did that initial position check, because I do think it's possible to
have null-string matches.
How about a regex like "x?" or "(x|)"?

At least with pcre, you have to explicitly disallow null-matches, see
PCRE_NOTEMPTY.

But if null-matches are really possible, all regexp-functions that will
continue where the last match ended, will loop. So maybe we should
explicitly handle that case and check if a match is empty?

Cheers,
--leo


----------------------------------------------------------------------

Comment By: Bogdan (bogdan_iancu)
Date: 2007-11-12 16:17

Message:
Logged In: YES 
user_id=1275325
Originator: NO

Hi Alexander,

I pushed your patch on SVN with some small changes - see previous
comment.

Thanks and regards,
Bogdan

----------------------------------------------------------------------

Comment By: Bogdan (bogdan_iancu)
Date: 2007-11-09 12:53

Message:
Logged In: YES 
user_id=1275325
Originator: NO

Hi Alexander,

I was reviewing the patch and it looks correct for me excepting one thing:
when setting/resetting the flag you test to see if the position is not the
start one: "if (p==input || *(p-1)=='\n')" . But I think p cannot equal
input anytime as this test is done after the first run of regexec and p
will point at the end of the matched string (which cannot be empty). So
p!=input all the time.....But maybe I'm missing something...:)

Thanks and regards,
Bogdan

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=743020&aid=1819248&group_id=139143



More information about the Devel mailing list