Jump to content

[SOLVED] Why does this regexp eat the first character??


carlos1234

Recommended Posts

I have been going round and round on this for a couple of days and just can't figure it out.  Would appreciate any insight anyone might have as to why this is happening. 

 

What I am trying to do...

 

I want to replace the value of a setting found on the right hand side of the equal sign in the $text variable below.  Instead of the value being 6 spaces followed by "16em" I want it to end up being 6 spaces followed by "24em". 

 

Instead I end up with space followed by "4em".  Why?

 

The preg_match seems to work just fine. 

 

Here is my test code...

 

<?php
$text = "LeftWidth =      16em\n";
preg_match('/([a-zA-Z]*)\s*=(\s*?)([ a-zA-Z0-9]*)\n/U' , $text, $matches);
echo "\$matches[0] = [$matches[0]]<br>";
echo "\$matches[1] = [$matches[1]]<br>";
echo "\$matches[2] = [$matches[2]]<br>";
echo "\$matches[3] = [$matches[3]]<br>";
$newvalue = "24em";
$text = preg_replace('/([a-zA-Z]*)\s*=(\s*?)([ a-zA-Z0-9]*)\n/U', "$1=$2$newvalue\n", $text);
echo "[$text]<br>";
?>

 

Any input would be greatly appreciated. 

 

Thanks.

 

Carlos

Link to comment
Share on other sites

using

$text = "LeftWidth =      16em\n";
preg_match_all('/([a-zA-Z]*)\s*=(\s*?)([ a-zA-Z0-9]*)\n/i', $text, $result);
$newvalue = "24em";

$text = preg_replace('/([a-zA-Z]*)\s*=(\s*?)([ a-zA-Z0-9]*)\n/i', "$1=$2_REMOVE_$newvalue", $text);
$text = str_replace("_REMOVE_",'',$text);
echo "[$text]<br />\n";

I can get the result you want.

The reason why this happens is because when the text is parsed the "newvalue" variable exists so it puts in "$1=$224em\n"

This is where preg_replace tries to parse it and may get confused by this. You could use a space instead of _REMOVE_ but then str_replace would so easily take out every space in the whole thing.

Link to comment
Share on other sites

Thanks very much for the input.  Not sure if preg_replace really gets confused as you say though I did figure out that if I put a space or any other character between the $2 and $newvalue that it works as it should. 

 

Is there no other way to just use a regexp in preg_replace without having to work around preg_replace's "confusion" as you mentioned? 

 

Carlos

Link to comment
Share on other sites

I think it does get confused as Kloplop321 says. It seemingly looks for the 22nd match. With normal strings/variables you can avoid the confusing by using curly brackets to close off the variable...

 

"$1={$2}$newvalue\n" 

 

...but after a quick look at the preg_match manual the syntax is slightly differen't...

 

"$1 = \${2}$newvalue\n",

Link to comment
Share on other sites

Not sure if preg_replace really gets confused as you say…

 

The problem is preg_replace getting "confused", in a sense.  Consider the following:

 

$subject  = "foobar";
$baz      = "9";
$replace  = "$2$1$baz";
$replaced = preg_replace("/(foo)(bar)/", $replace, $subject);

 

$replace will contain the string "$2$19".  This is fed into preg_replace which looks for capturing group #2 (which contains foo) and #19! That does not exist, so it will be substituted with an empty string leaving $replaced to contain just bar.  Bear in mind that these dollar replacements can only go up to 99, this is important for your particular problem.

 

With the replacement string of "$1=$2$newvalue\n", the value "$1=$224em\n" will be passed in to be replaced. It will then try to substitute $1 which exists with value "LeftWidth", and $22 which doesn't exist so gets a value of "[/tt]" (empty string) leaving us with "[tt]LeftWidth=4em".

 

To resolve this issue, you need to change the way you reference the dollar replacements. First, wrap the number in curly braces to make it clear which number we are referring to ($2 becomes ${2}).  Next, because we're using double quotes we need to escape the dollar else the parser will look for an actual variable called $2 (which does not exist): ${2} becomes \${2}.  To put this into your example, the replacement string should be "$1=\${2}$newvalue\n".

 

P.S. cags beat me to it, really need to learn to get around to reading and answering more quickly.

Link to comment
Share on other sites

I haven't tried the solution proposed yet but it makes sense and I am excited to implement it!  You all rock! 

 

I wouldn't have figured out what you all said in a million years of Googling everything I could get my hands on regarding regular expressions. 

 

Thank so much again!! 

 

Carlos

Link to comment
Share on other sites

Hmm...interesting.  Yes I can see that now but the documentation mentions "//1" (the old style backreferencing)...not $1 (the new style). 

 

I don't think I would have made the connection between that old style and new style and figured out that the problem was related to that on my own without your all's help. 

 

But thanks for pointing that out though.  Appreciate it. 

 

Carlos

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.