Regex help

haku · January 11, 2013

I'm at a bit of a loss with this. Here's the code:

$text = preg_replace($patterns, $replacements, $text);

Here's $patterns:

Array
(
   [0] => /(?<!">)\$2010.sa/i
   [1] => /(?<!">)\$2010/i
)

And here's $replacements:

Array
(
   [0] => <a href="http://www.example.com/2010/sa">$2010.sa</a>
   [1] => <a href="http://www.example.com/2010">$2010</a>
)

Here's $text before the line of code at the top:

$2010 $2010.sa

And here's $text after the replacement:

<a href="http://www.example.com/2010">10</a> <a href="http://www.example.com/2010/sa">10.sa</a>

As you can see, before the replacement, $replacements contains two HTML anchors. The text of these anchors are $2010.sa and $2010 respectively. Yet after the replacement, the text is changed from $2010.sa to 10.sa and $2010 to 10.

Can someone see how/why that would happend?

Actual code:

print_r($patterns);
print_r($replacements);
echo $text . PHP_EOL;
$text = preg_replace($patterns, $replacements, $text);
echo $text;

Christian F. · January 11, 2013

You need to escape the dot in the first pattern, otherwise it'll be counted as a wildcard.

As for why the second replacement happens, it's because you didn't properly escape the dollar sign. You need to add a double-escaped slash in front if it, to send an escaped dollar sign to the engine.

Like this:

$patterns = array ('/(?<!>)\\\\$2010\\.sa/i', '/(?<!>)\\\\$2010/i');

haku · January 11, 2013

Interesting. That didn't actually work for me, but I found that the problem was in my replacement - I wasn't escaping the $, so it was acting as a back reference. Once I escaped it in my replacements, it all works fine.

Array
(
   [0] => <a href="http://en.ba/company/2010/sa">\$2010.sa</a>
   [1] => <a href="http://en.ba/company/2010">\$2010</a>
)

But

Thanks for your help - I hadn't actually realized that I hadn't escaped the . in my patterns, so I included that. It was working on all other cases that I'm testing, this pattern with a $ followed by numbers was the last pattern that was failing (we've been testing for months). But it's very possible this could catch me up in the future somewhere, so I'm glad you caught that.

Christian F. · January 11, 2013

D'oh! Turns out I should have been looking at the source, as I broke the RegExp with my double-escaping. >< Sorry about that.

Anyway, glad to see that you found out the proper culprit, in hindsight I should have spotted that too. You're welcome btw, for the thing that I did catch. Those unescaped periods can be quite sneaky indeed, as they match themselves in addition to every other character (except newlines). Makes for some exceptionally sneaky bugs down the road.

haku · January 12, 2013

It probably would have too, because this isn't a regex I threw together in a few minutes, It's been refined over a few months, and I STILL didn't catch it

Sign In

Regex help

Recommended Posts

haku

Link to comment

Share on other sites

Christian F.

Link to comment

Share on other sites

haku

Link to comment

Share on other sites

Christian F.

Link to comment

Share on other sites

haku

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information