Jump to content

Archived

This topic is now archived and is closed to further replies.

haku

Regex help

Recommended Posts

I'm at a bit of a loss with this. Here's the code:

$text = preg_replace($patterns, $replacements, $text);

 

Here's $patterns:

Array
(
   [0] => /(?<!">)\$2010.sa/i
   [1] => /(?<!">)\$2010/i
)

 

And here's $replacements:

Array
(
   [0] => <a href="http://www.example.com/2010/sa">$2010.sa</a>
   [1] => <a href="http://www.example.com/2010">$2010</a>
)

 

Here's $text before the line of code at the top:

$2010 $2010.sa

 

And here's $text after the replacement:

<a href="http://www.example.com/2010">10</a> <a href="http://www.example.com/2010/sa">10.sa</a>

 

As you can see, before the replacement, $replacements contains two HTML anchors. The text of these anchors are $2010.sa and $2010 respectively. Yet after the replacement, the text is changed from $2010.sa to 10.sa and $2010 to 10.

 

Can someone see how/why that would happend?

 

 

Actual code:

print_r($patterns);
print_r($replacements);
echo $text . PHP_EOL;
$text = preg_replace($patterns, $replacements, $text);
echo $text;

Share this post


Link to post
Share on other sites

You need to escape the dot in the first pattern, otherwise it'll be counted as a wildcard.

 

As for why the second replacement happens, it's because you didn't properly escape the dollar sign. You need to add a double-escaped slash in front if it, to send an escaped dollar sign to the engine.

Like this:

$patterns = array ('/(?<!>)\\\\$2010\\.sa/i', '/(?<!>)\\\\$2010/i');

Share this post


Link to post
Share on other sites

Interesting. That didn't actually work for me, but I found that the problem was in my replacement - I wasn't escaping the $, so it was acting as a back reference. Once I escaped it in my replacements, it all works fine.

 

Array
(
   [0] => <a href="http://en.ba/company/2010/sa">\$2010.sa</a>
   [1] => <a href="http://en.ba/company/2010">\$2010</a>
)

 

But

 

Thanks for your help - I hadn't actually realized that I hadn't escaped the . in my patterns, so I included that. It was working on all other cases that I'm testing, this pattern with a $ followed by numbers was the last pattern that was failing (we've been testing for months). But it's very possible this could catch me up in the future somewhere, so I'm glad you caught that.

Share this post


Link to post
Share on other sites

D'oh! Turns out I should have been looking at the source, as I broke the RegExp with my double-escaping. >< Sorry about that.

Anyway, glad to see that you found out the proper culprit, in hindsight I should have spotted that too. You're welcome btw, for the thing that I did catch. Those unescaped periods can be quite sneaky indeed, as they match themselves in addition to every other character (except newlines). Makes for some exceptionally sneaky bugs down the road.

Share this post


Link to post
Share on other sites

It probably would have too, because this isn't a regex I threw together in a few minutes, It's been refined over a few months, and I STILL didn't catch it :)

Share this post


Link to post
Share on other sites

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.