Jump to content

Lookbehind not working?


joe92

Recommended Posts

Hi there, I am baffled by this.

 

I have two preg_replaces running to account for almost every type of hyperlink on a page and in each is a lookbehind to make sure the hyperlink is not prefixed with 'href=' as that would indicate that it has already been run through a different preg_replace and placed in an <a> tag. Here are the regex's:

$message = preg_replace("/(?<!href=\")((https?:\/\/)(www\.|ftp\.)?[\w\-]+\.[^\s\<]+)(?!\")/ism", "<a href=\"\\1\" target=\"_blank\">\\1</a>", $message);
$message = preg_replace("/(?<!href=\")(https?:\/\/)?((www|ftp)\.[\w\-]+\.[^\s\<]+)(?!\")/ism", "<a href=\"http://\\2\" target=\"_blank\">\\2</a>", $message);

 

They both work OK except that the lookbehind doesn't seem to work:

Input:
http://www.google.com
www.google.com
http://google.com
www.google.co.uk
http://google.co.uk 

Output:
<a href="http://<a href="http://www.google.com"" target="_blank">www.google.com"</a> target="_blank"><a href="http://www.google.com" target="_blank">www.google.com</a></a>
<a href="http://www.google.com" target="_blank">www.google.com</a>
<a href="http://google.com" target="_blank">http://google.com</a>
<a href="http://www.google.co.uk" target="_blank">www.google.co.uk</a>
<a href="http://google.co.uk" target="_blank">http://google.co.uk</a> 

 

It would appear that because the first one can be run by both preg_replace's, it is being. Does anybody know why this is happening? Even with the lookbehind. I thought I had it all working, but it seems like I was wrong.

 

Thank you for your help

Joe.

Link to comment
Share on other sites

Ok, I think I know what is happening. The first regex turned it link into an <a> tag with the contents as the name of the website. So when the second regex was run it saw the hyperlink in the middle of the tag and converted that. But on that premise it should have ended up going like so should it not which wouldn't explain the <a> tag within the href:

link:
http://www.google.com

First regex into:
<a href="http://www.google.com">http://www.google.com</a>

Second regex into:
<a href="http://www.google.com"><a href="http://www.google.com">http://www.google.com</a>

 

Ahhhh no, I see now. The second regex is saying the http can be there or not be there so it read the href and didn't continue with the regex but then saw the www. which it can start with as long as it doesn't begin with the href but it didn't so it continued. Ha. Sorry about all this dialogue. I just figured it out as I was writing it. Now to get around that...

Link to comment
Share on other sites

Well, got it solved. Try this, it works:

 

$message = "
<br/>
<br/>
<br/>
I said https://www.google.com<br/>
I said http://www.google.com<br/>
I said www.google.com<br/>
I said http://google.com yeah<br/>
I said www.google.co.uk maybe<br/>
I said ftp.google.co.uk<br/>
I said http://ftp.google.co.uk<br/>
I said http://google.co.uk";

$message1 = preg_replace("/(?<!href=\")((https?:\/\/)(www\.|ftp\.)?[\w\-]+\.[^\s\<]+)(?!\")/ism", "<a href=\"\\1\" target=\"_blank\">\\1</a>", $message);
$message2 = preg_replace("/(?<!http:\/\/)(?<!https:\/\/)((www|ftp)\.[\w\-]+\.[^\s\<]+)(?!\")/ism", "<a href=\"http://\\1\" target=\"_blank\">\\1</a>", $message1);

echo $message2;

 

Obviously those ftp addresses go nowhere though ;)

 

The first preg_replace gets every link with a http:// before it but maybe not with a www so that means it grabs:

http://www and

http://

 

Which meant that the only way left to write a link is:

www.

 

That means the second regex should only search for those with www and should in fact make sure it does not start with http or https. Since a lookbehind has to be fixed length that meant sticking two lookbehinds in the second regex. Anyway, tis solved now. Posting this explanation should anyone chance upon this in the future.

 

Take care,

Joe

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.