Jump to content

Need help with regular expression for detecting and ignoring url


samij586

Recommended Posts

I'm trying to write a script that looks at user input, and replaces a pair of _'s with u tags.  This is obviously fairly trivial.  the problem comes when i run into a url that has underscores in it.  At the minimum it ruins the link, at the worst it ruins the whole page by leaving an unclosed u. From what I'm reading, i need a negative lookbehind assertion, but my experiments with that have proven less than successful.

The best RE i've come up with so far is:

 

/(?:[[:space:]])(?!http:\/\/)\S+/

 

which does find every word that doesn't begin with an http, but in order to do the replacement, i need to end up having the whole string with url's removed.

a few things that may make it easier:

- all url's, at this point in my script, begin with http://

- there are no newlines in the input

 

an example string that might be processed follows:

Input:

this is an _underlined link to http://google.com/search?q=_ ._

Output:

this is an underlined link to http://google.com/search?q=_ .

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.