Jump to content

Need help with regular expression for detecting and ignoring url


samij586

Recommended Posts

I'm trying to write a script that looks at user input, and replaces a pair of _'s with u tags.  This is obviously fairly trivial.  the problem comes when i run into a url that has underscores in it.  At the minimum it ruins the link, at the worst it ruins the whole page by leaving an unclosed u. From what I'm reading, i need a negative lookbehind assertion, but my experiments with that have proven less than successful.

The best RE i've come up with so far is:

 

/(?:[[:space:]])(?!http:\/\/)\S+/

 

which does find every word that doesn't begin with an http, but in order to do the replacement, i need to end up having the whole string with url's removed.

a few things that may make it easier:

- all url's, at this point in my script, begin with http://

- there are no newlines in the input

 

an example string that might be processed follows:

Input:

this is an _underlined link to http://google.com/search?q=_ ._

Output:

this is an underlined link to http://google.com/search?q=_ .

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.