The14thGOD Posted August 18, 2009 Share Posted August 18, 2009 Sorry couldn't resist. I'm looking to have a preg_replace that matches all the ways to match a url and then replace it with a working link (yep...) Here's what I got so far. <?php $row['body'] = preg_replace('/^(https?:\/\/)|(www.)?([a-z0-9\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/','<a href="\\1\\2" target="_blank">\\2</a>',$row['body']); ?> I'm not sure how on how to make it so that either the http or www part can both be there, one or the other, or neither be there. I'm not sure I even have it written right (probably not) I also am not sure how to write the 2nd part since the http(s)/www is optional. I think it could be like this, but it is kind of long, I'm assuming it could be chopped down a bit? <?php $row['body'] = preg_replace('/^(https?:\/\/|https?:\/\/www.|www.)?([a-z0-9\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/','<a href="\\1\\2" target="_blank">\\2</a>',$row['body']); ?> Any help/improvements is greatly appreciated. Justin Quote Link to comment https://forums.phpfreaks.com/topic/170892-1-url-regex-to-rule-them-all/ Share on other sites More sharing options...
thebadbad Posted August 18, 2009 Share Posted August 18, 2009 Aah, the ever returning URL regex I once tried to write my own, but gave up when I got to the last components. Got some parts right though, so have a look if you want: <?php function is_url($url = false) { if ($url === false) { return false; } $filefolderchars = '[a-z0-9+´!"¤%&()=`@£$€^¨\~*\';,.-]'; $pattern = '~'. #opening pattern delimiter '^'. #start of string '[a-z][a-z0-9+.-]*://'. #scheme ''. #userinfo (optional) (not implemented) '(?:'. #hostname/IP '((?:[a-z0-9](?:[a-z0-9-]{0,61}[a-z0-9])?\.)+[a-z]{2,6})'. #hostname (1 or more subdomains, length 1-63; TLD, length 2-6) '|'. #or '(??:(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.){3}(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))'. #IP ')'. '(?:?:6553[0-5]|655[0-2][0-9]|65[0-4][0-9]{2}|6[0-4][0-9]{3}|[1-5][0-9]{4}|[1-9][0-9]{0,3}|0))?'. #port (range 0-65535) (optional) '/?'. #trailing slash (optional) '(?:'. #rest is optional '(?<=/)'. #must be preceded by a slash '(?:' . $filefolderchars . '+/?)*'. #path and filename (optional) #'(?:;(?:[^=;]+(?:=[^;]+)?)+)?'. #parameter ')?'. '$'. #end of string '~'. #closing pattern delimiter 'iD'; #pattern modifiers (case-insensitivity, $ end-only) $test = preg_match($pattern, $url, $matches); if ($test && (strlen($matches[1]) <= 253)) { return true; } else { return false; } } ?> Quote Link to comment https://forums.phpfreaks.com/topic/170892-1-url-regex-to-rule-them-all/#findComment-901425 Share on other sites More sharing options...
roopurt18 Posted August 18, 2009 Share Posted August 18, 2009 I don't have the time to help you with your URL regexp, but I can offer a piece of advice. Whenever I try to right a real nasty regexp I break it apart into pieces. $protocol = '(http|https)'; $domain = '(regexp_to_match_domain)'; $url_regexp = "/{$protocol}{$domain}/"; // then combine them together It will take a bit of time to get it right, but in doing it this way you divide and conquer and can easily test any one of the individual parts. Also, read the specification from W3C on URIs. Quote Link to comment https://forums.phpfreaks.com/topic/170892-1-url-regex-to-rule-them-all/#findComment-901437 Share on other sites More sharing options...
Garethp Posted August 19, 2009 Share Posted August 19, 2009 I'd use something like ~(https?://)?(www\.)?([a-zA-Z0-9]+?\.)?[a-zA-Z0-9]\.[a-zA-Z]{2,3}(\.[a-zA-Z]{2,3})?(.+)?$~ So it makes http, with s, option, and www. optional, then you have the subdomain which is optional, then the main domain with a dot then the .com or .net or whatever, then an optional country code, then match anything that comes after Quote Link to comment https://forums.phpfreaks.com/topic/170892-1-url-regex-to-rule-them-all/#findComment-901597 Share on other sites More sharing options...
The14thGOD Posted August 21, 2009 Author Share Posted August 21, 2009 Thank you all for your replies. Sorry I havn't had internet for the last couple of days so I was unable to look at these and respond in a reasonable amount of time. Garethp, this looks pretty good (I'm not amazing at RegEx, and I'll have to look up some things again to fully understand it). roopurt18, that's a good idea and a lot easier to read haha. thebadbad, thank you, when I have a chance I might dip deeper into this, though I don't know if I'll need that much URL validation . Does anyone have any suggestion on how I could put this together as a hyperlink (html). I'd rather avoid the full url as the actual link cause it can look kinda ugly. I'd like it to be something like: URL: http://www.somerandomsite.com/stuff/hi.html <a href="http://www.somerandomsite.com/stuff/hi.html">somerandomsize.com/stuff/hi.html</a> Slightly better to look at. It would be idea to just fit it into the text (instead of being the url it's shorten to "this site" or something) but I don't think that's possible...I can think of a way but I don't think it would be very user friendly and probably be more work than it's worth. CMS's are fun! Quote Link to comment https://forums.phpfreaks.com/topic/170892-1-url-regex-to-rule-them-all/#findComment-903429 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.