mrbean Posted December 11, 2011 Share Posted December 11, 2011 Hi, I have worked the whole day / all day to fix this but it still doesn't work I am trying to match only URL's What I did try to do is use this pattern: (((https|http):\/\/)|www\.|)[a-zA-Z1-9-]{0,9}(\.[a-zA-Z1-9-]{1,5}\.[a-zA-Z1-9-]{1,5}|\.[a-zA-Z1-9-]{1,5}) It must match these URL's: google.com www.google.com http://google.com https://google.com http://www.google.com https://www.google.com google.co.uk www.google.co.uk http://google.co.uk https://google.co.uk http://www.google.co.uk https://www.google.co.uk But it doesn't work Can someone please help me with this. Thank you in advance for your support. Quote Link to comment Share on other sites More sharing options...
Winstons Posted December 11, 2011 Share Posted December 11, 2011 Your RegExp is big and bad. Try my code $str = 'www.google.com http://google.com https://google.com http://www.google.com https://www.google.com google.co.uk www.google.co.uk http://google.co.uk https://google.co.uk http://www.google.co.uk https://www.google.co.uk'; preg_match_all("#((?:https?://)?(?:www\.)?[-a-z\d]{1,9}\.[-a-z\d]{2,5}(?:\.[-a-z\d]{2,4})?)#is", $str, $match); echo '<pre>'.htmlspecialchars(print_r($match, 1)).'</pre>'; Result is Array ( [0] => Array ( [0] => www.google.com [1] => http://google.com [2] => https://google.com [3] => http://www.google.com [4] => https://www.google.com [5] => google.co.uk [6] => www.google.co.uk [7] => http://google.co.uk [8] => https://google.co.uk [9] => http://www.google.co.uk [10] => https://www.google.co.uk ) [1] => Array ( [0] => www.google.com [1] => http://google.com [2] => https://google.com [3] => http://www.google.com [4] => https://www.google.com [5] => google.co.uk [6] => www.google.co.uk [7] => http://google.co.uk [8] => https://google.co.uk [9] => http://www.google.co.uk [10] => https://www.google.co.uk ) ) Quote Link to comment Share on other sites More sharing options...
mrbean Posted December 11, 2011 Author Share Posted December 11, 2011 That also doesn't work if my string is: www.goog it matches www.goog isn't a complete url Quote Link to comment Share on other sites More sharing options...
Winstons Posted December 11, 2011 Share Posted December 11, 2011 www.goog isn't a complete url www - is correct domain name goog - goog, too, fits the pattern. therefore believes it is right RegExp. If you you want correct url get, you must to enumerate a list of domains Try it $str = ' www.google.com http://google.com https://google.com http://www.google.com https://www.google.com google.co.uk www.google.co.uk http://google.co.uk https://google.co.uk http://www.google.co.uk https://www.google.co.uk www.goo go.ru google.lol '; preg_match_all("#(?:https?://)?(?:www\.)?[-a-z\d]{2,9}\.(?(1)[-a-z\d]{2,5}|(?:co|com|uk|us|ru|org|net))(\.[-a-z\d]{2,4})?#is", $str, $match); echo '<pre>'.(print_r($match, 1)).'</pre>'; Quote Link to comment Share on other sites More sharing options...
ragax Posted December 20, 2011 Share Posted December 20, 2011 Hi MrBean, I made a simple expression to match all your urls but not www.goog (?i)\b(?:http[s]?://)?(?(?=www.)www.)(?:[-a-z\d]+\.)+[a-z]{2,4} There are a million ways to match urls, so depending on your needs, you may want to tweak it. Is this what you were looking for? Let me know if I can help further. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.