mrbean Posted December 11, 2011 Share Posted December 11, 2011 Hi, I have worked the whole day / all day to fix this but it still doesn't work I am trying to match only URL's What I did try to do is use this pattern: (((https|http):\/\/)|www\.|)[a-zA-Z1-9-]{0,9}(\.[a-zA-Z1-9-]{1,5}\.[a-zA-Z1-9-]{1,5}|\.[a-zA-Z1-9-]{1,5}) It must match these URL's: google.com www.google.com http://google.com https://google.com http://www.google.com https://www.google.com google.co.uk www.google.co.uk http://google.co.uk https://google.co.uk http://www.google.co.uk https://www.google.co.uk But it doesn't work Can someone please help me with this. Thank you in advance for your support. Link to comment https://forums.phpfreaks.com/topic/252933-php-pcre-url-matching-pattern/ Share on other sites More sharing options...
Winstons Posted December 11, 2011 Share Posted December 11, 2011 Your RegExp is big and bad. Try my code $str = 'www.google.com http://google.com https://google.com http://www.google.com https://www.google.com google.co.uk www.google.co.uk http://google.co.uk https://google.co.uk http://www.google.co.uk https://www.google.co.uk'; preg_match_all("#((?:https?://)?(?:www\.)?[-a-z\d]{1,9}\.[-a-z\d]{2,5}(?:\.[-a-z\d]{2,4})?)#is", $str, $match); echo '<pre>'.htmlspecialchars(print_r($match, 1)).'</pre>'; Result is Array ( [0] => Array ( [0] => www.google.com [1] => http://google.com [2] => https://google.com [3] => http://www.google.com [4] => https://www.google.com [5] => google.co.uk [6] => www.google.co.uk [7] => http://google.co.uk [8] => https://google.co.uk [9] => http://www.google.co.uk [10] => https://www.google.co.uk ) [1] => Array ( [0] => www.google.com [1] => http://google.com [2] => https://google.com [3] => http://www.google.com [4] => https://www.google.com [5] => google.co.uk [6] => www.google.co.uk [7] => http://google.co.uk [8] => https://google.co.uk [9] => http://www.google.co.uk [10] => https://www.google.co.uk ) ) Link to comment https://forums.phpfreaks.com/topic/252933-php-pcre-url-matching-pattern/#findComment-1296747 Share on other sites More sharing options...
mrbean Posted December 11, 2011 Author Share Posted December 11, 2011 That also doesn't work if my string is: www.goog it matches www.goog isn't a complete url Link to comment https://forums.phpfreaks.com/topic/252933-php-pcre-url-matching-pattern/#findComment-1296752 Share on other sites More sharing options...
Winstons Posted December 11, 2011 Share Posted December 11, 2011 www.goog isn't a complete url www - is correct domain name goog - goog, too, fits the pattern. therefore believes it is right RegExp. If you you want correct url get, you must to enumerate a list of domains Try it $str = ' www.google.com http://google.com https://google.com http://www.google.com https://www.google.com google.co.uk www.google.co.uk http://google.co.uk https://google.co.uk http://www.google.co.uk https://www.google.co.uk www.goo go.ru google.lol '; preg_match_all("#(?:https?://)?(?:www\.)?[-a-z\d]{2,9}\.(?(1)[-a-z\d]{2,5}|(?:co|com|uk|us|ru|org|net))(\.[-a-z\d]{2,4})?#is", $str, $match); echo '<pre>'.(print_r($match, 1)).'</pre>'; Link to comment https://forums.phpfreaks.com/topic/252933-php-pcre-url-matching-pattern/#findComment-1296793 Share on other sites More sharing options...
ragax Posted December 20, 2011 Share Posted December 20, 2011 Hi MrBean, I made a simple expression to match all your urls but not www.goog (?i)\b(?:http[s]?://)?(?(?=www.)www.)(?:[-a-z\d]+\.)+[a-z]{2,4} There are a million ways to match urls, so depending on your needs, you may want to tweak it. Is this what you were looking for? Let me know if I can help further. Link to comment https://forums.phpfreaks.com/topic/252933-php-pcre-url-matching-pattern/#findComment-1299573 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.