53329 Posted September 4, 2008 Share Posted September 4, 2008 Hi, I've found a few php scripts that can properly get the hostname from a url (most of the time...) but they always seem to fail if its an IP. They keep cutting off the first octet thinking that it is a subdomain. It sounds simple but for some reason I can't figure out how to tell if the url I have is a domain name like: http://subdomain.example.com/index.php or if it's just an IP like: http://72.14.207.99/index.php If I could determine that a url is an IP it would save me so much time. Either finding out how to do that or a php script that actually parses URLs well would solve all of my problems. Quote Link to comment Share on other sites More sharing options...
effigy Posted September 4, 2008 Share Posted September 4, 2008 Do you want to match both, or just IPs (see below)? <pre> <?php $data = <<<DATA http://subdomain.example.com/index.php or if it's just an IP like: http://72.14.207.99/index.php DATA; preg_match_all('%http://[\d.]+\S+(?<!\p{P})%', $data, $matches); print_r($matches); ?> </pre> Quote Link to comment Share on other sites More sharing options...
nrg_alpha Posted September 4, 2008 Share Posted September 4, 2008 You can do a crude format check... $str = 'http://72.14.207.99/index.php'; // $str = 'http://subdomain.example.com/index.php'; // uncomment this line and comment the line above to test the other way around... echo (preg_match('#http://[0-9]+(\.[0-9]+)+/#', $str))? 'true' : 'false'; While this pattern doesn't check to see if the IP is a valid formatted one, it does check to see if it is an IP format (of some sort). Quote Link to comment Share on other sites More sharing options...
53329 Posted September 5, 2008 Author Share Posted September 5, 2008 Thanks a lot. I'll use the second one simply because a true/false works better in this scenario. I was using it to referer block the domains of common hotlinkers. If it's an IP I just loose the http and everything after the first forward slash. If it's not I run it through my other script. Works like a charm Quote Link to comment Share on other sites More sharing options...
53329 Posted September 5, 2008 Author Share Posted September 5, 2008 Actually I may as well see if there is a fix for this because it would save a lot of time. I played around with it and made the IP regex a little more refined. If I get a match for IP how do I omit the line for the host, subdomain and domain? As is I can only get one or the other to work by alternating commented lines. $url = 'http://72.14.207.99:80/index.php?test=1#2'; //$url = 'http://subdomain.example.com/index.php'; $r = "^(??P<scheme>\w+)://)?"; $r .= "(??P<login>\w+)?P<pass>\w+)@)?"; $r.="(??P<ip>(([0-9]{1,3}+\.){3}+[0-9]{1,3})))?"; //$r .= "(?P<host>(??P<subdomain>[-\w\.]+)\.)?" . "(?P<domain>[-\w]+\.(?P<extension>\w+)))"; $r .= "(?:?P<port>\d+))?"; $r .= "(?P<path>[\w/]*/(?P<file>\w+(?:\.\w+)?)?)?"; $r .= "(?:\?(?P<arg>[\w=&]+))?"; $r .= "(?:#(?P<anchor>\w+))?"; $r = "!$r!"; // Delimiters preg_match($r, $url,$parsed); print_r($parsed); Quote Link to comment Share on other sites More sharing options...
nrg_alpha Posted September 5, 2008 Share Posted September 5, 2008 That's quite the mind boggling expression you have going there.. As far as being able to determine if it is an IP based url or not, could you not do something a little more simple? //$str = 'http://72.14.207.99/index.php'; $str = 'http://subdomain.example.com/index.php'; if(preg_match('#http://([0-9]+(\.[0-9]+)+)/#', $str, $match)){ // IP address detected... echo $match[1]; // your isolated IP address.. do what you will with it.. } else { // Non IP URL found instead... echo 'Non IP URL detected'; } Quote Link to comment Share on other sites More sharing options...
53329 Posted September 5, 2008 Author Share Posted September 5, 2008 Well I can grap the IP no problem now. Its combing it with my url parsing script that is the problem. Put simply I need a way for backrefrence "host" not to match anything if the backrefrence "ip" does. If I can find regex to do that I can probably work it in. Quote Link to comment Share on other sites More sharing options...
53329 Posted September 5, 2008 Author Share Posted September 5, 2008 Fixed. Here is the code for anyone who wants it. It should parse any url conceivable (I stress should). //$str = 'http://72.14.256.99:80/index.php?test=1#2'; $str = 'http://subdomain.example.com/index.php'; $r = "^(??P<scheme>\w+)://)?"; $r .= "(??P<login>\w+)?P<pass>\w+)@)?"; $ip="(?:[0-9]{1,3}+\.){3}+[0-9]{1,3}";//checks for ip $not_ip="(?P<subdomain>[-\w\.]+)\.)?(?P<domain>[-\w]+\.(?P<extension>\w+))";//always works so we need a conditional $r.="(?P<host>(?(?=" . $ip . ")(?P<ip>" . $ip . ")|(?:" . $not_ip . "))"; $r .= "(?:?P<port>\d+))?"; $r .= "(?P<path>[\w/]*/(?P<file>\w+(?:\.\w+)?)?)?"; $r .= "(?:\?(?P<arg>[\w=&]+))?"; $r .= "(?:#(?P<anchor>\w+))?"; $r = "!$r!"; // Delimiters preg_match($r, $str,$parsed); print_r($parsed); if($parsed['ip']) { if(long2ip(ip2long($parsed['ip']))==$parsed['ip']) { echo "<br />Valid IP"; } else { echo "<br />Invalid IP"; } } if($parsed['domain']) { //not an ip } Thanks nrf_alpha and effigy. Quote Link to comment Share on other sites More sharing options...
angelfashion Posted September 9, 2008 Share Posted September 9, 2008 fiuh finally found this solution hehe Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.