doa24uk Posted July 31, 2009 Share Posted July 31, 2009 Hi guys, Here's my code. It stips the URL down to site.com rather than having http://www.site.com $linkurl = "http://site.com"; // Split link to just get domain name function parse_url_domain ($url) { $parsed = parse_url($url); $hostname = $parsed['host']; return $hostname; } $raw_url = parse_url($linkurl); $domain_only =str_replace ('www.','', $raw_url); echo $domain_only['host']; exit(); The problem is I'm using this to strip URLs that have various structures & need to strip them ALL down to sitename.tld eg. http://www4.site2.com http://site3.com http://www15.site4.com Is there a way to tell the script to knock off everything in these cases so we're left with site2.com site3.com site4.com ????????? :facewall: Quote Link to comment Share on other sites More sharing options...
WolfRage Posted July 31, 2009 Share Posted July 31, 2009 This could get very complex very quickly. But if those are the only types of url then try this. <?php $url=array_reverse(explode('.',$url),TRUE); $url=$url[1].$url[0]; ?> Now if you need it to do more with more difficult url's let me know and I will take this further. Quote Link to comment Share on other sites More sharing options...
doa24uk Posted July 31, 2009 Author Share Posted July 31, 2009 I'm not entirely sure how to integrate that with my initial script ... Quote Link to comment Share on other sites More sharing options...
WolfRage Posted July 31, 2009 Share Posted July 31, 2009 <?php $url = http://site.com; // Split link to just get domain name function parse_url_domain ($url) { $parsed = parse_url($url); $hostname = $parsed['host']; return $hostname; } $raw_url = parse_url($url); $url=array_reverse(explode('.',$raw_url),TRUE); $domain_only=$url[1].$url[0]; echo $domain_only; exit(); ?> Quote Link to comment Share on other sites More sharing options...
doa24uk Posted July 31, 2009 Author Share Posted July 31, 2009 This just outputs - Array Quote Link to comment Share on other sites More sharing options...
WolfRage Posted July 31, 2009 Share Posted July 31, 2009 The problem was, you did not actually call your function. <?php $url = 'http://site.com'; // Split link to just get domain name function parse_url_domain ($url) { $parsed = parse_url($url); $hostname = $parsed['host']; return $hostname; } $raw_url = parse_url_domain($url); $url=array_reverse(explode('.',$raw_url),TRUE); $domain_only=$url[1].$url[0]; echo $domain_only; exit(); ?> Quote Link to comment Share on other sites More sharing options...
doa24uk Posted July 31, 2009 Author Share Posted July 31, 2009 Sorry to be a pain but this isn't working again..... Your exact code - <?php $url = 'http://site.com'; // Split link to just get domain name function parse_url_domain ($url) { $parsed = parse_url($url); $hostname = $parsed['host']; return $hostname; } $raw_url = parse_url_domain($url); $url=array_reverse(explode('.',$raw_url),TRUE); $domain_only=$url[1].$url[0]; echo $domain_only; exit(); ?> Gives the following results $url values first followed by outputted results http://site.com --> comsite http://www.site.com --> sitewww http://www2.site.com --> siteww2 ------------- So I removed the $url[0] to give the following code, but this is still erroring (although it's closer) <?php $url = 'http://www2.site.com'; // Split link to just get domain name function parse_url_domain ($url) { $parsed = parse_url($url); $hostname = $parsed['host']; return $hostname; } $raw_url = parse_url_domain($url); $url=array_reverse(explode('.',$raw_url),TRUE); $domain_only=$url[1]; echo $domain_only; exit(); ?> The above code gives the following results http://site.com --> com http://www.site.com --> site http://www2.site.com --> site ---- So to my mind what we need strip the .com or .net (I will only be using TLDs with one . - so no .co.uk for example) then parse the URL..... I'm one step away, help please! Quote Link to comment Share on other sites More sharing options...
WolfRage Posted July 31, 2009 Share Posted July 31, 2009 K make this line <?php $url=array_reverse(explode('.',$raw_url),TRUE); ?> Like this. <?php $url=array_reverse(explode('.',$raw_url)); ?> Now we need to take it further, now you need to do checks with in the array for each TLD and remove that from the array, also you need to check for each possibility of www eg www1 www2 etc. I will post back soon when I have taken this further. Quote Link to comment Share on other sites More sharing options...
roopurt18 Posted July 31, 2009 Share Posted July 31, 2009 <?php $urls = array(); $urls[] = 'http://www4.site2.com'; $urls[] = 'http://site3.com'; $urls[] = 'http://www15.site4.com'; $urls[] = 'http://www15.site4.com/alsjdflsjf/asdflajsfdlajsf?lsdjflj&lsdjflsf&lsjdflsjf=alsdjflasjf&lsfjlsdf'; foreach( $urls as $url ) { echo get_domain( $url ) . '<br />'; } /** * Extracts the domain from a URL * * @param string $url * @return string */ function get_domain( $url ) { if( strpos( $url, '://' ) !== false ) { list( $throwAway, $url ) = explode( '://', $url ); } if( strpos( $url, '/' ) !== false ) { $url = explode( '/', $url ); $url = array_shift( $url ); } return $url; } ?> Quote Link to comment Share on other sites More sharing options...
roopurt18 Posted July 31, 2009 Share Posted July 31, 2009 Updated for what you want: <?php $urls = array(); $urls[] = 'http://www4.site2.com'; $urls[] = 'http://site3.com'; $urls[] = 'http://www15.site4.com'; $urls[] = 'http://www15.site4.com/alsjdflsjf/asdflajsfdlajsf?lsdjflj&lsdjflsf&lsjdflsjf=alsdjflasjf&lsfjlsdf'; foreach( $urls as $url ) { echo get_domain( $url ) . '<br />'; } /** * Extracts the domain from a URL * * @param string $url * @return string */ function get_domain( $url ) { if( strpos( $url, '://' ) !== false ) { list( $throwAway, $url ) = explode( '://', $url ); } if( strpos( $url, '/' ) !== false ) { $url = explode( '/', $url ); $url = array_shift( $url ); } if( preg_match( '/^www[^.]*\..*/', $url ) ) { $url = explode( '.', $url ); array_shift( $url ); $url = implode( '.', $url ); } return $url; } ?> Quote Link to comment Share on other sites More sharing options...
WolfRage Posted July 31, 2009 Share Posted July 31, 2009 Ok so regular expression matching blows my method out of the water. But this is what I had come up with so far although far from perfected. <?php $www=array('www','www1','www2'); $tld=array('co','com','uk','net','org','us','biz'); $url = 'http://www2.site.com'; // Split link to just get domain name function parse_url_domain ($url) { $parsed = parse_url($url); return $parsed['host']; } $url=parse_url_domain($url); echo $url.'<br />'; $url=array_reverse(explode('.',$url)); var_dump($url); echo '<br />'; foreach($www as $key=>$value) { if(in_array($value,$url)) { unset($url[array_search($value,$url)]); } } foreach($tld as $key=>$value) { if(in_array($value,$url)) { unset($url[array_search($value,$url)]); } } var_dump($url); echo '<br />'; $domain_only=$url[1]; echo $domain_only; exit(); ?> By the way roopurt18 that is a genius little script. Utilizes some of what I started and then accounts for any case by using perg_match(). Quote Link to comment Share on other sites More sharing options...
roopurt18 Posted July 31, 2009 Share Posted July 31, 2009 I don't know why I didn't just use regexp on the whole thing. <?php $urls = array(); $urls[] = 'http://www4.site2.com'; $urls[] = 'http://site3.com'; $urls[] = 'http://www15.site4.com'; $urls[] = 'http://www15.site4.com/alsjdflsjf/asdflajsfdlajsf?lsdjflj&lsdjflsf&lsjdflsjf=alsdjflasjf&lsfjlsdf'; foreach( $urls as $url ) { echo print_r( get_domain( $url ) ) . '<br />'; } /** * Extracts the domain from a URL * * @param string $url * @return string | boolean */ function get_domain( $url ) { if( preg_match( '@^http://(www[^.]+\.)?([^/]+)@', $url, $matches ) ) { return $matches[2]; } return false; } ?> Quote Link to comment Share on other sites More sharing options...
doa24uk Posted July 31, 2009 Author Share Posted July 31, 2009 Awesome guys! @roopurt - that did the trick once I stopped being an idiot & actually looked at the code you'd provided @Wolfrage - thank you for all your help, pity roopurt pipped you to the prize but we (by that I mean you) would have got there soon enough! This is part of a larger script I'm designing & I would like to credit both of you, where would you like the links pointing for the credits?? Quote Link to comment Share on other sites More sharing options...
roopurt18 Posted July 31, 2009 Share Posted July 31, 2009 No need to provide credit. But if you feel you must you can just point back at the URL of this topic. Or my user profile. Let me know if you need my PayPal to pay royalties and such. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.