Jump to content

[SOLVED] str_replace different url structures


doa24uk

Recommended Posts

Hi guys,

 

Here's my code. It stips the URL down to site.com rather than having http://www.site.com

 

$linkurl = "http://site.com";


// Split link to just get domain name

function parse_url_domain ($url) {
$parsed = parse_url($url);
$hostname = $parsed['host'];
return $hostname;
}

$raw_url = parse_url($linkurl);
$domain_only =str_replace ('www.','', $raw_url);
echo $domain_only['host'];
exit();

 

The problem is I'm using this to strip URLs that have various structures & need to strip them ALL down to sitename.tld

 

eg.

 

http://www4.site2.com

http://site3.com

http://www15.site4.com

 

Is there a way to tell the script to knock off everything in these cases so we're left with

 

site2.com

site3.com

site4.com

 

?????????

 

:facewall: :facewall:

Link to comment
Share on other sites

This could get very complex very quickly. But if those are the only types of url then try this.

<?php
$url=array_reverse(explode('.',$url),TRUE);
$url=$url[1].$url[0];
?>

Now if you need it to do more with more difficult url's let me know and I will take this further.

Link to comment
Share on other sites

<?php
$url = http://site.com;
// Split link to just get domain name

function parse_url_domain ($url) {
  $parsed = parse_url($url);
  $hostname = $parsed['host'];
  return $hostname;
}

$raw_url = parse_url($url);
$url=array_reverse(explode('.',$raw_url),TRUE);
$domain_only=$url[1].$url[0];
echo $domain_only;
exit();
?>

Link to comment
Share on other sites

The problem was, you did not actually call your function.

<?php
$url = 'http://site.com';
// Split link to just get domain name

function parse_url_domain ($url) {
  $parsed = parse_url($url);
  $hostname = $parsed['host'];
  return $hostname;
}

$raw_url = parse_url_domain($url);
$url=array_reverse(explode('.',$raw_url),TRUE);
$domain_only=$url[1].$url[0];
echo $domain_only;
exit();
?>

Link to comment
Share on other sites

Sorry to be a pain but this isn't working again.....

 

Your exact code -

 

<?php
$url = 'http://site.com';
// Split link to just get domain name

function parse_url_domain ($url) {
  $parsed = parse_url($url);
  $hostname = $parsed['host'];
  return $hostname;
}

$raw_url = parse_url_domain($url);
$url=array_reverse(explode('.',$raw_url),TRUE);
$domain_only=$url[1].$url[0];
echo $domain_only;
exit();
?>

 

Gives the following results $url values first followed by outputted results

 

http://site.com -->  comsite

http://www.site.com --> sitewww

http://www2.site.com --> siteww2

 

-------------

 

So I removed the $url[0] to give the following code, but this is still erroring (although it's closer)

 

<?php
$url = 'http://www2.site.com';
// Split link to just get domain name

function parse_url_domain ($url) {
  $parsed = parse_url($url);
  $hostname = $parsed['host'];
  return $hostname;
}

$raw_url = parse_url_domain($url);
$url=array_reverse(explode('.',$raw_url),TRUE);
$domain_only=$url[1];
echo $domain_only;
exit();
?>

 

The above code gives the following results

 

http://site.com -->  com

http://www.site.com --> site

http://www2.site.com --> site

 

----

 

 

So to my mind what we need strip the .com or .net (I will only be using TLDs with one . - so no .co.uk for example) then parse the URL.....

 

I'm one step away, help please!

Link to comment
Share on other sites

K make this line

<?php
$url=array_reverse(explode('.',$raw_url),TRUE);
?>

Like this.

<?php
$url=array_reverse(explode('.',$raw_url));
?>

Now we need to take it further, now you need to do checks with in the array for each TLD and remove that from the array, also you need to check for each possibility of www eg www1 www2 etc.  I will post back soon when I have taken this further.

Link to comment
Share on other sites

<?php
$urls = array();
$urls[] = 'http://www4.site2.com';
$urls[] = 'http://site3.com';
$urls[] = 'http://www15.site4.com';
$urls[] = 'http://www15.site4.com/alsjdflsjf/asdflajsfdlajsf?lsdjflj&lsdjflsf&lsjdflsjf=alsdjflasjf&lsfjlsdf';

foreach( $urls as $url ) {
    echo get_domain( $url ) . '<br />';
}

/**
* Extracts the domain from a URL
* 
* @param string $url
* @return string
*/
function get_domain( $url ) {
    if( strpos( $url, '://' ) !== false ) {
        list( $throwAway, $url ) = explode( '://', $url );
    }
    if( strpos( $url, '/' ) !== false ) {
        $url = explode( '/', $url );
        $url = array_shift( $url );
    }
    return $url;
}
?>

Link to comment
Share on other sites

Updated for what you want:

 

<?php
$urls = array();
$urls[] = 'http://www4.site2.com';
$urls[] = 'http://site3.com';
$urls[] = 'http://www15.site4.com';
$urls[] = 'http://www15.site4.com/alsjdflsjf/asdflajsfdlajsf?lsdjflj&lsdjflsf&lsjdflsjf=alsdjflasjf&lsfjlsdf';

foreach( $urls as $url ) {
    echo get_domain( $url ) . '<br />';
}

/**
* Extracts the domain from a URL
* 
* @param string $url
* @return string
*/
function get_domain( $url ) {
    if( strpos( $url, '://' ) !== false ) {
        list( $throwAway, $url ) = explode( '://', $url );
    }
    if( strpos( $url, '/' ) !== false ) {
        $url = explode( '/', $url );
        $url = array_shift( $url );
    }
    if( preg_match( '/^www[^.]*\..*/', $url ) ) {
        $url = explode( '.', $url );
        array_shift( $url );
        $url = implode( '.', $url );
    }
    return $url;
}
?>

Link to comment
Share on other sites

Ok so regular expression matching blows my method out of the water. But this is what I had come up with so far although far from perfected.

<?php
$www=array('www','www1','www2');
$tld=array('co','com','uk','net','org','us','biz');
$url = 'http://www2.site.com';
// Split link to just get domain name
function parse_url_domain ($url) {
  $parsed = parse_url($url);
  return $parsed['host'];
}
$url=parse_url_domain($url);
echo $url.'<br />';
$url=array_reverse(explode('.',$url));
var_dump($url);
echo '<br />';
foreach($www as $key=>$value) {
  if(in_array($value,$url)) {
    unset($url[array_search($value,$url)]);
  }
}
foreach($tld as $key=>$value) {
  if(in_array($value,$url)) {
    unset($url[array_search($value,$url)]);
  }
}
var_dump($url);
echo '<br />';
$domain_only=$url[1];
echo $domain_only;
exit();
?>

By the way roopurt18 that is a genius little script. Utilizes some of what I started and then accounts for any case by using perg_match().

Link to comment
Share on other sites

I don't know why I didn't just use regexp on the whole thing.

 

<?php
$urls = array();
$urls[] = 'http://www4.site2.com';
$urls[] = 'http://site3.com';
$urls[] = 'http://www15.site4.com';
$urls[] = 'http://www15.site4.com/alsjdflsjf/asdflajsfdlajsf?lsdjflj&lsdjflsf&lsjdflsjf=alsdjflasjf&lsfjlsdf';

foreach( $urls as $url ) {
    echo print_r( get_domain( $url ) ) . '<br />';
}

/**
* Extracts the domain from a URL
* 
* @param string $url
* @return string | boolean
*/
function get_domain( $url ) {
    if( preg_match( '@^http://(www[^.]+\.)?([^/]+)@', $url, $matches ) ) {
        return $matches[2];
    }
    return false;
}
?>

Link to comment
Share on other sites

Awesome guys!

 

@roopurt - that did the trick once I stopped being an idiot & actually looked at the code you'd provided

@Wolfrage - thank you for all your help, pity roopurt pipped you to the prize but we (by that I mean you) would have got there soon enough!

 

This is part of a larger script I'm designing & I would like to credit both of you, where would you like the links pointing for the credits??

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.