Jump to content

strpos() returning empty


hellonoko

Recommended Posts

I am using strpos() to compare URLS.

 

However in my function it doesn't seem to return anything. When I copy the bit of code out into its own page or outside of my function it works.

 

Any ideas?

 

Code is on line 74.

 

Thanks.

 

<?php

//error_reporting(E_ALL);

//echo $site_url = 'http://www.empreintes-digitales.fr/';
$target_url = "http://www.empreintes-digitales.fr";

//$target_url = 'http://redthreat.wordpress.com/';
//$target_url= 'http://www.kissatlanta.com/blog/';
//$target_url= 'http://www.empreintes-digitales.fr/';

$url = "";
$link = "";

$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';

crawl_page( $target_url, $userAgent);

function crawl_page( $target_url, $userAgent)
{
	$ch = curl_init();

	curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
	curl_setopt($ch, CURLOPT_URL,$target_url);
	curl_setopt($ch, CURLOPT_FAILONERROR, true);
	curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
	curl_setopt($ch, CURLOPT_AUTOREFERER, true);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
	curl_setopt($ch, CURLOPT_TIMEOUT, 10);

	$html = curl_exec($ch);

	if (!$html) 
	{
		echo "<br />cURL error number:" .curl_errno($ch);
		echo "<br />cURL error:" . curl_error($ch);
		exit;
	}

	//
	// load scrapped data into the DOM
	//

	$dom = new DOMDocument();
	@$dom->loadHTML($html);

	//
	// get only LINKS from the DOM with XPath
	//

	$xpath = new DOMXPath($dom);
	$hrefs = $xpath->evaluate("/html/body//a");

	//
	// go through all the links and store to db or whatever
	//
	for ($i = 0; $i < $hrefs->length; $i++) 
	{
		$href = $hrefs->item($i);
		$url = $href->getAttribute('href');

		$links_1[$link] = $url;

		//echo $absolute_links[$link] = relative2absolute($target_url, $url);
		//echo '<br>';

		//if the $url does not contain the web site base address: http://www.thesite.com/ then add it onto the front


		echo gettype($url);
		echo gettype($target_url);

		echo '<b>';
		echo $pos = strpos($url , $target_url);
		echo '</b>';

		if ( $pos == FALSE )
		{
			echo 'INCOMPLETE: '.$url;
			echo '<br>';
		}
		else
		{
			echo 'COMPLETE: '.$url;
			echo '<br>';
		}

	}
}

Link to comment
https://forums.phpfreaks.com/topic/150087-strpos-returning-empty/
Share on other sites

if the case returns FALSE, then yes, it returns 'empty'.

<?php
echo "{".strpos('abc', 'a')."}<br>";
echo "{".strpos('abc', 'b')."}<br>";
echo "{".strpos('abc', 'c')."}<br>";
echo "{".strpos('abc', 'd')."}<br>";
?>

that code outputs

{0}

{1}

{2}

{}

 

I think this explains it pretty well:

http://us2.php.net/manual/en/function.strpos.php

 

To be more specific:

 

function checkURL($url, $target_url)
{
	echo $url.'<br>';
	echo $target_url.'<br>';

	echo gettype($url).'<br>';
	echo gettype($target_url).'<br>';

	echo '<b>';
	echo $pos = strpos($url , $target_url);
	echo '</b>';


}

 

Returns:

http://empreintes-digitales.fr/board/register.php
http://www.empreintes-digitales.fr
string
string
http://empreintes-digitales.fr/board/login.php?action=forget
http://www.empreintes-digitales.fr
string
string
#
http://www.empreintes-digitales.fr
string
string
http://66.102.9.104/translate_c?hl=fr&sl=fr&tl=en&u=www.empreintes-digitales.fr/index.php
http://www.empreintes-digitales.fr
string
string

 

And on and on. Nothing from

$pos = strpos()

 

 

actually it is comparing correctly.

 

"http://empreintes-digitales.fr/board/register.php"

simply does not contain the string:

"http://www.empreintes-digitales.fr"

 

perhaps you want to shorten your target_url a bit?

I don't know what the purpose of this is, but you could strip the target url down to 1st and 2nd level domains and get far better matches.

since

"http://empreintes-digitales.fr/board/register.php"

does contain

"empreintes-digitales.fr"

 

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.