Jump to content


Photo

PHP / Sockets / Slow / Bad coding


  • Please log in to reply
6 replies to this topic

#1 A584537

A584537
  • New Members
  • Pip
  • Newbie
  • 6 posts

Posted 17 October 2006 - 08:44 PM

Hi, I'm pretty new to PHP and I'm having a lot of trouble with my PHP code.
I was wondering if you could help me :)
<?php
$page2="";
$pages=array();
$page="";
$i=0;
$j=0;
$k=0;
$o=0;
$url="";
$socket;


function getpage($name) {

$indexs=array("index.html", "index.php", "index.htm", "index.shtml", "index.asp");
$matchnum = sizeof($indexs);
$service_port=80;
$address = gethostbyname("$name");

 for($i=0;$i<$matchnum;$i++) 
 {
 
 $socket = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);

if ($socket < 0) {
   echo "socket_create() failed: reason:";
}
 
 $result = socket_connect($socket, $address, $service_port);

 if ($result < 0) {
    echo "socket_connect() failed.\nReason: ($result) " . socket_strerror($result) . "\n";
}

$in = "GET /$indexs[$i] HTTP/1.1\r\n";
$in .= "Host: $name\r\n";
$in .= "Accept: */*\r\nUser-Agent: Mozilla/4.0\r\n\r\n";
$out = '';

socket_write($socket, $in, strlen($in));

 while ($out = socket_read($socket, 300))
 {
    $page2 .= $out;
    }
   
   
   if(!strstr($page2, "404"))
   {
   break;
   return $page2;
   }
    
socket_close($socket);
  }

 }
$url=$_POST['url'];

if($url==NULL)
{

echo "
<html>
<head>
<title>Url Parser for Daniel Markus - clickvalue.nl er</title>
</head>
<body>
<form method=\"post\" action=\"$PHP_SELF\">


<p>
Url:<input type=\"text\" name=\"url\"><BR>

The url can be one url, or a list seperated by commas.</p>
<input name='submit' type='submit' value='Submit'>

</form>


</body></html>
";
}
else if($url!=NULL)
 {
  $url=$_POST['url'];
  $url = str_replace(" ", "", $url);
  $urls = explode(",", $url);

  echo sizeof($urls[0]);
  
for($r=0;$r<sizeof($urls);$r++)
{
echo $urls[$i];
$pages[$r]=getpage($urls[$i]);
checkit($pages[$r]);
}

} 


function checkit($page) {
$arr=array("omniture", "sitestat", "urchintracker", "HEAD"); 
$matchnum = sizeof($indexs);
$matchnum2 = sizeof($arr);

echo "Got to check();<br>";
 
 for($i=0;$i<$matchnum2;$i++);
 {
  echo $currenturl . " " . $i;
  
  if(strstr($pages[$i], $arr[$i]) == FALSE)
  {
echo "Didn't find $urls[$i]: $arr[$i]<br>";
  }
  else
  {
  echo "In $urls[$i]: $currentarr<br>";
  }
  
  }
}

?>
I want to parse the urls I get from the text input box, then connect to that site and find the index file, then look for certain strings, if they're found then print that out

This script is doesn't even complete on my server, I've messed something up big time I think

I've tried loads of different things, different ways to handle it but it's just not happening.

#2 rab

rab
  • Members
  • PipPipPip
  • Advanced Member
  • 155 posts

Posted 17 October 2006 - 08:48 PM

You know cURL is good for that too

#3 A584537

A584537
  • New Members
  • Pip
  • Newbie
  • 6 posts

Posted 17 October 2006 - 08:50 PM

You know cURL is good for that too

Everybody keeps saying that lol :)
I'm gonna learn it after I finish this, but I need to get this done within the next few hours lol :) So I'm gonna have to learn after if I find a good tutorial

#4 A584537

A584537
  • New Members
  • Pip
  • Newbie
  • 6 posts

Posted 17 October 2006 - 09:03 PM


Fatal error: Call to undefined function curl_init() in G:\www\daniel.php on line 3

Are variables outside of my function 'getpage()' not available to that function?


#5 rab

rab
  • Members
  • PipPipPip
  • Advanced Member
  • 155 posts

Posted 17 October 2006 - 09:40 PM

<?php

function _getPage( $pagename, $end ) 
{
	$indexes	= array("index.html", "index.php", "index.htm", "index.shtml", "index.asp");
	$pages		= "";
	
	for( $i = 0; $i < sizeof( $indexes ); $i++ ) 
	{
		$ch = curl_init("http://www.$pagename.$end/{$indexes[$i]}");
		curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
		curl_setopt($ch, CURLOPT_HEADER, 1);
		$pageData = curl_exec($ch);
		curl_close($ch);

		if( !preg_match("/HTTP\/1\.1 404 Not Found/", $pageData) ) 
		{
			$pages .= $pageData."\n";
			print "[200] Adding http://www.$pagename.com/{$indexes[$i]}\n";
		}
		else
		{
			print "[404] http://www.$pagename.com/{$indexes[$i]} not found\n";
		}
	}
	return $pages;
}

_getPage( "google", "com" );
?> 


#6 A584537

A584537
  • New Members
  • Pip
  • Newbie
  • 6 posts

Posted 17 October 2006 - 09:57 PM

<?php

function _getPage( $pagename, $end ) 
{
	$indexes	= array("index.html", "index.php", "index.htm", "index.shtml", "index.asp");
	$pages		= "";
	
	for( $i = 0; $i < sizeof( $indexes ); $i++ ) 
	{
		$ch = curl_init("http://www.$pagename.$end/{$indexes[$i]}");
		curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
		curl_setopt($ch, CURLOPT_HEADER, 1);
		$pageData = curl_exec($ch);
		curl_close($ch);

		if( !preg_match("/HTTP\/1\.1 404 Not Found/", $pageData) ) 
		{
			$pages .= $pageData."\n";
			print "[200] Adding http://www.$pagename.com/{$indexes[$i]}\n";
		}
		else
		{
			print "[404] http://www.$pagename.com/{$indexes[$i]} not found\n";
		}
	}
	return $pages;
}

_getPage( "google", "com" );
?> 

Curl is so good lol :)
If I wasn't so stupid this script would have only taken me 30 minutes instead of 4 hours :/
Thanks for your help and the kick up the ass to start using curl :)

#7 rab

rab
  • Members
  • PipPipPip
  • Advanced Member
  • 155 posts

Posted 17 October 2006 - 10:10 PM

No problem, don't try to reinvent the wheel  :P




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users