Jump to content

PHP / Sockets / Slow / Bad coding


A584537

Recommended Posts

Hi, I'm pretty new to PHP and I'm having a lot of trouble with my PHP code.
I was wondering if you could help me :)
[code]
<?php
$page2="";
$pages=array();
$page="";
$i=0;
$j=0;
$k=0;
$o=0;
$url="";
$socket;


function getpage($name) {

$indexs=array("index.html", "index.php", "index.htm", "index.shtml", "index.asp");
$matchnum = sizeof($indexs);
$service_port=80;
$address = gethostbyname("$name");

for($i=0;$i<$matchnum;$i++)
{

$socket = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);

if ($socket < 0) {
  echo "socket_create() failed: reason:";
}

$result = socket_connect($socket, $address, $service_port);

if ($result < 0) {
    echo "socket_connect() failed.\nReason: ($result) " . socket_strerror($result) . "\n";
}

$in = "GET /$indexs[$i] HTTP/1.1\r\n";
$in .= "Host: $name\r\n";
$in .= "Accept: */*\r\nUser-Agent: Mozilla/4.0\r\n\r\n";
$out = '';

socket_write($socket, $in, strlen($in));

while ($out = socket_read($socket, 300))
{
    $page2 .= $out;
    }
 
 
  if(!strstr($page2, "404"))
  {
  break;
  return $page2;
  }
   
socket_close($socket);
  }

}
$url=$_POST['url'];

if($url==NULL)
{

echo "
<html>
<head>
<title>Url Parser for Daniel Markus - clickvalue.nl er</title>
</head>
<body>
<form method=\"post\" action=\"$PHP_SELF\">


<p>
Url:<input type=\"text\" name=\"url\"><BR>

The url can be one url, or a list seperated by commas.</p>
<input name='submit' type='submit' value='Submit'>

</form>


</body></html>
";
}
else if($url!=NULL)
{
  $url=$_POST['url'];
  $url = str_replace(" ", "", $url);
  $urls = explode(",", $url);

  echo sizeof($urls[0]);
 
for($r=0;$r<sizeof($urls);$r++)
{
echo $urls[$i];
$pages[$r]=getpage($urls[$i]);
checkit($pages[$r]);
}

}


function checkit($page) {
$arr=array("omniture", "sitestat", "urchintracker", "HEAD");
$matchnum = sizeof($indexs);
$matchnum2 = sizeof($arr);

echo "Got to check();<br>";

for($i=0;$i<$matchnum2;$i++);
{
  echo $currenturl . " " . $i;
 
  if(strstr($pages[$i], $arr[$i]) == FALSE)
  {
echo "Didn't find $urls[$i]: $arr[$i]<br>";
  }
  else
  {
  echo "In $urls[$i]: $currentarr<br>";
  }
 
  }
}

?>
[/code]
I want to parse the urls I get from the text input box, then connect to that site and find the index file, then look for certain strings, if they're found then print that out

This script is doesn't even complete on my server, I've messed something up big time I think

I've tried loads of different things, different ways to handle it but it's just not happening.
Link to comment
https://forums.phpfreaks.com/topic/24264-php-sockets-slow-bad-coding/
Share on other sites

[quote author=rab link=topic=111827.msg453352#msg453352 date=1161118098]
You know cURL is good for that too
[/quote]
Everybody keeps saying that lol :)
I'm gonna learn it after I finish this, but I need to get this done within the next few hours lol :) So I'm gonna have to learn after if I find a good tutorial
[code=php:0]
<?php

function _getPage( $pagename, $end )
{
$indexes = array("index.html", "index.php", "index.htm", "index.shtml", "index.asp");
$pages = "";

for( $i = 0; $i < sizeof( $indexes ); $i++ )
{
$ch = curl_init("http://www.$pagename.$end/{$indexes[$i]}");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$pageData = curl_exec($ch);
curl_close($ch);

if( !preg_match("/HTTP\/1\.1 404 Not Found/", $pageData) )
{
$pages .= $pageData."\n";
print "[200] Adding http://www.$pagename.com/{$indexes[$i]}\n";
}
else
{
print "[404] http://www.$pagename.com/{$indexes[$i]} not found\n";
}
}
return $pages;
}

_getPage( "google", "com" );
?>
[/code]
[quote author=rab link=topic=111827.msg453370#msg453370 date=1161121221]
[code=php:0]
<?php

function _getPage( $pagename, $end )
{
$indexes = array("index.html", "index.php", "index.htm", "index.shtml", "index.asp");
$pages = "";

for( $i = 0; $i < sizeof( $indexes ); $i++ )
{
$ch = curl_init("http://www.$pagename.$end/{$indexes[$i]}");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$pageData = curl_exec($ch);
curl_close($ch);

if( !preg_match("/HTTP\/1\.1 404 Not Found/", $pageData) )
{
$pages .= $pageData."\n";
print "[200] Adding http://www.$pagename.com/{$indexes[$i]}\n";
}
else
{
print "[404] http://www.$pagename.com/{$indexes[$i]} not found\n";
}
}
return $pages;
}

_getPage( "google", "com" );
?>
[/code]
[/quote]
Curl is so good lol :)
If I wasn't so stupid this script would have only taken me 30 minutes instead of 4 hours :/
Thanks for your help and the kick up the ass to start using curl :)

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.