Jump to content

Archived

This topic is now archived and is closed to further replies.

A584537

PHP / Sockets / Slow / Bad coding

Recommended Posts

Hi, I'm pretty new to PHP and I'm having a lot of trouble with my PHP code.
I was wondering if you could help me :)
[code]
<?php
$page2="";
$pages=array();
$page="";
$i=0;
$j=0;
$k=0;
$o=0;
$url="";
$socket;


function getpage($name) {

$indexs=array("index.html", "index.php", "index.htm", "index.shtml", "index.asp");
$matchnum = sizeof($indexs);
$service_port=80;
$address = gethostbyname("$name");

for($i=0;$i<$matchnum;$i++)
{

$socket = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);

if ($socket < 0) {
  echo "socket_create() failed: reason:";
}

$result = socket_connect($socket, $address, $service_port);

if ($result < 0) {
    echo "socket_connect() failed.\nReason: ($result) " . socket_strerror($result) . "\n";
}

$in = "GET /$indexs[$i] HTTP/1.1\r\n";
$in .= "Host: $name\r\n";
$in .= "Accept: */*\r\nUser-Agent: Mozilla/4.0\r\n\r\n";
$out = '';

socket_write($socket, $in, strlen($in));

while ($out = socket_read($socket, 300))
{
    $page2 .= $out;
    }
 
 
  if(!strstr($page2, "404"))
  {
  break;
  return $page2;
  }
   
socket_close($socket);
  }

}
$url=$_POST['url'];

if($url==NULL)
{

echo "
<html>
<head>
<title>Url Parser for Daniel Markus - clickvalue.nl er</title>
</head>
<body>
<form method=\"post\" action=\"$PHP_SELF\">


<p>
Url:<input type=\"text\" name=\"url\"><BR>

The url can be one url, or a list seperated by commas.</p>
<input name='submit' type='submit' value='Submit'>

</form>


</body></html>
";
}
else if($url!=NULL)
{
  $url=$_POST['url'];
  $url = str_replace(" ", "", $url);
  $urls = explode(",", $url);

  echo sizeof($urls[0]);
 
for($r=0;$r<sizeof($urls);$r++)
{
echo $urls[$i];
$pages[$r]=getpage($urls[$i]);
checkit($pages[$r]);
}

}


function checkit($page) {
$arr=array("omniture", "sitestat", "urchintracker", "HEAD");
$matchnum = sizeof($indexs);
$matchnum2 = sizeof($arr);

echo "Got to check();<br>";

for($i=0;$i<$matchnum2;$i++);
{
  echo $currenturl . " " . $i;
 
  if(strstr($pages[$i], $arr[$i]) == FALSE)
  {
echo "Didn't find $urls[$i]: $arr[$i]<br>";
  }
  else
  {
  echo "In $urls[$i]: $currentarr<br>";
  }
 
  }
}

?>
[/code]
I want to parse the urls I get from the text input box, then connect to that site and find the index file, then look for certain strings, if they're found then print that out

This script is doesn't even complete on my server, I've messed something up big time I think

I've tried loads of different things, different ways to handle it but it's just not happening.

Share this post


Link to post
Share on other sites
[quote author=rab link=topic=111827.msg453352#msg453352 date=1161118098]
You know cURL is good for that too
[/quote]
Everybody keeps saying that lol :)
I'm gonna learn it after I finish this, but I need to get this done within the next few hours lol :) So I'm gonna have to learn after if I find a good tutorial

Share this post


Link to post
Share on other sites
[quote]

Fatal error: Call to undefined function curl_init() in G:\www\daniel.php on line 3

[/quote]
Are variables outside of my function 'getpage()' not available to that function?

Share this post


Link to post
Share on other sites
[code=php:0]
<?php

function _getPage( $pagename, $end )
{
$indexes = array("index.html", "index.php", "index.htm", "index.shtml", "index.asp");
$pages = "";

for( $i = 0; $i < sizeof( $indexes ); $i++ )
{
$ch = curl_init("http://www.$pagename.$end/{$indexes[$i]}");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$pageData = curl_exec($ch);
curl_close($ch);

if( !preg_match("/HTTP\/1\.1 404 Not Found/", $pageData) )
{
$pages .= $pageData."\n";
print "[200] Adding http://www.$pagename.com/{$indexes[$i]}\n";
}
else
{
print "[404] http://www.$pagename.com/{$indexes[$i]} not found\n";
}
}
return $pages;
}

_getPage( "google", "com" );
?>
[/code]

Share this post


Link to post
Share on other sites
[quote author=rab link=topic=111827.msg453370#msg453370 date=1161121221]
[code=php:0]
<?php

function _getPage( $pagename, $end )
{
$indexes = array("index.html", "index.php", "index.htm", "index.shtml", "index.asp");
$pages = "";

for( $i = 0; $i < sizeof( $indexes ); $i++ )
{
$ch = curl_init("http://www.$pagename.$end/{$indexes[$i]}");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$pageData = curl_exec($ch);
curl_close($ch);

if( !preg_match("/HTTP\/1\.1 404 Not Found/", $pageData) )
{
$pages .= $pageData."\n";
print "[200] Adding http://www.$pagename.com/{$indexes[$i]}\n";
}
else
{
print "[404] http://www.$pagename.com/{$indexes[$i]} not found\n";
}
}
return $pages;
}

_getPage( "google", "com" );
?>
[/code]
[/quote]
Curl is so good lol :)
If I wasn't so stupid this script would have only taken me 30 minutes instead of 4 hours :/
Thanks for your help and the kick up the ass to start using curl :)

Share this post


Link to post
Share on other sites
No problem, don't try to reinvent the wheel  :P

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.