Jump to content

PHP / Sockets / Slow / Bad coding


A584537

Recommended Posts

Hi, I'm pretty new to PHP and I'm having a lot of trouble with my PHP code.
I was wondering if you could help me :)
[code]
<?php
$page2="";
$pages=array();
$page="";
$i=0;
$j=0;
$k=0;
$o=0;
$url="";
$socket;


function getpage($name) {

$indexs=array("index.html", "index.php", "index.htm", "index.shtml", "index.asp");
$matchnum = sizeof($indexs);
$service_port=80;
$address = gethostbyname("$name");

for($i=0;$i<$matchnum;$i++)
{

$socket = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);

if ($socket < 0) {
  echo "socket_create() failed: reason:";
}

$result = socket_connect($socket, $address, $service_port);

if ($result < 0) {
    echo "socket_connect() failed.\nReason: ($result) " . socket_strerror($result) . "\n";
}

$in = "GET /$indexs[$i] HTTP/1.1\r\n";
$in .= "Host: $name\r\n";
$in .= "Accept: */*\r\nUser-Agent: Mozilla/4.0\r\n\r\n";
$out = '';

socket_write($socket, $in, strlen($in));

while ($out = socket_read($socket, 300))
{
    $page2 .= $out;
    }
 
 
  if(!strstr($page2, "404"))
  {
  break;
  return $page2;
  }
   
socket_close($socket);
  }

}
$url=$_POST['url'];

if($url==NULL)
{

echo "
<html>
<head>
<title>Url Parser for Daniel Markus - clickvalue.nl er</title>
</head>
<body>
<form method=\"post\" action=\"$PHP_SELF\">


<p>
Url:<input type=\"text\" name=\"url\"><BR>

The url can be one url, or a list seperated by commas.</p>
<input name='submit' type='submit' value='Submit'>

</form>


</body></html>
";
}
else if($url!=NULL)
{
  $url=$_POST['url'];
  $url = str_replace(" ", "", $url);
  $urls = explode(",", $url);

  echo sizeof($urls[0]);
 
for($r=0;$r<sizeof($urls);$r++)
{
echo $urls[$i];
$pages[$r]=getpage($urls[$i]);
checkit($pages[$r]);
}

}


function checkit($page) {
$arr=array("omniture", "sitestat", "urchintracker", "HEAD");
$matchnum = sizeof($indexs);
$matchnum2 = sizeof($arr);

echo "Got to check();<br>";

for($i=0;$i<$matchnum2;$i++);
{
  echo $currenturl . " " . $i;
 
  if(strstr($pages[$i], $arr[$i]) == FALSE)
  {
echo "Didn't find $urls[$i]: $arr[$i]<br>";
  }
  else
  {
  echo "In $urls[$i]: $currentarr<br>";
  }
 
  }
}

?>
[/code]
I want to parse the urls I get from the text input box, then connect to that site and find the index file, then look for certain strings, if they're found then print that out

This script is doesn't even complete on my server, I've messed something up big time I think

I've tried loads of different things, different ways to handle it but it's just not happening.
Link to comment
Share on other sites

[quote author=rab link=topic=111827.msg453352#msg453352 date=1161118098]
You know cURL is good for that too
[/quote]
Everybody keeps saying that lol :)
I'm gonna learn it after I finish this, but I need to get this done within the next few hours lol :) So I'm gonna have to learn after if I find a good tutorial
Link to comment
Share on other sites

[code=php:0]
<?php

function _getPage( $pagename, $end )
{
$indexes = array("index.html", "index.php", "index.htm", "index.shtml", "index.asp");
$pages = "";

for( $i = 0; $i < sizeof( $indexes ); $i++ )
{
$ch = curl_init("http://www.$pagename.$end/{$indexes[$i]}");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$pageData = curl_exec($ch);
curl_close($ch);

if( !preg_match("/HTTP\/1\.1 404 Not Found/", $pageData) )
{
$pages .= $pageData."\n";
print "[200] Adding http://www.$pagename.com/{$indexes[$i]}\n";
}
else
{
print "[404] http://www.$pagename.com/{$indexes[$i]} not found\n";
}
}
return $pages;
}

_getPage( "google", "com" );
?>
[/code]
Link to comment
Share on other sites

[quote author=rab link=topic=111827.msg453370#msg453370 date=1161121221]
[code=php:0]
<?php

function _getPage( $pagename, $end )
{
$indexes = array("index.html", "index.php", "index.htm", "index.shtml", "index.asp");
$pages = "";

for( $i = 0; $i < sizeof( $indexes ); $i++ )
{
$ch = curl_init("http://www.$pagename.$end/{$indexes[$i]}");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$pageData = curl_exec($ch);
curl_close($ch);

if( !preg_match("/HTTP\/1\.1 404 Not Found/", $pageData) )
{
$pages .= $pageData."\n";
print "[200] Adding http://www.$pagename.com/{$indexes[$i]}\n";
}
else
{
print "[404] http://www.$pagename.com/{$indexes[$i]} not found\n";
}
}
return $pages;
}

_getPage( "google", "com" );
?>
[/code]
[/quote]
Curl is so good lol :)
If I wasn't so stupid this script would have only taken me 30 minutes instead of 4 hours :/
Thanks for your help and the kick up the ass to start using curl :)
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.