Jump to content

PhP5 and Curl: Am I doing this correctly?


patrick24601

Recommended Posts

I obtained the following script from a friend of mine who is running it locally on his computer. It seems to work fine on his machine but not on my machine. The scripts forms a google query (same thing you see in your address bar when you query google) and let's you parse through the results. I cannot seem to get it working on my PHP5/Apache2 server. I don't have a clue as to what the issue might be:

 

    <h2>Google Searcher</h2>
<?php

error_reporting (E_ERROR | E_WARNING | E_PARSE | E_NOTICE);
    /* Setting the variables 
    A google Query looks like this:    
    so here are the variables I need for my  query:
    http://www.google.com/search?q=MYQUERY&start=MYSTART
    ¦-------------------GooglePrefix------¦-----query--¦suffix¦-counter--¦
    */
    $GooglePrefix = "http://www.google.com/search?q=";
    $query ="link building";
    $GoogleCountSuffix ="&start=";
    //-----------------------------
    echo "Looking for ".$query."<br>";
    /* Loop to get the Google result pages 
    While going through the loop, we build the query URL out of the parts and the loop counter 
    The results are stored in the $res variable.
    Basically, we get the complete source code for each result page, and store ALL of them in one looong string.
    */
$loop = 0;
$res = "";
    while ($loop<= 30)
    {	

        $CompleteUrl = $GooglePrefix.$query.$GoogleCountSuffix.$loop;
	echo "<br>$CompleteUrl";
        $res = $res.webFetcher($CompleteUrl); // we use the function webFetcher to get the page
			echo "<br>$loop : $res"; 
        $loop = $loop+10;
    }
    echo "<hr>";
    /* Now we use regular expressions to filter the URLs out of the result pages
    For this, the function "do_reg" is called, giving it the complete resultstring and the regular expression.
    The returned value (an array of matches) is stored in $regx
    */
    $resultURLs = do_reg($res, "/h2.class=r.*(http.*)\"/U");
    /* Now we want to fetch all those URLs
    Again, we use a loop for this. Some more explanations in the loop itself.
    */
        for ($i = 0; $i < count($resultURLs); $i++) //we use the length of the returned array to count.
        {
            $text = $resultURLs[$i]; //$text is set to the item in the result we are at
            $comp = webFetcher($text); //we get the page at the URL
            if (preg_match("/google_ad/", $comp, $matches))
            /* again, we use aregular expression function.
            This time, we are looking for "google_ad", a code snippet that tells us that google ads are used in the page.
            If found, this is true.
            */
            {
                echo "Google ad code found! <a href=".$text.">".$text."</a><br>";
            }
        }
    
    function do_reg($text, $regex) //returns all the found matches in an array
    {
        preg_match_all($regex, $text, $regxresult, PREG_PATTERN_ORDER);
        return $regresult = $regxresult[1];
    }
    
    function webFetcher($url)
    {
	$resulting = "";

        /* This does exactly what it is named after - it fetches a page from the web, just give it the URL */
        $crawl = curl_init(); //the curl library is initiated, the following lines set the curl variables
        curl_setopt ($crawl, CURLOPT_URL, $url); //The URL is set
        curl_setopt($crawl, CURLOPT_RETURNTRANSFER, 1); //Tells it to return the results in a variable
        $resulting = $resulting.curl_exec($crawl);  //curl is executed and the results stored in $resulting
        curl_close($crawl);     // closes the curl procedure.
        return $result = $resulting;
    }
?>

 

The error I get back is:

Bad Request
Your client has issued a malformed or illegal request.

 

Yet the google query string is correct i.e. http://www.google.com/search?q=link building&start=0 . It works in a browser but not in PHP/CURL.

 

Any help is appreciated. Thanks.

 

Link to comment
https://forums.phpfreaks.com/topic/106456-php5-and-curl-am-i-doing-this-correctly/
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.