why website is redirecting when using Curl

seany123 · January 31, 2017

Hello, so im trying to use Curl to connect to a website, but when i try to return the url, i get redirected and it ends up redirecting me back to "myowndomain.com"/back-soon.

is there a way to see why the site is redirecting when using Curl, it doesnt redirect when i connect to the website through a usual web browser..

code:

$url = "https://groceries.asda.com/";

$main_page = curlFunction($url);

echo $main_page;

function curlFunction($url)
{
	$cookie_file = "cookie.txt";
	
    // Assigning cURL options to an array
    $options = Array(
        CURLOPT_RETURNTRANSFER => TRUE,  // Setting cURL's option to return the webpage data
        CURLOPT_FOLLOWLOCATION => TRUE,  // Setting cURL to follow 'location' HTTP headers
        CURLOPT_AUTOREFERER => TRUE, // Automatically set the referer where following 'location' HTTP headers
        CURLOPT_CONNECTTIMEOUT => 120,   // Setting the amount of time (in seconds) before the request times out
        CURLOPT_TIMEOUT => 120,  // Setting the maximum amount of time for cURL to execute queries
        CURLOPT_MAXREDIRS => 10, // Setting the maximum number of redirections to follow
        CURLOPT_USERAGENT => "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1a2pre) Gecko/2008073000 Shredder/3.0a2pre ThunderBrowse/3.2.1.8",  // Setting the useragent
        CURLOPT_URL => $url, // Setting cURL's URL option with the $url variable passed into the function
		//this is for the cookie.
		CURLOPT_COOKIESESSION => TRUE,
		CURLOPT_COOKIEFILE => $cookie_file,
		CURLOPT_COOKIEJAR => $cookie_file,
    );
	
    $ch = curl_init();  // Initialising cURL
    curl_setopt_array($ch, $options);   // Setting cURL's options using the previously assigned array data in $options
    $data = curl_exec($ch); // Executing the cURL request and assigning the returned data to the $data variable
    curl_close($ch);    // Closing cURL
    return $data;   // Returning the data from the function
}

Any help would be great.

sean

requinix · January 31, 2017

Your code is just dumping the HTML out to your browser. If there is any redirect then your browser is doing it, so use your browser's tools to find out why. For example, have it track HTTP requests, find the one that performs the redirect, and see what originated it.

seany123 · January 31, 2017

Your code is just dumping the HTML out to your browser. If there is any redirect then your browser is doing it, so use your browser's tools to find out why. For example, have it track HTTP requests, find the one that performs the redirect, and see what originated it.

Thanks for your response,

i have just now installed and tried it with httprequester on firefox,

it appears to display the website without any further redirects.

Edited January 31, 2017 by seany123

requinix · January 31, 2017

Well, when I tried your code I got a page that was empty except for the logo and black background for the navigation bar, so...

seany123 · January 31, 2017

Well, when I tried your code I got a page that was empty except for the logo and black background for the navigation bar, so...

Thanks for the response.

That's strange, im running the exact code and its redirecting.

maybe this a something with my servers settings causing the redirect.

requinix · January 31, 2017

I know that you didn't show the literal contents of some file because your post is missing the opening <?php. Is there anything else you didn't include?

dalecosp · January 31, 2017

<?php

//all your curl stuff here

print_r(curl_getinfo($ch));

Might give some kind of clue.

seany123 · February 1, 2017

I know that you didn't show the literal contents of some file because your post is missing the opening <?php. Is there anything else you didn't include?

this is exactly the code i'm running:

<?php
//error_reporting(E_ALL);
//ini_set('max_execution_time', 0);

$url = "https://groceries.asda.com/";

$main_page = Acurl($url);

echo $main_page;

function Acurl($url)
{
	//$cookie_file = "cookie.txt";
	
    // Assigning cURL options to an array
    $options = Array(
        CURLOPT_RETURNTRANSFER => TRUE,  // Setting cURL's option to return the webpage data
        CURLOPT_FOLLOWLOCATION => TRUE,  // Setting cURL to follow 'location' HTTP headers
        CURLOPT_AUTOREFERER => TRUE, // Automatically set the referer where following 'location' HTTP headers
        CURLOPT_CONNECTTIMEOUT => 120,   // Setting the amount of time (in seconds) before the request times out
        CURLOPT_TIMEOUT => 120,  // Setting the maximum amount of time for cURL to execute queries
        CURLOPT_MAXREDIRS => 10, // Setting the maximum number of redirections to follow
        CURLOPT_USERAGENT => "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1a2pre) Gecko/2008073000 Shredder/3.0a2pre ThunderBrowse/3.2.1.8",  // Setting the useragent
        CURLOPT_URL => $url, // Setting cURL's URL option with the $url variable passed into the function
		//this is for the cookie.
		CURLOPT_COOKIESESSION => TRUE,
		CURLOPT_COOKIEFILE => $cookie_file,
		CURLOPT_COOKIEJAR => $cookie_file,
    );
	
    $ch = curl_init();  // Initialising cURL
    curl_setopt_array($ch, $options);   // Setting cURL's options using the previously assigned array data in $options
    $data = curl_exec($ch); // Executing the cURL request and assigning the returned data to the $data variable
    curl_close($ch);    // Closing cURL
    return $data;   // Returning the data from the function
}

strange that even with your test it wasn't showing the entire webpage? just parts of the nav bar?

might there be something the website is doing to block a curl connection?

<?php

//all your curl stuff here

print_r(curl_getinfo($ch));
Might give some kind of clue.

Ok i will give that a try and see what it returns.

requinix · February 1, 2017

strange that even with your test it wasn't showing the entire webpage? just parts of the nav bar?

might there be something the website is doing to block a curl connection?

Have you looked at the HTML source of the page you're trying to copy? It references stylesheets and images and code. When I run your code from my "website" most of those references will break because the files don't exist, and it happens to break in such a way that most of the page is missing/not visible.

Unless you cloned the assorted other files, I would expect the same when you try it. The redirect is completely random.

Here's another thing to try: running it locally. Start the built-in server

php -S localhost:8000 -t /path/to/where/your/test/file/is

then go to http://localhost:8000/file.php and see what happens.

seany123 · February 2, 2017

Have you looked at the HTML source of the page you're trying to copy? It references stylesheets and images and code. When I run your code from my "website" most of those references will break because the files don't exist, and it happens to break in such a way that most of the page is missing/not visible.

Unless you cloned the assorted other files, I would expect the same when you try it. The redirect is completely random.

Here's another thing to try: running it locally. Start the built-in server
php -S localhost:8000 -t /path/to/where/your/test/file/is
then go to http://localhost:8000/file.php and see what happens.

to be completely honest i haven't run into this problem before, usually i'm able to use the function to display the website then i use a simple scraping function which scrapes parts out from the source code.

I thought curl downloaded the source code after the includes had been created, so didn't realize that include paths made a difference.

I will try running the script locally as you suggested.

thanks

sean

Edited February 2, 2017 by seany123

Sign In

why website is redirecting when using Curl

Recommended Posts

seany123

Link to comment

Share on other sites

requinix

Link to comment

Share on other sites

seany123

Link to comment

Share on other sites

requinix

Link to comment

Share on other sites

seany123

Link to comment

Share on other sites

requinix

Link to comment

Share on other sites

dalecosp

Link to comment

Share on other sites

seany123

Link to comment

Share on other sites

requinix

Link to comment

Share on other sites

seany123

Link to comment

Share on other sites

Join the conversation

Browse

Activity

Important Information