Jump to content

Webscraping problem


Gilzean

Recommended Posts

I'm relatively new to PHP/MYSQL/APACHE although I have a lot of IT experience in various technologies. Recently, I have set about trying to give myself a crash-course in PHP whilst developing a little generic webscraper. It is more or less there but I have been stumped by one particular website that I have attempted to extract information from. The site is www.0044.co.uk. If rather than navigating around the site by use of its buttons you use a fastrack option (for example, entering a url of http://www.0044.co.uk/Tariffs/Global/calculate-ekit.asp?Action=compute), you are taken to that page without any problem.

 

When I try to mimic that using file_get_contents or by using CURL, I get an "object not found" error. I am baffled !

 

For simplicity's sake, let's say that the PHP looks like this :

 

 

$url = "http://www.0044.co.uk/Tariffs/Global/calculate-ekit.asp?Action=compute";

$data = file_get_contents($url);

echo "DATA = ";

echo $data;

 

 

 

 

Link to comment
https://forums.phpfreaks.com/topic/82190-webscraping-problem/
Share on other sites

are you looking for this?

 

site.com/page.php?link=page1

 

if so here ya go

<?php
//page.php
$getlink = $_GET["link"];

if ($getlink == "page1") {
echo "Hello This Is Page 1";
}
elseif ($getlink == "page2") {
echo "Welcome To Page 2";
}
else {
echo "Ooops! Page Not Found";
}
?>

 

or if you are trying to take the page from a remote site, then I guess you could try a include

<?php
include ("http://www.0044.co.uk/Tariffs/Global/calculate-ekit.asp?Action=compute");
?>

 

idk I am mostly new to php as well

Link to comment
https://forums.phpfreaks.com/topic/82190-webscraping-problem/#findComment-417651
Share on other sites

No. The url is valid. I just want to know how to get hold of the contents of this particular one, especially using CURL.

 

My testbed PHP script for this looks like so :

 

$url = "http://www.0044.co.uk/Tariffs/Global/calculate-ekit.asp?Action=compute";

 

$ch = curl_init();

curl_setopt($ch, CURLOPT_COOKIEJAR, "/tmp/cookiejar-$randnum");

curl_setopt($ch, CURLOPT_COOKIEFILE, "/tmp/cookiejar-$randnum");

curl_setopt($ch, CURLOPT_REFERER, $reffer);

curl_setopt($ch, CURLOPT_URL, $url);

curl_setopt($ch, CURLOPT_TIMEOUT, 10);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

$data2 = curl_exec($ch);

curl_close($ch);

Link to comment
https://forums.phpfreaks.com/topic/82190-webscraping-problem/#findComment-417656
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.