Gilzean Posted December 18, 2007 Share Posted December 18, 2007 I'm relatively new to PHP/MYSQL/APACHE although I have a lot of IT experience in various technologies. Recently, I have set about trying to give myself a crash-course in PHP whilst developing a little generic webscraper. It is more or less there but I have been stumped by one particular website that I have attempted to extract information from. The site is www.0044.co.uk. If rather than navigating around the site by use of its buttons you use a fastrack option (for example, entering a url of http://www.0044.co.uk/Tariffs/Global/calculate-ekit.asp?Action=compute), you are taken to that page without any problem. When I try to mimic that using file_get_contents or by using CURL, I get an "object not found" error. I am baffled ! For simplicity's sake, let's say that the PHP looks like this : $url = "http://www.0044.co.uk/Tariffs/Global/calculate-ekit.asp?Action=compute"; $data = file_get_contents($url); echo "DATA = "; echo $data; Link to comment https://forums.phpfreaks.com/topic/82190-webscraping-problem/ Share on other sites More sharing options...
Lamez Posted December 18, 2007 Share Posted December 18, 2007 are you looking for this? site.com/page.php?link=page1 if so here ya go <?php //page.php $getlink = $_GET["link"]; if ($getlink == "page1") { echo "Hello This Is Page 1"; } elseif ($getlink == "page2") { echo "Welcome To Page 2"; } else { echo "Ooops! Page Not Found"; } ?> or if you are trying to take the page from a remote site, then I guess you could try a include <?php include ("http://www.0044.co.uk/Tariffs/Global/calculate-ekit.asp?Action=compute"); ?> idk I am mostly new to php as well Link to comment https://forums.phpfreaks.com/topic/82190-webscraping-problem/#findComment-417651 Share on other sites More sharing options...
Gilzean Posted December 18, 2007 Author Share Posted December 18, 2007 No. The url is valid. I just want to know how to get hold of the contents of this particular one, especially using CURL. My testbed PHP script for this looks like so : $url = "http://www.0044.co.uk/Tariffs/Global/calculate-ekit.asp?Action=compute"; $ch = curl_init(); curl_setopt($ch, CURLOPT_COOKIEJAR, "/tmp/cookiejar-$randnum"); curl_setopt($ch, CURLOPT_COOKIEFILE, "/tmp/cookiejar-$randnum"); curl_setopt($ch, CURLOPT_REFERER, $reffer); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_TIMEOUT, 10); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); $data2 = curl_exec($ch); curl_close($ch); Link to comment https://forums.phpfreaks.com/topic/82190-webscraping-problem/#findComment-417656 Share on other sites More sharing options...
cooldude832 Posted December 18, 2007 Share Posted December 18, 2007 don't even use curl I did this for some one else http://www.phpfreaks.com/forums/index.php/topic,172445.0.html You find a tag structure that is consistent and work off that. Link to comment https://forums.phpfreaks.com/topic/82190-webscraping-problem/#findComment-417661 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.