zavin Posted January 4, 2011 Share Posted January 4, 2011 Is it possible to write a php script that would extract data from an external web page. I've been working with php for a few years as a hobby, but I've never seen this done or needed a reason to until now. Thanks in advance for any responses. Quote Link to comment https://forums.phpfreaks.com/topic/223315-extract-data-from-web-page/ Share on other sites More sharing options...
QuickOldCar Posted January 4, 2011 Share Posted January 4, 2011 Use curl http://php.net/manual/en/book.curl.php Quote Link to comment https://forums.phpfreaks.com/topic/223315-extract-data-from-web-page/#findComment-1154432 Share on other sites More sharing options...
johnny86 Posted January 4, 2011 Share Posted January 4, 2011 Depends on what you need to extract from there.. Do you want something from the source code? Or transfer something like a file? Quote Link to comment https://forums.phpfreaks.com/topic/223315-extract-data-from-web-page/#findComment-1154438 Share on other sites More sharing options...
zavin Posted January 4, 2011 Author Share Posted January 4, 2011 QuickOldCar - Thanks I'm going to start studying up on cURL to see if it's what I need. johnny86 - I'm wanting to extract information. For example if a web page had a list of prices and I wanted my page to list some of those prices based on the item from that page I want to list. I hope that makes since. Quote Link to comment https://forums.phpfreaks.com/topic/223315-extract-data-from-web-page/#findComment-1154443 Share on other sites More sharing options...
QuickOldCar Posted January 4, 2011 Share Posted January 4, 2011 May want to use http://simplehtmldom.sourceforge.net/ but also curl to resolve the urls first. If look into dom I feel that alongside curl can do what you need to. Or just everything in curl, up to you. Quote Link to comment https://forums.phpfreaks.com/topic/223315-extract-data-from-web-page/#findComment-1154456 Share on other sites More sharing options...
johnny86 Posted January 4, 2011 Share Posted January 4, 2011 Or you can just fetch the source code of the site with $source = file_get_contents('http://site.to.fetch.from/'); Then you could use preg_match to extract what you need. Which would be bit faster than using simplehtmldom. But you can also use simplehtmldom with your $source.. Quote Link to comment https://forums.phpfreaks.com/topic/223315-extract-data-from-web-page/#findComment-1154462 Share on other sites More sharing options...
zavin Posted January 4, 2011 Author Share Posted January 4, 2011 I think $source = file_get_contents('http://site.to.fetch.from/'); will work for me, but I'm also going to look into simplehtmldom a little more. The first problem I have is that the site I'm trying to get info from requires a log in and of course when I try to log in it looks for the log in page for the site on my server. Is there a work around for this? Quote Link to comment https://forums.phpfreaks.com/topic/223315-extract-data-from-web-page/#findComment-1154496 Share on other sites More sharing options...
lastkarrde Posted January 4, 2011 Share Posted January 4, 2011 of course when I try to log in it looks for the log in page for the site on my server I don't understand. cURL can store cookies. Do a request to the login page, passing the appropriate login parameters via POST. Then do another request to the page you want to access. Quote Link to comment https://forums.phpfreaks.com/topic/223315-extract-data-from-web-page/#findComment-1154546 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.