gigabyt3r Posted December 4, 2009 Share Posted December 4, 2009 How can I grab information from a webpage? For example if I wanted to get the time a search took on google where it says "Results 1 - 10 of about 610,000,000 for php. (0.05 seconds)" How could i get the '0.05' bit from the source to use it as a string? Thanks in advance Link to comment https://forums.phpfreaks.com/topic/183963-how-to-grab-information-from-a-webpage/ Share on other sites More sharing options...
Deoctor Posted December 4, 2009 Share Posted December 4, 2009 try this code <?php $ch = curl_init("http://www.example.com/reallybigfile.tar.gz"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_BINARYTRANSFER, true); $output = curl_exec($ch); $fh = fopen("out.txt", 'w'); fwrite($fh, $output); fclose($fh); ?> this will write the curl(result) of a webpage into a file.. Link to comment https://forums.phpfreaks.com/topic/183963-how-to-grab-information-from-a-webpage/#findComment-971114 Share on other sites More sharing options...
oni-kun Posted December 4, 2009 Share Posted December 4, 2009 How can I grab information from a webpage? For example if I wanted to get the time a search took on google where it says "Results 1 - 10 of about 610,000,000 for php. (0.05 seconds)" How could i get the '0.05' bit from the source to use it as a string? Thanks in advance You can use regex to find the string "(x.xx seconds)". I'm not good enough with it to tell you the regex though. To open the web page as a string essentially.. you'd use this: $query = "php"; $result = file_get_contents("http://www.google.com/search?q=".urlencode($query)); preg_match(....) //Here Link to comment https://forums.phpfreaks.com/topic/183963-how-to-grab-information-from-a-webpage/#findComment-971126 Share on other sites More sharing options...
MadTechie Posted December 4, 2009 Share Posted December 4, 2009 Example (expanded from oni-kun) <?php $query = "php"; preg_match('%\(<b>([^<]*)</b> seconds\)%i',file_get_contents("http://www.google.com/search?q=".urlencode($query)),$found); echo $found[1]; Link to comment https://forums.phpfreaks.com/topic/183963-how-to-grab-information-from-a-webpage/#findComment-971132 Share on other sites More sharing options...
Deoctor Posted December 4, 2009 Share Posted December 4, 2009 check this code which i have written out <?php $url_feed='http://chaitu09986025424.blog.co.in/feed/rss/'; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, "http://validator.w3.org/feed/check.cgi?url=$url_feed"); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_BINARYTRANSFER, true); $output = curl_exec($ch); $fh = fopen("out.txt", 'w'); fwrite($fh, $output); $section = file_get_contents('./out.txt', NULL, NULL, 1256, 95); //echo $section; $fh1=fopen("out_1.txt",'w'); fwrite($fh1, $section); fclose($fh1); var_dump($section); fclose($fh); curl_close($ch); ?> now this will write the required one in the out_1.txt file and also will store it in the string $section. so u can check with this one display the result accordingly.. in this u need to change the numbers in the line $section = file_get_contents('./out.txt', NULL, NULL, 1256, 95); to match ur criteria Link to comment https://forums.phpfreaks.com/topic/183963-how-to-grab-information-from-a-webpage/#findComment-971165 Share on other sites More sharing options...
MadTechie Posted December 4, 2009 Share Posted December 4, 2009 And how exactly do that get the seconds ? my extension of oni-kun code seams more practical, and while it make be necessary to use cURL thats a simple change Link to comment https://forums.phpfreaks.com/topic/183963-how-to-grab-information-from-a-webpage/#findComment-971178 Share on other sites More sharing options...
Deoctor Posted December 4, 2009 Share Posted December 4, 2009 like here what i am doing is checking for the particular terminology that the curl function will display in the txt file img alt="[Valid RSS]" title="Valid RSS" src="images/valid-rss.png" /> This is a valid RSS feed. like the same this can be used to check the particular words like these for google. (0.21 seconds) and check the value that is displayed before the words seconds and can print that value or store it some where for a later use.. i think this is not that much impossible task Link to comment https://forums.phpfreaks.com/topic/183963-how-to-grab-information-from-a-webpage/#findComment-971188 Share on other sites More sharing options...
MadTechie Posted December 4, 2009 Share Posted December 4, 2009 Well the logic your using seams long winded and has pointless code! why your passing it to w3.org seams a little weird! a simple download html and extract is all that's needed, Link to comment https://forums.phpfreaks.com/topic/183963-how-to-grab-information-from-a-webpage/#findComment-971210 Share on other sites More sharing options...
Deoctor Posted December 4, 2009 Share Posted December 4, 2009 actually the code which i have used here is for the validation of the rss feed site for which i have used this site instead of using a reg expression, coz reg expressions are not so trust worthy after all in my case. i dont think that has any thing weird in the code if u could have checked it out and seen the result of what i was giving Link to comment https://forums.phpfreaks.com/topic/183963-how-to-grab-information-from-a-webpage/#findComment-971213 Share on other sites More sharing options...
MadTechie Posted December 4, 2009 Share Posted December 4, 2009 actually the code which i have used here is for the validation of the rss feed site for which i have used this site instead of using a reg expression, In that case why not just use a XML parse ? instead of replying on 2 sites to pull out a small section on text! Link to comment https://forums.phpfreaks.com/topic/183963-how-to-grab-information-from-a-webpage/#findComment-971250 Share on other sites More sharing options...
Deoctor Posted December 4, 2009 Share Posted December 4, 2009 i have used that one too but there was an issue with some of the rss feed urls, they are not getting recognised if (!@$xml=simplexml_load_file("$subscr")) the above code i have used out.. $subscr is the url of the rss feed i am using out. as this case failed i am trying to fetch the data by passing it to an other site. Link to comment https://forums.phpfreaks.com/topic/183963-how-to-grab-information-from-a-webpage/#findComment-971256 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.