gigabyt3r Posted December 4, 2009 Share Posted December 4, 2009 How can I grab information from a webpage? For example if I wanted to get the time a search took on google where it says "Results 1 - 10 of about 610,000,000 for php. (0.05 seconds)" How could i get the '0.05' bit from the source to use it as a string? Thanks in advance Quote Link to comment Share on other sites More sharing options...
Deoctor Posted December 4, 2009 Share Posted December 4, 2009 try this code <?php $ch = curl_init("http://www.example.com/reallybigfile.tar.gz"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_BINARYTRANSFER, true); $output = curl_exec($ch); $fh = fopen("out.txt", 'w'); fwrite($fh, $output); fclose($fh); ?> this will write the curl(result) of a webpage into a file.. Quote Link to comment Share on other sites More sharing options...
oni-kun Posted December 4, 2009 Share Posted December 4, 2009 How can I grab information from a webpage? For example if I wanted to get the time a search took on google where it says "Results 1 - 10 of about 610,000,000 for php. (0.05 seconds)" How could i get the '0.05' bit from the source to use it as a string? Thanks in advance You can use regex to find the string "(x.xx seconds)". I'm not good enough with it to tell you the regex though. To open the web page as a string essentially.. you'd use this: $query = "php"; $result = file_get_contents("http://www.google.com/search?q=".urlencode($query)); preg_match(....) //Here Quote Link to comment Share on other sites More sharing options...
MadTechie Posted December 4, 2009 Share Posted December 4, 2009 Example (expanded from oni-kun) <?php $query = "php"; preg_match('%\(<b>([^<]*)</b> seconds\)%i',file_get_contents("http://www.google.com/search?q=".urlencode($query)),$found); echo $found[1]; Quote Link to comment Share on other sites More sharing options...
Deoctor Posted December 4, 2009 Share Posted December 4, 2009 check this code which i have written out <?php $url_feed='http://chaitu09986025424.blog.co.in/feed/rss/'; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, "http://validator.w3.org/feed/check.cgi?url=$url_feed"); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_BINARYTRANSFER, true); $output = curl_exec($ch); $fh = fopen("out.txt", 'w'); fwrite($fh, $output); $section = file_get_contents('./out.txt', NULL, NULL, 1256, 95); //echo $section; $fh1=fopen("out_1.txt",'w'); fwrite($fh1, $section); fclose($fh1); var_dump($section); fclose($fh); curl_close($ch); ?> now this will write the required one in the out_1.txt file and also will store it in the string $section. so u can check with this one display the result accordingly.. in this u need to change the numbers in the line $section = file_get_contents('./out.txt', NULL, NULL, 1256, 95); to match ur criteria Quote Link to comment Share on other sites More sharing options...
MadTechie Posted December 4, 2009 Share Posted December 4, 2009 And how exactly do that get the seconds ? my extension of oni-kun code seams more practical, and while it make be necessary to use cURL thats a simple change Quote Link to comment Share on other sites More sharing options...
Deoctor Posted December 4, 2009 Share Posted December 4, 2009 like here what i am doing is checking for the particular terminology that the curl function will display in the txt file img alt="[Valid RSS]" title="Valid RSS" src="images/valid-rss.png" /> This is a valid RSS feed. like the same this can be used to check the particular words like these for google. (0.21 seconds) and check the value that is displayed before the words seconds and can print that value or store it some where for a later use.. i think this is not that much impossible task Quote Link to comment Share on other sites More sharing options...
MadTechie Posted December 4, 2009 Share Posted December 4, 2009 Well the logic your using seams long winded and has pointless code! why your passing it to w3.org seams a little weird! a simple download html and extract is all that's needed, Quote Link to comment Share on other sites More sharing options...
Deoctor Posted December 4, 2009 Share Posted December 4, 2009 actually the code which i have used here is for the validation of the rss feed site for which i have used this site instead of using a reg expression, coz reg expressions are not so trust worthy after all in my case. i dont think that has any thing weird in the code if u could have checked it out and seen the result of what i was giving Quote Link to comment Share on other sites More sharing options...
MadTechie Posted December 4, 2009 Share Posted December 4, 2009 actually the code which i have used here is for the validation of the rss feed site for which i have used this site instead of using a reg expression, In that case why not just use a XML parse ? instead of replying on 2 sites to pull out a small section on text! Quote Link to comment Share on other sites More sharing options...
Deoctor Posted December 4, 2009 Share Posted December 4, 2009 i have used that one too but there was an issue with some of the rss feed urls, they are not getting recognised if (!@$xml=simplexml_load_file("$subscr")) the above code i have used out.. $subscr is the url of the rss feed i am using out. as this case failed i am trying to fetch the data by passing it to an other site. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.