Jump to content

How to grab information from a webpage?


gigabyt3r

Recommended Posts

How can I grab information from a webpage? For example if I wanted to get the time a search took on google where it says "Results 1 - 10 of about 610,000,000 for php. (0.05 seconds)" How could i get the '0.05' bit from the source to use it as a string?

 

Thanks in advance

Link to comment
Share on other sites

try this code

<?php
$ch = curl_init("http://www.example.com/reallybigfile.tar.gz");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
$output = curl_exec($ch);

$fh = fopen("out.txt", 'w');
fwrite($fh, $output);
fclose($fh);
?>

 

this will write the curl(result) of a webpage into a file..

Link to comment
Share on other sites

How can I grab information from a webpage? For example if I wanted to get the time a search took on google where it says "Results 1 - 10 of about 610,000,000 for php. (0.05 seconds)" How could i get the '0.05' bit from the source to use it as a string?

 

Thanks in advance

 

You can use regex to find the string "(x.xx seconds)". I'm not good enough with it to tell you the regex though. To open the web page as a string essentially.. you'd use this:

$query = "php";
$result = file_get_contents("http://www.google.com/search?q=".urlencode($query));
preg_match(....) //Here

Link to comment
Share on other sites

check this code which i have written out

<?php
$url_feed='http://chaitu09986025424.blog.co.in/feed/rss/';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://validator.w3.org/feed/check.cgi?url=$url_feed");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
$output = curl_exec($ch);

$fh = fopen("out.txt", 'w');
fwrite($fh, $output);
$section = file_get_contents('./out.txt', NULL, NULL, 1256, 95);
//echo $section;
$fh1=fopen("out_1.txt",'w');
fwrite($fh1, $section);
fclose($fh1);
var_dump($section);
fclose($fh);
curl_close($ch);
?>

 

now this will write the required one in the out_1.txt file and also will store it in the string $section.

so u can check with this one display the result accordingly..

 

in this u need to change the  numbers in the line

$section = file_get_contents('./out.txt', NULL, NULL, 1256, 95);
to match ur criteria
Link to comment
Share on other sites

like here what i am doing is checking for the particular terminology that the curl function will display in the txt file

img alt="[Valid RSS]" title="Valid RSS" src="images/valid-rss.png" /> This is a valid RSS feed.

 

like the same this can be used to check the particular words like these

for google.  (0.21 seconds)

 

and check the value that is displayed before the words seconds and can print that value or store it some where for a later use..

 

 

i think this is not that much impossible task >:(

Link to comment
Share on other sites

actually the code which i have used here is for the validation of the rss feed site for which i have used this site instead of using a reg expression,

coz reg expressions are not so trust worthy after all in my case.

i dont think that has any thing weird in the code if u could have checked it out and seen the result of what i was giving :P

Link to comment
Share on other sites

i have used that one too

but there was an issue with some of the rss feed urls,

they are not getting recognised

if (!@$xml=simplexml_load_file("$subscr"))

 

the above code i have used out..

$subscr is the url of the rss feed i am using out.

as this case failed i am trying to fetch the data by passing it to an other site.

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.