Jump to content

[SOLVED] getting info from another site


nutt318

Recommended Posts

Not sure if this is the correct area but anyways here is my question. There is a website that has a section that has some price quotes and stocks. Anyways I want to be able to capture that information and put it on my own website. I am not sure how to do this and not sure what to look for in source codes.

 

So basically I just want to copy a little section of another website and have that information in my website. Just let me know if that is possible.

 

Thanks,

Jake

Link to comment
Share on other sites

You can do it but you'd have to either rely on them having their website exactly te same for following some sort of standard which is almost impossible.

 

If you know the format of their site doesn't change then you can parse the information but if they change their page layout even slightly then your script will fail.

 

Would be interesting to see what others make of this.

Link to comment
Share on other sites

Well, basically you should be looking into regular expression. Parsing data out of websites with regex is relatively "easy" (as long as you know regexp). Basically, you just download the page and then capture the data from it using preg_match.

 

As Yesideez said, however, you will have to update the script every time they change the layout, because your parsing solution will fail. This is not particularly big problem, though. It just means that the script will need to maintained and you should also try to make any parsing scripts as maintainable as possible (in one project of mine, all website parsers are in separate classes, making them easy to update if needed).

Link to comment
Share on other sites

This is so god damn easy that you should be able to do it in 5 lines. You can do it with more lines using curl and thus make it better.

 

Here is an example:

 

  $data = file_get_contents("http://www.filefactory.com/upload/upload_flash_begin.php?files=1");

  if($data === FALSE)
   die("Couldn't load url.");
   
  preg_match_all("/<viewhash>([^<]*)/", $data, $viewhashes);
  
  return $viewhashes[1][0];

 

The first line reads in all the data at the site, in 1 line. Go PHP!

 

Lines two and three are error handling.

 

Line 4, that's your parsing. VERY important to learn regex, or pay someone to do your regex for you, like that one guy who is the only person I have seen take so much pride in not knowing something.

 

Last line - your data! If you are not good with regex, or even if you are, you can use print_r on $matches to make sure you get the right thing.

 

And that it. All you need to do. And it's done. And so and am and my bitch girlfriend, so I'm gonna go to sleep, by myself drink, and you can tell everyone what a nice guy I am, cause I'm suck a great fucking guy she left me.

Link to comment
Share on other sites

tibberous,

Ok, well I do not know alot about regex but I am a pretty quick and easy learner. So if I wanted use that type of code you posted let me see if this is right. Basically on the webpage below I want to get the Nymex Crude Future prices for that line and possibly some others. Also just line 4 of the code i am having troubles understanding on how to select just certian parts of the data on that page. Anyways let me know on how i would put the code in the viewhash area. Thanks,

 

$data = file_get_contents("http://www.bloomberg.com/markets/commodities/energyprices.html");

  if($data === FALSE)
   die("Couldn't load url.");
   
  preg_match_all("/<viewhash>([^<]*)/", $data, $viewhashes);
  
  return $viewhashes[1][0];

Link to comment
Share on other sites

Warning: file_get_contents(): URL file-access is disabled in the server configuration

 

It means the url wrappers are disabled for file handling functions, so they can't access remote locations.

 

You need to either get the allow_url_fopen enabled in the php.ini (which may not be possible in shared hosting, if that's what you have). Another way to download remote web pages is using the CURL functions. You can find more information about CURL in the php manual:

 

http://docs.php.net/manual/en/ref.curl.php

 

However, here is a simple example of how to download a page using CURL:

 

<?php
$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "http://rithiur.anthd.com/");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);

$contents = curl_exec($ch);
curl_close($ch);

echo $contents;
?>

Link to comment
Share on other sites

Rithiur, Well that worked great for displaying the page information in my web page. But if i want to select just a certian area of that page to be displayed but not the whole page how would i go about doing that?

 

For example if I want to select just 3 to 4 lines of text in the middle of the page what else would i need in the code.

 

I have serach the php.net and have been looking the the table of contents for the cURL info but cannot find how to get just a certian area to display. Let me know if you have and idea.

 

thanks,

Link to comment
Share on other sites

Thanks to Rithiur he helped me with this code and the results are exactly what was needed, THanks man.

 


<?php
$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "http://www.bloomberg.com/markets/commodities/energyprices.html");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);

$contents = curl_exec($ch);
curl_close($ch);

function find_values ($string, $page)
{
$string = preg_quote($string, '#');

// takes everything from the given string to end of row
preg_match("#$string(.*)</tr>#Us", $page, $match);

// Get the values from the row we found previously
preg_match_all("#<span[^>]*>([^<]*)</span>#s", $match[1], $values);

// Return the values	
return $values[1];
}

$find = find_values('Nymex Crude Future', $contents);
echo "Nymex Crude Future: Price = $find[0], Change = $find[1], & Change = $find[2], Time = $find[3]<br>";

$find = find_values('Nymex Heating Oil Future', $contents);
echo "Nymex Heating Oil Future: Price = $find[0], Change = $find[1], & Change = $find[2], Time = $find[3]<br>";

$find = find_values('Nymex RBOB Gasoline Future', $contents);
echo "Nymex RBOB Gasoline Future: Price = $find[0], Change = $find[1], & Change = $find[2], Time = $find[3]<br>";


?>

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.