PHP Get URL

Instant87 · October 4, 2008

Hi, I'm making a website for a game I play that query's the games databases (yes I do have permission) every day or so and retrieves the data. So far I got this:

<?php

$ch = curl_init() or die(curl_error());

curl_setopt($ch, CURLOPT_URL,"url");

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

$data1=curl_exec($ch) or die(curl_error());

echo "".$data1."";

echo curl_error($ch);

curl_close($ch);

?>

(replacing url with the actualy url)

What I need help on is extracting certain parts of the page, for example between the words "Ruled by" and "Exports:"

Also what I need help with is query the site only once per day

xtopolis · October 4, 2008

You need to search through $data1. Either by regular expression, or by strpos and substr..

Query the site once per day, setup a cron job.

thebadbad · October 4, 2008

I'll try to help you with the regular expressions (if it's not too complex) if you provide us with the page URL (or simply the HTML source code) and precisely describe what you want to grab.

Instant87 · October 4, 2008

The site is: http://viridian.puzzlepirates.com/yoweb/island/info.wm?islandid=01

Very, short simple.

And the part I would like to extract is:

Ruled by <a href="/yoweb/flag/info.wm?flagid=10001210">The Wrath of Armageddon</a><br>

The part that will change sometimes is "The Wrath of Armageddon" so I guess the start will be before that and the end would be </a><br>

thebadbad · October 4, 2008

If you want to extract the (relative) link and the link text ("The Wrath of Armageddon" in this case), this will work:

<?php
//$data1 contains page source
preg_match('~<a href="(.+?)">(.+?)</a>~is', $data1, $matches);
//$matches[0] contains the full HTML link, e.g. <a href="/yoweb/flag/info.wm?flagid=10001210">The Wrath of Armageddon</a>
//$matches[1] contains the relative URL, e.g. /yoweb/flag/info.wm?flagid=10001210
//$matches[2] contains the link text, e.g. The Wrath of Armageddon
?>

Example with link to the pirate clan (or whatever it is):

<?php
echo "Ruled by <a href=\"http://viridian.puzzlepirates.com{$matches[1]}\">{$matches[2]}</a>";
?>

Instant87 · October 4, 2008

Awesome! It works

My last questions is about the cron jobs that someone mentioned earlier, I googled that and I only found how to to it via the command line on your comp... so is there a way to do it on a page with a php script of some sort, to only gather the information only once per day?

thebadbad · October 4, 2008

Check if your webhost offers cron jobs. Or if you've got your own server, you should be able to set it up. Else, google "free cron job" or the like, and see if you can find any.

To clarify; a cron job is simply an application of some sort, that runs a script (URL) at a specified time/interval. So it can't be done within a script (although it should be possible to simulate it, if you've got a page that's visited at least once a day).

Sign In

PHP Get URL

Recommended Posts

Instant87

Link to comment

Share on other sites

xtopolis

Link to comment

Share on other sites

thebadbad

Link to comment

Share on other sites

Instant87

Link to comment

Share on other sites

thebadbad

Link to comment

Share on other sites

Instant87

Link to comment

Share on other sites

thebadbad

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information