Jump to content

Archived

This topic is now archived and is closed to further replies.

matfish

URL/Link Extracting

Recommended Posts

Hi, anyone point me in the right direction of extracting a list of URLs/Links of an external page (such as extracting a page of links from google which are related to a keyword)? I can then manipulate this data into my database?

Just want to extract links from a page which I could specify and maybe put the urls into an array so I could then play with?

Many thanks for any help.

Share this post


Link to post
Share on other sites
Ok, lets start again.

Anyone know how to extract URLs from a specific site in php?

Thanks

Share this post


Link to post
Share on other sites
With regular expressions,

Heres my crappy attempt at regex :D

[code]<?php

function GetLinks($url)
{
$aOut = array();
preg_match_all("/http:\/\/?[^ ][^\"][^'][^<][^>]+/i", file_get_contents($url), $aOut, PREG_PATTERN_ORDER);

print_r($aOut);

}

echo "<pre>";
GetLinks("http://google.com.au");
echo "</pre>";

?>[/code]

Hey, its a start ;)

Share this post


Link to post
Share on other sites
Thats brilliant thank you, It contains all of the ahref tag but from that I can pick out the URLs which is what I need.

Many thanks!!!!

Share this post


Link to post
Share on other sites
Hey dude, Im having a bit of trouble reading the array, for example: picking out a random array - say number 4?

Just returns "Array"

Share this post


Link to post
Share on other sites
random:

$num = rand(0, count([b]ARRAY[/b]));
echo [b]ARRAY[/b][$num];

entire:

$num2 = count([b]ARRAY[/b]);
$num3 = 0;

while($num3 <= $num2) {
echo [b]ARRAY[/b][$num3] . "< br >
";
$num3++;
}


hope this helped =)

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.