Jump to content

Extracting links from a variable containing HTML?


denhamd2

Recommended Posts

I have some links stored in a <table>, which is in my variable $html, but I basically just want to get all the links that have the word "New York" contained between their <a href></a> tags. Here is a snippet from my $html variable (the <table> is a lot bigger but I've just shortened it here):

[CODE]
<table>
<tr>
<td><a href="asdsd.wmv">Latest from Los Angeles</a></td>
</tr>
<tr>
<td><a href="rftyrtyr.wmv" style="font-family: Georgia; size: 11px;">New York is a great city</a></td>
</tr>
<tr>
<td><a href="cal.wmv" style="font-family: Arial; size: 10px;">I'm going to California</a></td>
</tr>
<tr>
<td><a href="fhgd.wmv">Texas is huge</a></td>
</tr>
<tr>
<td><a href="esbnjgjhwmv" style="font-family: Times; size: 10px;">The Empire State building in New York city</a></td>
</tr>
</table>
[/CODE]

As you can see there is unnecessary html code in there. I basically want to take out the links containing "New York" in the link text and echo them with a line break separating each one. Any ideas on how to do this?
Something like...

[code]<?

$html = '<table>
<tr>
<td><a href="asdsd.wmv">Latest from Los Angeles</a></td>
</tr>
<tr>
<td><a href="rftyrtyr.wmv" style="font-family: Georgia; size: 11px;">New York is a great city</a></td>
</tr>
<tr>
<td><a href="cal.wmv" style="font-family: Arial; size: 10px;">I\'m going to California</a></td>
</tr>
<tr>
<td><a href="fhgd.wmv">Texas is huge</a></td>
</tr>
<tr>
<td><a href="esbnjgjhwmv" style="font-family: Times; size: 10px;">The Empire State building in New York city</a></td>
</tr>
</table>';

preg_match_all ( "|href\=\"?'?`?([[:alnum:]:?=&@/#._-]+)\"?'?`?.*>(.*new york.*)</|i", $html, $url );

for ( $i = 0; $i < sizeof ( $url[1] ); $i++ )
{
echo "url  = " . $url[1][$i] . "<br />";
echo "text = " . $url[2][$i] . "<br /><br />";
}

?>[/code]


me!
  • 2 weeks later...
thanks, just a couple of problems i'm having though. sometimes instead of "New York" it says "Big Apple" - I would like these links to be included too. Also sometimes the link spans onto separate lines - is there any way to include these links too?

Many thanks in advance

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.