Jump to content


Photo

Extracting links from a variable containing HTML?


  • Please log in to reply
2 replies to this topic

#1 denhamd2

denhamd2
  • Members
  • PipPipPip
  • Advanced Member
  • 81 posts

Posted 09 October 2006 - 11:49 AM

I have some links stored in a <table>, which is in my variable $html, but I basically just want to get all the links that have the word "New York" contained between their <a href></a> tags. Here is a snippet from my $html variable (the <table> is a lot bigger but I've just shortened it here):

<table>
<tr>
<td><a href="asdsd.wmv">Latest from Los Angeles</a></td>
</tr>
<tr>
<td><a href="rftyrtyr.wmv" style="font-family: Georgia; size: 11px;">New York is a great city</a></td>
</tr>
<tr>
<td><a href="cal.wmv" style="font-family: Arial; size: 10px;">I'm going to California</a></td>
</tr>
<tr>
<td><a href="fhgd.wmv">Texas is huge</a></td>
</tr>
<tr>
<td><a href="esbnjgjhwmv" style="font-family: Times; size: 10px;">The Empire State building in New York city</a></td>
</tr>
</table>

As you can see there is unnecessary html code in there. I basically want to take out the links containing "New York" in the link text and echo them with a line break separating each one. Any ideas on how to do this?

#2 printf

printf
  • Staff Alumni
  • Advanced Member
  • 889 posts

Posted 09 October 2006 - 03:57 PM

Something like...

<?

$html = '<table>
<tr>
<td><a href="asdsd.wmv">Latest from Los Angeles</a></td>
</tr>
<tr>
<td><a href="rftyrtyr.wmv" style="font-family: Georgia; size: 11px;">New York is a great city</a></td>
</tr>
<tr>
<td><a href="cal.wmv" style="font-family: Arial; size: 10px;">I\'m going to California</a></td>
</tr>
<tr>
<td><a href="fhgd.wmv">Texas is huge</a></td>
</tr>
<tr>
<td><a href="esbnjgjhwmv" style="font-family: Times; size: 10px;">The Empire State building in New York city</a></td>
</tr>
</table>';

	preg_match_all ( "|href\=\"?'?`?([[:alnum:]:?=&@/#._-]+)\"?'?`?.*>(.*new york.*)</|i", $html, $url );

	for ( $i = 0; $i < sizeof ( $url[1] ); $i++ )
	{
		echo "url  = " . $url[1][$i] . "<br />";
		echo "text = " . $url[2][$i] . "<br /><br />";
	}

?>


me!

#3 denhamd2

denhamd2
  • Members
  • PipPipPip
  • Advanced Member
  • 81 posts

Posted 24 October 2006 - 10:36 AM

thanks, just a couple of problems i'm having though. sometimes instead of "New York" it says "Big Apple" - I would like these links to be included too. Also sometimes the link spans onto separate lines - is there any way to include these links too?

Many thanks in advance




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users