Jump to content

Need Regex: Getting content within a tag by attribute


vincea

Recommended Posts

Hey everyone,

 

First off I hate reg expressions so I thought I would try a search which came up empty, so I might as well ask for help.

 

Since MLS.ca doesn't believe in showing real estate feeds with XML but with tables (ugh) I need to find certain (sometimes nested) HTML tags with certain attributes.

 

an example would be:

<div class="Text">I need the text in here</div>

 

however that tag may be nested within a table structure.

 

my function looks like so where $content is the html source but again i have no regex:

 

function read_value ($content, $tag, $attr) {
if ($content == "") {
	echo "Feed contains no content to be parsed";
}

if ($attr != "")
	$found = preg_match('//', $content, $matches);
else
	echo "No attribute specified";

if ($found != false) 
	return $matches[0]; // matches found: return them
else
	return false; // no matches found: return false


}

<pre>
<?php
function read_value ($content, $tag, $attr) {
	$found = 0;
	if ($content == "") {
		echo "Feed contains no content to be parsed";
	}
	if ($attr != "") {
		$found = preg_match('/<' . 
			preg_quote($tag) .
			'[^>]+' .
			preg_quote($attr) . 
			'=[^>]+>(.*?)<\/' .
			preg_quote($tag) . 
			'>' .
		'/i', $content, $matches);
	}
	else {
		echo "No attribute specified";
	}
	return $found ? $matches[0] : false ;
}

$content = <<<CONTENT
<html><body><h1>Head</h1><p>para before<div class="text">match me</div>para after</p></body></html>
CONTENT;

echo htmlspecialchars(read_value($content, 'div', 'class'));

?>
</pre>

<pre>
<?php
function read_value ($content, $tag, $attr) {
	$found = 0;
	if ($content == "") {
		echo "Feed contains no content to be parsed";
	}
	if ($attr != "") {
		$found = preg_match('/<' . 
			preg_quote($tag) .
			'[^>]+\w+=[\'"]?' .
			preg_quote($attr) . 
			'[^>]*>(.*?)<\/' .
			preg_quote($tag) . 
			'>' .
		'/i', $content, $matches);
	}
	else {
		echo "No attribute specified";
	}
	return $found ? $matches[0] : false ;
}

$content = <<<CONTENT
<html><body><h1>Head</h1><p>para before<div class="text">match me</div>para after</p></body></html>
CONTENT;

echo htmlspecialchars(read_value($content, 'div', 'text'));

?>
</pre>

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.