Jump to content

Matching Certain tags


cooldude832

Recommended Posts

I want to use preg_split to split a page about its <div><table><tr><td> tags, so I need a pattern that will match <div> or <table> or <tr>< or <td> any ideas and it has to be able to also handle the fact that a tag could have a styling on it or a class etc Ithink i need something like <div*> but I don't know the rest of it

Link to comment
https://forums.phpfreaks.com/topic/77093-matching-certain-tags/
Share on other sites

well my first issue was I wanted to remove all the info pre the body tag, I did it using explode, but not all body tags are all lower case, and some had some issues, again a regex issue I  tried "\<body*>\"; no good, if you got an idea to do a preg_split at that I'd love to see it,

 

that clears up some of it, but then in body <script> tags also are need to remove, my goal is to strip a page of everythign but container elements (<div>,<table><tr><td>)) whcih that pattern is doing for me, but then also kill all the script/css tags as those are special cases so I guess I need to find a replace ment for

<script*>*</script> and <style*>*</style>

 

Link to comment
https://forums.phpfreaks.com/topic/77093-matching-certain-tags/#findComment-390964
Share on other sites

What about something like this?

 

<pre>
<?php
   $data = <<<DATA
<html>
	<head>
		<title>Title</title>
	</head>
	<body>
		<font>Font Tag</font>
		<div>Div Content</div>
		<div id="1">More Div Content</div>
		<b>Bold</b>
		<table>
			<tr>
				<td>A Cell</td>
			</tr>
		</table>
		<hr>
	</body>
</html>
DATA;
### Split on the begin/end tags of what is desired
### and pull some content along.
$matches = preg_split(
	'%(</?(?:div|t(?:able|[rd]))[^>]*>[^<]*)%',
	$data,
	-1,
	PREG_SPLIT_DELIM_CAPTURE
);
### For each match...
$num_matches = count($matches);
for ($i = 0; $i < $num_matches; $i++) {
	### Strip unwanted tags.
	$matches[$i] = strip_tags($matches[$i], '<div><table><tr><td>');
	### If the entry doesn't start with a "<" (tag) it wasn't
	### included in our split; thus, not desired.
	if (strpos($matches[$i], '<') !== 0) {
		unset($matches[$i]);
	}
	### Otherwise, escape it for viewing purposes.
	else {
		$matches[$i] = htmlspecialchars($matches[$i]);
	}
}
### Display.
print_r($matches);
?>
</pre>

Link to comment
https://forums.phpfreaks.com/topic/77093-matching-certain-tags/#findComment-390988
Share on other sites

that is working, now I just want to build some sort of multi dimensonal array of the data based on tag depth which I can figrue out, note I subbed back in the < > for the < and > as its easier to type, but i did it post your thing, I'll pm you with the final result if you interested in it.

Link to comment
https://forums.phpfreaks.com/topic/77093-matching-certain-tags/#findComment-390998
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.