Jump to content

remove html between two tags


anarchoi

Recommended Posts

<?php
$string = "abc [TAG]<a href=\"123.com\">123</a>[/TAG] def ghi jkl mno";
$opener = "[TAG]";
$closer = "[/TAG]";
$i = '/'.preg_quote($opener,'/').'(.*?)'.preg_quote($closer,'/').'/ie';
$a = preg_replace($i,"strip_tags(\"\\1\")",$string) or die('ERROR');
echo $a;
?>

 

although if you actually want to 'keep' those [tag]s then the preg_replace is not a very gr8 idea, coz like whenever I tried to keep them.. the preg_replace would thro itself into an infinate loop and just timeout with php

One way...

 

<?php

$str = 'abc [TAG]<a href="123.com">123</a>[/TAG] def ghi jkl mno[TAG]<a href="123.com">123</a>[/TAG]';

preg_match_all('/\[TAG\](.*?)\[\/TAG\]/', $str, $matches);

foreach ($matches[1] as $match)
{
    $str = str_replace($match, strip_tags($match), $str);
}

print $str;

?>

 

Probably better ways...

@MrAdam

Your method will also remove HTML from outside the [TAG]s, if it matches what's found inside the [TAG]s. E.g.

 

[TAG]<a href="123.com">123</a>[/TAG] def <a href="123.com">123</a>

 

becomes

 

[TAG]123[/TAG] def 123

 

@OP

This modification of RussellReal's code will keep the [TAG]s and only remove HTML from inside the [TAG]s. I'm using a positive lookbehind and lookahead in the regex pattern, so the [TAG]s have to be there, but aren't matched and removed. I also added support for newlines inside the [TAG]s, and treat the string feeded to strip_tags() as a literal string, meaning e.g. $abc won't translate to the contents of the variable:

 

<?php
$string = '[TAG]<a href="123.com">123</a>[/TAG] def <a href="123.com">123</a>';
$open = '[TAG]';
$close = '[/TAG]';
$pattern = '/(?<=' . preg_quote($open, '/') . ')(.*?)(?=' . preg_quote($close, '/') . ')/ise';
$string = preg_replace($pattern, "strip_tags('$1')", $string);
echo $string;
?>

Output:

[TAG]123[/TAG] def <a href="123.com">123</a>

I edited the code a bit, so you define the tags in an array. And I assume that the opening and closing tags will always match, so we only define each different tag once. I also dropped the lookbehind, since it requires a fixed length assertion, and would fail if the defined tags in the array aren't the same length. Code:

 

<?php
$string = '[TAG]<a href="123.com">123</a>[/TAG] def <a href="123.com">123</a> test [ABC]<em>italic no more</em>[/ABC] bla <strong>bla</strong>';
$tags = array('TAG', 'ABC');
//escape tags with preg_quote (requires PHP 5)
foreach ($tags as &$tag) {
$tag = preg_quote($tag, '~');
}
$pattern = '~(\[(' . implode('|', $tags) . ')\])(.*?)(?=\[/\\2\])~ise';
$string = preg_replace($pattern, "'$1' . strip_tags('$3')", $string);
echo $string;
?>

If you for some reason are using PHP 4, change this:

 

//escape tags with preg_quote (requires PHP 5)
foreach ($tags as &$tag) {
$tag = preg_quote($tag, '~');
}

to this:

 

$esc_tags = array();
foreach ($tags as $tag) {
$esc_tags[] = preg_quote($tag, '~');
}
$tags = $esc_tags;

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.