Jump to content

Making a simple bbcode filter, but preg_replace is too greedy! Need some help!


Recommended Posts

I am writing a simple forum to go on top of an existing forum database, and I've got most of the thing down...  but one thing that is bugging me is that my bbcode functions don't always work as they should, and it's starting to give me a headache.

 

Basically, I add all of my bbcode functions to two arrays, and then all the text from the forum runs through a preg_replace() to make sure that everything is parsed properly.  This mostly works very well, but when I get into things like the quote box that you typically see on a forum, it gives me trouble.

 

Here's the code I use for the quote box array entries, and for the replacement.  $npttxt is the text to be parsed:

 

$tmpbbcode = array(  
'/(\[quote\])(.*)(\[\/quote\])/Uis'
);

$tmphtml = array(  
'<center><table cellpadding=3 cellspacing=1 border=0 width=90% bgcolor=#CBCBCB>
<tr><td><font class="forumqtname">Quote:</font></td></tr>
<tr bgcolor=#FFFFFF><td><font class="forumdesc">${2}</font></td></tr>
</table></center>'
);

$npttxt = preg_replace($tmpbbcode, $tmphtml, $npttxt);

 

This works perfect when there's only one quote box.  But when there is one inside the other, like this:

 

[quote]
[quote]Hello World[/quote]
Hello to you too[/quote]

 

...it should look something like this:

 

Hello World

Hello to you too

 

...but instead, I get this (sans spaces):

 

[ quote ]Hello World

Hello to you too[ /quote ]

 

So my preg_replace is finding the first instance of the end quote after it finds the first one.  Is there any way I can make it work from the outside in on both ends of my string?  Or maybe I am going about this in a completely wrong way, and someone could steer me in the right direction?  I really think that I can make this work though, I just need some help with the regular expression, and making it act the way I need it to.

 

Thanks in advance to anyone who can help me with this...  I know regular expressions are a pain in the butt.  ;D

I was wondering about that as well...  would it be possible to alter my regex to just find the last occurrence of the end quote, and change that one?  Or would that be impossible / too costly to do?

I've been wondering about this myself... I'm pretty sure there's not going to be a way...

 

These two situations would collide:

 

Quote inside
another quote

 

and

 

Two seperate
quotes

 

you might be able to search for quote tags with no other quote tags between them, then use a while+eregi until you've replaced them all.

<?php
$newhtml = preg_replace(
array(
	'/[quote]/i',
	'/[\/quote]/i'),
array(
	'<center><table cellpadding=3 cellspacing=1 border=0 width=90% bgcolor=#CBCBCB><tr><td><font class="forumqtname">Quote:</font></td></tr><tr bgcolor=#FFFFFF><td><font class="forumdesc">',
	'</font></td></tr></table></center>'
),$npttxt
);
?>

 

That's the easiest way, but it doesn't error check unmatched tags.

I couldn't find an old post of mine on phpfreaks, but I did find it in my own archives. Here's one for cleaning up nesting (I haven't reviewed it):

 

<?php
echo $data = '[quote]Testing one.[/quote]Inbetween text.[quote]Nested[quote]quote[/quote][/quote][quote]Triple[quote]nested[quote]quote[/quote][/quote][/quote]';

### separate ungreedily
$quotes = preg_split('/(\[quote\].*?\[\/quote\])/i', $data, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
### show separation
echo '<br><br>';
echo '<b>separation:</b>';
echo '<pre>', print_r($quotes, true), '</pre>';

$i=0;
foreach($quotes as $quote) {
	### while there is:
	### 1. a beginning quote with text before it; or
	### 2. an ending quote with text after it...
	while(preg_match('/.+\[quote\]|\[\/quote\].+/i',$quotes[$i])) {
		### ...remove the quote tagging but save the text
		$quotes[$i] = preg_replace('/(.+)\[quote\]/i','\1 ', $quotes[$i]);
		$quotes[$i] = preg_replace('/\[\/quote\](.+)/i','\1 ', $quotes[$i]);
	}
	### remove this item if we ended up with nothing
	if (trim($quotes[$i])=='') {
		unset($quotes[$i]);
	}
	### remove this item if it is only an unmatched quote tag
	else {
		if (preg_match('/^(?:\[quote\]|\[\/quote\])$/i',$quotes[$i])) {
			unset($quotes[$i]);
		}
	}
	$i++;
}

### show the cleaned up results
echo '<b>results:</b>';
echo '<pre>', print_r($quotes, true), '</pre>';
?>

 

<!-- Output

[quote]Testing one.[/quote]Inbetween text.[quote]Nested[quote]quote[/quote][/quote][quote]Triple[quote]nested[quote]quote[/quote][/quote][/quote]

separation:

Array
(
	[0] => [quote]Testing one.[/quote]
	[1] => Inbetween text.
	[2] => [quote]Nested[quote]quote[/quote]
	[3] => [/quote]
	[4] => [quote]Triple[quote]nested[quote]quote[/quote]
	[5] => [/quote][/quote]
)

results:

Array
(
	[0] => [quote]Testing one.[/quote]
	[1] => Inbetween text.
	[2] => [quote]Nested quote[/quote]
	[4] => [quote]Triple nested quote[/quote]
)

-->
[/co

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.