Jump to content

Preserve text formatting NOT using nl2br


SiriusB

Recommended Posts

Hi there

 

I have decided to store content for my website in a database.  For most of the content formatting such as headers, code, bold etc I am using BB Code.

 

To convert the BB Code to XHTML I use the function below

 

if (!empty($string))	
{
	$search = array(
	'#\[b\](.*?)\[/b\]#s',
	'#\[i\](.*?)\[/i\]#s',
	'#\[u\](.*?)\[/u\]#s',
	'#\[img\](.*?)\[/img\]#s',
	'#\[url=(.*?)\](.*?)\[/url\]#s',
	'#\[code\](.*?)\[/code\]#s'
	);

	$replace = array(
	'<b>\\1</b>',
	'<i>\\1</i>',
	'<u>\\1</u>',
	'<img src="\\1">',
	'<a href="\\1">\\2</a>',
	'<pre><code>\\1</code></pre>'
	);
      
	//convert BBCode tags to XHTML
	$xhtml = stripslashes(preg_replace($search , $replace, $string));	

      //preserve paragraphs by converting newlines and returns with <p> tags
	$xhtml = str_replace('<p></p>', '', '<p>' . preg_replace('#\n|\r#', '</p>$0<p>', $xhtml) . '</p>');    

	//output formatted string
	return $xhtml;
}

 

The BB Code part itself is working just fine.  However, I would like to preserver my paragraphs, otherwise the text is unbearable to read.  I know about nl2br but this I think is untidy and not very XHTML compliant. 

 

As you can see in the function above there is the following line:

 

$xhtml = str_replace('<p></p>', '', '<p>' . preg_replace('#\n|\r#', '</p>$0<p>', $xhtml) . '</p>');

 

This line replaces line breaks with <p> tags.  The results are good but there are one or two problems.  The line of code matches any line breaks and ends up putting <p></p> around <hx>, <code> and pretty much any other tags.

 

An example of what I am talking about is below:

 

<p><h1>Title</h1></p>

<p>This is some text</p>

<p><code>echo "This is code";</code></p>

 

Is it possible to modify the line of code to only match sections of text that don't already have tags?

 

I would try myself but regex is a massive mystery to me and I am not 100% sure how that line of code works :P

 

Any help is greatly appreciated.

Link to comment
Share on other sites

That didn't work either, I'm afraid.

 

True it only put <p></p> tags around actual paragraphs, but it removed the line breaks themselves so the page source becomes unreadable.

 

It also didn't solve the problem of <p></p> tags being put around other tags such as <code> and <h2> etc.

 

I know I could probably add [p] tags to my BBCode, but that would mean writing new content would take just as long as writing the XHTML myself.  The whole point of moving my site to PHP was to reduce the time it takes for me to add content :P

 

I really, really want to avoid having br tags everywhere as it flies in the face of good semantic markup.

 

There isn't a clever regular expression that can go "oh look, \n\n - I shall add <p> tags to that.  But If I find "\n\n<some other tag>" I'll ignore it." is there?  So come on regex pros... :D

Link to comment
Share on other sites

Why not just use two HTML line break tags then?

 

How would that help?

 

I need to add <p> tags to paragraphs.  I want to preserve the line breaks [\n] as it allows for readable code.  My original line of code manages this, but adds tags to every block of text, regardless of whether or not it has a tag.

 

I need a regular expression that can match something like "\n\nSome text\n\n" but not "\n\n<tag>Some text</tag>\n\n" where <tag> is any other xhtml tag.

 

There has to be a regular expression that can accomplish this.

 

If it was just for forum posts or comments/blog entries using nl2br wouldn't be an issue, as the content is usually fairly short.  However my site has quite a large amount of content and I would prefer this to be correctly marked up.  Not only is it following standards it makes creating CSS much easier as I have something to target.  As I said, I can't be the only person to want to produce fully semantic code. 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.