SiriusB Posted October 17, 2007 Share Posted October 17, 2007 Hi there I have decided to store content for my website in a database. For most of the content formatting such as headers, code, bold etc I am using BB Code. To convert the BB Code to XHTML I use the function below if (!empty($string)) { $search = array( '#\[b\](.*?)\[/b\]#s', '#\[i\](.*?)\[/i\]#s', '#\[u\](.*?)\[/u\]#s', '#\[img\](.*?)\[/img\]#s', '#\[url=(.*?)\](.*?)\[/url\]#s', '#\[code\](.*?)\[/code\]#s' ); $replace = array( '<b>\\1</b>', '<i>\\1</i>', '<u>\\1</u>', '<img src="\\1">', '<a href="\\1">\\2</a>', '<pre><code>\\1</code></pre>' ); //convert BBCode tags to XHTML $xhtml = stripslashes(preg_replace($search , $replace, $string)); //preserve paragraphs by converting newlines and returns with <p> tags $xhtml = str_replace('<p></p>', '', '<p>' . preg_replace('#\n|\r#', '</p>$0<p>', $xhtml) . '</p>'); //output formatted string return $xhtml; } The BB Code part itself is working just fine. However, I would like to preserver my paragraphs, otherwise the text is unbearable to read. I know about nl2br but this I think is untidy and not very XHTML compliant. As you can see in the function above there is the following line: $xhtml = str_replace('<p></p>', '', '<p>' . preg_replace('#\n|\r#', '</p>$0<p>', $xhtml) . '</p>'); This line replaces line breaks with <p> tags. The results are good but there are one or two problems. The line of code matches any line breaks and ends up putting <p></p> around <hx>, <code> and pretty much any other tags. An example of what I am talking about is below: <p><h1>Title</h1></p> <p>This is some text</p> <p><code>echo "This is code";</code></p> Is it possible to modify the line of code to only match sections of text that don't already have tags? I would try myself but regex is a massive mystery to me and I am not 100% sure how that line of code works Any help is greatly appreciated. Quote Link to comment https://forums.phpfreaks.com/topic/73671-preserve-text-formatting-not-using-nl2br/ Share on other sites More sharing options...
BlueSkyIS Posted October 17, 2007 Share Posted October 17, 2007 this thread might help: http://www.phpfreaks.com/forums/index.php/topic,163932.0.html Quote Link to comment https://forums.phpfreaks.com/topic/73671-preserve-text-formatting-not-using-nl2br/#findComment-371683 Share on other sites More sharing options...
SiriusB Posted October 17, 2007 Author Share Posted October 17, 2007 I am not sure how. I don't want to remove all the tags. I just need to find a way to only put <p> tags around actual paragraphs and leave text with <hx> and <code> tags alone. Quote Link to comment https://forums.phpfreaks.com/topic/73671-preserve-text-formatting-not-using-nl2br/#findComment-371693 Share on other sites More sharing options...
SiriusB Posted October 18, 2007 Author Share Posted October 18, 2007 Anyone else got any ideas? Surely I am not the first person to want to do this! Quote Link to comment https://forums.phpfreaks.com/topic/73671-preserve-text-formatting-not-using-nl2br/#findComment-372241 Share on other sites More sharing options...
Daniel0 Posted October 18, 2007 Share Posted October 18, 2007 How about this? $text = '<p>'.str_replace("\n\n", '</p><p>', $text).'</p>'; Quote Link to comment https://forums.phpfreaks.com/topic/73671-preserve-text-formatting-not-using-nl2br/#findComment-372262 Share on other sites More sharing options...
SiriusB Posted October 18, 2007 Author Share Posted October 18, 2007 That didn't work either, I'm afraid. True it only put <p></p> tags around actual paragraphs, but it removed the line breaks themselves so the page source becomes unreadable. It also didn't solve the problem of <p></p> tags being put around other tags such as <code> and <h2> etc. I know I could probably add [p] tags to my BBCode, but that would mean writing new content would take just as long as writing the XHTML myself. The whole point of moving my site to PHP was to reduce the time it takes for me to add content I really, really want to avoid having br tags everywhere as it flies in the face of good semantic markup. There isn't a clever regular expression that can go "oh look, \n\n - I shall add <p> tags to that. But If I find "\n\n<some other tag>" I'll ignore it." is there? So come on regex pros... Quote Link to comment https://forums.phpfreaks.com/topic/73671-preserve-text-formatting-not-using-nl2br/#findComment-372406 Share on other sites More sharing options...
Daniel0 Posted October 18, 2007 Share Posted October 18, 2007 Why not just use two HTML line break tags then? Quote Link to comment https://forums.phpfreaks.com/topic/73671-preserve-text-formatting-not-using-nl2br/#findComment-372452 Share on other sites More sharing options...
SiriusB Posted October 18, 2007 Author Share Posted October 18, 2007 Why not just use two HTML line break tags then? How would that help? I need to add <p> tags to paragraphs. I want to preserve the line breaks [\n] as it allows for readable code. My original line of code manages this, but adds tags to every block of text, regardless of whether or not it has a tag. I need a regular expression that can match something like "\n\nSome text\n\n" but not "\n\n<tag>Some text</tag>\n\n" where <tag> is any other xhtml tag. There has to be a regular expression that can accomplish this. If it was just for forum posts or comments/blog entries using nl2br wouldn't be an issue, as the content is usually fairly short. However my site has quite a large amount of content and I would prefer this to be correctly marked up. Not only is it following standards it makes creating CSS much easier as I have something to target. As I said, I can't be the only person to want to produce fully semantic code. Quote Link to comment https://forums.phpfreaks.com/topic/73671-preserve-text-formatting-not-using-nl2br/#findComment-372582 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.