peppericious Posted December 6, 2012 Share Posted December 6, 2012 I'm using tinyMCE for textareas in a form. Some users of the form are copy/pasting content from Word. I've managed to get fairly clean content using the Paste tinyMCE plugin. However, in the case of tables, cell content is being pasted like this: <td valign="top" width="148"><p>text</p></td> How can I strip out those paragraph tags so that I simply get this: <td valign="top" width="148">text</td> Thanks in advance. Quote Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/ Share on other sites More sharing options...
.josh Posted December 6, 2012 Share Posted December 6, 2012 If you are certain it will always be a simple paragraph tag (no attributes) then you can use str_ireplace, as it is technically more efficient. $content =str_ireplace(array('<p>','</p>'),'',$content); If it may possibly have attributes, you can use preg_replace instead. $content = preg_replace('~</?p[^>]*>~i','',$content); Quote Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/#findComment-1397979 Share on other sites More sharing options...
peppericious Posted December 7, 2012 Author Share Posted December 7, 2012 If you are certain it will always be a simple paragraph tag (no attributes) then you can use str_ireplace, as it is technically more efficient. $content =str_ireplace(array('<p>','</p>'),'',$content); If it may possibly have attributes, you can use preg_replace instead. $content = preg_replace('~</?p[^>]*>~i','',$content); Very, very helpful. Thank you. Quote Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/#findComment-1398059 Share on other sites More sharing options...
peppericious Posted December 7, 2012 Author Share Posted December 7, 2012 .. er... hang on... this solution won't work because I need to retain paragraph tags which are not within the table. I want to strip out paragraph tags, but only those ones within td tags (which may or may not themselves contain attributes)... ...? Quote Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/#findComment-1398065 Share on other sites More sharing options...
Christian F. Posted December 7, 2012 Share Posted December 7, 2012 (edited) This RegExp should do it: #(<td[^>]*>)(?:<p>(.*?)</p>)*(</td>)# That said though, what you really should do is use DOMdocument or some other manner of properly traversing the DOM object. Regular Exp<b></b>ressions are for regular languages, not markup and other irregular languages. Edited December 7, 2012 by Christian F. Quote Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/#findComment-1398070 Share on other sites More sharing options...
peppericious Posted December 7, 2012 Author Share Posted December 7, 2012 Thank you very much. ... or some other manner of properly traversing the DOM object. ... and what would that be, exactly? Do you mean using javascript (which I know almost nothing about but which I'm keen to learn)? Quote Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/#findComment-1398073 Share on other sites More sharing options...
Christian F. Posted December 7, 2012 Share Posted December 7, 2012 Xpath is an option, and there are a couple of other PHP classes for cleaning and traversing HTML pages. All in the same vein as DOMdocument though. Quote Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/#findComment-1398077 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.