peppericious Posted December 6, 2012 Share Posted December 6, 2012 I'm using tinyMCE for textareas in a form. Some users of the form are copy/pasting content from Word. I've managed to get fairly clean content using the Paste tinyMCE plugin. However, in the case of tables, cell content is being pasted like this: <td valign="top" width="148"><p>text</p></td> How can I strip out those paragraph tags so that I simply get this: <td valign="top" width="148">text</td> Thanks in advance. Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/ Share on other sites More sharing options...
.josh Posted December 6, 2012 Share Posted December 6, 2012 If you are certain it will always be a simple paragraph tag (no attributes) then you can use str_ireplace, as it is technically more efficient. $content =str_ireplace(array('<p>','</p>'),'',$content); If it may possibly have attributes, you can use preg_replace instead. $content = preg_replace('~</?p[^>]*>~i','',$content); Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/#findComment-1397979 Share on other sites More sharing options...
peppericious Posted December 7, 2012 Author Share Posted December 7, 2012 If you are certain it will always be a simple paragraph tag (no attributes) then you can use str_ireplace, as it is technically more efficient. $content =str_ireplace(array('<p>','</p>'),'',$content); If it may possibly have attributes, you can use preg_replace instead. $content = preg_replace('~</?p[^>]*>~i','',$content); Very, very helpful. Thank you. Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/#findComment-1398059 Share on other sites More sharing options...
peppericious Posted December 7, 2012 Author Share Posted December 7, 2012 .. er... hang on... this solution won't work because I need to retain paragraph tags which are not within the table. I want to strip out paragraph tags, but only those ones within td tags (which may or may not themselves contain attributes)... ...? Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/#findComment-1398065 Share on other sites More sharing options...
Christian F. Posted December 7, 2012 Share Posted December 7, 2012 This RegExp should do it: #(<td[^>]*>)(?:<p>(.*?)</p>)*(</td>)# That said though, what you really should do is use DOMdocument or some other manner of properly traversing the DOM object. Regular Exp<b></b>ressions are for regular languages, not markup and other irregular languages. Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/#findComment-1398070 Share on other sites More sharing options...
peppericious Posted December 7, 2012 Author Share Posted December 7, 2012 Thank you very much. ... or some other manner of properly traversing the DOM object. ... and what would that be, exactly? Do you mean using javascript (which I know almost nothing about but which I'm keen to learn)? Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/#findComment-1398073 Share on other sites More sharing options...
Christian F. Posted December 7, 2012 Share Posted December 7, 2012 Xpath is an option, and there are a couple of other PHP classes for cleaning and traversing HTML pages. All in the same vein as DOMdocument though. Link to comment https://forums.phpfreaks.com/topic/271684-regex-to-strip-paragraph-tags-out-of-table-cells/#findComment-1398077 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.