paladin_sh Posted January 24, 2007 Share Posted January 24, 2007 Hey There,I've having some trouble with Preg_Replace. I need it to remove everything except the Alphanumeric Characters of a string and the HTML.I want anything between the HTML brackets to be allowed, despite the Removal of everything non-alphanumeric. This way I can sort it out with the tags() function to only allow the HTML I want.I used this to remove Alphanumeric:[code]$string = preg_replace("/[^A-Za-z1-9]/","",$string);[/code]But I need that statment to ignore anything between HTML tags too.Thanks for any help that is offered.- Paladin Quote Link to comment Share on other sites More sharing options...
effigy Posted January 24, 2007 Share Posted January 24, 2007 See [url=http://www.phpfreaks.com/forums/index.php/topic,122857.0.html]this[/url] topic. It covers how to distinguish between HTML and non-HTML. Quote Link to comment Share on other sites More sharing options...
paladin_sh Posted January 24, 2007 Author Share Posted January 24, 2007 Thanks,I read through the topic, and while I can remove HTML code, and mess around with the stuff inside it I can't get my script to remove everything 'except' the HTML. Let alone get it to remove everything except the HTML and the Alphanumerics. When I give it a try, removing everything but the HTML, it just removes all the spaces and leaves everything else. hehe.I admit, Regex isn't my strong suit, and I am still learning quite a bit about it especially how it is used in PHP. I learn mostly by example.I do appreciate the help so far though.- Paladin Quote Link to comment Share on other sites More sharing options...
effigy Posted January 24, 2007 Share Posted January 24, 2007 [code]<pre><?php $html = <<<HTML <html> <head><title>T%i#t*l@e!</title></head> <body> abcde 12345 <font color="red">|+_()*!@#$%</font> </body> </html>HTML; $html = preg_replace_callback( '/(?<=>)([^<]+)/', create_function( '$matches', 'return preg_replace("/\W/", "", $matches[0]);' ), $html ); echo htmlspecialchars($html);?></pre>[/code]If you want to preserve the whitespace, change[tt] /\W/ [/tt]to[tt] /(?![\s])\W/[/tt]. Also, note that the[tt] w/W [/tt]shorthand includes/excludes an underscore as well, so you may want to change this. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.