tryingtolearn Posted January 20, 2007 Share Posted January 20, 2007 Im wondering if there is a way to accomplish this.If a user inputs some html code I only want to accept everything that would be inbetween the body tags.If they input [code]<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><title>test</title><link href="style.css" rel="stylesheet" type="text/css"><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"></head><body disabled leftmargin="0" topmargin="0" marginwidth="0" marginheight="0"><table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td align="center"> <table width="800" border="0" cellspacing="0" cellpadding="0"> <tr> <td bgcolor="FE0000"><img src="img/logo2.gif" width="371" height="128"></td> </tr> <tr> <td bgcolor="F0F0F0"> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="1"></td> <td width="258" valign="top"><a href="www.birddogsgarage.com">BDG</a></td> </tr> </table> </td> </tr> </table> </td> </tr></table></body></html>[/code]I need to remove [code]<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><title>test</title><link href="style.css" rel="stylesheet" type="text/css"><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"></head><body disabled leftmargin="0" topmargin="0" marginwidth="0" marginheight="0"></body></html>[/code]But keep[code]<table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td align="center"> <table width="800" border="0" cellspacing="0" cellpadding="0"> <tr> <td bgcolor="FE0000"><img src="img/logo2.gif" width="371" height="128"></td> </tr> <tr> <td bgcolor="F0F0F0"> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="1"></td> <td width="258" valign="top"><a href="www.birddogsgarage.com">BDG</a></td> </tr> </table> </td> </tr> </table> </td> </tr></table>[/code]The problem that I am running into is stripping off the top portion because the body tag can have alot of variation and the content itself can start with anything.This is the closest I have comebut it still leaves the closing head tag and the body tag in place - [code]<?php$source = '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><title>test</title><link href="style.css" rel="stylesheet" type="text/css"><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"></head><body disabled leftmargin="0" topmargin="0" marginwidth="0" marginheight="0"><p><table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td align="center"> <table width="800" border="0" cellspacing="0" cellpadding="0"> <tr> <td bgcolor="FE0000"><img src="img/logo2.gif" width="371" height="128"></td> </tr> <tr> <td bgcolor="F0F0F0"> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="1"></td> <td width="258" valign="top"><a href="www.birddogsgarage.com">BDG</a></td> </tr> </table> </td> </tr> </table> </td> </tr></table></body></html>' . $source;$output = strstr ($source,"</head>");$output = substr ($output, 0, strpos ($output,"</body>")); $file_source = highlight_string($output, true);echo '<textarea name="150" rows="20" cols="75" >'.$output.'</textarea>';?>[/code]Any ideas would be greatly appreciated. Quote Link to comment https://forums.phpfreaks.com/topic/34987-need-help-with-removing-certain-chunks-of-html-code/ Share on other sites More sharing options...
scotmcc Posted January 20, 2007 Share Posted January 20, 2007 Do you want to strip all of the HTML tags? If you do, there is a PHP function called strip_tags(). You can also allow certain tags using this function, but you will have to read through the manual to figure out how it works :)[url=http://us2.php.net/strip_tags]http://us2.php.net/strip_tags[/url]Scot Quote Link to comment https://forums.phpfreaks.com/topic/34987-need-help-with-removing-certain-chunks-of-html-code/#findComment-165010 Share on other sites More sharing options...
tryingtolearn Posted January 20, 2007 Author Share Posted January 20, 2007 Thanks Scot,If I understand that correct that only removes the tag so I will still be stuck with everything inbetween the tags (The text) I dont think that will work - unless I am reading it wrong.. Quote Link to comment https://forums.phpfreaks.com/topic/34987-need-help-with-removing-certain-chunks-of-html-code/#findComment-165012 Share on other sites More sharing options...
scotmcc Posted January 20, 2007 Share Posted January 20, 2007 As I understand it, the strip_tags() function will remove the entire HTML tag, for instance:[code]$text = '<p class="123">some text</p>';$text = strip_tags($text);echo $text;[/code]This should just echo 'some text'.However, if you leave some tags in, you will also get the attributes of those tags in the '$text' variable.You should just try it out and see if it works for you.Scot Quote Link to comment https://forums.phpfreaks.com/topic/34987-need-help-with-removing-certain-chunks-of-html-code/#findComment-165013 Share on other sites More sharing options...
tryingtolearn Posted January 20, 2007 Author Share Posted January 20, 2007 I stand correctedYou are right on the money I was reading it wrong.Now the only thing that is left behind is if the user has something in the title tag.Guess I will have to find a way to strip that - and then use the strip_tags.Back to the drawing board. Quote Link to comment https://forums.phpfreaks.com/topic/34987-need-help-with-removing-certain-chunks-of-html-code/#findComment-165016 Share on other sites More sharing options...
tryingtolearn Posted January 20, 2007 Author Share Posted January 20, 2007 This did the trick - but it seemed to make more sense to add the tags that you dont want rather than create a list of all the tags to allow.It also strips the text between open and closed tags For example the title.Thanks for the push in the right direction Scot! I appreciate it.[code]<?php $source ='<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><title>test</title><link href="style.css" rel="stylesheet" type="text/css"><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"></head><body disabled leftmargin="0" topmargin="0" marginwidth="0" marginheight="0"><table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td align="center"> <table width="800" border="0" cellspacing="0" cellpadding="0"> <tr> <td bgcolor="FE0000"><img src="img/logo2.gif" width="371" height="128"></td> </tr> <tr> <td bgcolor="F0F0F0"> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="1"></td> <td width="258" valign="top"><a href="www.birddogsgarage.com">BDG</a></td> </tr> </table> </td> </tr> </table> </td> </tr></table></body></html>'; function strip_selected_tags($source, $tags = '<html><head><title><link><meta><body><!>', $stripContent = true) { preg_match_all("/<([^>]+)>/i",$tags,$allTags,PREG_PATTERN_ORDER); foreach ($allTags[1] as $tag){ if ($stripContent) { $source = preg_replace("/<".$tag."[^>]*>.*<\/".$tag.">/iU","",$source); } $source = preg_replace("/<\/?".$tag."[^>]*>/iU","",$source); } return $source; } $clean = strip_selected_tags($source); echo '<textarea name="150" rows="20" cols="75" >'.$clean.'</textarea>';?>[/code] Quote Link to comment https://forums.phpfreaks.com/topic/34987-need-help-with-removing-certain-chunks-of-html-code/#findComment-165044 Share on other sites More sharing options...
scotmcc Posted January 20, 2007 Share Posted January 20, 2007 very nice, glad that helped!Scot Quote Link to comment https://forums.phpfreaks.com/topic/34987-need-help-with-removing-certain-chunks-of-html-code/#findComment-165045 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.