Jump to content

Recommended Posts

Im wondering if there is a way to accomplish this.

If a user inputs some html code I only want to accept everything that would be inbetween the body tags.

If they input
[code]<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>test</title>
<link href="style.css" rel="stylesheet" type="text/css">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body disabled leftmargin="0" topmargin="0" marginwidth="0" marginheight="0">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td align="center">
<table width="800" border="0" cellspacing="0" cellpadding="0">
<tr>
<td bgcolor="FE0000"><img src="img/logo2.gif" width="371" height="128"></td>
</tr>
<tr>
<td bgcolor="F0F0F0">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="1"></td>
<td width="258" valign="top"><a href="www.birddogsgarage.com">BDG</a></td>
</tr>
</table>
</td>
</tr>
</table>
</td>
</tr>
</table>
</body>
</html>[/code]

I need to remove
[code]<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>test</title>
<link href="style.css" rel="stylesheet" type="text/css">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body disabled leftmargin="0" topmargin="0" marginwidth="0" marginheight="0">



</body>
</html>[/code]

But keep
[code]
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td align="center">
<table width="800" border="0" cellspacing="0" cellpadding="0">
<tr>
<td bgcolor="FE0000"><img src="img/logo2.gif" width="371" height="128"></td>
</tr>
<tr>
<td bgcolor="F0F0F0">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="1"></td>
<td width="258" valign="top"><a href="www.birddogsgarage.com">BDG</a></td>
</tr>
</table>
</td>
</tr>
</table>
</td>
</tr>
</table>
[/code]

The problem that I am running into is stripping off the top portion because the body tag can have alot of variation and the content itself can start with anything.

This is the closest I have come
but it still leaves the  closing head tag and the body tag in place -

[code]<?php
$source = '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>test</title>
<link href="style.css" rel="stylesheet" type="text/css">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body disabled leftmargin="0" topmargin="0" marginwidth="0" marginheight="0">
<p><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td align="center">
<table width="800" border="0" cellspacing="0" cellpadding="0">
<tr>
<td bgcolor="FE0000"><img src="img/logo2.gif" width="371" height="128"></td>
</tr>
<tr>
<td bgcolor="F0F0F0">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="1"></td>
<td width="258" valign="top"><a href="www.birddogsgarage.com">BDG</a></td>
</tr>
</table>
</td>
</tr>
</table>
</td>
</tr>
</table>
</body>
</html>' . $source;
$output = strstr ($source,"</head>");
$output = substr ($output, 0, strpos ($output,"</body>"));
$file_source = highlight_string($output, true);
echo '<textarea name="150" rows="20" cols="75" >'.$output.'</textarea>';
?>[/code]


Any ideas would be greatly appreciated.
Do you want to strip all of the HTML tags? If you do, there is a PHP function called strip_tags(). You can also allow certain tags using this function, but you will have to read through the manual to figure out how it works :)

[url=http://us2.php.net/strip_tags]http://us2.php.net/strip_tags[/url]

Scot
As I understand it, the strip_tags() function will remove the entire HTML tag, for instance:

[code]
$text = '<p class="123">some text</p>';

$text = strip_tags($text);

echo $text;
[/code]

This should just echo 'some text'.

However, if you leave some tags in, you will also get the attributes of those tags in the '$text' variable.

You should just try it out and see if it works for you.

Scot
I stand corrected
You are right on the money
I was reading it wrong.

Now the only thing that is left behind is if the user has something in the title tag.

Guess I will have to find a way to strip that - and then use the strip_tags.

Back to the drawing board.
This did the trick - but it seemed to make more sense to add the tags that you dont want rather than create a list of all the tags to allow.

It also strips the text between open and closed tags For example the title.

Thanks for the push in the right direction Scot!  I appreciate it.

[code]<?php
  $source ='<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>test</title>
<link href="style.css" rel="stylesheet" type="text/css">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body disabled leftmargin="0" topmargin="0" marginwidth="0" marginheight="0">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td align="center">
<table width="800" border="0" cellspacing="0" cellpadding="0">
<tr>
<td bgcolor="FE0000"><img src="img/logo2.gif" width="371" height="128"></td>
</tr>
<tr>
<td bgcolor="F0F0F0">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="1"></td>
<td width="258" valign="top"><a href="www.birddogsgarage.com">BDG</a></td>
</tr>
</table>
</td>
</tr>
</table>
</td>
</tr>
</table>
</body>
</html>';
 
 
  function strip_selected_tags($source, $tags = '<html><head><title><link><meta><body><!>', $stripContent = true)
  {
      preg_match_all("/<([^>]+)>/i",$tags,$allTags,PREG_PATTERN_ORDER);
      foreach ($allTags[1] as $tag){
          if ($stripContent) {
              $source = preg_replace("/<".$tag."[^>]*>.*<\/".$tag.">/iU","",$source);
          }
          $source = preg_replace("/<\/?".$tag."[^>]*>/iU","",$source);
      }
      return $source;
  }
  $clean = strip_selected_tags($source);
  echo '<textarea name="150" rows="20" cols="75" >'.$clean.'</textarea>';


?>[/code]
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.