Jump to content


Photo

Remove <script> tags and everything in them


  • Please log in to reply
2 replies to this topic

#1 Kingskin

Kingskin
  • Members
  • PipPip
  • Member
  • 18 posts

Posted 04 March 2006 - 01:15 PM

Should be a simple one, I need to know not just how to remove script tags, but also all of the text between them, ie i need a script that, upon encountering <script>, removes everything until it encounters </script>.

At the moment i can remove the script tags which obviously takes care of the security aspect, but it leaves the content of the tags in my DB which obvisuly I dont want for neatness reasons.

Cheers!

#2 Kingskin

Kingskin
  • Members
  • PipPip
  • Member
  • 18 posts

Posted 04 March 2006 - 01:55 PM

Ok, sorted i'll post the solution for anyone that needs it:
I found this on php.net but it wasnt working and also used PHP5 only functions so I modified it and it's now working with PHP4

// Usage: 
//    $nasty_tags=array('script','body');  #etc
//    $to_clean='Here is some script: <script>alert("oops")</script> - it will be cleaned!';
//    $clean_str = removeTags($to_clean,$nasty_tags);
function removeTags($text,$tags_array)
{
       $length = strlen($text);
       $pos =0;
       $tags_array = array_flip($tags_array);
       while ($pos < $length && ($pos = strpos($text,'<',$pos)) !== false){
           $dlm_pos = strpos($text,' ',$pos);
           $dlm2_pos = strpos($text,'>',$pos);
           if ($dlm_pos > $dlm2_pos)$dlm_pos=$dlm2_pos;
           $which_tag = strtolower(substr($text,$pos+1,$dlm_pos-($pos+1)));
           $tag_length = strlen($srch_tag);
           if (!isset($tags_array[$which_tag])){
               //if no tag matches found
               ++$pos;
               continue;
           }
           //find the end
           $sec_tag = '</'.$which_tag.'>';
           $sec_pos = strpos($text,$sec_tag,$pos+$tag_length);
           //remove everything after if end of the tag not found
           if ($sec_pos === false) $sec_pos = $length-strlen($sec_tag);
           $rmv_length = $sec_pos-$pos+strlen($sec_tag);
           $text = substr_replace($text,'',$pos,$rmv_length);
           //update length
           $length = $length - $rmv_length;
           $pos++;
       }
       return $text;
}


#3 Kingskin

Kingskin
  • Members
  • PipPip
  • Member
  • 18 posts

Posted 05 March 2006 - 10:58 AM

OK for some reason that script above would only work it if the script tags were not the first or last item in a string. This (much neater) solution works much better:

function evilTags($text)
{
    // Part 1
    // This array is for single tags and their closing counterparts
    
    $tags_to_strip = Array("html","body","meta","link","head");
    
    foreach ($tags_to_strip as $tag) 
    {
           $text = preg_replace("/<\/?" . $tag . "(.|\s)*?>/","",$text);
    }
    
    // Part 2
    // This array is for stripping opening and closing tags AND what's in between
    
    $tags_and_content_to_strip = Array("title","script");
    
    foreach ($tags_and_content_to_strip as $tag) {
           $text = preg_replace("/<" . $tag . ">(.|\s)*?<\/" . $tag . ">/","",$text);
    }
    
    return $text;
}





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users