Jump to content

Archived

This topic is now archived and is closed to further replies.

Kingskin

Remove <script> tags and everything in them

Recommended Posts

Should be a simple one, I need to know not just how to remove script tags, but also all of the text between them, ie i need a script that, upon encountering <script>, removes everything until it encounters </script>.

At the moment i can remove the script tags which obviously takes care of the security aspect, but it leaves the content of the tags in my DB which obvisuly I dont want for neatness reasons.

Cheers!

Share this post


Link to post
Share on other sites
Ok, sorted i'll post the solution for anyone that needs it:
I found this on php.net but it wasnt working and also used PHP5 only functions so I modified it and it's now working with PHP4

[code]
// Usage:
//    $nasty_tags=array('script','body');  #etc
//    $to_clean='Here is some script: <script>alert("oops")</script> - it will be cleaned!';
//    $clean_str = removeTags($to_clean,$nasty_tags);
function removeTags($text,$tags_array)
{
       $length = strlen($text);
       $pos =0;
       $tags_array = array_flip($tags_array);
       while ($pos < $length && ($pos = strpos($text,'<',$pos)) !== false){
           $dlm_pos = strpos($text,' ',$pos);
           $dlm2_pos = strpos($text,'>',$pos);
           if ($dlm_pos > $dlm2_pos)$dlm_pos=$dlm2_pos;
           $which_tag = strtolower(substr($text,$pos+1,$dlm_pos-($pos+1)));
           $tag_length = strlen($srch_tag);
           if (!isset($tags_array[$which_tag])){
               //if no tag matches found
               ++$pos;
               continue;
           }
           //find the end
           $sec_tag = '</'.$which_tag.'>';
           $sec_pos = strpos($text,$sec_tag,$pos+$tag_length);
           //remove everything after if end of the tag not found
           if ($sec_pos === false) $sec_pos = $length-strlen($sec_tag);
           $rmv_length = $sec_pos-$pos+strlen($sec_tag);
           $text = substr_replace($text,'',$pos,$rmv_length);
           //update length
           $length = $length - $rmv_length;
           $pos++;
       }
       return $text;
}
[/code]

Share this post


Link to post
Share on other sites
OK for some reason that script above would only work it if the script tags were not the first or last item in a string. This (much neater) solution works much better:

[code]function evilTags($text)
{
    // Part 1
    // This array is for single tags and their closing counterparts
    
    $tags_to_strip = Array("html","body","meta","link","head");
    
    foreach ($tags_to_strip as $tag)
    {
           $text = preg_replace("/<\/?" . $tag . "(.|\s)*?>/","",$text);
    }
    
    // Part 2
    // This array is for stripping opening and closing tags AND what's in between
    
    $tags_and_content_to_strip = Array("title","script");
    
    foreach ($tags_and_content_to_strip as $tag) {
           $text = preg_replace("/<" . $tag . ">(.|\s)*?<\/" . $tag . ">/","",$text);
    }
    
    return $text;
}[/code]

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.