Jump to content

How to clear all html tags except some from html file using regex?


Recommended Posts

I have retrieved html contents using CURL. I successfully retrieved all the contents between the needed form tags.

 

It has some table tags (<tr><td><th>) and many other tags. I just want to retrieve all <input> tags, <select> tags, and <textarea> tags.

 

Whats the regex I should use to clear all unneeded tags?

example:

 

remove all tr tags:

 

$content = preg_replace('~</?tr[^>]*>~i','',$content);

 

note: that does not remove content between the tags.

Thanks.. will try it. Can't we do like just remove all tags except few. I know some regex but not so complex.

I know we can add ^ which means "NOT".

Can't we do something like that?

Thanks.. will try it. Can't we do like just remove all tags except few. I know some regex but not so complex.

I know we can add ^ which means "NOT".

Can't we do something like that?

Have you tried strip_tags? As second parameter it takes the allowable tags.

Anyways if you want do it with regex try this one (I've modified Crayon Violent regexp adding a negative lookahead assertion):

 

$content=preg_replace('/<\/?(?!input|textarea|select)[^>]*>/','',$content);

 

In some cases it could have problems (html code within html comments ...  casually I tried it against a page that had it).

 

 

 

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.