Jump to content

Remove specific tag attributes from html


groupbrand

Recommended Posts

Hi, I am trying to remove specific tag attributes from my HTML whilst leaving some intact. Really I need a regular expression that matches a list of 'accepted' tag attributes (e.g. href) and removes all others (e.g. style, class etc.) as in the following example:

 

<a href="http://mysite.com" target="_blank" class="myclass"> would become  <a href="http://mysite.com" target="_blank">

 

However, i won't just be matching <a> tags but all tags including <p>, <ul>, <li> etc.

 

Hope someone can help!

 

Cheers.

 

 

Hi there.  Give this a shot:

 

// Some sample data to work with
$data = '<p class="someclass">This is a <a href="http://www.google.com" target="_blank" style="color:#cc0000">hyperlink</a> in a paragraph!</p>';

// Array of attributes to keep, all others will be removed
$keep = array('href', 'target');

// Get an array of all the attributes and their values in the data string
preg_match_all('/[a-z]+=".+"/iU', $data, $attributes);

// Loop through the attribute pairs, match them against the keep array and remove
// them from $data if they don't exist in the array
foreach ($attributes[0] as $attribute) {
$attributeName = stristr($attribute, '=', true);
if (!in_array($attributeName, $keep)) {
	$data = str_replace(' ' . $attribute, '', $data);
}
}

 

That should output $data as something like:

 

<p>This is a <a href="http://www.google.com" target="_blank">hyperlink</a> in a paragraph!</p>

 

Hope this helps!  :)

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.