Jump to content

Whitespace Compression with Preformat Awareness


aunquarra

Recommended Posts

My end goal... take in a string of HTML and return a whitespace-compressed version of it. I've got some code I've used on many occasions through the years, and it's been tweaked a fair bit over time as well. I'm happy with it except for one thing... It won't leave <pre> blocks alone.

 

I can match <pre> blocks just fine.

 

'/<[\s]*pre([^>]*)>([^<]*)<\/pre[\s]*>/Ui'

 

...seems to work as I'd like (though I can't warrant that it's perfect by a long shot).

 

What I don't know is how to apply that kind of detection in the manner I want. I'd like a neat regex solution, but I can accept a clever workaround. I just haven't thought of anything.

 

Here's what I typically use for the search and replace on a preg_replace()...

 

		$search = array(
			'/(\s+)?(.+).+);(\s+)?/',			// css each item
			'/(\s+)?(.+)(\s+)?\{(.+)\}(\s+)?/',	// css between items
			'/\n/',								// replace end of line by a space
			'/\>[^\S ]+/s',						// strip whitespaces after tags, except space
			'/[^\S ]+\</s',						// strip whitespaces before tags, except space
		 	'/(\s)+/s',							// shorten multiple whitespace sequences
		  );

		 $replace = array(
			'\\2:\\3;',
			'\\2 {\\4}',
			' ',
			'>',
		 	'<',
		 	'\\1',
		  );

 

So the question is how I can either:

 

  1. [*]Modify the patterns to ignore anything inside <pre> blocks (if that's even possible); or

[*]Add some kind of tokenization to remove <pre> blocks first, then run my preg_replace(), then put the <pre> blocks back.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.