Jump to content

Regex help


loquela

Recommended Posts

Hi there,

I have about 8,000 html files that I want to strip all code from the top of the file to upto and including the </head> tag and then I wann strip the </html> from each of the files.

 

Does anybody have any advice?

 

Cheers!

Link to comment
Share on other sites

Can't really give you a suggestion about how to get your list of files into the $files array below unless you provide more details....but this should do the trick (I suggest you make backups first):

$files = array('file1.html','file2.html','file3.html');
foreach($files as $file) {
  $file = file_get_contents($file);
  $file = preg_replace('^.*?</head>(.*?)</html>(.*)$~is','$1$2',$file);
  file_put_contents($file);
}

 

You might (probably will) also need to use set_time_limit or something to keep your script from timing out.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.