Jump to content

Zlib (GZip) Questions - Archives within archives - how to read them iteratively?


thepip3r

Recommended Posts

PHP 5.3.1

Windows 2003/IIS 6

MySQL ??

 

I'm building a page that will import some information into a database that is programatically generated so I don't have any control over how the archives are presented.

 

Basically, I'm trying to use the Zlib reading functions in order to parse these files and archives that contain more files to extract data I need and write it to a database.  The problem is that because there are nested archives within the parent archive, when I've tried to parse that parent file with gzfile() or gzdecode(), it starts opening the text files underneath, but once it hits the other gz files, it shows garbled/encoded information.  I know it's not much but this is what I've tried so far:

 

<?php
include("func.php");
?>
<html>
<body>
<pre>
<?php 
$target = "tmp_dbg/"; 
$target = $target . basename( $_FILES['debug']['name']) ; 
$ok=1; 
if(move_uploaded_file($_FILES['debug']['tmp_name'], $target)) 
{
	echo "The file ". basename( $_FILES['debug']['name']). " has been uploaded<br />";
} 
else {
	echo "Sorry, there was a problem uploading your file.";
}

$file_name = $target;

$lines = gzfile($file_name);

//$lines = gzdecode($lines);
//print_r($lines);

foreach ($lines as $line) {
	echo $line;
}	

?> 
</pre>
</form>
</body>
</html>

 

Can anyone offer any guidance on how I would recursively parse a parent GZ file and read all files (including embedded GZ files and their associated content) so I can start parsing this data?

 

TIA!

Link to comment
Share on other sites

Zlib can only compress a single file. It doesn't have the ability to pack / unpack multiple archives. It sounds like the archives you are working with are .tar'ed first then zlib'ed which is possible because .tar allows for multiple uncompressed archives. But without seeing the file that is just a guess. But again Zlib it's self cannot manage or manipulate multiple archives.

Link to comment
Share on other sites

Use Archive_Tar  class from the Pear Repository.. Then do something like this...

 

 

$upload_file = 'upload_file.gz';

$temp = 'temp.tar';

// inflate and save the tar archive 

file_put_contents ( $temp, gzinflate ( $upload_file ) );

// require the tar archive class

require ( 'tar.php' );

// create the tar file

$tar = new Archive_Tar ( $temp );


// array to hold all the file names in the tar archive (for extracting)

$list = array ();

// array to hold all the files attributes in the tar archive (for whatever purpose)

$data = array ();


// temp directory where to put all the files that are in the tar archive


$tmp = '/tmp';


// remove part of the memorized file names (read tar_archive docs to understand this!)

$remove = '';


// read tar file contents

if ( ( $files = $tar->listContent () ) != 0 )
{
foreach ( $files as $file )
{
	$list[] = $file['filename'];

	$data[] = array ( $tmp . '/' . $file['filename'], $file['size'] . ' bytes', date ( 'm-d-Y', $file['mtime'] ) );
}


// extract all files in the tar archive

$tar->extractList ( $list, $tmp, $remove ) or die ( 'Could not extract files!' );


// now do stuff with all your files

foreach ( $data AS $item )
{
	$out = fopen ( $item[0], 'rb' );

	// do whatever

	fclose ( $out );
}
}

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.