Jump to content

Reading binary (data structures) files help


ardenphp
Go to solution Solved by requinix,

Recommended Posts

Hi,

I'm trying to read a binary file based of its defined format, but while I'm getting some sucess, I'm failing to understand how it migth be done correctly.

The file format is as below (http://www.watto.org/specs.html?specs=Archive_U_Generic down at the end of the page) & (https://wiki.beyondunreal.com/Legacy:Package_File_Format)

// ARCHIVE HEADER
 4 - Unreal Header (193,131,42,158)
 2 - Version (100)
 2 - License Mode (57)
 2 - Package Flags (1)
 2 - Package Flags (0)
 4 - Number Of Names
 4 - Name Directory Offset
 4 - Number Of Files
 4 - File Directory Offset
 4 - Number Of Types
 4 - Type Directory Offset
16 - GUID Hash

I've been reading in some data from a small test file with the below code (https://arden.ie/readfile.php)

<?php
$filename = "test.utx";
$handle   = fopen($filename, "r");
echo "<pre>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - File Signature (".hexdec(bin2hex($contents[0])).",".hexdec(bin2hex($contents[1])).",".hexdec(bin2hex($contents[2])).",".hexdec(bin2hex($contents[3])).")<br>\n";

$contents = fread($handle, 2); // 2 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Version<BR>\n";

$contents = fread($handle, 2); // 2 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - License Mode<BR>\n";

$contents = fread($handle, 2); // 2 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Package Flags<BR>\n";

$contents = fread($handle, 2); // 2 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Package Flags<BR>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Number Of Names<br>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Name Directory Offset<br>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Number Of Files<br>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - File Directory Offset<br>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Number Of Types<br>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Type Directory Offset<br>\n";


$contents = fread($handle, 1); // 1 bytes
$d = bin2hex($contents);

$contents = fread($handle, 1); // 1 bytes
$c = bin2hex($contents);

$contents = fread($handle, 1); // 1 bytes
$b = bin2hex($contents);

$contents = fread($handle, 1); // 1 bytes
$a = bin2hex($contents);
//echo $a.$b.$c.$d."\n";

$contents = fread($handle, 1); // 1 bytes
$f = bin2hex($contents);

$contents = fread($handle, 1); // 1 bytes
$e = bin2hex($contents);
//echo $a.$b.$c.$d."-".$e.$f."\n";

$contents = fread($handle, 1); // 1 bytes
$h = bin2hex($contents);

$contents = fread($handle, 1); // 1 bytes
$g = bin2hex($contents);
//echo $a.$b.$c.$d."-".$e.$f."-".$g.$h."\n";

$contents = fread($handle, 2); // 2 bytes
$i = bin2hex($contents);
//echo $a.$b.$c.$d."-".$e.$f."-".$g.$h."-".$i."\n";

$contents = fread($handle, 6); // 6 bytes
$j = bin2hex($contents);

echo $a.$b.$c.$d."-".$e.$f."-".$g.$h."-".$i."-".$j."\n";







fseek($handle, 0); // go back to start of file

$data = fread ($handle, filesize($filename))
    or die ("Could not read data from file $file");

# Create the format for unpacking the header data
$header_format = 
    'IUnreal Header/' .               
    'IVersion/' .               
    'ILicense Mode/' .               
    'IPackage Flags(1)/' .       
    'LPackage Flags(0)/' .      
    'INumber Of Names/' .       
    'IName Directory Offset/' . 
	'INumber Of Files/' .       
	'IFile Directory Offset/' . 
	'INumber Of Types/' .       
	'IType Directory Offset/' . 
	'IGUID Hash' ;              

# Unpack the header data
$header = unpack($header_format, $data);
//$targetRecord =$header_format['L# of Records'];

print_r($header);

echo "<BR><BR>\n".bin2hex($header['Unreal Header'])."<br>\n";
echo $header['Unreal Header'];

echo "</pre>\n";
fclose($handle);
?>

which outputs the following

c1832a9e - File Signature (193,131,42,158)

7f00 - 127 - Version

1d00 - 29 - License Mode

2100 - 33 - Package Flags

0000 - 0 - Package Flags

1f000000 - 31 - Number Of Names

48000000 - 72 - Name Directory Offset

07000000 - 7 - Number Of Files

f1020000 - 241 - File Directory Offset

06000000 - 6 - Number Of Types

c7020000 - 199 - Type Directory Offset

e484d857-00b7-4107-a58a-36ff29f6a3a5
Array
(
    [Unreal Header] => 2653586369
    [Version] => 1900671
    [License Mode] => 33
    [Package Flags(1)] => 31
    [Package Flags(0)] => 72
    [Number Of Names] => 7
    [Name Directory Offset] => 753
    [Number Of Files] => 6
    [File Directory Offset] => 711
    [Number Of Types] => 3833911383
    [Type Directory Offset] => 1090977975
    [GUID Hash] => 4281764517
)



32363533353836333639

2653586369

 

I know the way I'm doing it is incredible inefficient, and probably the wrong way, and the order doesn't seem correct either with the GUID. So could someone please assist in guiding me on how to better understand how to better read such data from files and how to work with the data in order to get the correct values etc..

The file I'm reading is a texture file (from the game unreal tournament), you can grab a copy here - https://arden.ie/test.utx

Below is the file contents (start) to show a caparison to the output above.

file.png

 

Thanks for all your help!

arden

Edited by ardenphp
Link to comment
Share on other sites

Updated the code somewhat and getting a little closer, but still struggling.

$filename = "test.utx";
$handle   = fopen($filename, "r");
echo "<pre>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - File Signature (".hexdec(bin2hex($contents[0])).",".hexdec(bin2hex($contents[1])).",".hexdec(bin2hex($contents[2])).",".hexdec(bin2hex($contents[3])).")<br>\n";

$contents = fread($handle, 2); // 2 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Version<BR>\n";

$contents = fread($handle, 2); // 2 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - License Mode<BR>\n";

$contents = fread($handle, 2); // 2 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Package Flags<BR>\n";

$contents = fread($handle, 2); // 2 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Package Flags<BR>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Number Of Names<br>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Name Directory Offset<br>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Number Of Files<br>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - File Directory Offset<br>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Number Of Types<br>\n";

$contents = fread($handle, 4); // 4 bytes
echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Type Directory Offset<br>\n";


$contents = fread($handle, 1); // 1 bytes
$d = bin2hex($contents);

$contents = fread($handle, 1); // 1 bytes
$c = bin2hex($contents);

$contents = fread($handle, 1); // 1 bytes
$b = bin2hex($contents);

$contents = fread($handle, 1); // 1 bytes
$a = bin2hex($contents);
//echo $a.$b.$c.$d."\n";

$contents = fread($handle, 1); // 1 bytes
$f = bin2hex($contents);

$contents = fread($handle, 1); // 1 bytes
$e = bin2hex($contents);
//echo $a.$b.$c.$d."-".$e.$f."\n";

$contents = fread($handle, 1); // 1 bytes
$h = bin2hex($contents);

$contents = fread($handle, 1); // 1 bytes
$g = bin2hex($contents);
//echo $a.$b.$c.$d."-".$e.$f."-".$g.$h."\n";

$contents = fread($handle, 2); // 2 bytes
$i = bin2hex($contents);
//echo $a.$b.$c.$d."-".$e.$f."-".$g.$h."-".$i."\n";

$contents = fread($handle, 6); // 6 bytes
$j = bin2hex($contents);

echo $a.$b.$c.$d."-".$e.$f."-".$g.$h."-".$i."-".$j."<BR><HR>\n";





fseek($handle, 0); // go back to start of file

$data = fread ($handle, filesize($filename))
    or die ("Could not read data from file $file");

# Create the format for unpacking the header data
$header_format = 
    'NUnreal Header/' .               
    'sVersion/' .               
    'sLicense Mode/' .               
    'sPackage Flags(1)/' .       
    'sPackage Flags(0)/' .      
    'iNumber Of Names/' .       
    'iName Directory Offset/' . 
	'iNumber Of Files/' .       
	'IFile Directory Offset/' . 
	'iNumber Of Types/' .       
	'IType Directory Offset/' . 
	'H*GUID Hash' ;              

# Unpack the header data
$header = unpack($header_format, $data);

print_r($header);
echo "<BR>\n";

echo bin_to_uuid($header['GUID Hash']);

function uuid_to_bin($uuid){
  return pack("H*", str_replace('-', '', $uuid));
}

function bin_to_uuid($bin){
  return join("-", unpack("H8time_low/H4time_mid/H4time_hi/H4clock_seq_hi/H12clock_seq_low", $bin));
}

echo "</pre>\n";
fclose($handle);

output...

c1832a9e - File Signature (193,131,42,158)

7f00 - 127 - Version

1d00 - 29 - License Mode

2100 - 33 - Package Flags

0000 - 0 - Package Flags

1f000000 - 31 - Number Of Names

48000000 - 72 - Name Directory Offset

07000000 - 7 - Number Of Files

f1020000 - 241 - File Directory Offset

06000000 - 6 - Number Of Types

c7020000 - 199 - Type Directory Offset

e484d857-00b7-4107-a58a-36ff29f6a3a5

Array
(
    [Unreal Header] => 3246598814
    [Version] => 127
    [License Mode] => 29
    [Package Flags(1)] => 33
    [Package Flags(0)] => 0
    [Number Of Names] => 31
    [Name Directory Offset] => 72
    [Number Of Files] => 7
    [File Directory Offset] => 753
    [Number Of Types] => 6
    [Type Directory Offset] => 711
    [GUID Hash] => 57d884e4b7000741a58a36ff29f6a3a502000000070000001e000000070000001f000000054e6f6e6500100407040d496e7465726e616c54696d650010000700084d69705a65726f001000070006436f6c6f720010040704065642697473001000070007466f726d617400100007000655426974730010000700065553697a650010000700065653697a6500100007000756436c616d7000100007000755436c616d7000100007000f546578436f6f7264536f757263650010000700094d6174657269616c0010000700084469666675736500100007000953706563756c61720010000700075368616465720010000700085061636b616765001004070407456e67696e65001000070405436f726500100007040a546578456e764d6170001000070006436c6173730010040704085465787475726500100007040d324b3454726f70687954455800100007000f74726f706879544558545552455300100007000c54726f706879317368616400100007000e54726f7068793152454674657800100007001454726f706879454e567265666c656374696f6e00100007000c54726f706879316261736500100007001454726f7068793152454673696c76657254455800100007001174726f7068793152454673696c7665720010000700054e6f6e650010040704022a03101b1f3f01229de0399c01a20181b010410501030601080401080722000100000822000100000a22000100000922000100001e00022a031818183f012260ce692d01a201c2b110410501030601080401080722000100000822000100000a22000100000922000100001e000b010b0c05011e022a03000000130122e423cc4c01a201d2a1104105010506010a04010a0722000400000822000400000a22000400000922000400001e000d05040e05031e0b010b0c05021e1e121000000000111214ffffffff151214ffffffff131214faffffff101214ffffffff0f121000000000128200070000001904000f00374c088200070000001c04000f003743098300070000001a04000f00077a098200070000001b04000f0037410a8500070000001804000f0007780a8300070000001d04000f00077f0a840000000000170400070001460b
)


35376438-3834-6534-6237-303030373431

 

Edited by ardenphp
Link to comment
Share on other sites

  • Solution

The one helpful part missing from your post is stating what the correct values are supposed to be. Because e484d857-00b7-4107-a58a-36ff29f6a3a5 looks like a correct GUID to me.

Typically, though, one deals with binary file data by reading an entire "block" of stuff at once and then unpacking it.

$data = fread($handle, 52);
print_r(unpack("C4header/vversion/vlicense/v2flags/Vnames/Vnameoffset/Vfiles/Vfileoffset/Vtypes/Vtypeoffset/C16guid", $data));
Array
(
    [header1] => 193
    [header2] => 131
    [header3] => 42
    [header4] => 158
    [version] => 127
    [license] => 29
    [flags1] => 33
    [flags2] => 0
    [names] => 31
    [nameoffset] => 72
    [files] => 7
    [fileoffset] => 753
    [types] => 6
    [typeoffset] => 711
    [guid1] => 228
    [guid2] => 132
    [guid3] => 216
    [guid4] => 87
    [guid5] => 0
    [guid6] => 183
    [guid7] => 65
    [guid8] => 7
    [guid9] => 165
    [guid10] => 138
    [guid11] => 54
    [guid12] => 255
    [guid13] => 41
    [guid14] => 246
    [guid15] => 163
    [guid16] => 165
)

 

Link to comment
Share on other sites

Thanks requinix!

Yes, having the orignal values would help, but I don't have them, only some.

I've updated by code and used yours, I never knew about the C16 option for the guid read! Handy!

I've noticed though that the byte order is different to your output, I'm guessing that's because there is and endian issue somewhere?

Might you know how I can correct this? 

Verified values opening the same file in another app.

fileinfo.png

I'm now getting the below output

[Unreal Header1] => 193
    [Unreal Header2] => 131
    [Unreal Header3] => 42
    [Unreal Header4] => 158
    [Version] => 127
    [License Mode] => 29
    [Package Flags1] => 33
    [Package Flags2] => 0
    [Number Of Names] => 31
    [Name Directory Offset] => 72
    [Number Of Files] => 7
    [File Directory Offset] => 753
    [Number Of Types] => 6
    [Type Directory Offset] => 711
    [GUID Hash1] => 87
    [GUID Hash2] => 216
    [GUID Hash3] => 132
    [GUID Hash4] => 228
    [GUID Hash5] => 183
    [GUID Hash6] => 0
    [GUID Hash7] => 7
    [GUID Hash8] => 65
    [GUID Hash9] => 165
    [GUID Hash10] => 138
    [GUID Hash11] => 54
    [GUID Hash12] => 255
    [GUID Hash13] => 41
    [GUID Hash14] => 246
    [GUID Hash15] => 163
    [GUID Hash16] => 165

 

Link to comment
Share on other sites

6 hours ago, ardenphp said:

I've noticed though that the byte order is different to your output, I'm guessing that's because there is and endian issue somewhere?

Take a look at what's happening with the data. Match up the expected values with the actual values...

You can keep the C16 but you'll have to piece together the hashes in a different order than 1-16.

Edited by requinix
derp
Link to comment
Share on other sites

Yes, my only concern with regards rearraigning the data is that it won't match on some other systems (like yours).

eg.

My first code example I was rearraigning the position of the data to get the correct hash, but your output didn’t need to be rearranged at all, the numbers were in the correct order, even looking at the file in a hex editor the data is arranged incorrectly compared ot the verified hash.

filehash.png

I came across the following post (https://github.com/webpatser/laravel-uuid/issues/11 note link in the last post too) which confirms the rearranging the data to get the correct hash which I was already doing, but it's noted as an endian-issue and as your output didn’t show this, I would end up on your system anyway, getting a different GUID and then such verification would fail, unless I did a second check of the original order too just to be sure, but that’s more of a band aide then a fix and ding this thousands of times at a go would not be very efficient.

All in all though, I'm much closer to a cleaner function and really appriceate your guidence.

arden

Link to comment
Share on other sites

Slight update to script to address padding issue now.

$filename = "test.utx";
$handle   = fopen($filename, "r");
echo "<pre>\n";

$data = fread($handle, 52) or die ("Could not read data from file $filename");
$header = unpack("C4Unreal Header/vVersion/vLicense Mode/v2Package Flags/VNumber Of Names/VName Directory Offset/VNumber Of Files/VFile Directory Offset/VNumber Of Types/VType Directory Offset/C16GUID Hash", $data);
print_r($header);
echo "<HR>\n";

if(isLittleEndian()!=1) // rearange to get correct hash
{ // order of file
echo strtoupper(str_pad(dechex($header['GUID Hash1']),  2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash2']),  2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash3']),  2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash4']),  2, "0", STR_PAD_LEFT)."-".
	            str_pad(dechex($header['GUID Hash5']),  2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash6']),  2, "0", STR_PAD_LEFT)."-".
	            str_pad(dechex($header['GUID Hash7']),  2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash8']),  2, "0", STR_PAD_LEFT)."-".
	            str_pad(dechex($header['GUID Hash9']),  2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash10']), 2, "0", STR_PAD_LEFT)."-".
	            str_pad(dechex($header['GUID Hash11']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash12']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash13']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash14']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash15']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash16']), 2, "0", STR_PAD_LEFT))."\n";	
}
else
{ // rearange
echo strtoupper(str_pad(dechex($header['GUID Hash4']),  2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash3']),  2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash2']),  2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash1']),  2, "0", STR_PAD_LEFT)."-".
	            str_pad(dechex($header['GUID Hash6']),  2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash5']),  2, "0", STR_PAD_LEFT)."-".
	            str_pad(dechex($header['GUID Hash8']),  2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash7']),  2, "0", STR_PAD_LEFT)."-".
	            str_pad(dechex($header['GUID Hash9']),  2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash10']), 2, "0", STR_PAD_LEFT)."-".
	            str_pad(dechex($header['GUID Hash11']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash12']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash13']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash14']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash15']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash16']), 2, "0", STR_PAD_LEFT))."\n";	
}

echo "</pre>\n";
fclose($handle);

function isLittleEndian() { return unpack('S',"\x01\x00")[1] === 1; }

function swapEndianness($hex) { return implode('', array_reverse(str_split($hex, 2))); }

output..

Array
(
    [Unreal Header1] => 193
    [Unreal Header2] => 131
    [Unreal Header3] => 42
    [Unreal Header4] => 158
    [Version] => 127
    [License Mode] => 29
    [Package Flags1] => 33
    [Package Flags2] => 0
    [Number Of Names] => 31
    [Name Directory Offset] => 72
    [Number Of Files] => 7
    [File Directory Offset] => 753
    [Number Of Types] => 6
    [Type Directory Offset] => 711
    [GUID Hash1] => 87
    [GUID Hash2] => 216
    [GUID Hash3] => 132
    [GUID Hash4] => 228
    [GUID Hash5] => 183
    [GUID Hash6] => 0
    [GUID Hash7] => 7
    [GUID Hash8] => 65
    [GUID Hash9] => 165
    [GUID Hash10] => 138
    [GUID Hash11] => 54
    [GUID Hash12] => 255
    [GUID Hash13] => 41
    [GUID Hash14] => 246
    [GUID Hash15] => 163
    [GUID Hash16] => 165
)

E484D857-00B7-4107-A58A-36FF29F6A3A5

Hopefully this will address the issues that have presented themselves trying to get to this point.

Thanks again for your help!

Link to comment
Share on other sites

I usually define functions that correspond to the types/blocks in the defined file format and use those to read it piece by piece.  For this, I created some readDWORD, readWORD, and readGUID functions.  The GUID took a bit to figure out, and I'm still not 100% sure it's correct but it makes some sense and matches your example.

<?php

$file = 'test.utx';
$fp = fopen($file, 'rb');

$sig = readDWORD($fp);
if ($sig !== 0x9E2A83C1){
    die('Invalid file format');
} else {
    echo 'Valid file.' . PHP_EOL;
}

echo 'Version: ' . ($version = readWORD($fp)) . PHP_EOL;
echo 'License mode: ' . readWORD($fp) . PHP_EOL;
echo 'Package flags: ' . readDWORD($fp) . PHP_EOL;
echo 'Name count: ' . readDWORD($fp) . PHP_EOL;
echo 'Name offset: ' . readDWORD($fp) . PHP_EOL;
echo 'Export count: ' . readDWORD($fp) . PHP_EOL;
echo 'Export offset: ' . readDWORD($fp) . PHP_EOL;
echo 'Import count: ' . readDWORD($fp) . PHP_EOL;
echo 'Import offset: ' . readDWORD($fp) . PHP_EOL;
if ($version >= 68){
    echo 'GUID: ' . readGUID($fp) . PHP_EOL;
}

function readDWORD($fp) : int{
    return read($fp, 4, 'V');
}

function readWORD($fp) : int{
    return read($fp, 2, 'v');
}

function readGUID($fp) : string{
    $time_low=readDWORD($fp);
    $time_mid=readWORD($fp);
    $time_high_and_version=readWORD($fp);
    $clk_seq_hi_res=read($fp, 1, 'C');
    $clk_seq_low=read($fp, 1, 'C');
    $node=fread($fp, 6);

    return sprintf('%s-%s-%s-%s%s-%s'
        , bin2hex(pack('N', $time_low))
        , bin2hex(pack('n', $time_mid))
        , bin2hex(pack('n', $time_high_and_version))
        , bin2hex(pack('C', $clk_seq_hi_res))
        , bin2hex(pack('C', $clk_seq_low))
        , bin2hex($node)
    );
}

function read($fp, int $length, string $code){
    $bytes = fread($fp, $length);
    $parsed = unpack($code . 'parsed', $bytes);

    return $parsed['parsed'];
}

The file specification you linked says the file is encoded in Little Endian.  The UUID RFC says the representation is in Big Endian so we read the multi-byte components as little endian then convert them to big endian for display.

output

Valid file.
Version: 127
License mode: 29
Package flags: 33
Name count: 31
Name offset: 72
Export count: 7
Export offset: 753
Import count: 6
Import offset: 711
GUID: e484d857-00b7-4107-a58a-36ff29f6a3a5

 

Edited by kicken
  • Like 1
Link to comment
Share on other sites

Unless you're saying that the endianness in the file varies then all you have to do (for my code) is make sure you use the format codes which specifically say little/big endian. If it does vary in the file, which is possible, then it's probably according to your system, in which case you very likely just use the system-dependent formatters instead.

Link to comment
Share on other sites

Thanks kicken!

Yes, this seems like a much cleaner approach to mine and object oriented! I was able to adjust the code for the complete header format and tested with a few different files and the data seems to be clean!

I might move it to a memory stream or buffer to save on constant reads going back and forth, but I'll do some speeds tests to justify the work.

Really appreciate all the guidance!

$fp = fopen($file, 'rb');
echo "<pre>\n";

$sig = readDWORD($fp);
if ($sig !== 0x9E2A83C1){
    die('Not an Unreal File.');
} else {
    echo 'Unreal File found. (0x' . StrToUpper(dechex($sig)) . ')' . PHP_EOL;
}
echo "\n";

echo 'Version:          ' . ($version = readWORD($fp)) . PHP_EOL;
echo 'License mode:     ' . readWORD($fp)  . PHP_EOL;
echo 'Package flags:    ' . readDWORD($fp) . PHP_EOL;
echo 'Name count:       ' . readDWORD($fp) . PHP_EOL;
echo 'Name offset:      ' . readDWORD($fp) . PHP_EOL;
echo 'Export count:     ' . readDWORD($fp) . PHP_EOL;
echo 'Export offset:    ' . readDWORD($fp) . PHP_EOL;
echo 'Import count:     ' . readDWORD($fp) . PHP_EOL;
echo 'Import offset:    ' . readDWORD($fp) . PHP_EOL;
echo "\n";

if ($version < 68) // old format
{
	echo 'Heritage count:   '  . readDWORD($fp) . PHP_EOL;
	echo 'Heritage offset:  '  . readDWORD($fp) . PHP_EOL;
}
else { // newer format
    echo 'GUID:             ' . readGUID($fp) . PHP_EOL;
	echo "\n";
	echo 'Generation count: ' . ($generations = readDWORD($fp)) . PHP_EOL;
	echo "\n";
	
	for($i=0;$i<$generations;$i++)
	{
		echo 'Import offset:    ' . readDWORD($fp) . PHP_EOL;
		echo 'Import count:     ' . readDWORD($fp) . PHP_EOL;
		echo "\n";
	}
}

 

Link to comment
Share on other sites

17 hours ago, requinix said:

Unless you're saying that the endianness in the file varies then all you have to do (for my code) is make sure you use the format codes which specifically say little/big endian. If it does vary in the file, which is possible, then it's probably according to your system, in which case you very likely just use the system-dependent formatters instead.

Sorry for the confusion, when you posted the code and output I saw the order was different to mine (your was correct in readable format), as far as I could tell, the file format will remain the same. As our outputs were different but the code was the same (when I started using your code), I assumed the difference was between our systems. Looking at the pack/unpack php functions, there are options for endian positions on numbers, but not characters/strings, so again, figured it was system specific.

I've yet to do some more testing but as in ‘kicken’ comment regarding the GUID and his code example, hopefully this is all that’s needed.

Thanks again!

 

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.