ardenphp Posted October 30, 2022 Share Posted October 30, 2022 (edited) Hi, I'm trying to read a binary file based of its defined format, but while I'm getting some sucess, I'm failing to understand how it migth be done correctly. The file format is as below (http://www.watto.org/specs.html?specs=Archive_U_Generic down at the end of the page) & (https://wiki.beyondunreal.com/Legacy:Package_File_Format) // ARCHIVE HEADER 4 - Unreal Header (193,131,42,158) 2 - Version (100) 2 - License Mode (57) 2 - Package Flags (1) 2 - Package Flags (0) 4 - Number Of Names 4 - Name Directory Offset 4 - Number Of Files 4 - File Directory Offset 4 - Number Of Types 4 - Type Directory Offset 16 - GUID Hash I've been reading in some data from a small test file with the below code (https://arden.ie/readfile.php) <?php $filename = "test.utx"; $handle = fopen($filename, "r"); echo "<pre>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - File Signature (".hexdec(bin2hex($contents[0])).",".hexdec(bin2hex($contents[1])).",".hexdec(bin2hex($contents[2])).",".hexdec(bin2hex($contents[3])).")<br>\n"; $contents = fread($handle, 2); // 2 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Version<BR>\n"; $contents = fread($handle, 2); // 2 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - License Mode<BR>\n"; $contents = fread($handle, 2); // 2 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Package Flags<BR>\n"; $contents = fread($handle, 2); // 2 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Package Flags<BR>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Number Of Names<br>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Name Directory Offset<br>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Number Of Files<br>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - File Directory Offset<br>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Number Of Types<br>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Type Directory Offset<br>\n"; $contents = fread($handle, 1); // 1 bytes $d = bin2hex($contents); $contents = fread($handle, 1); // 1 bytes $c = bin2hex($contents); $contents = fread($handle, 1); // 1 bytes $b = bin2hex($contents); $contents = fread($handle, 1); // 1 bytes $a = bin2hex($contents); //echo $a.$b.$c.$d."\n"; $contents = fread($handle, 1); // 1 bytes $f = bin2hex($contents); $contents = fread($handle, 1); // 1 bytes $e = bin2hex($contents); //echo $a.$b.$c.$d."-".$e.$f."\n"; $contents = fread($handle, 1); // 1 bytes $h = bin2hex($contents); $contents = fread($handle, 1); // 1 bytes $g = bin2hex($contents); //echo $a.$b.$c.$d."-".$e.$f."-".$g.$h."\n"; $contents = fread($handle, 2); // 2 bytes $i = bin2hex($contents); //echo $a.$b.$c.$d."-".$e.$f."-".$g.$h."-".$i."\n"; $contents = fread($handle, 6); // 6 bytes $j = bin2hex($contents); echo $a.$b.$c.$d."-".$e.$f."-".$g.$h."-".$i."-".$j."\n"; fseek($handle, 0); // go back to start of file $data = fread ($handle, filesize($filename)) or die ("Could not read data from file $file"); # Create the format for unpacking the header data $header_format = 'IUnreal Header/' . 'IVersion/' . 'ILicense Mode/' . 'IPackage Flags(1)/' . 'LPackage Flags(0)/' . 'INumber Of Names/' . 'IName Directory Offset/' . 'INumber Of Files/' . 'IFile Directory Offset/' . 'INumber Of Types/' . 'IType Directory Offset/' . 'IGUID Hash' ; # Unpack the header data $header = unpack($header_format, $data); //$targetRecord =$header_format['L# of Records']; print_r($header); echo "<BR><BR>\n".bin2hex($header['Unreal Header'])."<br>\n"; echo $header['Unreal Header']; echo "</pre>\n"; fclose($handle); ?> which outputs the following c1832a9e - File Signature (193,131,42,158) 7f00 - 127 - Version 1d00 - 29 - License Mode 2100 - 33 - Package Flags 0000 - 0 - Package Flags 1f000000 - 31 - Number Of Names 48000000 - 72 - Name Directory Offset 07000000 - 7 - Number Of Files f1020000 - 241 - File Directory Offset 06000000 - 6 - Number Of Types c7020000 - 199 - Type Directory Offset e484d857-00b7-4107-a58a-36ff29f6a3a5 Array ( [Unreal Header] => 2653586369 [Version] => 1900671 [License Mode] => 33 [Package Flags(1)] => 31 [Package Flags(0)] => 72 [Number Of Names] => 7 [Name Directory Offset] => 753 [Number Of Files] => 6 [File Directory Offset] => 711 [Number Of Types] => 3833911383 [Type Directory Offset] => 1090977975 [GUID Hash] => 4281764517 ) 32363533353836333639 2653586369 I know the way I'm doing it is incredible inefficient, and probably the wrong way, and the order doesn't seem correct either with the GUID. So could someone please assist in guiding me on how to better understand how to better read such data from files and how to work with the data in order to get the correct values etc.. The file I'm reading is a texture file (from the game unreal tournament), you can grab a copy here - https://arden.ie/test.utx Below is the file contents (start) to show a caparison to the output above. Thanks for all your help! arden Edited October 30, 2022 by ardenphp Quote Link to comment https://forums.phpfreaks.com/topic/315475-reading-binary-data-structures-files-help/ Share on other sites More sharing options...
ardenphp Posted October 30, 2022 Author Share Posted October 30, 2022 (edited) Updated the code somewhat and getting a little closer, but still struggling. $filename = "test.utx"; $handle = fopen($filename, "r"); echo "<pre>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - File Signature (".hexdec(bin2hex($contents[0])).",".hexdec(bin2hex($contents[1])).",".hexdec(bin2hex($contents[2])).",".hexdec(bin2hex($contents[3])).")<br>\n"; $contents = fread($handle, 2); // 2 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Version<BR>\n"; $contents = fread($handle, 2); // 2 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - License Mode<BR>\n"; $contents = fread($handle, 2); // 2 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Package Flags<BR>\n"; $contents = fread($handle, 2); // 2 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Package Flags<BR>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Number Of Names<br>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Name Directory Offset<br>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Number Of Files<br>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - File Directory Offset<br>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Number Of Types<br>\n"; $contents = fread($handle, 4); // 4 bytes echo bin2hex($contents)." - ".hexdec(bin2hex($contents[0]))." - Type Directory Offset<br>\n"; $contents = fread($handle, 1); // 1 bytes $d = bin2hex($contents); $contents = fread($handle, 1); // 1 bytes $c = bin2hex($contents); $contents = fread($handle, 1); // 1 bytes $b = bin2hex($contents); $contents = fread($handle, 1); // 1 bytes $a = bin2hex($contents); //echo $a.$b.$c.$d."\n"; $contents = fread($handle, 1); // 1 bytes $f = bin2hex($contents); $contents = fread($handle, 1); // 1 bytes $e = bin2hex($contents); //echo $a.$b.$c.$d."-".$e.$f."\n"; $contents = fread($handle, 1); // 1 bytes $h = bin2hex($contents); $contents = fread($handle, 1); // 1 bytes $g = bin2hex($contents); //echo $a.$b.$c.$d."-".$e.$f."-".$g.$h."\n"; $contents = fread($handle, 2); // 2 bytes $i = bin2hex($contents); //echo $a.$b.$c.$d."-".$e.$f."-".$g.$h."-".$i."\n"; $contents = fread($handle, 6); // 6 bytes $j = bin2hex($contents); echo $a.$b.$c.$d."-".$e.$f."-".$g.$h."-".$i."-".$j."<BR><HR>\n"; fseek($handle, 0); // go back to start of file $data = fread ($handle, filesize($filename)) or die ("Could not read data from file $file"); # Create the format for unpacking the header data $header_format = 'NUnreal Header/' . 'sVersion/' . 'sLicense Mode/' . 'sPackage Flags(1)/' . 'sPackage Flags(0)/' . 'iNumber Of Names/' . 'iName Directory Offset/' . 'iNumber Of Files/' . 'IFile Directory Offset/' . 'iNumber Of Types/' . 'IType Directory Offset/' . 'H*GUID Hash' ; # Unpack the header data $header = unpack($header_format, $data); print_r($header); echo "<BR>\n"; echo bin_to_uuid($header['GUID Hash']); function uuid_to_bin($uuid){ return pack("H*", str_replace('-', '', $uuid)); } function bin_to_uuid($bin){ return join("-", unpack("H8time_low/H4time_mid/H4time_hi/H4clock_seq_hi/H12clock_seq_low", $bin)); } echo "</pre>\n"; fclose($handle); output... c1832a9e - File Signature (193,131,42,158) 7f00 - 127 - Version 1d00 - 29 - License Mode 2100 - 33 - Package Flags 0000 - 0 - Package Flags 1f000000 - 31 - Number Of Names 48000000 - 72 - Name Directory Offset 07000000 - 7 - Number Of Files f1020000 - 241 - File Directory Offset 06000000 - 6 - Number Of Types c7020000 - 199 - Type Directory Offset e484d857-00b7-4107-a58a-36ff29f6a3a5 Array ( [Unreal Header] => 3246598814 [Version] => 127 [License Mode] => 29 [Package Flags(1)] => 33 [Package Flags(0)] => 0 [Number Of Names] => 31 [Name Directory Offset] => 72 [Number Of Files] => 7 [File Directory Offset] => 753 [Number Of Types] => 6 [Type Directory Offset] => 711 [GUID Hash] => 57d884e4b7000741a58a36ff29f6a3a502000000070000001e000000070000001f000000054e6f6e6500100407040d496e7465726e616c54696d650010000700084d69705a65726f001000070006436f6c6f720010040704065642697473001000070007466f726d617400100007000655426974730010000700065553697a650010000700065653697a6500100007000756436c616d7000100007000755436c616d7000100007000f546578436f6f7264536f757263650010000700094d6174657269616c0010000700084469666675736500100007000953706563756c61720010000700075368616465720010000700085061636b616765001004070407456e67696e65001000070405436f726500100007040a546578456e764d6170001000070006436c6173730010040704085465787475726500100007040d324b3454726f70687954455800100007000f74726f706879544558545552455300100007000c54726f706879317368616400100007000e54726f7068793152454674657800100007001454726f706879454e567265666c656374696f6e00100007000c54726f706879316261736500100007001454726f7068793152454673696c76657254455800100007001174726f7068793152454673696c7665720010000700054e6f6e650010040704022a03101b1f3f01229de0399c01a20181b010410501030601080401080722000100000822000100000a22000100000922000100001e00022a031818183f012260ce692d01a201c2b110410501030601080401080722000100000822000100000a22000100000922000100001e000b010b0c05011e022a03000000130122e423cc4c01a201d2a1104105010506010a04010a0722000400000822000400000a22000400000922000400001e000d05040e05031e0b010b0c05021e1e121000000000111214ffffffff151214ffffffff131214faffffff101214ffffffff0f121000000000128200070000001904000f00374c088200070000001c04000f003743098300070000001a04000f00077a098200070000001b04000f0037410a8500070000001804000f0007780a8300070000001d04000f00077f0a840000000000170400070001460b ) 35376438-3834-6534-6237-303030373431 Edited October 30, 2022 by ardenphp Quote Link to comment https://forums.phpfreaks.com/topic/315475-reading-binary-data-structures-files-help/#findComment-1602059 Share on other sites More sharing options...
Solution requinix Posted October 30, 2022 Solution Share Posted October 30, 2022 The one helpful part missing from your post is stating what the correct values are supposed to be. Because e484d857-00b7-4107-a58a-36ff29f6a3a5 looks like a correct GUID to me. Typically, though, one deals with binary file data by reading an entire "block" of stuff at once and then unpacking it. $data = fread($handle, 52); print_r(unpack("C4header/vversion/vlicense/v2flags/Vnames/Vnameoffset/Vfiles/Vfileoffset/Vtypes/Vtypeoffset/C16guid", $data)); Array ( [header1] => 193 [header2] => 131 [header3] => 42 [header4] => 158 [version] => 127 [license] => 29 [flags1] => 33 [flags2] => 0 [names] => 31 [nameoffset] => 72 [files] => 7 [fileoffset] => 753 [types] => 6 [typeoffset] => 711 [guid1] => 228 [guid2] => 132 [guid3] => 216 [guid4] => 87 [guid5] => 0 [guid6] => 183 [guid7] => 65 [guid8] => 7 [guid9] => 165 [guid10] => 138 [guid11] => 54 [guid12] => 255 [guid13] => 41 [guid14] => 246 [guid15] => 163 [guid16] => 165 ) Quote Link to comment https://forums.phpfreaks.com/topic/315475-reading-binary-data-structures-files-help/#findComment-1602065 Share on other sites More sharing options...
ardenphp Posted October 31, 2022 Author Share Posted October 31, 2022 Thanks requinix! Yes, having the orignal values would help, but I don't have them, only some. I've updated by code and used yours, I never knew about the C16 option for the guid read! Handy! I've noticed though that the byte order is different to your output, I'm guessing that's because there is and endian issue somewhere? Might you know how I can correct this? Verified values opening the same file in another app. I'm now getting the below output [Unreal Header1] => 193 [Unreal Header2] => 131 [Unreal Header3] => 42 [Unreal Header4] => 158 [Version] => 127 [License Mode] => 29 [Package Flags1] => 33 [Package Flags2] => 0 [Number Of Names] => 31 [Name Directory Offset] => 72 [Number Of Files] => 7 [File Directory Offset] => 753 [Number Of Types] => 6 [Type Directory Offset] => 711 [GUID Hash1] => 87 [GUID Hash2] => 216 [GUID Hash3] => 132 [GUID Hash4] => 228 [GUID Hash5] => 183 [GUID Hash6] => 0 [GUID Hash7] => 7 [GUID Hash8] => 65 [GUID Hash9] => 165 [GUID Hash10] => 138 [GUID Hash11] => 54 [GUID Hash12] => 255 [GUID Hash13] => 41 [GUID Hash14] => 246 [GUID Hash15] => 163 [GUID Hash16] => 165 Quote Link to comment https://forums.phpfreaks.com/topic/315475-reading-binary-data-structures-files-help/#findComment-1602067 Share on other sites More sharing options...
requinix Posted October 31, 2022 Share Posted October 31, 2022 (edited) 6 hours ago, ardenphp said: I've noticed though that the byte order is different to your output, I'm guessing that's because there is and endian issue somewhere? Take a look at what's happening with the data. Match up the expected values with the actual values... You can keep the C16 but you'll have to piece together the hashes in a different order than 1-16. Edited October 31, 2022 by requinix derp Quote Link to comment https://forums.phpfreaks.com/topic/315475-reading-binary-data-structures-files-help/#findComment-1602073 Share on other sites More sharing options...
ardenphp Posted October 31, 2022 Author Share Posted October 31, 2022 Yes, my only concern with regards rearraigning the data is that it won't match on some other systems (like yours). eg. My first code example I was rearraigning the position of the data to get the correct hash, but your output didn’t need to be rearranged at all, the numbers were in the correct order, even looking at the file in a hex editor the data is arranged incorrectly compared ot the verified hash. I came across the following post (https://github.com/webpatser/laravel-uuid/issues/11 note link in the last post too) which confirms the rearranging the data to get the correct hash which I was already doing, but it's noted as an endian-issue and as your output didn’t show this, I would end up on your system anyway, getting a different GUID and then such verification would fail, unless I did a second check of the original order too just to be sure, but that’s more of a band aide then a fix and ding this thousands of times at a go would not be very efficient. All in all though, I'm much closer to a cleaner function and really appriceate your guidence. arden Quote Link to comment https://forums.phpfreaks.com/topic/315475-reading-binary-data-structures-files-help/#findComment-1602077 Share on other sites More sharing options...
ardenphp Posted October 31, 2022 Author Share Posted October 31, 2022 Slight update to script to address padding issue now. $filename = "test.utx"; $handle = fopen($filename, "r"); echo "<pre>\n"; $data = fread($handle, 52) or die ("Could not read data from file $filename"); $header = unpack("C4Unreal Header/vVersion/vLicense Mode/v2Package Flags/VNumber Of Names/VName Directory Offset/VNumber Of Files/VFile Directory Offset/VNumber Of Types/VType Directory Offset/C16GUID Hash", $data); print_r($header); echo "<HR>\n"; if(isLittleEndian()!=1) // rearange to get correct hash { // order of file echo strtoupper(str_pad(dechex($header['GUID Hash1']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash2']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash3']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash4']), 2, "0", STR_PAD_LEFT)."-". str_pad(dechex($header['GUID Hash5']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash6']), 2, "0", STR_PAD_LEFT)."-". str_pad(dechex($header['GUID Hash7']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash8']), 2, "0", STR_PAD_LEFT)."-". str_pad(dechex($header['GUID Hash9']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash10']), 2, "0", STR_PAD_LEFT)."-". str_pad(dechex($header['GUID Hash11']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash12']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash13']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash14']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash15']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash16']), 2, "0", STR_PAD_LEFT))."\n"; } else { // rearange echo strtoupper(str_pad(dechex($header['GUID Hash4']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash3']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash2']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash1']), 2, "0", STR_PAD_LEFT)."-". str_pad(dechex($header['GUID Hash6']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash5']), 2, "0", STR_PAD_LEFT)."-". str_pad(dechex($header['GUID Hash8']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash7']), 2, "0", STR_PAD_LEFT)."-". str_pad(dechex($header['GUID Hash9']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash10']), 2, "0", STR_PAD_LEFT)."-". str_pad(dechex($header['GUID Hash11']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash12']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash13']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash14']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash15']), 2, "0", STR_PAD_LEFT).str_pad(dechex($header['GUID Hash16']), 2, "0", STR_PAD_LEFT))."\n"; } echo "</pre>\n"; fclose($handle); function isLittleEndian() { return unpack('S',"\x01\x00")[1] === 1; } function swapEndianness($hex) { return implode('', array_reverse(str_split($hex, 2))); } output.. Array ( [Unreal Header1] => 193 [Unreal Header2] => 131 [Unreal Header3] => 42 [Unreal Header4] => 158 [Version] => 127 [License Mode] => 29 [Package Flags1] => 33 [Package Flags2] => 0 [Number Of Names] => 31 [Name Directory Offset] => 72 [Number Of Files] => 7 [File Directory Offset] => 753 [Number Of Types] => 6 [Type Directory Offset] => 711 [GUID Hash1] => 87 [GUID Hash2] => 216 [GUID Hash3] => 132 [GUID Hash4] => 228 [GUID Hash5] => 183 [GUID Hash6] => 0 [GUID Hash7] => 7 [GUID Hash8] => 65 [GUID Hash9] => 165 [GUID Hash10] => 138 [GUID Hash11] => 54 [GUID Hash12] => 255 [GUID Hash13] => 41 [GUID Hash14] => 246 [GUID Hash15] => 163 [GUID Hash16] => 165 ) E484D857-00B7-4107-A58A-36FF29F6A3A5 Hopefully this will address the issues that have presented themselves trying to get to this point. Thanks again for your help! Quote Link to comment https://forums.phpfreaks.com/topic/315475-reading-binary-data-structures-files-help/#findComment-1602091 Share on other sites More sharing options...
kicken Posted October 31, 2022 Share Posted October 31, 2022 (edited) I usually define functions that correspond to the types/blocks in the defined file format and use those to read it piece by piece. For this, I created some readDWORD, readWORD, and readGUID functions. The GUID took a bit to figure out, and I'm still not 100% sure it's correct but it makes some sense and matches your example. <?php $file = 'test.utx'; $fp = fopen($file, 'rb'); $sig = readDWORD($fp); if ($sig !== 0x9E2A83C1){ die('Invalid file format'); } else { echo 'Valid file.' . PHP_EOL; } echo 'Version: ' . ($version = readWORD($fp)) . PHP_EOL; echo 'License mode: ' . readWORD($fp) . PHP_EOL; echo 'Package flags: ' . readDWORD($fp) . PHP_EOL; echo 'Name count: ' . readDWORD($fp) . PHP_EOL; echo 'Name offset: ' . readDWORD($fp) . PHP_EOL; echo 'Export count: ' . readDWORD($fp) . PHP_EOL; echo 'Export offset: ' . readDWORD($fp) . PHP_EOL; echo 'Import count: ' . readDWORD($fp) . PHP_EOL; echo 'Import offset: ' . readDWORD($fp) . PHP_EOL; if ($version >= 68){ echo 'GUID: ' . readGUID($fp) . PHP_EOL; } function readDWORD($fp) : int{ return read($fp, 4, 'V'); } function readWORD($fp) : int{ return read($fp, 2, 'v'); } function readGUID($fp) : string{ $time_low=readDWORD($fp); $time_mid=readWORD($fp); $time_high_and_version=readWORD($fp); $clk_seq_hi_res=read($fp, 1, 'C'); $clk_seq_low=read($fp, 1, 'C'); $node=fread($fp, 6); return sprintf('%s-%s-%s-%s%s-%s' , bin2hex(pack('N', $time_low)) , bin2hex(pack('n', $time_mid)) , bin2hex(pack('n', $time_high_and_version)) , bin2hex(pack('C', $clk_seq_hi_res)) , bin2hex(pack('C', $clk_seq_low)) , bin2hex($node) ); } function read($fp, int $length, string $code){ $bytes = fread($fp, $length); $parsed = unpack($code . 'parsed', $bytes); return $parsed['parsed']; } The file specification you linked says the file is encoded in Little Endian. The UUID RFC says the representation is in Big Endian so we read the multi-byte components as little endian then convert them to big endian for display. output Valid file. Version: 127 License mode: 29 Package flags: 33 Name count: 31 Name offset: 72 Export count: 7 Export offset: 753 Import count: 6 Import offset: 711 GUID: e484d857-00b7-4107-a58a-36ff29f6a3a5 Edited October 31, 2022 by kicken 1 Quote Link to comment https://forums.phpfreaks.com/topic/315475-reading-binary-data-structures-files-help/#findComment-1602094 Share on other sites More sharing options...
requinix Posted October 31, 2022 Share Posted October 31, 2022 Unless you're saying that the endianness in the file varies then all you have to do (for my code) is make sure you use the format codes which specifically say little/big endian. If it does vary in the file, which is possible, then it's probably according to your system, in which case you very likely just use the system-dependent formatters instead. Quote Link to comment https://forums.phpfreaks.com/topic/315475-reading-binary-data-structures-files-help/#findComment-1602102 Share on other sites More sharing options...
ardenphp Posted October 31, 2022 Author Share Posted October 31, 2022 Thanks kicken! Yes, this seems like a much cleaner approach to mine and object oriented! I was able to adjust the code for the complete header format and tested with a few different files and the data seems to be clean! I might move it to a memory stream or buffer to save on constant reads going back and forth, but I'll do some speeds tests to justify the work. Really appreciate all the guidance! $fp = fopen($file, 'rb'); echo "<pre>\n"; $sig = readDWORD($fp); if ($sig !== 0x9E2A83C1){ die('Not an Unreal File.'); } else { echo 'Unreal File found. (0x' . StrToUpper(dechex($sig)) . ')' . PHP_EOL; } echo "\n"; echo 'Version: ' . ($version = readWORD($fp)) . PHP_EOL; echo 'License mode: ' . readWORD($fp) . PHP_EOL; echo 'Package flags: ' . readDWORD($fp) . PHP_EOL; echo 'Name count: ' . readDWORD($fp) . PHP_EOL; echo 'Name offset: ' . readDWORD($fp) . PHP_EOL; echo 'Export count: ' . readDWORD($fp) . PHP_EOL; echo 'Export offset: ' . readDWORD($fp) . PHP_EOL; echo 'Import count: ' . readDWORD($fp) . PHP_EOL; echo 'Import offset: ' . readDWORD($fp) . PHP_EOL; echo "\n"; if ($version < 68) // old format { echo 'Heritage count: ' . readDWORD($fp) . PHP_EOL; echo 'Heritage offset: ' . readDWORD($fp) . PHP_EOL; } else { // newer format echo 'GUID: ' . readGUID($fp) . PHP_EOL; echo "\n"; echo 'Generation count: ' . ($generations = readDWORD($fp)) . PHP_EOL; echo "\n"; for($i=0;$i<$generations;$i++) { echo 'Import offset: ' . readDWORD($fp) . PHP_EOL; echo 'Import count: ' . readDWORD($fp) . PHP_EOL; echo "\n"; } } Quote Link to comment https://forums.phpfreaks.com/topic/315475-reading-binary-data-structures-files-help/#findComment-1602105 Share on other sites More sharing options...
ardenphp Posted November 1, 2022 Author Share Posted November 1, 2022 17 hours ago, requinix said: Unless you're saying that the endianness in the file varies then all you have to do (for my code) is make sure you use the format codes which specifically say little/big endian. If it does vary in the file, which is possible, then it's probably according to your system, in which case you very likely just use the system-dependent formatters instead. Sorry for the confusion, when you posted the code and output I saw the order was different to mine (your was correct in readable format), as far as I could tell, the file format will remain the same. As our outputs were different but the code was the same (when I started using your code), I assumed the difference was between our systems. Looking at the pack/unpack php functions, there are options for endian positions on numbers, but not characters/strings, so again, figured it was system specific. I've yet to do some more testing but as in ‘kicken’ comment regarding the GUID and his code example, hopefully this is all that’s needed. Thanks again! Quote Link to comment https://forums.phpfreaks.com/topic/315475-reading-binary-data-structures-files-help/#findComment-1602146 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.