bsmither Posted July 12, 2021 Share Posted July 12, 2021 PHP 7.4.2 is given 256M as memory_limit. I am giving a 370MB file to md5_file(). I get a hash with no errors. Is PHP loading the entire file into memory, all at once, to process it? If so, is there then a practical limit to the size of the file without eventually causing an out-of-memory situation? Quote Link to comment Share on other sites More sharing options...
requinix Posted July 12, 2021 Share Posted July 12, 2021 Sounds like you've already given it a situation where it would have run out of memory and it did not... Quote Link to comment Share on other sites More sharing options...
bsmither Posted July 12, 2021 Author Share Posted July 12, 2021 Another experiment with a 8GiB file has these results: PHP did not time out. Even though max_execution_time is set to 60 seconds, it took 66 seconds for the entire process to finish (this function and and a few milliseconds of additional work elsewhere). The web server did time out waiting for PHP. Using NGINX where the default request timeout is 60 seconds, the result was a 504 Gateway Time-out. PHP's current memory allocation jumped from 3MiB to 12MiB at the point where md5_file() executed. Watching the drive activity light on the server box showed continuous activity for approximately 65 seconds. Followed by a brief blip for housekeeping. So, I can surmise that PHP's md5_file() reads the file in chunks - perhaps 8-9MiB per chunk - but unknown if the file fits completely in memory, then that is what will happen. Quote Link to comment Share on other sites More sharing options...
Solution requinix Posted July 12, 2021 Solution Share Posted July 12, 2021 Here are some relevant facts: 1. max_execution_time has some nuances in exactly what counts towards the limit. For instance, on Linux systems it counts only the time PHP itself is working, and therefore will not count time for disk reads which are performed by the system. 2. The average hard drive can stream data at about 125 MB/s. An 8 GiB file would take about 8000/125 = 64 seconds to read in full. 3. PHP's memory usage is not a perfect correlation to the underlying source - especially given that md5_file() will be performing some amount of hashing work in addition to the literal file reads. 4. PHP claims memory in blocks. 8MB = some number of blocks taken * the memory used per block. Do you want to know the exact truth of what PHP is doing, or would you like to continue investigating to see if you can discover it for yourself? Quote Link to comment Share on other sites More sharing options...
bsmither Posted July 13, 2021 Author Share Posted July 13, 2021 This is very interesting information and I am grateful for the time you've taken to explain the situation. Thank you! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.