Jump to content

Problems getting mime type of files


cbreemer
Go to solution Solved by cbreemer,

Recommended Posts

I am traversing a directory tree and printing out the name and mime type of each file. Getting the mime type fails for about 600 of my 800+ files.

Initially I used mime_content_type but I read suggestions that this was deprecated so I switched to using finfo_open(FILEINFO_MIME_TYPE), only to find I get exactly the same errors on the same files. See this snippet of output

Belasting\CBR\2010-05-22 Inkomstenbelasting 2009 Voorlopige aanslag.pdf  ->  application/pdf<br>
Belasting\CBR\2011-05-27 Inkomstenbelasting 2010 Voorlopige aanslag.pdf  ->  <br />
<b>Warning</b>:  finfo_file(C:\Users\cbree\OneDrive\docs\Belasting\CBR\2011-05-27 Inkomstenbelasting 2010 Voorlopige aanslag.pdf): Failed to open stream: Invalid argument in <b>C:\Users\cbree\OneDrive\wwwroot\doc\find.php</b> on line <b>19</b><br />

For the first file it works correctly, and the mimetype is printed. For the second file, which a very similar pdf with a very similar name, I get this error.

The PHP code I use is

    echo str_replace($base . "\\", "", $file);
    echo "  ->  ";
    $finfo = finfo_open(FILEINFO_MIME_TYPE);
    echo finfo_file($finfo, $file);
    finfo_close($finfo);
    echo "<br>\n";

My PHP version is 8.3.3 on Windows 11. I am really stumped by this, any ideas appreciated !


 

 

Link to comment
Share on other sites

16 minutes ago, cbreemer said:

For the second file, which a very similar pdf with a very similar name, I get this error.

finfo doesn't care about the file's name - only its contents. And it seems that your 2010 file reads a little different from the 2009 version, enough so that finfo can't tell what's in the file.

The unfortunate truth about MIME detection is that it doesn't work very reliably in many cases. Generally, you're better off examining the extension and then trying (when possible) to verify that the file is valid for that extension.
In the case of PDFs that's actually kinda hard to do. Is there a problem with just trusting that your *.pdf files are PDF files? What other kinds of files do you need to handle?

Link to comment
Share on other sites

14 minutes ago, Barand said:

Try something like this

$dir = '../test';

echo '<pre>';
foreach (glob("$dir/*.*") as $f)  {
    printf( "%-80s  %-30s<br>", $f, mime_content_type($f) );
}

 

Thanks ! That is much like what I had to start with, except that only I use scandir instead of glob (bit I'm pretty sure it does not matter how exactly the filenames are obtained).

Edited by cbreemer
Link to comment
Share on other sites

  • Solution
7 minutes ago, requinix said:

finfo doesn't care about the file's name - only its contents. And it seems that your 2010 file reads a little different from the 2009 version, enough so that finfo can't tell what's in the file.

Yes I thought so too. I was just ruling out the distant possibility there was something anomalous in the path name. 

7 minutes ago, requinix said:

The unfortunate truth about MIME detection is that it doesn't work very reliably in many cases. Generally, you're better off examining the extension and then trying (when possible) to verify that the file is valid for that extension.
In the case of PDFs that's actually kinda hard to do. Is there a problem with just trusting that your *.pdf files are PDF files? What other kinds of files do you need to handle?

Indeed I am going off mime type detection, seeing how easily it can trip up. It's not a big deal, they are all my own files and there are only a handful of file types to deal with. So checking the extensions will be fine, as you suggest. I'm already implementing it.

Thanks for your help !

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.