Jump to content

Recommended Posts

My server is Linux/Apache/PHP.

 

When a file is uploaded, I use PHP's finfo_open to confirm that the file have the correct file extension matches and delete them if it doesn't match.  I also which file mimi types and size could be uploaded.

 

Things I do with the files include:

  1. Upload user's files and store them in some public directory (/var/www/html/users_public_directory/), and allow other users to directly download them.
  2. Upload user's files and store them in some private directory (/var/www/users_private_directory/), and allow other users to download them using X-Sendfile.
  3. Upload user's ZIP files and convert them to PDF files (unzip the ZIP file, and uses Libreoffice and Imagemagick's convert to convert them to PDFs).

From the server's prospective, what are the risks of allowing users to upload files?  Are there some file types which are more dangerous to the server?  Could they be executed on the server, and if so, how could this be prevented?

Checking the MIME type doesn't help at all. For example, even a perfectly valid JPEG image can contain malicious code. What matters is how the files are treated.

 

There are three problems:

  • Your server may misinterpret the uploaded files and execute embedded code.
  • The browsers of your users may erroneously execute the files.
  • Some file types are inherently dangerous (HTML, SVG, Flash etc.)

Unfortunately, there's no definite solution. In fact, some webservers (like Apache) and some browsers (like Internet Explorer) will actively work against you and sabotage your efforts with all kinds of weird “features”. The only way to deal with that is to use defense in depth and protect the application with multiple security layers:

  • Never accept the user-provided filename. Generate a random name and pick an extension from a pre-defined whilelist.
  • Avoid direct downloads whenever possible. Instead, serve the files through a script. If this is not an option, make sure that the webserver will not any execute any scripts in the upload folder (and make sure this configuration cannot be overriden with a custom .htaccess file).
  • Serve the files via a separate (sub)domain to isolate them from the actual application.
  • Tell the client to not execute scripts. Use Content Security Policy, for example. Not all browsers support this, but for the ones that do, it's a very powerful feature.

Of course processing the files with LibreOffice, Imagemagick and whatnot creates additional risks, because attackers may be able to exploit specific bugs in those applications. But if you need this feature, you'll have to live with that.

 

I hope you use proper PHP extension for the processing? If you do it on the shell, you also have to worry about shell injections.

Checking the MIME type doesn't help at all.

Just because it's not enough doesn't mean it's useless. Assuming we're talking about determining the MIME type server-side and not getting it from $_FILES, of course.

- Best way to determine the proper file extension when it's not known ahead of time

- Good indication as to how an operating system (and to a lesser degree, your server) will interpret the file, if combined with using the correct file extension

- Provides validation for the good users

- Quick way to deny lazy attackers who aren't forging their own upload requests

Assuming we're talking about determining the MIME type server-side and not getting it from $_FILES, of course.

 

*lol*

 

So when the client explicitly declares a file type, that's not trustworthy, but when the same client puts a bunch of magic bytes into the file and makes the server guess the type, then we can suddenly rely on it? That doesn't make a terrible lot of sense.

 

It also seems we're going off on a tangent here. NotionCommotion asked about security, not data validation for well-meaning users. Using MIME type detection as a security feature is fundamentally wrong, because malicious file uploads are not an input problem. A perfectly valid image can blow up the server, and a super-shady PHP script can be a harmless piece of source code. The question is how the file is interpreted. And, no, you do not want to rely on the OS or webserver to figure it out for you. You tell the server and client how to interpret the data and prevent any guesswork from the beginning.

 

If we leave aside security and get into data validation, sure, you can check the MIME type. Whether you use the declaration of the client or the server's guess is irrelevant. Either way, the validation won't be very useful. The average user has Windows, and Windows is all about file extensions. Even if the client somehow ends up with a wrong extension on their PC, they'll quickly notice, because they won't be able to open the file properly. So if you get a “.jpg” file, it's safe to assume that the user actually means a JPEG image and not a Word document.

So when the client explicitly declares a file type, that's not trustworthy, but when the same client puts a bunch of magic bytes into the file and makes the server guess the type, then we can suddenly rely on it? That doesn't make a terrible lot of sense.

Rely on it in the sense that what you see is going to be what the server sees. Sure, I can stick a %PNG into whatever I want and make my .zip look like an image, however the server is going to use the same logic to come to the same conclusion. Even if I ignore the MIME type, the rest of the system may not, so I'd rather derive the same results as it will now than leave the question unanswered.

 

Using MIME type detection as a security feature is fundamentally wrong, because malicious file uploads are not an input problem.

Strictly speaking yes, data is data and harm only comes from using it incorrectly. But there are more ways to harm a system than waiting for a user to try to download the file to their computer or otherwise execute it.

If the file is malicious I don't want it on my server. Period. I don't care if the web server won't try to interpret it, I don't care if the file sits unused for eternity, it should not be there at all.

 

You tell the server and client how to interpret the data and prevent any guesswork from the beginning.

Ideally, yes, but I don't know of an operating system where you can specify the behavior of an individual file without having to rely on things like MIME type detection or file extension mappings.

 

If we leave aside security and get into data validation, sure, you can check the MIME type.

I'm sure we're on the same page regarding validation :)
Your server may misinterpret the uploaded files and execute embedded code.

What might allow this to happen?

 

 

Never accept the user-provided filename. Generate a random name and pick an extension from a pre-defined whilelist.

You mean not to use the user-provided filename to store it on the server, or not to display this name when someone wants to download it?  If the former, why pick an extension at all, and just use a random name without an extension (and revert to the applicable filename with extension when downloaded via headers)?  If the later, way?

 

 

Avoid direct downloads whenever possible. Instead, serve the files through a script. If this is not an option, make sure that the webserver will not any execute any scripts in the upload folder (and make sure this configuration cannot be overriden with a custom .htaccess file).

Again, what might make the webserver execute one of these file/scripts?

 

I hope you use proper PHP extension for the processing? If you do it on the shell, you also have to worry about shell injections.

Proper PHP extensions?  I don't, but of course use escapeshellarg() and the like.  Please explain.

What might allow [a misinterpretation of files] to happen?

 

Well, mostly the file extension. Many webservers will execute any file with a “.php” extension, sometimes even if the extension occurs within the filename (like “mypicture.php.jpg”).

 

Another case is a local file inclusion attack: If the user is able to inject a custom path into a dynamic include statement, then the included file will also be executed as a script – regardless of the file extension or the MIME type. For example, PHP will happily execute code embedded into a comment segment of a JPEG image.

 

The protection against that is obvious:

  • Be as restrictive as possible when you set up script execution in the webserver configuration. By default, no files should be executed. Then you create a whitelist of legitimate scripts or script folders.
  • Do not allow arbitrary paths in dynamic include statements. Use a whitelist of valid paths.
  • Restrict script inclusion with the open_basedir setting.

 

 

 

You mean not to use the user-provided filename to store it on the server, or not to display this name when someone wants to download it?  If the former, why pick an extension at all, and just use a random name without an extension (and revert to the applicable filename with extension when downloaded via headers)?  If the later, way?

 

If you store the uploaded files outside of the document root and serve them through a PHP script, then of course you could theoretically store the files with no extension at all. However, the extension is important information for humans. Administrators or supporters may have to access the files directly once in a while, and in that case it's useful to know the type without having to invoke MIME detection or look up the meta information in the database.

 

 

 

Proper PHP extensions?  I don't, but of course use escapeshellarg() and the like.  Please explain.

 

Escaping is fragile and should only be used as the last resort. For example, escapeshellarg() is vulnerable to null byte injections, which can allow users to truncate the shell arguments:

<?php

header('Content-Type: text/html;charset=utf8');



$_POST['filename'] = "myfile.php\0";	// note the null byte
$intended_filename = $_POST['filename'] . '.jpg';

$arg = 'This file should have a .jpg file extension, but the user prevented that: ' . escapeshellarg($intended_filename);

echo htmlspecialchars($arg, ENT_QUOTES, 'UTF-8');

So avoid dynamic shell commands whenever possible and use PHP extensions instead. Have you tried the ImageMagick extension, for example?

Edited by Jacques1

You know you're getting some good information when you get two experienced developers going back and forth. Soak it up peeps. Programmers, in my experience, rarely agree. However, there is more than one way to approach a problem and come to a suitable solution. Many times, a "right" answer cannot be necessarily determined in programming. It's not a black and white science. You can run tests a thousand times on a piece of code and get the correct testing answer, but still have serious flaws in your logic that will cause it to fail given such and such data. The worst thing a new programmer can do is stick steady to the first way they learned how to accomplish a task. It feels good to see results, but dig deeper and see how others are tackling the problem. Look into best practices at the time. If you post to the board and several people are telling you such and such is the wrong way to do something - this should be a clue that maybe it's the wrong approach. Having a healthy ego is good, but being stubborn will only keep you stuck as a mediocre programmer at best - at worst, a poor programmer who has never been challenged and only thinks they are good. A good programmer never stops progressing and continues to grow and learn.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.