Jump to content

Advice on image upload script.


Millertime1

Recommended Posts

I'm new to programming so take it easy on me but I wrote this script for php and it's not complete yet but I could use some advice on file upload security. Besides recreating the image, checking token/request and setting apache settings what else can I do to secure this application. Here's my code:

header('Content-Type: application/json');

$uploaded = array();
	
if(!empty($_FILES['file']['name'][0])) {	
	
	$allowedExts = array("gif", "jpeg", "pjpeg", "x-png", "jpg", "png");
		
	foreach ($_FILES['file']['tmp_name'] as $k => $v) {
		$name = $_FILES['file']['name'][$k];
		$type = $_FILES['file']['type'][$k];
		$error = $_FILES["file"]["error"][$k];
		$size = $_FILES["file"]["size"][$k];
	    $tempDir = $_FILES["file"]["tmp_name"][$k];
		$lowerName = strtolower($name);
		$temp = explode(".", $lowerName);
		$extension = end($temp);
		
		if ((($type == "image/gif")
			|| ($type == "image/jpeg")
			|| ($type == "image/jpg")
			|| ($type == "image/pjpeg")
			|| ($type == "image/x-png")
			|| ($type == "image/png"))
			&& ($size < 20000)
			&& in_array($extension, $allowedExts)) {
			if ($error > 0) {
			echo "Return Code: " . $error . "<br>";
			} else {
				$imageInfo = getimagesize($tempDir);
				if($imageInfo['mime'] != 'image/gif' && $imageInfo['mime'] != 'image/jpeg' && $imageInfo['mime'] != 'image/jpg' &&
				$imageInfo['mime'] != 'image/pjpeg' && $imageInfo['mime'] != 'image/x-png' && $imageInfo['mime'] != 'image/png') {
					echo 'Sorry we only accept GIF, JPEG, JPG, PJPEG, X-PNG, PNG image files!';
				}else {
				switch($type) {
					case 'image/gif':
						$newName = md5(uniqid()). '.' .time(). '.gif';
					break;
					case 'image/jpeg':
						$newName = md5(uniqid()). '.' .time(). '.jpeg';
					break;
					case 'image/jpg':
						$newName = md5(uniqid()). '.' .time(). '.jpg';
					break;
					case 'image/pjpeg':
						$newName = md5(uniqid()). '.' .time(). '.pjpeg';
					break;
					case 'image/x-png':
						$newName = md5(uniqid()). '.' .time(). '.x-png';
					break;
					case 'image/png':
						$newName = md5(uniqid()). '.' .time(). '.png';
					break;
				}
				
				
				if (file_exists('images/' .Session::get(Config::get('session/session_name')). '/' .$newName)) {
				   echo escape($name) . " already exists. ";
				} else {
				if(move_uploaded_file($tempDir, 'images/' .Session::get(Config::get('session/session_name')). '/' .$newName)) {
					$uploaded[] = array(
						'name' => $name,
						'file' => 'images/' .Session::get(Config::get('session/session_name')). '/' .$newName
				    );
				}
				}
				}
				}
			} else {
			   echo 'Sorry we only accept GIF, JPEG, JPG, PJPEG, X-PNG, PNG image files!';
			}
		}
	}

 I've been researching apache Mod_mime but can't find enough info to wrap my mind around it.

Edited by Millertime1
Link to comment
Share on other sites

don't trust $_FILES['file']['type'][$k]; to give you an accurate mime type.

 

It is set by the browser and shouldn't be trusted.

 

Use this: http://php.net/manual/en/function.finfo-file.php

 

That's why I had getimagesize() too but I'm not sure if you can trust that one either from what I've been reading.

 

I've changed a section of my code to this:

header('Content-Type: application/json');

$uploaded = array();
	
if(!empty($_FILES['file']['name'][0])) {	
	
	$allowedExts = array("gif", "jpeg", "pjpeg", "x-png", "jpg", "png");
		
	foreach ($_FILES['file']['tmp_name'] as $k => $v) {
		$name = $_FILES['file']['name'][$k];
		$type = $_FILES['file']['type'][$k];
		$error = $_FILES["file"]["error"][$k];
		$size = $_FILES["file"]["size"][$k];
	    $tempDir = $_FILES["file"]["tmp_name"][$k];
		$lowerName = strtolower($name);
		$temp = explode(".", $lowerName);
		$extension = end($temp);
		
		$finfo = finfo_open(FILEINFO_MIME_TYPE);
		$fileType = finfo_file($finfo, $tempDir);
		finfo_close($finfo);
		
		
		
		if ((($fileType == "image/gif")
			|| ($fileType == "image/jpeg")
			|| ($fileType == "image/jpg")
			|| ($fileType == "image/pjpeg")
			|| ($fileType == "image/x-png")
			|| ($fileType == "image/png"))
			&& ($size < 20000)
			&& in_array($extension, $allowedExts)) {
			if ($error > 0) {
			echo "Return Code: " . $error . "<br>";
			} else {
				$imageInfo = getimagesize($tempDir);
				if($imageInfo['mime'] != 'image/gif' && $imageInfo['mime'] != 'image/jpeg' && $imageInfo['mime'] != 'image/jpg' &&
				$imageInfo['mime'] != 'image/pjpeg' && $imageInfo['mime'] != 'image/x-png' && $imageInfo['mime'] != 'image/png') {
					echo 'Sorry we only accept GIF, JPEG, JPG, PJPEG, X-PNG, PNG image files!';
				}else {
				switch($fileType) {
					case 'image/gif':
						$newName = md5(uniqid()). '.' .time(). '.gif';
					break;
					case 'image/jpeg':
						$newName = md5(uniqid()). '.' .time(). '.jpeg';
					break;
					case 'image/jpg':
						$newName = md5(uniqid()). '.' .time(). '.jpg';
					break;
					case 'image/pjpeg':
						$newName = md5(uniqid()). '.' .time(). '.pjpeg';
					break;
					case 'image/x-png':
						$newName = md5(uniqid()). '.' .time(). '.x-png';
					break;
					case 'image/png':
						$newName = md5(uniqid()). '.' .time(). '.png';
					break;
				}
				
Link to comment
Share on other sites

Neither getimagesize() nor finfo_file() have any relevance for security, because even a perfectly valid image can contain arbitrary code. The only thing that matters is how the file is interpreted.

 

Note that you have to worry both about your server and your users. Enforcing a whitelist of file extensions is a good start, but it's by no means sufficient. If see the following problems:

  • The code is damn complex for such a simple task, and it's very difficult to read due to the structure, the formatting and a lot of duplicate checks. This greatly increases the risk of bugs.
  • There's no backup plan. Either this code, the webserver and the clients all behave exactly like they should, or you're screwed. That's not a good idea. Internet Explorer in particular will actively work against you, so it's very important to have a second line of defense.
  • What's with all those strange MIME types and extensions? Why do you want to allow the “x-png” extension? When do you expect to get the nonexistent “image/jpg” type? Do you really need people to upload progressive JPEGs? The problem is that the more exotic the data, the bigger the risk that somebody misunderstands it.
  • This md5(uniqid()) stuff is a bit silly. It may be “good enough” for a low-traffic, low-value website, but why fumble with workarounds when every serious operating system is capable of generating proper random numbers?
  • The file_exists() check doesn't work, but if you use proper random numbers, you don't need it anyway.
  • What is Session::get(Config::get('session/session_name'))? Are you sure you can safely insert it into the file path? Why do you need the subfolder in the first place?

I'd suggest the following:

  • Throw away all duplicate checks. You don't really need any type checks at all: This is, at best, a usability feature for the odd case that the client accidentally takes a non-image and gives it an image extension.
  • Refactor the code: Use proper indentation, make the structure clearer (e. g., put the standard case before the error case).
  • Use Content Security Policy to tell the browser that no scripts should ever be executed. A good policy would be: default-src 'none'; sandbox. Note, however, that only modern browsers respect this header.
  • Serve the images from a separate domain, preferable not a subdomain. Due to the Same-Origin Policy, malicious scripts will at least be isolated on this domain and cannot access the main site. This is actually the only way to deal with Internet Explorer.
  • Allow only the types and extensions you actually need.
  • Use the random number generator of your operating system. You can access it through several PHP interfaces, depending on which extensions you have installed: Mcrypt has mcrypt_create_iv(), OpenSSL has openssl_random_pseudo_bytes(). Otherwise, you can read directly from the system device (e. g. /dev/urandom).
Link to comment
Share on other sites

 

 

Neither getimagesize() nor finfo_file() have any relevance for security, because even a perfectly valid image can contain arbitrary code. The only thing that matters is how the file is interpreted.

 

Note that you have to worry both about your server and your users. Enforcing a whitelist of file extensions is a good start, but it's by no means sufficient. If see the following problems:

  • The code is damn complex for such a simple task, and it's very difficult to read due to the structure, the formatting and a lot of duplicate checks. This greatly increases the risk of bugs.

I thought quickly checking the content type before a more in depth check would be a good ideal and as for the format it was edited a bunch so it got a bit screwed up but I had plans to fix the structure.

  • There's no backup plan. Either this code, the webserver and the clients all behave exactly like they should, or you're screwed. That's not a good idea. Internet Explorer in particular will actively work against you, so it's very important to have a second line of defense.

I never knew about that so now I'll have to research that more. Thank you.

  • What's with all those strange MIME types and extensions? Why do you want to allow the “x-png” extension? When do you expect to get the nonexistent “image/jpg” type? Do you really need people to upload progressive JPEGs? The problem is that the more exotic the data, the bigger the risk that somebody misunderstands it.

I just got the extension names off of w3schools file upload script and x-png was recommended in a RFC document I read.

  • This md5(uniqid()) stuff is a bit silly. It may be “good enough” for a low-traffic, low-value website, but why fumble with workarounds when every serious operating system is capable of generating proper random numbers?

I didn't think I would run into a collision problem when I was also using a timestamp but I will use mcrypt from now on.

  • The file_exists() check doesn't work, but if you use proper random numbers, you don't need it anyway.

I figure it would never find a file with the same name but I thought having a backup would be helpful.

  • What is Session::get(Config::get('session/session_name'))? Are you sure you can safely insert it into the file path? Why do you need the subfolder in the first place?

It's a method I have to read session data and session_name is the users table id and I was going to add a numeric check to that also . I was reading that storing them that way would help if I corrupted the image location table and they were all stored in a single folder.

 

I'd suggest the following:

  • Throw away all duplicate checks. You don't really need any type checks at all: This is, at best, a usability feature for the odd case that the client accidentally takes a non-image and gives it an image extension.
  • Refactor the code: Use proper indentation, make the structure clearer (e. g., put the standard case before the error case).
  • Use Content Security Policy to tell the browser that no scripts should ever be executed. A good policy would be: default-src 'none'; sandbox. Note, however, that only modern browsers respect this header.

I'm looking into it.

  • Serve the images from a separate domain, preferable not a subdomain. Due to the Same-Origin Policy, malicious scripts will at least be isolated on this domain and cannot access the main site. This is actually the only way to deal with Internet Explorer.

Here's where I'm lost...let's say I store them in a folder sub root how do you call it when it has no registered address...sorry i'm real new!

  • Allow only the types and extensions you actually need.
  • Use the random number generator of your operating system. You can access it through several PHP interfaces, depending on which extensions you have installed: Mcrypt has mcrypt_create_iv(), OpenSSL has openssl_random_pseudo_bytes(). Otherwise, you can read directly from the system device (e. g. /dev/urandom).

 

Edited by Millertime1
Link to comment
Share on other sites

I just got the extension names off of w3schools file upload script and x-png was recommended in a RFC document I read.

 

w3schools is infamous for wrong and inaccurate information. Note that they are not associated with the W3C. It's a private company in Norway which appearently uses the name to sell certificates.

 

Nevertheless, it's OK to accept the “image/x-png” MIME type (since the type is irrelevant anyway), but you shouldn't allow an “x-png” extension.

 

 

 

I figure it would never find a file with the same name but I thought having a backup would be helpful.

 

The problem of the file_exists() method is that there's a gap between the check and the action. If two users try to upload the same new file at the same time, file_exists() returns false in both cases, because the file indeed doesn't exist at that point in time. However, the situation changes when the files are actually written. Now one upload will overwrite the other.

 

Instead of a weak random number and a weak backup, it's better to have a strong random number which doesn't need a backup.

 

 

 

It's a method I have to read session data and session_name is the users table id and I was going to add a numeric check to that also . I was reading that storing them that way would help if I corrupted the image location table and they were all stored in a single folder.

 

Yeah, well, as long as you make sure the ID is indeed a number, it's OK. But I'd rather backup my database regularly.

 

 

 

Here's where I'm lost...let's say I store them in a folder sub root how do you call it when it has no registered address...sorry i'm real new!

 

What I mean is that you should have a separate domain for the uploads. Let's say your main domain is www.millertime.com, and the document root of this site is /var/www/millertime/www. Then you should have a separate domain like uploads.millertime.io with a separate document root like /var/www/millertime/uploads. That's where the uploaded images go. The point is that even if you do have malicious JavaScript code on uploads.millertime.io, it cannot access the main domain www.millertime.com. All browsers prevent that.

 

Of course an extra domain is expensive, so it might not be an option. Maybe you can at least get a subdomain like uploads.millertime.com. This isn't quite as secure, because some data like cookies is shared between the domains. But it's better than nothing.

Link to comment
Share on other sites

w3schools is infamous for wrong and inaccurate information. Note that they are not associated with the W3C. It's a private company in Norway which appearently uses the name to sell certificates.

 

Nevertheless, it's OK to accept the “image/x-png” MIME type (since the type is irrelevant anyway), but you shouldn't allow an “x-png” extension.

 

 I did a little research and it looks like I could just accept GIF, JPG, PNG and be fine so I'm probably going to just go with those.

 

 

The problem of the file_exists() method is that there's a gap between the check and the action. If two users try to upload the same new file at the same time, file_exists() returns false in both cases, because the file indeed doesn't exist at that point in time. However, the situation changes when the files are actually written. Now one upload will overwrite the other.

 

Instead of a weak random number and a weak backup, it's better to have a strong random number which doesn't need a backup.

 

 Makes since. What size of mcrypt iv should I use that won't cause too much of a load on the server and at the same time be sufficient enough for naming?

 

 

Yeah, well, as long as you make sure the ID is indeed a number, it's OK. But I'd rather backup my database regularly.

 

 

 If it doesn't create performance problems I think storing this way helps with keeping relation to the users if something was to happen.

 

 

What I mean is that you should have a separate domain for the uploads. Let's say your main domain is www.millertime.com, and the document root of this site is /var/www/millertime/www. Then you should have a separate domain like uploads.millertime.io with a separate document root like /var/www/millertime/uploads. That's where the uploaded images go. The point is that even if you do have malicious JavaScript code on uploads.millertime.io, it cannot access the main domain www.millertime.com. All browsers prevent that.

 

Of course an extra domain is expensive, so it might not be an option. Maybe you can at least get a subdomain like uploads.millertime.com. This isn't quite as secure, because some data like cookies is shared between the domains. But it's better than nothing.

 

okay I understand now. Is there any dependable image hosting options available for websites?

Link to comment
Share on other sites

Makes since. What size of mcrypt iv should I use that won't cause too much of a load on the server and at the same time be sufficient enough for naming?

 

16 bytes is just fine. Note that you have to encode the bytes to turn them into sensible characters. If you use simple hexadecimal encoding, you get 32 characters:

$new_filename = bin2hex(mcrypt_create_iv(16, MCRYPT_DEV_URANDOM));
okay I understand now. Is there any dependable image hosting options available for websites?

 

What do you mean?

Link to comment
Share on other sites

Here's a updated code, it's not done yet as it doesn't check the session or the request before starting and also it doesn't move the file to another domain as I haven't set it up yet but here it is:  

if(!empty($_FILES['file']['name'][0])) {	
	if(sizeof($_FILES['file']['name'], 0) < 11 ) {
		foreach ($_FILES['file']['tmp_name'] as $k => $v) {
			$name = $_FILES['file']['name'][$k];
			$type = $_FILES['file']['type'][$k];
			$error = $_FILES["file"]["error"][$k];
			$size = $_FILES["file"]["size"][$k];
			$tempDir = $_FILES["file"]["tmp_name"][$k];
			
			if ($error > 0) {
					echo "Return Code: " . $error . "<br>";
			} else {
				$finfo = finfo_open(FILEINFO_MIME_TYPE);
				$fileType = finfo_file($finfo, $tempDir);
				finfo_close($finfo);
						
				if ((($fileType == "image/gif")	|| ($fileType == "image/jpg") || ($fileType == "image/png")) && ($size < 20000))	 {
						$iv_size = mcrypt_get_iv_size(MCRYPT_RIJNDAEL_256, MCRYPT_MODE_ECB);
						switch($fileType) {
							case 'image/gif':
								$newName = bin2hex(mcrypt_create_iv($iv_size, MCRYPT_RAND)). '.' .time(). '.gif';
							break;
							case 'image/jpg':
								$newName = bin2hex(mcrypt_create_iv($iv_size, MCRYPT_RAND)). '.' .time(). '.jpg';
							break;
							case 'image/png':
								$newName = bin2hex(mcrypt_create_iv($iv_size, MCRYPT_RAND)). '.' .time(). '.png';
							break;
						}
						$user = Session::get(Config::get('session/session_name'));
						if(ctype_digit($user)) {
							if(move_uploaded_file($tempDir, 'images/' .escape($user). '/' .$newName)) {
									//insert into database							}
							else {
								echo "There was an error uploading your files";
							}
						}else {
							exit();
						}
						
				}else {
					echo "Invalid file";
				}
			}
		}
	}else {
		echo "Only 10 images can be uploaded at a time!";	
	}
}	
Edited by Millertime1
Link to comment
Share on other sites

16 bytes is just fine. Note that you have to encode the bytes to turn them into sensible characters. If you use simple hexadecimal encoding, you get 32 characters:

$new_filename = bin2hex(mcrypt_create_iv(16, MCRYPT_DEV_URANDOM));
Sorry i didn't see your post...as you can see I already converted the iv.

 

What do you mean?

Like a file hosting website that's meant just for websites to host images on that scan them and allow you to call them to your page. I've looked at imageshack, photobucket, and Amazon file storage but none of them seem to match what I'm looking for.

Link to comment
Share on other sites

Regarding the previous code: This indeed looks a lot better and clearer. There are still some issues and oddities, though.

  • What is the client supposed to do with the internal error code? In fact, some of the codes are clearly not meant for the end user: If you're having trouble with the temporary folder or a PHP extension, that's none of their business. You should have proper error handling with a message that's actually useful and doesn't expose internal information (e. g., “The file is too big. We only allow file sizes up to ... KB.”). Also consider sending an appropriate HTTP response code like 422 Unprocessable Entity.
  • Wherever you've copied the Mcrypt code from, that's not what I meant. MCRYPT_RIJNDAEL_256? We're not doing any encryption here. You just call mcrypt_create_iv() with a fixed size of 16 bytes and MCRYPT_DEV_URANDOM as the source. Do not use MCRYPT_RAND. This is even worse than the uniqid() stuff from before.
  • The MIME type for JPEG images is image/jpeg. There is no such thing as “image/jpg” (like I already said above).
  • What does the ominous escape() function do? Escaping for what? This should get a proper name, and I'm pretty sure it will turn out it's not appropriate for file paths. The ctype_digit() check is good, though.
  • The “invalid file” message should be a bit more descriptive: List the allowed types again.
  • There are still issues with the indentation. It doesn't matter if you use tabs or spaces, but be consistent. Always use the same amount of whitespace for one indentation level.
  • The lonely exit statement in line 38 should be replaced with proper error handling.

 

 

 

Like a file hosting website that's meant just for websites to host images on that scan them and allow you to call them to your page. I've looked at imageshack, photobucket, and Amazon file storage but none of them seem to match what I'm looking for.

 

I think this is overkill. Unless you expect large amounts of traffic, just store the images on your server.

Link to comment
Share on other sites

Regarding the previous code: This indeed looks a lot better and clearer. There are still some issues and oddities, though.

  • What is the client supposed to do with the internal error code? In fact, some of the codes are clearly not meant for the end user: If you're having trouble with the temporary folder or a PHP extension, that's none of their business. You should have proper error handling with a message that's actually useful and doesn't expose internal information (e. g., “The file is too big. We only allow file sizes up to ... KB.”). Also consider sending an appropriate HTTP response code like 422 Unprocessable Entity.
  • Wherever you've copied the Mcrypt code from, that's not what I meant. MCRYPT_RIJNDAEL_256? We're not doing any encryption here. You just call mcrypt_create_iv() with a fixed size of 16 bytes and MCRYPT_DEV_URANDOM as the source. Do not use MCRYPT_RAND. This is even worse than the uniqid() stuff from before.
  • The MIME type for JPEG images is image/jpeg. There is no such thing as “image/jpg” (like I already said above).
  • What does the ominous escape() function do? Escaping for what? This should get a proper name, and I'm pretty sure it will turn out it's not appropriate for file paths. The ctype_digit() check is good, though.
  • The “invalid file” message should be a bit more descriptive: List the allowed types again.
  • There are still issues with the indentation. It doesn't matter if you use tabs or spaces, but be consistent. Always use the same amount of whitespace for one indentation level.
  • The lonely exit statement in line 38 should be replaced with proper error handling.

 

 

 

I think this is overkill. Unless you expect large amounts of traffic, just store the images on your server.

I went with the 256 encryption iv size because i read that it also works as a token. The errors weren't set up yet as i should have mentioned as I'm having trouble with changing a value onload from a reponseText right now so i just left them as an echo for the time being, i should have mentioned that. I'll change the iv. I don't know why I put the escape on the user id.

 

And for images I'm going to have to work on storing the images when i get home...I'm running an xampp server so I'll have to research how to properly set it up since i read I'll have to set some setting to be able to cross communicate with the other domain.

Link to comment
Share on other sites

I went with the 256 encryption iv size because i read that it also works as a token.

 

That code makes absolutely no sense. You're asking for the IV size of some very exotic encryption algorithm and a mode which doesn't even support an IV (ECB). This should actually return 0, but for some reason it returns 32 instead. That's far too much.

 

You just need to choose appropriate length for the random string. The fact that this function was originally meant to generate initialization vectors is irrelevant here. It might as well be called mcrypt_random_bytes(). That would actually be a better name.

 

 

 

And for images I'm going to have to work on storing the images when i get home...I'm running an xampp server so I'll have to research how to properly set it up since i read I'll have to set some setting to be able to cross communicate with the other domain.

 

You don't need to set up an image subdomain on your local test server. That would be a bit silly. You only need this when the site is actually online.

 

I don't know what you're read about cross-domain communication, but this is only a problem if you want to load the images via JavaScript. Is that the plan? If you just want to include the images in your HTML document, then of course you can load them from any domain you want.

Link to comment
Share on other sites

That code makes absolutely no sense. You're asking for the IV size of some very exotic encryption algorithm and a mode which doesn't even support an IV (ECB). This should actually return 0, but for some reason it returns 32 instead. That's far too much.

 

You just need to choose appropriate length for the random string. The fact that this function was originally meant to generate initialization vectors is irrelevant here. It might as well be called mcrypt_random_bytes(). That would actually be a better name.

 

 

 

 

You don't need to set up an image subdomain on your local test server. That would be a bit silly. You only need this when the site is actually online.

 

I don't know what you're read about cross-domain communication, but this is only a problem if you want to load the images via JavaScript. Is that the plan? If you just want to include the images in your HTML document, then of course you can load them from any domain you want.

 

I should have checked into mcrypt more before i typed it since its been 2 months since i called that module and instead i just googled it and using it as a token and that's what came up. There's 30x the bad advice compared to good advice out there and about cross communicating i was also thinking that later i might add .pdf files and that's why i was thinking that i would need to have cross domain set up. Im on my phone and its hard to write on.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.