Jump to content

Store uploaded file as blob or in folder?


Accipiter

Recommended Posts

Hi,

Creating a community site while being new to both, php, mysql and who knows what, I constantly bump into decisions which I feel I have too little knowledge to make good choices about.

This time:

When uploading files to the image galleries (or just a personal avatar), where best to store them?
My two options are:
1) As normal files, linking to them from the database by folder and name.
2) As blobs in a separate table, linking to them from the database by id.

In any case, the files are not directly accessible by a user, so the folders would be "out of reach", above the public_html folder.

What is best? What is fastest? What is recommended?

/Accipiter
Link to comment
Share on other sites

Generally, (and anybody out there correct me if I'm wrong) its best to store links to the files in the database. And yes, usually you store them out of your document tree for your page. Usually, I just make a 'data' folder under apache, something like /usr/local/apache/data, and store them in there. Also, letting people upload files to your server is a [b]huge[/b] security concern. Make sure you use the 'maxfilesize' attribute in your form to fend off buffer overloads. I usually rename all the files that come in to something like '13.dat' where '13' is the id in the DB. Makes them easy to look up, and helps makes sure that anything malicious will have a harder time being run via extension associations (.exe files for example, definitely not fool proof). Generally, you get better performance storing links to the files. Databases weren't really designed to handle binary stored in their tables (they can do it, yes. But this wasn't the original intent).
Link to comment
Share on other sites

[quote author=c4onastick link=topic=121549.msg499938#msg499938 date=1168283590]
Also, letting people upload files to your server is a [b]huge[/b] security concern. Make sure you use the 'maxfilesize' attribute in your form to fend off buffer overloads.
[/quote]

Thanks for the quick reply.

Yes, I have the maxfilesize set, and also checks that [color=blue]$_FILES['file']['size'] < MAX_FILE_SIZE[/color] (a constant value) as well as [color=blue]is_uploaded_file($_FILES['file']['tmpname']) [/color]. Well, then I use
[color=blue]list($width, $height, $filetype) = @getimagesize($_FILES['file']['tmpname'])[/color] in order to determine if it is a valid jpg/gif/png image. Though, I have no idea how reliable that getimagesize() is.

I am more concerned about the filename, as you mentioned, c4onastick.
If I store the file in a folder as suggested, but still wish to keep the original name, how to be sure it is a valid filename? I now try to remove any non-valid character:
[color=blue]trim(preg_replace('/[\/\n\r\t\*:<>\|\?\\\\"]/', '', $filename))[/color]
But I am not sure if I am missing something. Any other nonprintable character that might slip through, or is that enough?

Though, from your reply, I've decided to drop the BLOB idea and go for the normal file storage way  :)

But, when the image later is viewed, if for some reason the "image" actually is evil executable code that slipped through the suffix name check and the getimagesize() check, won't sending the header: [i]Content-Type: image/xxx"[/i] preventing it from being treated as anything else but an image in the clients browser?

Sorry if I slipped from my original question somewhat  :-[
Link to comment
Share on other sites

[quote author=Daniel0 link=topic=121549.msg499940#msg499940 date=1168283767]
You shouldn't rely on client-side validation (i.e. the maxfilesize field is not secure).
[/quote]
Very true. Its still better to be paranoid when it comes to security.
[quote author=Accipiter]But, when the image later is viewed, if for some reason the "image" actually is evil executable code that slipped through the suffix name check and the getimagesize() check, won't sending the header: Content-Type: image/xxx" preventing it from being treated as anything else but an image in the clients browser?
[/quote]
It should, but like Daniel0 said, better safe than sorry!
Link to comment
Share on other sites

[quote author=Accipiter link=topic=121549.msg499964#msg499964 date=1168286928]Well, then I use
[color=blue]list($width, $height, $filetype) = @getimagesize($_FILES['file']['tmpname'])[/color] in order to determine if it is a valid jpg/gif/png image. Though, I have no idea how reliable that getimagesize() is.[/quote]
I think it's perfectly reliable. It checks if the headers are available, and if they are, well then it must be an image.

[quote author=Accipiter link=topic=121549.msg499964#msg499964 date=1168286928]I am more concerned about the filename, as you mentioned, c4onastick.
If I store the file in a folder as suggested, but still wish to keep the original name, how to be sure it is a valid filename? I now try to remove any non-valid character:
[color=blue]trim(preg_replace('/[\/\n\r\t\*:<>\|\?\\\\"]/', '', $filename))[/color]
But I am not sure if I am missing something. Any other nonprintable character that might slip through, or is that enough?[/quote]
Store the filename in your database and give it a name yourself like: [code]md5(uniqid(rand().microtime(),true));[/code]

[quote author=Accipiter link=topic=121549.msg499964#msg499964 date=1168286928]But, when the image later is viewed, if for some reason the "image" actually is evil executable code that slipped through the suffix name check and the getimagesize() check, won't sending the header: [i]Content-Type: image/xxx"[/i] preventing it from being treated as anything else but an image in the clients browser?[/quote]
Yes it would. Sending e.g. [tt]Content-type: image/png[/tt] will make the browser treat it like a PNG image, but when the file has been downloaded to the user's computer it is the computer's file extension associations that will decide what to do with it when executed/opened.
Link to comment
Share on other sites

You could always do something like:

[code=php:0]
$error = FALSE;
$allowed = array("jpg", "png", "gif"); //file extensions you wish to allow
$filename = $_FILES['file']['size'];
$fnexplode = explode(".", $filename);
if(count($fnexplode) < 2) { $error = "Please upload an image file (.jpg, .png, .gif, etc.)"; }
$extension = $fnexplode[count($fnexplode) - 1];
if(!in_array(strtolower($extension), $allowed) { $error .= "Invalid file extension"; }
$imgsize = @getimagesize($_FILES['file']['tmpname']); //taken from Daniel0 :p
if(!is_numeric($imgsize[0]) || $imgsize[0] < 1) {
$error .= "Please upload an image file.  Your file may be corrupted or invalid.";
}
//check the file size and all that fun stuff
if($error == FALSE) {
//move file to the folder and do the DB stuff and what ever else needs to be done
//output success message, or redirect or what ever you do.
}
[/code]

I've never worked with upload scripts, but if I ever do that is basically what I plan on using lol...

Since I've never worked with upload scripts before, that could be completely retarded though lol.
Link to comment
Share on other sites

Thanks for the security input!

But, as I've decided to put the files in a folder (with a ID_XXXX.DAT name as suggested), there is one followup question:

Let's say there will be over 10000, maybe millions (a guy can dream  ;D) of images in the folder. Will this cause any problems? Should I try to sort them into subfolders if it would make fetching the files quicker? Or is it nothing to worry about?
What would be the limit more than harddrive space?

Simply; is there anything I aught to think about when it comes to storing lots of files on a linux-based server.
Link to comment
Share on other sites

You shouldn't have to worry about it. I don't think (unless you're running a [b]huge[/b] site) you'll even get close to the limit. Categorizing them in folders would be fine, if you envision this helping (usually doesn't help me, I like to store that kind of data in the DB) I'd set this up now, as it will be more difficult to add later.
Link to comment
Share on other sites

Reading a little about it on other places on the net, it seems there is no actual limit (atleast not a limit I would ever reach). Though, it might make it slower, but this is probably just when you do a [i]ls[/i] or so.

But to be on the safe side, I will make it so that the files can be stored in multiple folders.

Thanks for all your input!
Link to comment
Share on other sites

You could store them in sub-folders based on the user that uploaded the file or a sub-folder indicating which filters will apply in displaying the image.

For instance, if someone is displaying an image attached to Phase Z of Project Y within the "Plans" portion of your site, you could store them in:

data/Plans/Images/Y/Z/<db_id>.<random_short_key>.<extension>

Using the db_id as part, or all, of the filename will help you later if you want to write a clean up script.  It makes it easier to check that every file on the disk is referenced in the DB and that every entry in the DB is on the disk.

Additionally, you should store the [i]original[/i] filename, maybe even the filesize, in the DB.  This way you can serve it later and have it appear to have the same name, thus hiding your renaming scheme (security, security, security!) and you can also check that it's likely the same file later.
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.