Jump to content

Directory System File Load Performance


fry2010

Recommended Posts

I have always wondered about performance issues with having say hundreds or even thousands of files in a single directory and how it affects performane, specifically when using fopen(), fwrite() functions etc...

 

Say I have a directory called '/temp/' and in that directory I have 10 files.

I have another directory called '/temp2/' and in that directory I have 10000 files.

 

Is there any significant difference in trying to open and/or edit a file in either situation?

Link to comment
Share on other sites

Basically what Im looking to do is create a directory for each new member to my site which will contain files relevant to them. Of course this could be done instead using mysql database, but I want to reduce the load on the database so I am considering putting certain information (for example profile view unique clicks) into their own files.

Good idea or is it pointless?

Link to comment
Share on other sites

With filesystem performance you really need to break it down to its core, the filesystem commands which vary wildly from filesystem format to format.  ZFS, UFS, EXT3, NTFS, FAT have all different core filesystem commands but those are all kernel-land.  Most of the time the operating system provides a common file IO API for file and directory operations in userland.  Then there is the file stream API that most of us are used to built atop the IO API.  So there are multiple layers you're dealing with in regards to performance.  What I'm trying to get at here is that if you wish to truly optimize filesystem performance it'll be more than an application level solution.

 

If you want to worry about performance you need to understand what the operating system is doing when you open a file or browse the filesystem.  If you're a Windows user, im giving you some homework.  Download Sysinternals Process Monitor (MS bought them and has their utils on their site now).  Only monitor Filesystem activity, filter out the php process or the web server that may be loading php as a module, whatever is executing your PHP code.

 

Write PHP scripts to:

Open and close a single file using an absolute path.

Open and close a single file using an relative path.

Get the filesize of a file.

Get a directory listing.

Search a few directories recursively.

 

You'll find there is a great deal more activity going on than you think.  And now you can put the filesystem performance into some context. Uncached file reads vs Uncached DB SELECTs should be a little more apparent.

 

You also mentioned that you wanted to reduce the load on the database.  This to me is a bad sign, its by far too general a statement.

 

Benchmark - take a measure of your queries. what's slow? what's fast? mysqlslap for direct mysql benchmarking ab for web server benchmarking. mtop is also useful

 

Profile - where, what, when, why, how - yay profiling! Dissect what is going on, identify bottlenecks particularly.  MySQLs Slow Query Log, EXPLAIN, SHOW STATUS, SHOW PROFILE, Section 7 - Optimization of the MySQL Manual are great sources of information.  mysqlsla and myprofi are tools that are worth a look aswell.

 

Proper Indexes - know them, use them, update them.  Read the manual, give it a week to sink in and read it again.  Print off a copy and have it near the porclien throne, indexes are fundamental to decent database schemas.

 

Other recommended tidbits

mysqltuner perl script

percona server, percona toolkit (percona is a custom/patched build of mysql, built with added "instrumentation" and other stuff).  Its great to put on a dev or testing server to optimize there and get profiling data early.

 

Commercial Tools that could help you

Jet Profiler for MysQL

Webyog's MONyog/SQLyog

MySQL Enterprise Monitor - had to add it to the list for a laugh

 

All too often most new (and not so new) web developers without DB knowledge of some kind blame slow performance on the DB, where they haven't utilized the built-in mechanics of the DB itself.  If a DBA ever comes onto a project at some point, hes gonna be frowning at you a lot and rolling his eyes and possibly laughing. 

 

Education is the means to answer our own ignorance, let it be the fire of our mind.

 

Such a long reply to a short question.. *sigh*

Link to comment
Share on other sites

No thats excellent thehippy. Plenty of great info there, cheers.

 

Basically what I meant by reduce the load on the database, I specifically meant reduce the number of queries. I know that when you construct a database effectivly and make effecient queries that there is a lot of performance to be had. Since I am not a professional database architect however, I thought perhaps that using directories to store files, so that they could then be used in a caching process, rather than performing the same requests over and over which give the same results.

 

In effect what Im getting at is: Perform mysql query once -> store data in a cache file -> call that cache file whenever its needed.

 

So it then lead me to think, since I would be creating a lot of cache files, will that in-fact lead to a worse performance gain than just sticking with a database in the first place?

 

I know that many popular frameworks, such as wordpress, opencart, drupal etc use cached files. I suppose I should really just take a good look at how they have done it..

 

Awesome answer too, I wish I had the time to be able to learn everything you just posted, as well as loads of other stuff to do with programming. I find I always try to do things as quick as possible. I doesnt seem to get me very far though lol. Maybe if I make it someday Ill have the time to truly appreciate this answer.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.