Jump to content

Precompressed gzip files rather than on fly


Orakio

Recommended Posts

I tried posting this on apache forums but it appears to be dead, there are no experts, and an armade of spammers so perhaps I'll get better luck here :)

 

 

I'm new here so please excuse me if this is the wrong subforum to post this in.

 

What I need is to have files pre-compressed for browsers that support gzip/zlib compression. This is not all, I also need for the system to rather than compress the file if the user supports gzip, decompress it if the user doesn't support it.

 

To help people understand what I am trying to achieve I will explain the cases that have led me to desire this.

 

Case 1:

 

This case is when I made a small simple gallery. I wanted to put some of my pictures on the web so that friends would be able to see them.

 

I started by making two simple CGI Perl scripts. One to display the gallery which was accessible to everyone and another to add pictures(from a network drive)/comments, etc which was only accessible to me.

 

Once these two were in working order being paranoid and performance conscious I thought about how to improve it. I realised that if you viewed every page the perl script provides twice there is no change. So I took the perl script out of the CGI directory and into root and removed CGI support. I then made it generate every possible page and made it direct the output into files rather than to the user, changing the hyperlinks accordingly.

 

I also deleted the admin script and now manage the simple flat database file by hand. As I don't often make changes to the gallery this is perfect. Whenever making a change I would run the script to regenerate the galery. It means that no script needs to be executed when a web request is being sent. This increased performance and made it more secure as there were no scripts to run which might contain bugs. There is also the added bonus that apache can cache the files in memory. Funnily enough I think this is probably how things were often done years ago and might still be done for the odd heavily used site here or there and for things like basic guestbooks.

 

The result can be seen here (if my PC/internet is working is on):

http://orakio.net:81/gal/

 

I'm not trying to spam, there are no adverts there. However if you would like a good example, none of the folders has a default index so you can get a good view of the structure and understanding of what I mean. Judging by the post at the bottom of this forum which hasn't been killed yet I don't think I have to worry about admins pouncing on me for posting a link. Well at least this should knock it over a page. The script creates a fairly complex site (I'd say more of a graph). I haven't gone overboard with presentation, I usually don't, although the organisation and technical skill should be appreciable. If anyone actually finds this approach useful and would like to find out more, get some advice, perhaps view some code then I will be happy to help.

 

Case 2:

 

I hate the word blog. I want to make a site where I can post articles, which most people would call blogs so if it suits you then call it that. Being the eccentric person I am I have a rather different approach to this than others might.

 

A while ago I created a site for a friend with a forum, the ability to post his poems and articles, profiles for users, etc. It uses PHP and has no static files other than that, user images and the css. I plan on adapting this system in a similar way as I did the gallery and more for my needs than my friend's .

 

The main difference with this is that it will still use PHP, but intermixed with the static html files. For example when a user reads a forum thread they will be looking at pregenerated HTML. When a user posts they will be posting to a PHP script which will generate the new HTML as required.

 

The issue with this is that there will be far more generated files than there are with my gallery and even though most will be text, in the long run I could start to see allot of space being consumed. This is not the only reason however why I want precompressed files.

 

End of cases.

 

So back to what I was trying to achieve. The idea is that now more and more people have gzip support and I would not be surprised if that has become the majority already. If not it probably will soon. When that is the case it will be less efficient to compress each file before sending than sending precompressed. If files were precompressed and most users had gzip support then it would reduce bandwidth all round, reduce CPU load from the CPU compressing it, save memory from the CPU having to allocate for the uncompressed from disk and the compressed to send, and save harddrive space. It would add a tiny bit of load to the user but this would be mediocre especially as it is spread out over many machines rather than one. I would not be surprised if overall in most cases the time of a request would be less from the reduced server response time and bandwidth. The more users the more this would be the case.

 

I know that you might be able to have precompressed files with asis and a file including a header of the type (expressing content encoding), or some apache.conf trickery. This doesn't solve the problem of users who don't have gzip support.

 

That might be solved by using one of apache's modules and somesort of rewrite rule or handler, or someother magic to access one of the two files but would not save harddrive space (would use more).

 

If I do get more harddrive space that would be better as it would increase performance further from not having to decompress. Any advice on achieving that would also be appreciated.

 

I haven't used mod_rewrite but would something like this work:

RewriteCond %{HTTP_ACCEPT} (?!.*gzip.*)

RewriteRule ^.(*).gzip$ $1.html

 

I'm sure this wouldn't work because I don't think accept encoding will be put into HTTP_ACCEPT and I'm probably using it all wrong  but give an example of what I am trying to accomplish. I have had a look at negotiate, mim and envif as well but am still somewhat confused as to how to do this in a simple manner.

 

Thinking about handlers, could one be assigned selectively to compressed files (with an extension that marks them as such) but only if the user doesn't support gzip?

 

What I would like (in pseudo code mostly) is something to this effect:

when inside some specified directory

if environment[accept-encoding] does not contain "gzip"

add handler decompress.exe

 

I hope that explains the kind of advice/help I need. I wouldn't mind making come casual suggestions to the apache developers if they have a feature request list somewhere or someway of formally submitting requests for features. If anyone can inform me of the proper way to go about that I would be greatful. That would be ofcourse only useful if I can't find advice here.

 

Thanks!

 

Sidenote 1:

 

Above I mentioned using asis with my own header files. Being a security freak I don't want apache to report ieven that it is apache. I have tried doing this with Header in apache's conf file and with asis header files but no avail. Does anyone reading this by chance know how to achieve this without changing the apache source and recompiling it?

Precompressed gzip files rather than on fly

 

Sidenote 2:

 

Some users might access some files much more than others. Does anyone know of a way to do this [these two things]:

 

Tell apache to use a different file extention if they support gzip or don't support it. This isn't that important though as I think I might be able to find it myself.

 

Stupid me, realised this is easy by making a simple script to filter access.log and make stats just as I started writing it but I have left it to give a better idea of the bigger picture. Easily have apache keep and log a record of how frequently a file is accessed to work out which would be best to have two versions of for each browser. The idea would be to have it have the compressed version for everything, and a decompressed version for the most popular pages.

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.