Jump to content

Saving site statistics (hits) into database, best format/strategy ?!


Recommended Posts

Hello.

What is the best way to save visitor "hits" into database so that I can then show stats like "hits in month x, hits in month y"?

 

If I save each visitor hit as a record in a table I think it will increase the database too much.

Is the best solution to create a record for each month and just increase a value in this record? For example an entrance would be "august 2008" and will increase the field "total hits" for each visitor during August?

 

Any best practices/ideas/strategies/techniques/ for this?

Thank you.

suppose you could have it check the database table to see if the current month/year exists in the month column,

 

if it does, then just add 1 to the value in the hits column

 

if it doesn't, then add another row with the current month/year in the months column and sit the hits column to 1

Hi

 

Suspect record locking wise inserts might be better. However whether it is enough of an improvement probably depends on the total number of reads compared to right.

 

Other advantage of a plain insert is that you can store other info  (time of day, ip address, whether they came from an external site of linked from another of your pages, etc).

 

If you just store limited info then you can always run a batch job to sum it up at the end of the month. So during July you might insert 100 rows for IP 123:223:123:223 browsing 10 different pages, but at the start of August just sum that into 10 rows, one for each page.

 

All the best

 

Keith

I've been working on something like this for a while (see: crawler catcher on the link on my signature) - It epends on how much data you wish to store.

 

For example - showing how many hits each month for x users is great - but if you are also including the current month ( reading rows that are being updated) you may get longer query times due to lock waits (more a MyIsam problem then an innodb problem) this can be a problem if LOTS of people are accessing the table.

 

One solution would be use 2 tables a fill table and a log table. the page hits go into the fill table, then every hour / every day you could aggregate those rows (group by ip/userID) and then update the "log" table via a cron.

 

This would mean you are only performing SELECTS on the log table (MyISAM is best for this) and then any aggregate functions will be much faster.

 

Just make sure that in the "main" tables you have a fixed row length - and ideally only store id numbers. So if you store User agents - store in a seperate table, then reference it to see if the user agent exists, and store the ID of that useragent.

 

My "test" log table (slightly different to above example) stores: IP,proxy,useragent,domain,page,timestamp,hit count - each row is 123 bytes, 1Mil (ish) rows is 100Mb.

 

 

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.