Jump to content

Looking for an method to be able to efficiently handle low priority sql inserts like star rankings, etc.


Recommended Posts

What I'm trying to do is implement features like a star rating system, favorites, etc and I'd like to find a way to do this to where I'm not writing to a DB after each action.  I was thinking of making use of memory to store actions and then apply them after a select number of statements, is this recommended or should I maybe store to a text file or something?  Any other ideas?

Have you identified that doing them as they are selected is causing performance or balancing issues? No need to over-engineer something for a perceived problem. I can't imagine that so many users are going to be ranking items in such volume that it would be a problem. Well, not unless you had something such as YouTube or Facebook - in which case there will be developers and architects that would be designing the process.

 

But, I will answer your question nonetheless with my suggestions:

 

It is certainly possible to "save up" these selections and then submit them at one time. But, it adds a lot more complexity and potential for errors. I would not store them in a flat file as that would be worse from a performance impact. Instead, you could store the selections in a cookie or in the session data. So, every time the user clicks to make a rating make an AJAX call to a PHP page and pass the ID of the records being rated and the rating. That PHP page will take that data and add to an array in either the cookie or session data. Then it can check to see how many records are in the array. If more than a certain number the page would additionally use that data to insert the necessary data into the database.

 

But, now comes the problems with such an approach.

 

The user will close the browser at some point. Any unsaved ratings could be lost. If they are in the session, they will definitely be lost. If they are in a cookie they will likely still be in the cookie when the user returns. But, what would happen when the user views one of the records that they previously rated? In addition to pulling the user's rating from the database you would have to also check the cookie to see if any of the items being displayed on the page have already been rated. The same applies as the user navigates from page to page. While showing records and the user's rating you have to pull from the database and the cookie/session - otherwise it would seem as if the user's rating was not saves (which it hasn't been yet).

 

Lastly, many/most applications with a rating system show the cumulative rating of items. So, if 10 people rated an item at 5 and another 10 rated it at 4, a cumulative rating of 4.5 would be displayed to users viewing that record. By not saving the ratings as people make them, this would rarely be accurate. You could include a single user's unsaved ratings in the calculation when displaying a page (again, more complexity) but it would be very, very difficult to take into account other users' unsaved rankings for that record.

 

Let's take YouTube as an example. Say you post a new video. As people view your video and rate it those rating would not get updated until the user's happen to hit whatever magical number of ratings you have defined. They may not even hit that number in that day. So, there could be hundreds or thousands of ratings that go unsaved for a long period. Of course, you could implement a timer within the JavaScript code to save all unsaved ratings after X minutes. But, that again adds more complexity and room for errors.

The star rating is an example.  It also applies to marking items as favorites and such.  The system I'm writing has the potential to be rather big and I'm trying to find the best method early on.  Plus I want it to be as peppy as I can make it.  Using batch functionality for so many write statements is supposedly quite a bit faster than doing them one at a time.  From what I've read though I can append to a text file (people say it's faster than DB writes) and mysql has an import function from text that's supposedly pretty peppy.  My goal is to have it where the system automatically applies the action after x number of write statements (or when the file size becomes too big), or after so many minutes (whatever comes first).  This system can obviously be adjusted depending on the need.  Plus with a text file I have some form of integrity incase the system goes down before the write statements are done. 

The star rating is an example.  It also applies to marking items as favorites and such.  The system I'm writing has the potential to be rather big and I'm trying to find the best method early on.  Plus I want it to be as peppy as I can make it.  Using batch functionality for so many write statements is supposedly quite a bit faster than doing them one at a time.  From what I've read though I can append to a text file (people say it's faster than DB writes) and mysql has an import function from text that's supposedly pretty peppy.  My goal is to have it where the system automatically applies the action after x number of write statements (or when the file size becomes too big), or after so many minutes (whatever comes first).  This system can obviously be adjusted depending on the need.  Plus with a text file I have some form of integrity incase the system goes down before the write statements are done. 

 

Doing inserts in a batch is much faster than processing them individually. But, that is typically the solution to use when you already have a number of records up-front. With what you are proposing, you don't know how many or how quickly the user will be making selections. Users aren't going to be clicking multiple ratings each second such that it should be an issue. You say this has the "potential" to be big. Well, if it does become big you should then have the capital and the resources to hire experienced developers and architects to recode the application to scale for the number of users. It's fine to try and look for potential problems, but trying to solve all the potential problems before you even see that there is one will only increase the time of getting something to production and, worse, can cause you to implement so much complexity that the application is buggy. It is better to identify these potential problems and identify how you will measure the activity to determine when you would need to react (e.g. monitor DB activity). In fact, the best solution might not to be creating some workaround in the code, but might be to scale the hardware to support the load.

 

But, since you seem to have already made up your mind, I'm not sure why you asked. Anyway, let's say you go with the method of saving the data in text files. You still have the problems to overcome that I raised previously. If a user rates an item, leaves the page and comes back how do you plan to show the user that they have already rated the item? You are now going to have to do the DB operation to get the item data along with any saved ratings by the user - then you need to go see if there is a text file for the user and parse that to see if the user has already rated the item. Only then can you output the page and show the correct details. Also, if you are only going to process the file after X number of ratings, what happens if the user only rated X-1 items and then leaves? Are you going to store that file indefinitely? If the user's ratings don't get saved till they come back then they won't be calculated in the cumulative ratings for those items. You will have lots of records where the cumulative ratings are not accurate since user left with some records not being saved.

 

Plus, you have to contend with potential data corruption since doing this through flat-files is much more likely to have problems than using the database. If a user was to click two items too close together one write operation may not complete before the other starts. You would have to implement some sort of process to 'lock' the files while they are being written to so that if another request is made that will edit the same file it will wait until the first one completed.

Don't have my mind really made up as much as while I was typing the original post I started to think of something (which sometimes happens).  Really wasn't married to the concept. I thought about some of the stuff you were talking about but I agree with you it's more headache than it's worth.  Thanks!

IF you wanted to somehow queue the inserts in order to have a batch insert what I would likely do is use something like MemcacheD as the temporary storage. The only place you could store something that would possibly be quicker than just sticking into the DB directly is in RAM, memcache provides tools for this.

 

The code for saving a rating would be something like:

$ratingQueue=$mc->get('ratingQueue', null, $cas);
if ($mc->getResultCode()!==Memcached::RES_SUCCESS){
   //Error handling
}

$ratingQueue[] = 'blah'; //Add something to the queue
if ($mc->cas($cas, 'ratingQueue', $ratingQueue) !== true){
   //Error handling
}
Then you'd have a background process or cron job run periodically to flush the queue to the database

while (true){
   $ratingQueue = $mc->get('ratingQueue', null, $cas);
   if ($mc->getResultCode() !== Memcached::RES_SUCCESS){
       //Error handling
   }
   $mc->cas($ca, 'ratingQueue', array());

   //Bulk insert $ratingQueue

   //Wait a bit.
   sleep(60);
}
As you can see above with my comments, you still need to consider how to handle errors such as a client modifying the queue between the read/write of another client, or connection failures.

 

The database will already be giving you pretty fast inserts so the extra overhead of something like the above is most likely not worth the hassle of creating it and making sure it works without issues. Just submit your data to the database immediately and don't worry about a queue system unless real-world testing seems to show a need for it.

Thanks for the reply.  I had considered memcache and I plan to use it for other activities.  However, I am probably overthinking things here.  I tend to do that.  It's probably best for me just to get the thing running and then I can always play around with concepts later on if necessary. 

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.