Jump to content

Preventing near simultaneous queries from returning duplicate same results


Recommended Posts

Ahoi Freaks,

 

MySQL 5.0.51a-3

I have a rather large "job" table (1.8+ million entries) that has a primary auto_incremented key (id) and a unique key (name).  This table is accessed via a web-service from mulitple nodes requesting jobs in rapid fire succession.  Ideally it would work like this:

1. web-service makes the SELECT for node

2. web-service updates the status of the row from "ready" to "inprocess"

3. node crunches away

4. node returns results to web-service

5. web-service stores results, updates the job status back to "ready"

 

As the table grows, I notice that the web-service can take long enough to select and update an assigned job from ready to improcess to allow multiple nodes to select the same "next available" job...

 

I realize there is no way to execute a simultaneous SELECT and UPDATE, so what would be the best practice to prevent duplicate results from near simultaneous queries?

 

Try to update a row as improcess that matches my select criteria before it is selected, then use the id of the affected row to select?  Any comments, ideas... tasty brain food recipes?

It would take a specific example of your queries to directly help, but two possibilities are -

 

1) If the SELECT/UPDATE queries are on the same table, you can usually just perform an UPDATE query with an appropriately crafted WHERE clause and then test to make sure that the row was actually updated, or

2) You would need to lock the table(s) to insure that the values did not change between queries.

Thanks for the reply!

Yes, this what the direction I was leaning as well since it is the same table.  I would first update the row, setting the "status" to the id of the node, then just select the row that was updated.  I will post if this solved the issue.

Yes, this seems to be working as expected, thanks again!  To recap:

first selecting a row, then update the selected row to prevent it being selected again causes problems with the same row being selected with large tables and/or high traffic requests.  To prevent this:

 

First:  UPDATE a single row that matches the desired WHERE clause (LIMIT 1).  In my case I updated the status from "ready" to unique node_id that made the request

Second: use mysql_affected_rows() to verify that a row was updated

Third: SELECT the row that was UPDATED, WHERE status = node_id

 

BAM, works like a charm.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.