Jump to content

MySql Performance - Static Counts vs Dynamic Counts


jokerofacoder

Recommended Posts

I'm designing a large scale application and I'm wondering about the performance of it.

 

In the code, I return several result sets.

For an example:

The car result set has carimages count, car comments count, car buyers count which are subqueries.

 

Here is a sample query:

SELECT *, (SELECT COUNT(1) FROM car_image WHERE carid=id) as imagecount, (SELECT COUNT(1) FROM car_comments where carid=id) as carcommentscount, (SELECT COUNT(1) FROM car_buyers WHERE carid = id) FROM cars

 

If everything is index properly and it's on a fast server (dual opteron, mysql 5, 4gigs ram), how long would it take to run on 20 million rows with 100-500-concurrent users.

 

I wanted to keep static counts but it was too much work and overhead.

Link to comment
Share on other sites

But is this efficient vs static counts?

 

Static counts are columns in each table that contain the counts so you don't have to run subqueries to get the counts. The problem is, sometimes..these counts can be off etc.

 

Dynamic counts is when you run subqueries to get the counts etc.

 

On a heavy user site, how would the performance be?

 

I talked to several DBA gurus and they all said that running 3 subqueries for each record in the car table is NOTHING for mysql because if INDEXED properly, MySql can run the query in 0.00003 seconds on a table with 10 million rows. I benchmarked it 10 million times (using Benchmark), and it took 0.005 seconds. When I got to 100 Million times (mysql-benchmark), it took 2 seconds. This is not accurate because mysql-benchmark executes the query 10 million times...NOT AGAINST 10 million rows.

Link to comment
Share on other sites

That's a very nice idea HOWEVER..in the application I'm designing, the counts are going to be permission based. In fact, for each row in the subquery, a function is executed which filters out the count number. Also, the maintenance of this idea maybe high and the application itself contains a forum as well.

 

In your experience against using indexes, how do they fair off against multiple queries against large tables?

Link to comment
Share on other sites

Permission-based? Your query didn't seem to indicate this... I'm confused.  As for index performance, provided that the index is selective enough, large tables shouldn't be the rate-limiting step.

Link to comment
Share on other sites

I posted this topic yesterday but I felt that I did not clarify it properly. So I'll repost the clarified one as a new post (hope you don't mind).

 

I'm building a large scale application and was wondering the difference between static counts vs dynamic counts.

 

Right now, I'm using dynamic counts. For an example:

 

SELECT *,

(SELECT COUNT(1) FROM CarImages WHERE carid = carid) as ImageCount,

(SELECT COUNT(1) FROM CarBuyers WHERE carid = id) as BuyerCount,

(SELECT COUNT(1) FROM CarSellers WHERE carid = id) as SellerCount,

FROM Cars WHERE CarCondition > 2 AND IsCarHasGoodRecord(carid);

 

Inside IsCarHasGoodRecord function:

function IsCarHasGoodRecrd(carid INT) RETURNS BOOLEAN

BEGIN

  DECLARE GoodCondition INT; DECLARE CarMaxConditionId INT;

  SELECT 1 INTO CarMaxConditionId  FROM CarMaxConditions WHERE carid = carid;

  SELECT 1 INTO GoodCondition FROM KellyBlueBookConfirm WHERE carid = carid AND carmaxid = CarMaxConditionId;

 

--If the current "car row" is good condition, return true so that the car will be included in the result set

    if GoodCondition  = 1 THEN

      RETURN TRUE;

    ELSE

    RETURN FALSE;

    END IF;

END$;

 

What happens is that for each car row, carimage count, car buyer count, car seller count is retrieved. Then the car id is passed to the IsCarHasGoodRecrd to find out if the car has a good record and whether to include it in the result set. As you can tell from the query above, about 3 correlated subqueries are executed in the main query, and 2 other correlate subqueries are returned in the function...totaling 5 correlated  subqueries for each row in car table.

 

 

Now, if everything is properly INDEXED, on a high-end server (2x dual opteron, 4gigs ram, 2x 250 gig)...with a large concurrent userbase and 20 million rows in all tables......what is performance going to be like? I never dealt with such databases or userbase so I don't know what to expect.

 

I was thinking of using static counts (where the there is a column for each count in the parent table) but it's too much maintenance and i would have to use transactions everytime i update it to make sure the counts are not off.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.