Jump to content

Recommended Posts

Im having some trouble with a big table I have

 

It has 4.7 million records and is heavily accessed

 

It has the following fields

 

id = PRIMARY

user = varchar25 INDEX

friend = varchar25 INDEX

timestamp = int12

gender = varchar6

x = tinyint

real = enum 0,1

 

I cant prune redundant records because.. well basically there is none

 

Running SELECT count(id) FROM table GROUP BY user takes 5 minutes+

I deleted an index on the x colum and that took 28 minutes to complete (ALTER TABLE)[/b]

 

What options do I have in circumstances of naturally large tables

 

 

The example of SELECT count(id) FROM table GROUP BY user above, is there anything I can do to get something

like that complete after 5 seconds?

Link to comment
https://forums.phpfreaks.com/topic/119285-large-table-and-query-time/
Share on other sites

on my website I never use the id field

 

however, in the example I gave you above of a bad query I used the id field leading you to believe i need a covering index on it

 

Why are we talking about imaginary queries?  I still don't understand.

Why are you using a group by?

 

That makes no sense.

 

SELECT COUNT(*) FROM my_table; will return the same total number.

 

How are you defining which group a given count belongs to?

 

I'd suggest creating an index:

 

ALTER TABLE `my_table` ADD INDEX ( `user` ( 25 ) ) ;

 

Reduced the count query time from 1m 32s to ~20sec. I'm guessing your db server isn't a terribly beefy machine.

 

As a side note, I forced in double the amount of records you have, all with random data.

You could run optimize periodically:

 

http://dev.mysql.com/doc/refman/5.0/en/optimize-table.html

 

and:

 

http://dev.mysql.com/doc/refman/5.0/en/analyze-table.html

 

 

COUNT(*) is optimized to return very quickly if the SELECT retrieves from one table, no other columns are retrieved, and there is no WHERE clause.

 

This optimization applies only to MyISAM tables only, because an exact row count is stored for this storage engine and can be accessed very quickly. For transactional storage engines such as InnoDB and BDB, storing an exact row count is more problematic because multiple transactions may be occurring, each of which may affect the count.

I dont think optimize will help as they are INNODB tables thus zero/little overhead. I've only experienced fregmented tables with MyIsam

 

If ran optimize on the 4.7 million rows table it would take about 10-20 minutes

 

I already have an index on user

 

SELECT count(*) FROM table GROUP BY user

This is an example of a query that is taking 400+ seconds to complete. Ignore the relivence of it, as its only an example and not something I ever use, but if I can find out why a simple query like this takes so long..

His next best option is to switch to MySQL 5.1 and start making use of partitions.

 

That's HUGE.

 

I tried his table layout on my own server with partitions and ~20M rows.

 

Queries that took 2-3 minutes without partitions took ~5-10 seconds tops with.

 

Lots of solutions to this issue. count() on an innodb table is just bad news.

 

A summary table wouldn't be a bad idea at all.. just make a single field table with a value you add/subtract from based upon the addition or subtraction of rows.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.