Please help me with a MySQL query..

mailjol · February 27, 2008

I have MySQL 4.0.x and a 2 GB table called "people" with approx 1200000 rows in it having columns:

id (PRIMARY AUTO INCREMENT)

name

email

group_id VARCHAR( (INDEXED)

address

data

reg_time (INDEXED)

ip

status

phone

There are approximately 14000 different group ids. I am facing a problem with a particular group which has approximately 25000 members. To search 300 members of that group starting from offset 15000 i do this query:

SELECT `name` FROM `people` WHERE `group_id` = 'b67k219f' LIMIT 15000, 300

This takes approximately 35 seconds before MySQL returns the records even though the column group_id has a INDEX on it.

To check further I created a separate table "people_idx" with just two columns "id" & "group_id" from the main table. Now I use following PHP to fetch names of 300 people starting offset 15000 from that group:

$result1 = mysql_query("SELECT id FROM people_idx WHERE group_id = 'b67k219f' LIMIT 15000, 300");
while(list($id) = mysql_fetch_row($result1))
{
$result2 = mysql_query("SELECT `name` FROM people WHERE id = $id LIMIT 1");
echo mysql_result($result2, 0)."\n";
}

Even though this performs 300+1 different queries it returns back in less than a second. Now I tried this join:

SELECT people.name FROM people, people_idx WHERE people_idx.group_id = 'b67k219f' AND people.id = people_idx.id LIMIT 15000, 300

This also takes more than 30 seconds. Now I am confused about what I am doing wrong and which approach should I use. Please anyone help me out.

Thanks in advance.

Regards,

Akash

aschk · February 27, 2008

Use EXPLAIN in front of those queries and post the output here so we can analyse the process that the pathing algorithm it choosing.

mailjol · February 27, 2008

SORRY!! THERE WERE FEW ERRORS IN MY PREVIOUS REPLY. HERE IS THE CORRECT OUTPUT:

For querying the main table:

SELECT `name` FROM `people` WHERE `group_id` = 'b67k219f' LIMIT 15000, 300

the output of EXPLAIN is:

table  	 type  	 possible_keys  	 key  	 key_len  	 ref  	 rows  	 Extra
people  	ref  	id  	id  	10  	const  	21132  	Using where

For querying the test table:

SELECT `id` FROM `people_idx` WHERE `group_id` = 'b67k219f' LIMIT 15000, 300

the output of EXPLAIN is:

table  	 type  	 possible_keys  	 key  	 key_len  	 ref  	 rows  	 Extra
people_idx  	ref  	group_id  	group_id  	10  	const  	19010  	Using where

And for the JOIN query:

SELECT people.name FROM people, people_idx WHERE people_idx.group_id = 'b67k219f' AND people.id = people_idx.id LIMIT 15000, 300

the output of EXPLAIN is:

table  	 type  	 possible_keys  	 key  	 key_len  	 ref  	 rows  	 Extra
people_idx 	ref 	group_id 	group_id 	10 	const 	19010 	Using where
people 	eq_ref 	PRIMARY 	PRIMARY 	3 	people_idx.id 	1 	Using where

Thanks..

aschk · February 27, 2008

Looking at the 1st and 2nd results from the above, it appears odd that the numbers of rows differs from 21132 to 19010 as they should be producing the same results.

However, looking at your 1st query it appears that it's doing a select based on the id and NOT the group_id, implying there is no INDEX on group_id. Can you verify this?

mailjol · February 27, 2008

Looking at the 1st and 2nd results from the above, it appears odd that the numbers of rows differs from 21132 to 19010 as they should be producing the same results.

However, looking at your 1st query it appears that it's doing a select based on the id and NOT the group_id, implying there is no INDEX on group_id. Can you verify this?

Don't go by the number of records as the main table "people" is constantly growing while "people_idx" was generated just to do these tests. Both tables have a index on group_id and the select is done based on group_id only. You can see all the WHERE clauses. "people_idx" is just a smaller version of "people" with just two columns.

mailjol · February 27, 2008

Looking at the 1st and 2nd results from the above, it appears odd that the numbers of rows differs from 21132 to 19010 as they should be producing the same results.

However, looking at your 1st query it appears that it's doing a select based on the id and NOT the group_id, implying there is no INDEX on group_id. Can you verify this?

The explain output still has few errors as I had edited it a little bit to make it viewable and by mistake changed "group_id" to "id".

luca200 · February 28, 2008

It's not the answer you need, but I don't believe it makes a great sense to use LIMIT without ORDER BY....

aschk · February 28, 2008

Good point well made there Luca. Indeed if you don't use ORDER BY there is no guarantee about which order the results will come back in. So you might do 15000, 300 , and then 15300, 300 and find it gives you some of the same result

mailjol · February 29, 2008

Even with ORDER BY its so slow. I am now tearing my hairs apart. I don't know why Indexes work when I use them from a separate table having just two columns taken from the main table using JOIN and don't work when doing a straightforward query on the main table which has these two columns INDEXED.

fenway · February 29, 2008

I just looked at the query.. and you're using a HUGE offset.

mailjol · February 29, 2008

Yes thats right. 15000 is a big offset but it can be even larger that. But don't you think a 30+ second delay is still not justified if MySQL is using indexes. The same query returns instantly if a create a similar table with same number of rows but only these two columns (one which is queried upon and the other which is being fetched).

fenway · February 29, 2008

Have you ANALYZE / OPTIMIZE tables recently?

mailjol · February 29, 2008

Yes I did "OPTIMIZE" it quite a few times during these tests. Sometimes it responded "Table is already up to date".

fenway · February 29, 2008

And ANALYZE too (since this deals with indexes directly)?

Since you can recreate these data in another table with the same indexes (accordingly to your earlier post), it's hard to see what else is the issue here.

mailjol · February 29, 2008

I have somehow reduced the delay to 1-3 seconds by using a covered index on columns which are frequently queried for. This returns the data from index itself without touching 1 GB table.

Sign In

Please help me with a MySQL query..

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived

Important Information