[SOLVED] optimization with EXPLAIN SELECT

misc00500 · May 17, 2007

greenhorn database designer with newbie question. In the query returned by "explain select" I received an TYPE:"ALL" for the 1st table. As I understand it, this is something that needs to be corrected. But I'm having trouble getting rid of it even by creating extra indexes to help with the JOIN. here are simplified versions of tables in question.

/********************** table property *************************/

property (

id INT NOT NULL AUTO_INCREMENT,

address char(50)

)

/********************** table deals *************************/

deals (

property_id INT,

deal_type_id INT,

sales_pitch TEXT,

price INT,

INDEX (sales_pitch(10), price),

PRIMARY KEY (property_id, deal_type_id)

)

/********************** query *************************/

EXPLAIN

SELECT

deals.property_id,

deals.price,

deals.sales_pitch,

property.address

FROM

deals LEFT JOIN property

ON

deals.property_id = property.id

/********************************************************/

What baffles me the most is that if I leave out (deals.sales_pitch) in the SELECT statement, then EXPLAIN returns a TYPE:"INDEX". But if i have deals.sales_pitch under SELECT statement EXPLAIN returns a TYPE:"ALL". Can someone please help me understand why this happens, when i have already created a multi-column index containing both (sales_pitch(10), price). THanks very much in advance. Also i would like to know if TYPE:"INDEX" is the best thing one can hope for in this type of query. THank you

btherl · May 17, 2007

Can you show us your actual explain output?

Your query only has one condition, and that is the join condition. In this case, I would expect the left table to be scanned entirely, followed by joining with the right table. I would not expect an index scan on the left table, because there is no point fetching an index when you want every row.

If you added a WHERE condition affecting the left table (deals) , then it's possible that an index would be the best plan.

misc00500 · May 17, 2007

Hey thanks alot for taking the time to read my post.

I see your logic, btherl. And that is what I thought as well. But I read in a e-book by mysql press that states the following :

TYPE - ALL

A full table scan is done for each combination of rows from the previous tables. This is normally not good if the table is the first table not marked const, and usually very bad in all other cases. Normally, you can avoid ALL by adding indexes that allow row retrieval from the table based on constant values or column values from earlier tables.

The following are links to actual EXPLAIN output

http://www.polaris4homes.com/query1.jpg

http://www.polaris4homes.com/query2.jpg

misc00500 · May 17, 2007

here's an excerpt from Apress the definitive guide from mysql 5 -

"ALL" means that for each record in the next-higher table in the hierarchy, all records of this table must be read.

So does that mean in my case - for each row in table 2 (property) a full scan will have to be done on table 1(deals). ie if i have 5 rows in table property, and 3 rows in deals, then the query will have to scan deals table 5 times going through 3 rows each time ?

sorry if i can't explain it well... . but yeah, I really appreciate your help.

bubblegum.anarchy · May 17, 2007

I get conflicting results.

The following table definitions and record inserts:

DROP TABLE IF EXISTS property;
CREATE TABLE property (
     idINT NOT NULL AUTO_INCREMENT primary key,
     address CHAR(50)
);

INSERT INTO property (address) VALUES ('123 Fake Street, Faketown');

DROP TABLE IF EXISTS deals;
CREATE TABLE deals (
     property_id INT,
     deal_type_id INT,      
     sales_pitch TEXT,   
     price INT,
     INDEX (sales_pitch(10), price),
     PRIMARY KEY (property_id, deal_type_id)
);

INSERT INTO deals (property_id, deal_type_id, sales_pitch, price) VALUES
     (1, 1, 'buy buy buy', 100000),
     (1, 2, 'hurry hurry hurry', 100000),
     (1, 3, 'make an offer', 100000),
     (1, 4, 'hurry last days', 100000),
     (1, 5, 'once in a lifetime', 100000),
     (1, 6, 'golf course nearby', 100000),
     (1, 7, 'beach side property', 100000);

The following explain:

EXPLAIN 
SELECT deals.property_id
, deals.price
#	, deals.sales_pitch
#	, property.address
FROM deals 
LEFT JOIN property ON deals.property_id = property.id

results:

id  select_type  table     type    possible_keys  key      key_len  ref                     rows  Extra
1   SIMPLE       deals     ALL     NULL           NULL     NULL     NULL                    7
1   SIMPLE       property  eq_ref  PRIMARY        PRIMARY  4        test.deals.property_id  1     Using index

With only the index property_id in the SELECT values resulted in the type `index`:

EXPLAIN
SELECT deals.property_id
#    , deals.price
#    , deals.sales_pitch
#    , property.address
FROM deals 
     LEFT JOIN property ON deals.property_id = property.id;

resulted in:

id  select_type  table     type    possible_keys  key      key_len  ref                     rows  Extra
1   SIMPLE       deals     index   NULL           PRIMARY  8        NULL                    7     Using index
1   SIMPLE       property  eq_ref  PRIMARY        PRIMARY  4        test.deals.property_id  1     Using index

btherl · May 17, 2007

What about this:

"It is possible that key will name an index that is not present in the possible_keys value. This can happen if none of the possible_keys indexes are suitable for looking up rows, but all the columns selected by the query are columns of some other index. That is, the named index covers the selected columns, so although it is not used to determine which rows to retrieve, an index scan is more efficient than a data row scan."

http://dev.mysql.com/doc/refman/5.1/en/explain.html

It may be deciding that since all data is available in an index, it's finding the data with a "full index scan" instead.

misc00500 · May 17, 2007

thanks alot bubblegum for taking the time to replicate the table structure. really do appreciate it. But yes, my results are the same as yours. I'm sorry if I didn't explain myself right in previous posts. My question is why whenever I SELECT ( deals.sales_pitch ) do I get TYPE:ALL. But when I SELECT ( deals.price ) only, I get TYPE:INDEX. What's confusing me is that table deals' INDEX(sales_pitch(10), price) should allow both to be SELECT with TYPE:INDEX shouldn't it ? I also deleted the INDEX(sales_pitch, price) and did SELECT on deals.price, and sure enough I get TYPE:ALL. So I was suspecting that maybe INDEXES works differently for TEXT data types ( which is what deals.sales_pitch is ). THanks in advance.

btherl · May 17, 2007

An index on sales_pitch(10) means only the first 10 characters are in the index. So that's no use for fetching the data from the index.. it will have to scan the table itself.

Thanks misc, I learnt something today

misc00500 · May 17, 2007

I think you are right btherl, I found this example here http://dev.mysql.com/doc/refman/5.0/en/explain.html that supports what you said in your first post.

mysql> ALTER TABLE tt MODIFY ActualPC VARCHAR(15);

Now tt.ActualPC and et.EMPLOYID are both VARCHAR(15). Executing the EXPLAIN statement again produces this result:

table type possible_keys key key_len ref rows Extra

tt ALL AssignedPC, NULL NULL NULL 3872 Using

ClientID, where

ActualPC

do ALL PRIMARY NULL NULL NULL 2135

range checked for each record (key map: 1)

et_1 ALL PRIMARY NULL NULL NULL 74

range checked for each record (key map: 1)

et eq_ref PRIMARY PRIMARY 15 tt.ActualPC 1

This is not perfect, but is much better: The product of the rows values is less by a factor of 74. This version executes in a couple of seconds.

A second alteration can be made to eliminate the column length mismatches for the tt.AssignedPC = et_1.EMPLOYID and tt.ClientID = do.CUSTNMBR comparisons:

mysql> ALTER TABLE tt MODIFY AssignedPC VARCHAR(15),

-> MODIFY ClientID VARCHAR(15);

After that modification, EXPLAIN produces the output shown here:

table type possible_keys key key_len ref rows Extra

et ALL PRIMARY NULL NULL NULL 74

tt ref AssignedPC, ActualPC 15 et.EMPLOYID 52 Using

ClientID, where

ActualPC

et_1 eq_ref PRIMARY PRIMARY 15 tt.AssignedPC 1

do eq_ref PRIMARY PRIMARY 15 tt.ClientID 1

At this point, the query is optimized almost as well as possible. The remaining problem is that, by default, MySQL assumes that values in the tt.ActualPC column are evenly distributed, and that is not the case for the tt table. Fortunately, it is easy to tell MySQL to analyze the key distribution:

btherl · May 17, 2007

I think you are right btherl, I found this example here http://dev.mysql.com/doc/refman/5.0/en/explain.html that supports what you said in your first post.

In my first post or last post? It looks somewhat relevant to both

My first post wasn't quite right because I didn't realize that mysql could fetch all the column data from an index, if a suitable index was found. So yes, it must fetch all rows, but it can choose either full table scan or "full index scan", if a suitable index can be found. Restricted indexes like sales_pitch(10) are no good because the index only contains partial data.

Regarding what the book said about ALL usually being bad.. it did say usually. In most queries, it's bad. But for your particular query, ALL is what you would expect, unless there happens to be a matching index for the columns you are fetching.

misc00500 · May 17, 2007

ah.. thanks for clearing that up I hated when I get stuck on something and can't work past it. So do you think if I expand the restrictive index

INDEX (sales_pitch(10) ) to maximum allowed for TEXT data types of INDEX (sales_pitch(255)) it would change the TYPE:ALL into TYPE:INDEX ? If not, could you possibly suggest any other solution ? Thanks alot for helping out

misc00500 · May 17, 2007

I just changed INDEX to sales_pitch(255) and it didn't work. But it's cool, I'm happy with it. thanks again.

btherl · May 21, 2007

Hmm.. Based on the example you posted, you will need to restrict your data type in the table definition as well as in the index to the same value. Only then will mysql know that the index contains all needed data.

Text actually has a much greater range than 255 characters.. it's only a text index that's limited to 255 characters. So you must limit the table data to be no larger than the index length if you want the index to contain all the data.

I wouldn't recommend doing that though. "ALL" is not so bad, when you're only reading 3 integers and a text column in each row. Putting the entire ext column into an index won't help all that much, since the index will be nearly as big as the table itself.

Sign In

[SOLVED] optimization with EXPLAIN SELECT

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived

Important Information