Jump to content

[SOLVED] optimization with EXPLAIN SELECT


misc00500

Recommended Posts

greenhorn database designer with newbie question.  In the query returned by "explain select" I received an TYPE:"ALL" for the 1st table.  As I understand it, this is something that needs to be corrected.  But I'm having trouble getting rid of it even by creating extra indexes to help with the JOIN.  here are simplified versions of tables in question.

 

/**********************  table property *************************/

 

property (

 

  id         INT NOT NULL AUTO_INCREMENT,

  address    char(50)

)

 

 

/**********************  table deals *************************/

 

deals (

 

  property_id INT,

  deal_type_id INT,

 

  sales_pitch TEXT,

  price INT,

 

  INDEX (sales_pitch(10), price),

  PRIMARY KEY (property_id, deal_type_id)

)

 

/**********************  query *************************/

 

EXPLAIN

 

SELECT

 

deals.property_id,

deals.price,

deals.sales_pitch,

 

property.address

 

FROM

 

deals LEFT JOIN property

 

ON

 

deals.property_id = property.id

 

 

/********************************************************/

 

What baffles me the most is that if I leave out (deals.sales_pitch) in the SELECT statement, then EXPLAIN returns a TYPE:"INDEX".  But if i have deals.sales_pitch under SELECT statement EXPLAIN returns a TYPE:"ALL".  Can someone please help me understand why this happens, when i have already created a multi-column index containing both (sales_pitch(10), price).  THanks very much in advance.  Also i would like to know if TYPE:"INDEX" is the best thing one can hope for in this type of query.  THank you

Link to comment
Share on other sites

Can you show us your actual explain output?

 

Your query only has one condition, and that is the join condition.  In this case, I would expect the left table to be scanned entirely, followed by joining with the right table.  I would not expect an index scan on the left table, because there is no point fetching an index when you want every row.

 

If you added a WHERE condition affecting the left table (deals) , then it's possible that an index would be the best plan.

Link to comment
Share on other sites

Hey thanks alot for taking the time to read my post.

 

I see your logic, btherl.  And that is what I thought as well.  But I read in a e-book by mysql press that states the following :

 

 

TYPE - ALL

 

A full table scan is done for each combination of rows from the previous tables. This is normally not good if the table is the first table not marked const, and usually very bad in all other cases. Normally, you can avoid ALL by adding indexes that allow row retrieval from the table based on constant values or column values from earlier tables.

 

 

The following are links to actual EXPLAIN output

 

http://www.polaris4homes.com/query1.jpg

http://www.polaris4homes.com/query2.jpg

 

 

 

Link to comment
Share on other sites

here's an excerpt from Apress the definitive guide from mysql 5 -

 

"ALL" means that for each record in the next-higher table in the hierarchy, all records of this table must be read.

 

So does that mean in my case - for each row in table 2 (property) a full scan will have to be done on table 1(deals).  ie if i have 5 rows in table property, and 3 rows in deals, then the query will have to scan deals table 5 times going through 3 rows each time ?

 

sorry if i can't explain it well... . but yeah, I really appreciate your help.

Link to comment
Share on other sites

I get conflicting results.

 

The following table definitions and record inserts:

 

DROP TABLE IF EXISTS property;
CREATE TABLE property (
     idINT NOT NULL AUTO_INCREMENT primary key,
     address CHAR(50)
);

INSERT INTO property (address) VALUES ('123 Fake Street, Faketown');

DROP TABLE IF EXISTS deals;
CREATE TABLE deals (
     property_id INT,
     deal_type_id INT,      
     sales_pitch TEXT,   
     price INT,
     INDEX (sales_pitch(10), price),
     PRIMARY KEY (property_id, deal_type_id)
);

INSERT INTO deals (property_id, deal_type_id, sales_pitch, price) VALUES
     (1, 1, 'buy buy buy', 100000),
     (1, 2, 'hurry hurry hurry', 100000),
     (1, 3, 'make an offer', 100000),
     (1, 4, 'hurry last days', 100000),
     (1, 5, 'once in a lifetime', 100000),
     (1, 6, 'golf course nearby', 100000),
     (1, 7, 'beach side property', 100000);

 

The following explain:

 

EXPLAIN 
SELECT deals.property_id
, deals.price
#	, deals.sales_pitch
#	, property.address
FROM deals 
LEFT JOIN property ON deals.property_id = property.id

 

results:

id  select_type  table     type    possible_keys  key      key_len  ref                     rows  Extra
1   SIMPLE       deals     ALL     NULL           NULL     NULL     NULL                    7
1   SIMPLE       property  eq_ref  PRIMARY        PRIMARY  4        test.deals.property_id  1     Using index   

 

With only the index property_id in the SELECT values resulted in the type `index`:

 

EXPLAIN
SELECT deals.property_id
#    , deals.price
#    , deals.sales_pitch
#    , property.address
FROM deals 
     LEFT JOIN property ON deals.property_id = property.id;

 

 

resulted in:

 

id  select_type  table     type    possible_keys  key      key_len  ref                     rows  Extra
1   SIMPLE       deals     index   NULL           PRIMARY  8        NULL                    7     Using index
1   SIMPLE       property  eq_ref  PRIMARY        PRIMARY  4        test.deals.property_id  1     Using index 

Link to comment
Share on other sites

What about this:

 

"It is possible that key will name an index that is not present in the possible_keys value. This can happen if none of the possible_keys indexes are suitable for looking up rows, but all the columns selected by the query are columns of some other index. That is, the named index covers the selected columns, so although it is not used to determine which rows to retrieve, an index scan is more efficient than a data row scan."

 

http://dev.mysql.com/doc/refman/5.1/en/explain.html

 

It may be deciding that since all data is available in an index, it's finding the data with a "full index scan" instead.

Link to comment
Share on other sites

thanks alot bubblegum for taking the time to replicate the table structure.  really do appreciate it.  But yes, my results are the same as yours.  I'm sorry if I didn't explain myself right in previous posts.  My question is why whenever I SELECT ( deals.sales_pitch ) do I get TYPE:ALL.  But when I SELECT ( deals.price ) only, I get TYPE:INDEX.  What's confusing me is that table deals' INDEX(sales_pitch(10), price) should allow both to be SELECT with TYPE:INDEX shouldn't it ?  I also deleted the INDEX(sales_pitch, price) and did SELECT on deals.price, and sure enough I get TYPE:ALL.  So I was suspecting that maybe INDEXES works differently for TEXT data types ( which is what deals.sales_pitch is ).   THanks in advance.

Link to comment
Share on other sites

An index on sales_pitch(10) means only the first 10 characters are in the index.  So that's no use for fetching the data from the index.. it will have to scan the table itself.

 

Thanks misc, I learnt something today :)

Link to comment
Share on other sites

I think you are right btherl, I found this example here http://dev.mysql.com/doc/refman/5.0/en/explain.html that supports what you said in your first post.

 

 

 

mysql> ALTER TABLE tt MODIFY ActualPC VARCHAR(15);

 

Now tt.ActualPC and et.EMPLOYID are both VARCHAR(15). Executing the EXPLAIN statement again produces this result:

 

table type   possible_keys key     key_len ref         rows    Extra

tt    ALL    AssignedPC,   NULL    NULL    NULL        3872    Using

            ClientID,                                         where

            ActualPC

do    ALL    PRIMARY       NULL    NULL    NULL        2135

     range checked for each record (key map: 1)

et_1  ALL    PRIMARY       NULL    NULL    NULL        74

     range checked for each record (key map: 1)

et    eq_ref PRIMARY       PRIMARY 15      tt.ActualPC 1

 

 

This is not perfect, but is much better: The product of the rows values is less by a factor of 74. This version executes in a couple of seconds.

A second alteration can be made to eliminate the column length mismatches for the tt.AssignedPC = et_1.EMPLOYID and tt.ClientID = do.CUSTNMBR comparisons:

 

 

mysql> ALTER TABLE tt MODIFY AssignedPC VARCHAR(15),

   ->                MODIFY ClientID   VARCHAR(15);

 

 

After that modification, EXPLAIN produces the output shown here:

table type   possible_keys key      key_len ref           rows Extra

et    ALL    PRIMARY       NULL     NULL    NULL          74

tt    ref    AssignedPC,   ActualPC 15      et.EMPLOYID   52   Using

            ClientID,                                         where

            ActualPC

et_1  eq_ref PRIMARY       PRIMARY  15      tt.AssignedPC 1

do    eq_ref PRIMARY       PRIMARY  15      tt.ClientID   1

 

 

At this point, the query is optimized almost as well as possible. The remaining problem is that, by default, MySQL assumes that values in the tt.ActualPC column are evenly distributed, and that is not the case for the tt table. Fortunately, it is easy to tell MySQL to analyze the key distribution:

 

 

 

Link to comment
Share on other sites

I think you are right btherl, I found this example here http://dev.mysql.com/doc/refman/5.0/en/explain.html that supports what you said in your first post.

 

 

In my first post or last post?  It looks somewhat relevant to both :)

 

My first post wasn't quite right because I didn't realize that mysql could fetch all the column data from an index, if a suitable index was found.  So yes, it must fetch all rows, but it can choose either full table scan or "full index scan", if a suitable index can be found.  Restricted indexes like sales_pitch(10) are no good because the index only contains partial data.

 

Regarding what the book said about ALL usually being bad.. it did say usually.  In most queries, it's bad.  But for your particular query, ALL is what you would expect, unless there happens to be a matching index for the columns you are fetching.

Link to comment
Share on other sites

ah..  thanks for clearing that up :)  I hated when I get stuck on something and can't work past it.  So do you think if I expand the restrictive index

INDEX (sales_pitch(10) ) to maximum allowed for TEXT data types of INDEX (sales_pitch(255)) it would change the TYPE:ALL into TYPE:INDEX ?  If not, could you possibly suggest any other solution ?  Thanks alot for helping out

Link to comment
Share on other sites

Hmm.. Based on the example you posted, you will need to restrict your data type in the table definition as well as in the index to the same value.  Only then will mysql know that the index contains all needed data.

 

Text actually has a much greater range than 255 characters.. it's only a text index that's limited to 255 characters.  So you must limit the table data to be no larger than the index length if you want the index to contain all the data.

 

I wouldn't recommend doing that though.  "ALL" is not so bad, when you're only reading 3 integers and a text column in each row.  Putting the entire ext column into an index won't help all that much, since the index will be nearly as big as the table itself.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.