js_280 Posted August 17, 2007 Share Posted August 17, 2007 I'm new to query optimization and seem to have a problem... I have three tables that I am joining -- customer table - 300,000 rows location table - 305,000 rows tank table - 170,000 rows I am joining the tables together with two fields each. There is an index on the joined fields, along with all fields in the "where" statement. When I use "explain" I get the information shown on the attached image... Here's the query... SELECT DISTINCT sqlar001.customerfile.customernumber, sqlar001.customerfile.customercompanynumber, sqlar001.customerfile.CustomerStatementName, sqlar001.locationfile.locationnumber FROM sqlar001.customerfile INNER JOIN sqlar001.locationfile ON sqlar001.customerfile.customernumber = sqlar001.locationfile.locationcustomernumber AND sqlar001.customerfile.customercompanynumber = sqlar001.locationfile.locationcompanynumber INNER JOIN sqlar001.tankfile ON sqlar001.customerfile.customernumber = sqlar001.tankfile.tankchargetocustomer AND sqlar001.customerfile.customercompanynumber = sqlar001.tankfile.tankcompanynumber WHERE sqlar001.tankfile.tankrentaltank = "Y" AND sqlar001.locationfile.locationnumber = "" AND sqlar001.locationfile.LocationAutoFill != "Y" AND sqlar001.customerfile.customercompanynumber NOT IN("355", "104"); The fields in the "WHERE" part of the statement are either Y/N fields or simple smallint(3) fields. The query returns approx 70,000 rows and takes 17 seconds on a Intel server with 3.2GHz Xeon Processor and 4GB RAM. MySQL, PHP and IIS are the only applications that are being utilized on this server. This query time seems excessive to me, is there something I can change in the query to optimize it? [attachment deleted by admin] Quote Link to comment Share on other sites More sharing options...
js_280 Posted August 19, 2007 Author Share Posted August 19, 2007 Anyone? Quote Link to comment Share on other sites More sharing options...
fenway Posted August 21, 2007 Share Posted August 21, 2007 I'm confused about the index usage based on the join conditions and the where clause... but != and NOT IN() are really bad for performance. Quote Link to comment Share on other sites More sharing options...
js_280 Posted August 22, 2007 Author Share Posted August 22, 2007 There is a non-unique index on customerfile.customernumber, locationfile.locationcustomernumber, customerfile.customercompanynumber, locationfile.locationcompanynumber, tankfile.tankcompanynumber, and each column specified in the "Where" statement. If != and NOT IN() are bad for performance, what are the other options beside listing everything I do need? I have 355 stores and simply need to exclude two of them for reporting purposes. Would it be faster performance-wise to use a sub-select than != ? Quote Link to comment Share on other sites More sharing options...
fenway Posted August 23, 2007 Share Posted August 23, 2007 You could easily exclude these two on the php side... Quote Link to comment Share on other sites More sharing options...
teng84 Posted August 23, 2007 Share Posted August 23, 2007 theres no way to have the query fast if you will load big data ive done something like that loads almost millions of records and what ever i do it still loads slow Quote Link to comment Share on other sites More sharing options...
fenway Posted August 23, 2007 Share Posted August 23, 2007 Post the EXPLAIN output for the query without any where clauses. Quote Link to comment Share on other sites More sharing options...
js_280 Posted August 23, 2007 Author Share Posted August 23, 2007 15.075 seconds without where clauses... SELECT distinct sqlar001.customerfile.customernumber, sqlar001.customerfile.customercompanynumber, sqlar001.customerfile.CustomerStatementName, sqlar001.locationfile.locationnumber FROM sqlar001.customerfile inner join sqlar001.locationfile on sqlar001.customerfile.customernumber = sqlar001.locationfile.locationcustomernumber AND sqlar001.customerfile.customercompanynumber = sqlar001.locationfile.locationcompanynumber inner join sqlar001.tankfile on sqlar001.customerfile.customernumber = sqlar001.tankfile.tankchargetocustomer and sqlar001.customerfile.customercompanynumber = sqlar001.tankfile.tankcompanynumber; [attachment deleted by admin] Quote Link to comment Share on other sites More sharing options...
fenway Posted August 27, 2007 Share Posted August 27, 2007 Why DISTINCT? That's causing 150K records to be sorted! Won't all those IDs be unique by nature? Also, what columns are covered by those keys? They look named.. run SHOW INDEXES. Quote Link to comment Share on other sites More sharing options...
js_280 Posted August 28, 2007 Author Share Posted August 28, 2007 ID's are unique, however there can be multiple tanks tied to each location, and will add duplicates. Indexes: mysql> show indexes from customerfile \G; *************************** 1. row *************************** Table: customerfile Non_unique: 1 Key_name: comp_index Seq_in_index: 1 Column_name: CustomerCompanyNumber Collation: A Cardinality: 90 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 2. row *************************** Table: customerfile Non_unique: 1 Key_name: account_index Seq_in_index: 1 Column_name: CustomerNumber Collation: A Cardinality: 296830 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: 2 rows in set (0.00 sec) mysql> show indexes from locationfile \G; *************************** 1. row *************************** Table: locationfile Non_unique: 1 Key_name: locationIDX Seq_in_index: 1 Column_name: LocationCustomerNumber Collation: A Cardinality: 301336 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 2. row *************************** Table: locationfile Non_unique: 1 Key_name: productcodeIDX Seq_in_index: 1 Column_name: LocationProductCategory Collation: A Cardinality: 56 Sub_part: NULL Packed: NULL Null: YES Index_type: BTREE Comment: *************************** 3. row *************************** Table: locationfile Non_unique: 1 Key_name: location_company Seq_in_index: 1 Column_name: LocationCompanyNumber Collation: A Cardinality: 91 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 4. row *************************** Table: locationfile Non_unique: 1 Key_name: location_locationnumber Seq_in_index: 1 Column_name: LocationNumber Collation: A Cardinality: 1969 Sub_part: NULL Packed: NULL Null: YES Index_type: BTREE Comment: 4 rows in set (0.00 sec) mysql> show indexes from tankfile \G; *************************** 1. row *************************** Table: tankfile Non_unique: 1 Key_name: tank_cust_index Seq_in_index: 1 Column_name: TankChargeToCustomer Collation: A Cardinality: 169650 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 2. row *************************** Table: tankfile Non_unique: 1 Key_name: serial_index Seq_in_index: 1 Column_name: TankSerialNo Collation: A Cardinality: 169650 Sub_part: NULL Packed: NULL Null: YES Index_type: BTREE Comment: *************************** 3. row *************************** Table: tankfile Non_unique: 1 Key_name: tank_company Seq_in_index: 1 Column_name: TankCompanyNumber Collation: A Cardinality: 91 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 4. row *************************** Table: tankfile Non_unique: 1 Key_name: rental_tank Seq_in_index: 1 Column_name: TankRentalTank Collation: A Cardinality: 3 Sub_part: NULL Packed: NULL Null: YES Index_type: BTREE Comment: 4 rows in set (0.00 sec) Quote Link to comment Share on other sites More sharing options...
fenway Posted August 28, 2007 Share Posted August 28, 2007 ID's are unique, however there can be multiple tanks tied to each location, and will add duplicates. Then which one do you want? Also, you may get some improvement from covering indexes across both columns in your on clause. Quote Link to comment Share on other sites More sharing options...
js_280 Posted August 29, 2007 Author Share Posted August 29, 2007 I want to be able to get all customers using the criteria specified without having the customer listed multiple times simply because they have multiple tanks tied to their location. Without the DISTINCT statement, I would get approximately 30,000 duplicates. Quote Link to comment Share on other sites More sharing options...
fenway Posted August 29, 2007 Share Posted August 29, 2007 I understand why you don't want duplicates.... but if you don't care which one, you should issue a subquery / dervied table query to limit the "extra overheard" of sorting 150K rows.... And still covering indexes would probably improve things, but the DISTINCT is what's killing you. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.