Bok Posted February 6, 2008 Share Posted February 6, 2008 Can anyone explain why this takes so long..?? The aim is to add ranks to Table B based on the score contained within it Given two tables A) rank int(11) PRI auto_increment id int(11) score double B) id int(11) PRI score double rank int(7) + 30 more fields.. Table B contains around 1.7M rows, A is initially empty I run insert into A (id,score) select id,score from B order by score desc,id; all well and good - takes about 1min, at this point table A contains the same number of rows ranked accordingly then update A,B set B.rank=A.rank where A.id = B.id; This takes over an hour... I'm kind of confused on this as it should be a straight match on the index right ? Thanks Bok Quote Link to comment Share on other sites More sharing options...
fenway Posted February 6, 2008 Share Posted February 6, 2008 Missing indexes? Quote Link to comment Share on other sites More sharing options...
Bok Posted February 6, 2008 Author Share Posted February 6, 2008 I don't believe so.. table A has two indexes 1 is on rank 2 is on id table B is indexed on id only Very puzzling.. Quote Link to comment Share on other sites More sharing options...
fenway Posted February 6, 2008 Share Posted February 6, 2008 Run the EXPLAIN on the equivalent selects. Quote Link to comment Share on other sites More sharing options...
Bok Posted February 6, 2008 Author Share Posted February 6, 2008 Explains show this mysql> explain select A.id,B.rank from B ,A where B.id = A.id; +----+-------------+--------------+--------+---------------+------+---------+------------------------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------------+--------+---------------+------+---------+------------------------+---------+-------------+ | 1 | SIMPLE | A | ALL | name | NULL | NULL | NULL | 1786054 | | | 1 | SIMPLE | B | eq_ref | name | name | 4 | dcfree.A.id | 1 | Using index | +----+-------------+--------------+--------+---------------+------+---------+------------------------+---------+-------------+ 2 rows in set (0.00 sec) using mk_query_profiler gives +----------------------------------------------------------+ | 1 (4090.6125 sec) | +----------------------------------------------------------+ __ Overall stats _______________________ Value _____________ Total elapsed time 4090.613 Questions 1 COMMIT 0 DELETE 0 DELETE MULTI 0 INSERT 0 INSERT SELECT 0 REPLACE 0 REPLACE SELECT 0 SELECT 0 UPDATE 0 UPDATE MULTI 1 Data into server 114 Data out of server 68 Optimizer cost 2150679.621 __ Table and index accesses ____________ Value _____________ Table locks acquired 2 Table scans 1 Join 0 Index range scans 0 Join without check 0 Join with check 0 Rows sorted 0 Range sorts 0 Merge passes 0 Table scans 0 Potential filesorts 0 Sorry, couldn't figure out how to change the font to make this look neater.. Quote Link to comment Share on other sites More sharing options...
fenway Posted February 6, 2008 Share Posted February 6, 2008 Hmm... I guess you are updating 1.7M rows... best suggestion I have is to set the rank directly in the insert... saves doing the update later/at all! Quote Link to comment Share on other sites More sharing options...
Bok Posted February 6, 2008 Author Share Posted February 6, 2008 Can't, it's a changing target as scores change through the day..that's why I use a seperate table with the auto-increment first. I can do it one other way which is pretty fast, I just can't understand why this one is so slow !! Quote Link to comment Share on other sites More sharing options...
fenway Posted February 6, 2008 Share Posted February 6, 2008 OK... well, the issue is likely update the index on rank since you're changing it 1.7M times! Do you use this index? Can you live without it during the update? There's a way to disable this. Quote Link to comment Share on other sites More sharing options...
Bok Posted February 6, 2008 Author Share Posted February 6, 2008 I don't think so, table B which is being updated does not have an index on rank... Quote Link to comment Share on other sites More sharing options...
fenway Posted February 7, 2008 Share Posted February 7, 2008 Hmmm... with 30+ fields, and perhaps not a fixed-width record, the server has to do a seek on each one to grab the value from one record then set it on the other one. Have you tried a covering index on ( id, rank )? Then it should be able to grab the value directly from the index without ever reading the table! Quote Link to comment Share on other sites More sharing options...
powerspike Posted February 7, 2008 Share Posted February 7, 2008 if your doing match up's on tables, it might be creating a temporary table for the query (which would explain the time) index rank and id in both tables and see if that helps alot, also remove any unneed indexs on both tables, as that'll slow things down as well A.rank = B.rank (in the slow query) ... if they are both indexed you might find it gets alot quicker for you. Quote Link to comment Share on other sites More sharing options...
toplay Posted February 7, 2008 Share Posted February 7, 2008 Why do you need to have B.rank at all? and why two score columns too? You have B.rank as only INT(7) trying to hold A.rank INT(11). The B.id can point (join) to A.id whenever you need to get the rank (A.rank). Your explain doesn't make sense to me...do you have indexes called "name"? Post the actual table structures (create table SQL with indexes). Tables don't seem normalized. FYI - Another way to write the update: UPDATE B JOIN A ON A.id = B.id SET B.rank = A.rank; Quote Link to comment Share on other sites More sharing options...
Bok Posted February 7, 2008 Author Share Posted February 7, 2008 Table A is a temporary table used only to generate the ranks themselves via the auto_increment field..,sorry the int(7) was a mistake, it is int(11). I now believe the problem is on the particular box, though I've yet to figure out exactly what it is. I tried the same thing on another similar box and the update ran in 5mins (first time it was changing all 1.7M rows), second time it only updated ~ 300,000 and took 1.2mins. I'll run more tests today to see if there is a better way to generate these rankings. Quote Link to comment Share on other sites More sharing options...
Bok Posted February 7, 2008 Author Share Posted February 7, 2008 Update as I think this must be memory settings somehow, though I've tweaked them without success.. I ran some tests on a smaller subset, and then tried it with varying numbers of rows to update. 435632 rows in both A and B, A is already ranked accordingly with it's auto_increment field. update B set rank=0 where id < 100000; (id is not necessarily sequential) Query OK, 51873 rows affected (0.68 sec) Rows matched: 51873 Changed: 51873 Warnings: 0 update A,B set B.rank=A.rank where B.id=A.id Query OK, 51873 rows affected (8.07 sec) Rows matched: 435632 Changed: 51873 Warnings: 0 update B set rank=0 where id < 200000; Query OK, 60829 rows affected (0.81 sec) Rows matched: 60829 Changed: 60829 Warnings: 0 update A,B set B.rank=A.rank where B.id=A.id Query OK, 60829 rows affected (43.47 sec) Rows matched: 435632 Changed: 60829 Warnings: 0 update B set rank=0 where id < 300000; Query OK, 74680 rows affected (2.85 sec) Rows matched: 74680 Changed: 74680 Warnings: 0 update A,B set B.rank=A.rank where B.id=A.id Query OK, 74680 rows affected (33.50 sec) Rows matched: 435632 Changed: 74680 Warnings: 0 update B set rank=0; Query OK, 435632 rows affected (16.46 sec) Rows matched: 435632 Changed: 435632 Warnings: 0 update A,B set B.rank=A.rank where B.id=A.id Query OK, 435632 rows affected (11 min 12.23 sec) Rows matched: 435632 Changed: 435632 Warnings: 0 So at some point, it's hitting some limit, anyone care to suggest which limit it is or how to find it ? Quote Link to comment Share on other sites More sharing options...
fenway Posted February 8, 2008 Share Posted February 8, 2008 Might be the max_join_size. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.