gin Posted March 5, 2008 Share Posted March 5, 2008 MySQL client version: 5.0.37 I have a table of items that sometimes references a table of rates. There may be multiples of a rate, depending on the date which the rate takes effect. CREATE TABLE `test_item` ( `id` varchar(20) NOT NULL, `rate` varchar(15) NOT NULL ); -- Dumping data for table `test_item` INSERT INTO `test_item` (`id`, `rate`) VALUES ('item1', 'rate1'), ('item2', ''), ('item3', 'rate2'); +-------+-------+ | id | rate | +-------+-------+ | item1 | rate1 | | item2 | | | item3 | rate2 | +-------+-------+ CREATE TABLE `test_rate` ( `id` varchar(15) NOT NULL, `date` date NOT NULL, `rate` mediumint(7) NOT NULL ); -- Dumping data for table `test_rate` INSERT INTO `test_rate` (`id`, `date`, `rate`) VALUES ('rate1', '2008-03-01', 100), ('rate1', '2008-03-05', 200), ('rate2', '2008-03-05', 300); +-------+------------+------+ | id | date | rate | +-------+------------+------+ | rate1 | 2008-03-01 | 100 | | rate1 | 2008-03-05 | 200 | | rate2 | 2008-03-05 | 300 | +-------+------------+------+ I want a list of all items and its most recent rate, like so: +-------+-------+-------+------------+------+ | id | rate | id | date | rate | +-------+-------+-------+------------+------+ | item1 | rate1 | rate1 | 2008-03-05 | 200 | | item2 | | NULL | NULL | NULL | | item3 | rate2 | rate2 | 2008-03-05 | 300 | +-------+-------+-------+------------+------+ The closest I can get: SELECT * FROM test_item LEFT JOIN test_rate ON test_item.rate = test_rate.id GROUP BY test_item.id ORDER BY test_item.id, test_rate.date DESC; +-------+-------+-------+------------+------+ | id | rate | id | date | rate | +-------+-------+-------+------------+------+ | item1 | rate1 | rate1 | 2008-03-01 | 100 | <-- Note the rate | item2 | | NULL | NULL | NULL | | item3 | rate2 | rate2 | 2008-03-05 | 300 | +-------+-------+-------+------------+------+ Removing the GROUP BY shows that the results are correctly sorted by date in the order I want, but the GROUP BY seems to take the first entry regardless of sorting rules, so I'm stumped. Any help much much appreciated! Quote Link to comment Share on other sites More sharing options...
aschk Posted March 5, 2008 Share Posted March 5, 2008 Also note the date You're not choosing the max rate and date from the JOIN'd table, and how do you expect MySQL to guess that this is what you want in your GROUP BY if you don't tell it ... Quote Link to comment Share on other sites More sharing options...
gin Posted March 6, 2008 Author Share Posted March 6, 2008 The trouble is, I don't want the max rate. I want the rate where the date is max. Quote Link to comment Share on other sites More sharing options...
gin Posted March 6, 2008 Author Share Posted March 6, 2008 Well, here's my next attempt with subqueries, but it doesn't really solve the problem since I get that error below. I understand why I get it, but I'm not so big on understanding how to get around it. SELECT * FROM test_item LEFT JOIN test_rate ON test_item.rate = test_rate.id WHERE date = (SELECT MAX(date) from test_rate GROUP BY test_rate.id) OR date IS NULL ORDER BY test_item.id, test_rate.date DESC; >> ERROR 1242 (21000): Subquery returns more than 1 row Quote Link to comment Share on other sites More sharing options...
fenway Posted March 6, 2008 Share Posted March 6, 2008 Replace date = ( your sequery ) with date IN Quote Link to comment Share on other sites More sharing options...
gin Posted March 6, 2008 Author Share Posted March 6, 2008 Still not entirely working. The second row shouldn't appear ;_; SELECT * FROM test_item LEFT JOIN test_rate ON test_item.rate = test_rate.id WHERE date IN (SELECT MAX(date) from test_rate GROUP BY test_rate.id) OR date IS NULL ORDER BY test_item.id, test_rate.date DESC; +-------+-------+-------+------------+------+ | id | rate | id | date | rate | +-------+-------+-------+------------+------+ | item1 | rate1 | rate1 | 2008-03-06 | 150 | | item1 | rate1 | rate1 | 2008-03-05 | 200 | | item2 | | NULL | NULL | NULL | | item3 | rate2 | rate2 | 2008-03-05 | 300 | +-------+-------+-------+------------+------+ Quote Link to comment Share on other sites More sharing options...
aschk Posted March 6, 2008 Share Posted March 6, 2008 use a JOIN not a subquery and let us know how you get on Quote Link to comment Share on other sites More sharing options...
luca200 Posted March 6, 2008 Share Posted March 6, 2008 I don't think a join can solve it.... Try this SELECT * FROM test_item LEFT JOIN test_rate ON test_item.rate = test_rate.id WHERE date = (SELECT MAX(date) from test_rate WHERE test_rate.id = test_item.rate) OR date IS NULL ORDER BY test_item.id, test_rate.date DESC; Quote Link to comment Share on other sites More sharing options...
fenway Posted March 6, 2008 Share Posted March 6, 2008 That's still with a subquery.... you need to join on the derveid table. Quote Link to comment Share on other sites More sharing options...
gin Posted March 7, 2008 Author Share Posted March 7, 2008 Well if I understand it correctly, a WHERE constitutes an implicit JOIN when two tables are involved. That said, I'm afraid I'm not getting what you (aschk & fenway) are trying to hint at. I'm guessing, though, that you mean me to first generate a table of rates that I want, but I'm not sure how, even ignoring the item table. luca200's code does work (thanks very much!) but I'd very much like to understand why. Could you explain? Quote Link to comment Share on other sites More sharing options...
gin Posted March 7, 2008 Author Share Posted March 7, 2008 Okay, still trying to figure this out. Given this: SELECT * FROM test_rate; +-------+------------+------+ | id | date | rate | +-------+------------+------+ | rate1 | 2008-03-01 | 500 | | rate1 | 2008-03-05 | 200 | | rate2 | 2008-03-03 | 300 | | rate2 | 2008-03-19 | 50 | +-------+------------+------+ How do I get this? +-------+------------+------+ | id | date | rate | +-------+------------+------+ | rate1 | 2008-03-05 | 200 | | rate2 | 2008-03-19 | 50 | +-------+------------+------+ Quote Link to comment Share on other sites More sharing options...
aschk Posted March 7, 2008 Share Posted March 7, 2008 Look, listen and learn SELECT * FROM test_rate t JOIN ( SELECT id ,MAX(date) as 'max_date' FROM test_rate GROUP BY id ) x ON x.max_date = t.`date` AND x.id = t.id Quote Link to comment Share on other sites More sharing options...
luca200 Posted March 7, 2008 Share Posted March 7, 2008 Well if I understand it correctly, a WHERE constitutes an implicit JOIN when two tables are involved. You're right luca200's code does work (thanks very much!) but I'd very much like to understand why. Could you explain? It's not that easy to explain it... the code first does a join between the two tables and then chooses only the rows in which the date field has the maximum value for that precise rate id. For doing this, however, it needs to repeat the internal query for every row of the joined table (it's a so called "correlated subquery"). That's why it's possible to do it better, and I think this could be the best solution: SELECT * FROM test_item i LEFT JOIN ( SELECT id, MAX(date) as 'max_date' FROM test_rate GROUP BY id ) x ON i.rate = x.id This one constructs first a table (the internal query) with just the maximum date for each rate, and then joins it to the items table. I think that's what fenway was hinting at. @aschk: what was that strange kind of joke? Quote Link to comment Share on other sites More sharing options...
aschk Posted March 7, 2008 Share Posted March 7, 2008 Nope, no joke. Guess the listening part is hard though... :-\ Quote Link to comment Share on other sites More sharing options...
luca200 Posted March 7, 2008 Share Posted March 7, 2008 Nope, no joke. Guess the listening part is hard though... :-\ Could be. But I don't see the point in joining rates table to a subquery of itself... Apart from not using items table, that could hardly solve gin's problem, what do you have after your join that you did not have with just the subquery? Quote Link to comment Share on other sites More sharing options...
fenway Posted March 7, 2008 Share Posted March 7, 2008 Because correlated subqueries are expensive -- you need to run it once PER ROW. Quote Link to comment Share on other sites More sharing options...
aschk Posted March 7, 2008 Share Posted March 7, 2008 Here's the full answer without subqueries. SELECT i.id ,i.rate ,r.max_date as 'date' ,r.rate FROM test_item i LEFT JOIN ( SELECT t.id ,x.max_date ,t.rate FROM test_rate t JOIN ( SELECT id ,MAX(date) as 'max_date' FROM test_rate GROUP BY id ) x ON x.max_date = t.`date` AND x.id = t.id ) r ON r.id = i.rate Not sure whether to use subqueries or joins? Best google it Bump for fenway's comment Quote Link to comment Share on other sites More sharing options...
luca200 Posted March 7, 2008 Share Posted March 7, 2008 Here's the full answer without subqueries. WITHOUT subqueries ??? ??? I see TWO subqueries in there... and I don't think you need more than one, as in my last solution. @fenway: who are you answering to? Quote Link to comment Share on other sites More sharing options...
fenway Posted March 7, 2008 Share Posted March 7, 2008 A derived table IS NOT a subquery in the traditional sense, since it's uncorrelated. Quote Link to comment Share on other sites More sharing options...
luca200 Posted March 7, 2008 Share Posted March 7, 2008 A derived table IS NOT a subquery in the traditional sense, since it's uncorrelated. Ok, so he meant "without correlated subquery". Thanks for specifying. But I still see one join more than needed. Do you still my last solution is missing something? Quote Link to comment Share on other sites More sharing options...
gin Posted March 8, 2008 Author Share Posted March 8, 2008 But I still see one join more than needed. Do you still my last solution is missing something? That solution isn't complete because I needed the rate that corresponded to the max_date as well. Anyway, correct me if I'm wrong (since most of your discussion went over my head, lol)... This solution from luca200, while it works, is bad because it involves a correlated subquery which eats resources SELECT * FROM test_item LEFT JOIN test_rate ON test_item.rate = test_rate.id WHERE date = (SELECT MAX(date) from test_rate WHERE test_rate.id = test_item.rate) OR date IS NULL ORDER BY test_item.id, test_rate.date DESC; But this one by aschk is better, because it uses a derived table(s?) instead. SELECT i.id, i.rate, r.max_date as 'date', r.rate FROM test_item i LEFT JOIN ( SELECT t.id, x.max_date, t.rate FROM test_rate t JOIN ( SELECT id, MAX(date) as 'max_date' FROM test_rate GROUP BY id ) x ON x.max_date = t.`date` AND x.id = t.id ) r ON r.id = i.rate; Thanks you guys for all your help! Note to self: go read up on subqueries vs joins and derived tables. Quote Link to comment Share on other sites More sharing options...
luca200 Posted March 9, 2008 Share Posted March 9, 2008 That solution isn't complete because I needed the rate that corresponded to the max_date as well. I forgot it... so that's why you need one more join. aschk's solution is the best Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.