dadamssg87 Posted May 25, 2011 Share Posted May 25, 2011 i have table in my db that has rows of dates, for example... id | date | created | display | group | 1 | 2011-05-24 00:00:00 | 2011-05-24 20:00:00 | Y | 20 | 2 | 2011-05-24 00:00:00 | 2011-05-24 22:00:00 | N | 20 | 3 | 2011-05-26 00:00:00 | 2011-05-26 22:00:00 | N | 20 | 4 | 2011-05-27 00:00:00 | 2011-05-26 22:00:00 | N | 20 | 5 | 2011-05-27 00:00:00 | 2011-05-30 22:00:00 | N | 20 | Now i'm trying to make sure that each group has no duplicate dates. If they do delete all but the newest. So in this case, after the code ran, it would have deleted row 2 and row 5 because they both have duplicate dates and they're the oldest of those particular duplicates. I posted in the PHP board and not the MySQL board because it's the logic i can't figure out. I believe array_unique() would come in handy but i can't wrap my head around how to use it to do what i want. Quote Link to comment Share on other sites More sharing options...
btherl Posted May 25, 2011 Share Posted May 25, 2011 Step 1 - Find which records to keep SELECT date, group, MAX(created) FROM table GROUP BY date, group Step 2 - Delete all records which are not being kept DELETE FROM table WHERE (date, group, created) NOT IN (SELECT date, group, MAX(created) FROM table GROUP BY date, group) This is untested - first run the query as a select instead of a delete to make sure it's getting the right rows. Sorry that this isn't a php solution - for me it's easier to express in SQL. Quote Link to comment Share on other sites More sharing options...
dadamssg87 Posted May 26, 2011 Author Share Posted May 26, 2011 I'm getting an SQL error #1093 - You can't specify target table 'Exceptions' for update in FROM clause. I think this will do what i want if i can get the syntax right. DELETE FROM Exceptions WHERE (date, cal_id, created) NOT IN (SELECT DATE, cal_id, MAX( created ) FROM Exceptions GROUP BY DATE, cal_id) Quote Link to comment Share on other sites More sharing options...
btherl Posted May 26, 2011 Share Posted May 26, 2011 Ok, instead you can use a temporary table: CREATE TABLE keep_me AS SELECT DATE, cal_id, MAX( created ) FROM Exceptions GROUP BY DATE, cal_id DELETE FROM Exceptions WHERE (date, cal_id, created) NOT IN (SELECT date, cal_id, max FROM keep_me) There's a race condition here - if new entries are created between these two queries then those new entries could get deleted. So if that's a possibility you should lock the table first. Quote Link to comment Share on other sites More sharing options...
dadamssg87 Posted May 26, 2011 Author Share Posted May 26, 2011 so in my script that will run and delete duplicates. I should always create this temporary table? or just create it once? Sorry, i'm not at all familiar with temporary tables. Quote Link to comment Share on other sites More sharing options...
dadamssg87 Posted May 26, 2011 Author Share Posted May 26, 2011 nope...that didn't do what i wanted. I had this in my table ID cal_id date created 115 67 2011-05-25 00:00:00 2011-05-25 14:17:53 111 67 2011-05-25 00:00:00 2011-05-25 14:13:21 116 67 2011-05-26 00:00:00 2011-05-25 14:17:55 112 67 2011-05-26 00:00:00 2011-05-25 14:13:21 117 67 2011-05-27 00:00:00 2011-05-25 14:17:56 118 67 2011-05-27 00:00:00 2011-05-25 14:19:24 119 67 2011-05-27 00:00:00 2011-05-25 14:19:26 120 67 2011-05-28 00:00:00 2011-05-25 14:24:24 109 67 2011-05-30 00:00:00 2011-05-25 14:12:25 It deleted everything except these rows 115 67 2011-05-25 00:00:00 2011-05-25 14:17:53 111 67 2011-05-25 00:00:00 2011-05-25 14:13:21 And stuck the correct keep dates in the keep_me table 115 67 2011-05-25 00:00:00 2011-05-25 14:17:53 116 67 2011-05-26 00:00:00 2011-05-25 14:17:55 117 67 2011-05-27 00:00:00 2011-05-25 14:19:26 120 67 2011-05-28 00:00:00 2011-05-25 14:24:24 109 67 2011-05-30 00:00:00 2011-05-25 14:12:25 *edit just noticed that it didn't get all the correct/keep_me dates right. #117 should be replaced with #119 because #119 is newer... Quote Link to comment Share on other sites More sharing options...
btherl Posted May 26, 2011 Share Posted May 26, 2011 How did you end up with an id number on the keep_me table? The keep_me table only has date, cal_id and created. If you try to modify the query to store the id number as well it won't work. THere's a few options for the temporary table. You can create it when you need it and delete it afterwards. Or you can leave it around, and clear the contents (DELETE FROM keep_me) when you are finished with it. Quote Link to comment Share on other sites More sharing options...
dadamssg87 Posted May 26, 2011 Author Share Posted May 26, 2011 i thought i could slip that in there without a problem, i'm jotting down on paper which rows i have currently in the database. And then writing down the id's the should be kept and which should be dropped. I can't tell which rows are in the keep_me table without their id. Is there any way to keep that in there? Quote Link to comment Share on other sites More sharing options...
dadamssg87 Posted May 26, 2011 Author Share Posted May 26, 2011 i think the keep_me table is working but i need all the data that's associated with each row. The keep_me table only has the 3 columns (date, cal_id, MAX(created). Several columns get lost. *edit these are the names of all the columns for each row id | cal_id | price | date | name | type | created | hash | username | grab Quote Link to comment Share on other sites More sharing options...
btherl Posted May 26, 2011 Share Posted May 26, 2011 You can do this: SELECT id, cal_id, date, created FROM Exceptions WHERE (cal_id, date, created) IN (SELECT cal_id, date, MAX(created) FROM Exceptions GROUP BY cal_id, date) The problem with adding the id directly is that MySQL will give you a randomly selected id instead of the one with the most recent created time. This is a "feature" of MySQL that often causes even more trouble. But this query will do the grouping first and THEN fetch the matching id numbers, and that will work. But none of this will work if you expect to have rows with the same cal_id, same date and same created time. Is that possible? And in that case would you choose the entry with the highest id number? Quote Link to comment Share on other sites More sharing options...
dadamssg87 Posted May 26, 2011 Author Share Posted May 26, 2011 No that's not possible. It's a form where the user inputs price, date, and name but the i have a jquery function that dynamically adds additional dates if necessary for the situation. The php code then grabs the timestamp, uses that for the first date's created field. Then if there are additional days, i've added one second to the timestamp so every row is at least one second different from the others. Ok we're definitely getting closer. Thanks for all the help. Now that i have the keep rows selected with that query. How do i delete all other rows in the Exceptions table? I know i could use php and cycle through each row and if that row is found in the exception table, do nothing, if it isn't found, delete it. I'm sure there's a more efficient/SQL way to do this though. something like this(i would change the first part to DELETE FROM) SELECT * FROM Exceptions WHERE id NOT IN (SELECT id, cal_id, date, created FROM Exceptions WHERE (cal_id, date, created) IN (SELECT cal_id, date, MAX(created) FROM Exceptions GROUP BY cal_id, date)) But i get this error = #1241 - Operand should contain 1 column(s) Quote Link to comment Share on other sites More sharing options...
btherl Posted May 26, 2011 Share Posted May 26, 2011 Yep you're very close there. The problem now is the middle query returns 4 columns, but the outer query asks for only one. SELECT * FROM Exceptions WHERE id NOT IN ( SELECT id FROM Exceptions WHERE (cal_id, date, created) IN ( SELECT cal_id, date, MAX(created) FROM Exceptions GROUP BY cal_id, date)) The first select wants to see 1 column come from the second select, and the second wants to see 3 columns from the third select. Algorithmically, this query is doing this: 1. Find the highest (most recent) created column value for each cal_id and date combination 2. Look up the unique ids for each cal_id, date and MAX(created) combination 3. Select all data except for those unique ids (this step can then be changed to delete, once you've verified the results are correct). Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.