JChilds Posted July 5, 2008 Share Posted July 5, 2008 I have 22 tables with 3 million rows in a database. They are as follows: Timestamp, ID, Price And heres an example of the data: 2008-07-6 01:00:00 1 1.1 2008-07-6 02:00:00 1 1.1 2008-07-6 03:00:00 1 1.2 2008-07-6 01:00:00 2 1.0 2008-07-6 02:00:00 2 1.1 2008-07-6 03:00:00 2 1.2 I want to ignore the time stamp, and delete all duplicates, but have no idea how to even start on this. Ideally, I don't want to have to create new tables to do this, as that'd take a bloody long time. Thanks in advance Link to comment https://forums.phpfreaks.com/topic/113338-delete-duplicate-rows-ignoring-one-column/ Share on other sites More sharing options...
ohdang888 Posted July 5, 2008 Share Posted July 5, 2008 for that many rows, this is a mysql intensive query... only run this when you have time to let it sit and operate... <?php $group = mysql_query("SELECT `id` FROM `tablename` GROUP BY `id` ORDER BY `id` ASC ")or die(mysql_error()); while($row = mysql_fetch_array($group)){ $id = mysql_real_escape_string($row['id']); $each = mysql_query("SELECT `price` FROM `tablename` WHERE `id`='$id' ")or die(mysql_error()); foreach($row2 as mysql_fetch_array($each){ $price = mysql_real_escape_string($row2['price']); mysql_query("DELETE FROM `tablename` WHERE `id`='$id' AND `price`='$price' LIMIT 5")or die(mysql_error()); echo "Duplicates of id of $id and price of $price have been deleted<br>"; }// end foreach }// end while ?> that is the main work page... that would search through the entire table... but because you are dealing with huge project, run this first: <?php $group = mysql_query("SELECT `id` FROM `tablename` GROUP BY `id` ORDER BY `id` ASC LIMIT 1")or die(mysql_error());// this is the difference between //the first one and now // this is only will select the first `id` of '1'... do this first to make sure it works... while($row = mysql_fetch_array($group)){ $id = mysql_real_escape_string($row['id']); $each = mysql_query("SELECT `price` FROM `tablename` WHERE `id`='$id' ")or die(mysql_error()); foreach($row2 as mysql_fetch_array($each){ $price = mysql_real_escape_string($row2['price']); mysql_query("DELETE FROM `tablename` WHERE `id`='$id' AND `price`='$price' LIMIT 5")or die(mysql_error()); echo "Duplicates of id of $id and price of $price have been deleted<br>"; }// end foreach }// end while ?> Link to comment https://forums.phpfreaks.com/topic/113338-delete-duplicate-rows-ignoring-one-column/#findComment-582373 Share on other sites More sharing options...
fenway Posted July 6, 2008 Share Posted July 6, 2008 If you want to ignore the timestamp, then presumably it doesn't matter at all... so why not just drop it? Link to comment https://forums.phpfreaks.com/topic/113338-delete-duplicate-rows-ignoring-one-column/#findComment-583056 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.