JChilds Posted July 5, 2008 Share Posted July 5, 2008 I have 22 tables with 3 million rows in a database. They are as follows: Timestamp, ID, Price And heres an example of the data: 2008-07-6 01:00:00 1 1.1 2008-07-6 02:00:00 1 1.1 2008-07-6 03:00:00 1 1.2 2008-07-6 01:00:00 2 1.0 2008-07-6 02:00:00 2 1.1 2008-07-6 03:00:00 2 1.2 I want to ignore the time stamp, and delete all duplicates, but have no idea how to even start on this. Ideally, I don't want to have to create new tables to do this, as that'd take a bloody long time. Thanks in advance Quote Link to comment Share on other sites More sharing options...
ohdang888 Posted July 5, 2008 Share Posted July 5, 2008 for that many rows, this is a mysql intensive query... only run this when you have time to let it sit and operate... <?php $group = mysql_query("SELECT `id` FROM `tablename` GROUP BY `id` ORDER BY `id` ASC ")or die(mysql_error()); while($row = mysql_fetch_array($group)){ $id = mysql_real_escape_string($row['id']); $each = mysql_query("SELECT `price` FROM `tablename` WHERE `id`='$id' ")or die(mysql_error()); foreach($row2 as mysql_fetch_array($each){ $price = mysql_real_escape_string($row2['price']); mysql_query("DELETE FROM `tablename` WHERE `id`='$id' AND `price`='$price' LIMIT 5")or die(mysql_error()); echo "Duplicates of id of $id and price of $price have been deleted<br>"; }// end foreach }// end while ?> that is the main work page... that would search through the entire table... but because you are dealing with huge project, run this first: <?php $group = mysql_query("SELECT `id` FROM `tablename` GROUP BY `id` ORDER BY `id` ASC LIMIT 1")or die(mysql_error());// this is the difference between //the first one and now // this is only will select the first `id` of '1'... do this first to make sure it works... while($row = mysql_fetch_array($group)){ $id = mysql_real_escape_string($row['id']); $each = mysql_query("SELECT `price` FROM `tablename` WHERE `id`='$id' ")or die(mysql_error()); foreach($row2 as mysql_fetch_array($each){ $price = mysql_real_escape_string($row2['price']); mysql_query("DELETE FROM `tablename` WHERE `id`='$id' AND `price`='$price' LIMIT 5")or die(mysql_error()); echo "Duplicates of id of $id and price of $price have been deleted<br>"; }// end foreach }// end while ?> Quote Link to comment Share on other sites More sharing options...
fenway Posted July 6, 2008 Share Posted July 6, 2008 If you want to ignore the timestamp, then presumably it doesn't matter at all... so why not just drop it? Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.