Jump to content

Finding Dupes in Multiple MySQL Fields


bienville

Recommended Posts

I have a little widget that allows users to input local meetings for a regional organization. The problem is sometimes the same meeting will get entered twice in slightly different ways so the records will not be exact dupes but the meetings are the same. (The head programmer on the project was arrested (no joke) so I am left to figure this out.)

 

After 2 hours of goolging, I figured out how to make a report that shows 'possible dupes' for human review. It works great but it only looks at the meeting names.

 

 <?
$sql = "SELECT t.* FROM meeting t\n"
   . "INNER JOIN\n"
   . " (SELECT meetname, COUNT(*) FROM meeting\n"
   . " GROUP BY meetname HAVING COUNT(*) > 1) as X\n"
    . " ON t.meetname = X.meetname"
    . " ORDER BY meetname ASC";
$result2 = mysql_query($sql);

while($row = mysql_fetch_assoc($result2))

 

Considering I'm not a php geek I was pretty happy with the report but in using it, I have to add another criteria.  Several legit meetings have the same name but are on different days.

 

What I really need is 'Possible dupes = all records with dupe meeting name (meetname) AND dupe meeting day (meetday) .'  Then a human can confirm they are dupes and delete them.

 

2 more hours of goolging and I'm stumped.  Code, help or links appreciated, but talk down to me I'm slow. ;-)

 

thanks

Link to comment
Share on other sites

Hi

 

Mikosikos idea would work but it would only show you the meetname and meetday data, whereas I presume there are other details on the meeting table that you want to see to make a human decision if it is a duplicate. Getting round that is why your original code used a JOIN.

 

Expanding your original code:-

 

SELECT t.*
FROM meeting t
INNER JOIN (SELECT meetname, meetday, COUNT(*)
FROM meeting
GROUP BY meetname, meetday 
HAVING COUNT(*) > 1) x
ON t.meetname = x.meetname AND t.meetday = x.meetday
ORDER BY t.meetname, t.meetday

 

All the best

 

Keith

Link to comment
Share on other sites

Thanks for the reply guys. I tried Keith's as it was most like my existing code.

 

Now I get "Warning: mysql_fetch_assoc(): supplied argument is not a valid MySQL result resource " when I hit that while statement.

 

I have to run out for an hour but I'll beat on it when I get back in and keep you posted.  thanks again.

Link to comment
Share on other sites

Here ya go...

 

Note: I thought the variable was meetname (above) but I forgot it is 'date' (but it is not a date field) that name was legacy. We changed it from date to 'day of week' but I did not rename the field in the db.  It is varchar(20).

 

When a human reviews it, they can delete it if it's a dupe.

 

I'm going to beat on it but if you see the problem, lemme know.

 

thanks again Keith.

 

 

if($param['mode'] == 'dupe_find')
{
?>

<form name="meeting_edit" id="meeting_edit" action="#" method="post">
<table width="90%" rules="ALL" id="meeting_t1" style="border: 1" cellpadding="6">

	<tbody>

		<tr>

			<th>ID</th>
			<th>Day</th>
			<th>Time</th>
			<th>Name</th>
			<th>Address</th>
			<th>Location</th>
			<th></th>
			<th></th>

		</tr>

		<?
		$sql = "SELECT t.* FROM meeting t\n"
			   . "INNER JOIN\n"
			   . " (SELECT meetname, date, COUNT(*) FROM meeting\n"
			   . " GROUP BY meetname, date HAVING COUNT(*) > 1) as X\n"
			   . " ON t.meetname = X.meetname AND t.date = x.date"
			   . " ORDER BY t.meetname, t.date ASC";
		$result2 = mysql_query($sql);

		while($row = mysql_fetch_assoc($result2))
		{

		?>

		<tr>
			<td><?=$row['id']?></td>

			<td><?=$row['date']?></td>
			<td><?=date("g:i a", strtotime($row['time']))?></td>
			<td><?=$row['meetname']?></td>
			<td><?=$row['address']?></td>
			<td><?=$row['location']?></td>
			<td><a href="?id=<?=$row['id']?>&mode=edit">E</a></td>
			<td><a href="?id=<?=$row['id']?>&mode=delete">X</a></td>
		</tr>

		<?

		}

		?>

	</tbody>

</table>

</form>	

Link to comment
Share on other sites

Hi

 

Can't see anything obvious wrong. But you do seem to be using an old version of php. Also you seem to have put carraige returns into the SQL which are not really useful.

 

<form name="meeting_edit" id="meeting_edit" action="#" method="post">
<table width="90%" rules="ALL" id="meeting_t1" style="border: 1" cellpadding="6">
	<tbody>
		<tr>
			<th>ID</th>
			<th>Day</th>
			<th>Time</th>
			<th>Name</th>
			<th>Address</th>
			<th>Location</th>
			<th></th>
			<th></th>
		</tr>
		<?php

		$sql = "SELECT t.* FROM meeting t"
			   . " INNER JOIN"
			   . " (SELECT meetname, date, COUNT(*) FROM meeting"
			   . " GROUP BY meetname, date HAVING COUNT(*) > 1) as X"
			   . " ON t.meetname = X.meetname AND t.date = x.date"
			   . " ORDER BY t.meetname, t.date ASC";
		$result2 = mysql_query($sql) or die(mysql_error());

		while($row = mysql_fetch_assoc($result2))
		{
		?>
		<tr>
			<td><?php echo $row['id'];?></td>
			<td><?php echo $row['date'];?></td>
			<td><?php echo date("g:i a", strtotime($row['time']));?></td>
			<td><?php echo $row['meetname'];?></td>
			<td><?php echo $row['address'];?></td>
			<td><?php echo $row['location'];?></td>
			<td><a href="?id=<?php echo $row['id'];?>&mode=edit">E</a></td>
			<td><a href="?id=<?php echo $row['id'];?>&mode=delete">X</a></td>
		</tr>
		<?php
		}
		?>
	</tbody>
</table>
</form>	 

 

Give that a try. Should at least give you the error message if it fails.

 

All the best

 

Keith

Link to comment
Share on other sites

Unknown column 'x.date' in 'on clause'

 

I know what it's telling me but I dunno why ;)

 

EDIT: Sheeeezaam... looks like it was case sensitive.  looks like I got something, will update. thanks

 

Edit 2: Keith it looks like you rock. With a hundreds of rows in the db I'm not 100% it is doing what I think it is but it I'm pretty sure we got it. (just have to check for obvious reasons)

 

Looks like it, many thinks man.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.