Jump to content

Can I remove duplicates from an array based on criteria?


Jonesmior

Recommended Posts

Hi all, hope someone can help. I've got duplicate results from a sql query and I can't figure out if it's possible to get rid of them using an array and if so how!

 

If a duplicate 'id' occurs, i want to output only the record where the date is greater / later regardless which date column it is from! So below, I would keep rows 2,3 &4.

 

mssql query returns something like...

 

    id      date_1            date_2

1) 801    04/09/2007      04/09/2007

2) 801    16/01/2008      04/09/2007

3) 801    04/09/2007      01/01/2008

3) 543    04/09/2007      04/09/2007

4) 654    04/09/2007      04/09/2007

 

Many thanks if someone can help.

 

--

JB

 

You can't guarantee that max date1 and date2 come from the same row if you do that

SELECT a.id, a.date1, a.date2
FROM the_table a
    INNER JOIN (SELECT id, max(date1) as latest FROM the_table GROUP BY id) as b
        ON a.id = b.id AND a.date1 = b.latest

Thanks for the replies. I'll give it a go on the server tomorrow. In the meantime I found the following which seems to work but I haven't tested it on proper data.

 

Thanks again,

 

--

JB

 


while(($row = mysql_fetch_assoc($result)))
  {
   $info[] = $row;
  }

for ($i=0; $i<count($info); $i++)

{

if ($info[$i]['id'] == $info[$i+1]['id']) 
	{

	if ($info[$i]['date'] > $info[$i+1]['date'])
		{
		unset($info[$i+1]); $i--;
		}
		else
		{
		unset($info[$i]); $i--;
		}		
	}

sort($info); continue;

}

Barand,

 

I was trying PHP to work around the shortcomings in my knowledge of advanced (for me!) sql select statements.

 

There are 12 tables in the query, many of which are looking up keys from other related tables using JOINs, 8 columns and always a minimum of 5 results.

 

I am unfamiliar with some of the syntax you posted, (FROM the_table a) for instance. Is table 'a' a temporary table? I obviously need to read further into the sql side of this project.

 

Thanks,

 

--

JB

 

The following are equivalent, see if you can spot the pattern:

 

SELECT
  `users`.`id`,
  `users`.`password`
FROM `users`

 

SELECT
  u.`id`,
  u.`password`
FROM `users` u

 

(No, it does not create a temporary table.)

A stupid example, but...

SELECT
  `users`.`id`
FROM `users`
INNER JOIN `users` ON `users`.`id`=`users`.`id`

 

Given the SELECT above, which `users` table do we mean in the following parts?

SELECT
  `users`.`id`

ON `users`.`id`=`users`.`id`

 

The answer is we don't know. Since we are joining the table to itself there is an ambiguity.

 

To solve this we alias the table.

 

SELECT
  a.`id`
FROM `users` a
INNER JOIN `users` b ON a.`id`=b.`id`

 

Going back to Barand's query above:

INNER JOIN (SELECT id, max(date1) as latest FROM the_table GROUP BY id)

 

He is using a nested query to create what behaves as a temporary table.  But how can you refer to the records returned from the nested SELECT?  You can't until you alias the nested result, which he did with as b.

 

To avoid confusion, it is not the alias that creates the temporary table.  The nested query creates the temporary table.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.