Jump to content

Selecting from duplicate values but wanting the max


Go to solution Solved by Barand,

Recommended Posts

I'm trying to figure out how to select job numbers and their line items by searching for a job number but only their latest entry in the table. This table is an archive table. As changes are made to rows in the main production_data table, the row's current values are backed up in this archive table. Eventually an order falls off completely and only exists in this archive table.

Here is the table schema

--
-- Table structure for table `production_data_archive`
--

DROP TABLE IF EXISTS `production_data_archive`;
CREATE TABLE IF NOT EXISTS `production_data_archive` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `order_id` int(11) NOT NULL,
  `job_number` int(8) NOT NULL,
  `enterprise` tinytext NOT NULL,
  `part_number` text NOT NULL,
  `description` text NOT NULL,
  `psm` tinytext NOT NULL,
  `qty` int(11) NOT NULL,
  `line_item` varchar(11) NOT NULL,
  `as400_ship_date` date DEFAULT NULL,
  `hold_reason` text NOT NULL DEFAULT '0',
  `hold_date` date DEFAULT NULL,
  `insert_time` timestamp NOT NULL DEFAULT current_timestamp() COMMENT 'time order was inserted',
  `archive_time` timestamp NULL DEFAULT NULL COMMENT 'this column was added 4-19-21',
  PRIMARY KEY (`id`),
  KEY `job_line` (`job_number`,`line_item`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
COMMIT;

Here is a sample of data when I do a search for a specific job_number:

image.thumb.png.e08838dcf3ad5dec66036740cdbb60f6.png

In the image above I search for a job_number and then highlighted the most recent entries for that job_number. Those are the records I would have wanted to come out of the query.

This is one thing I have tried among others...it's closer but still not working as expected.

SELECT
	pda.*
FROM
	production_data_archive pda
INNER JOIN(
	SELECT MAX(id) AS max_id, order_id FROM production_data_archive GROUP BY order_id
) maxtbl
ON pda.id = maxtbl.order_id  
ORDER BY `pda`.`job_number` ASC

 

  • Solution

Try

TABLE: production_data_archive;
+----+----------+------------+-----------+---------------------+
| id | order_id | job_number | line_item | insert_time         |
+----+----------+------------+-----------+---------------------+
|  1 |    16824 |   22000412 | A         | 2021-03-26 00:00:00 |
|  2 |    16824 |   22000412 | A         | 2021-03-30 00:00:00 |
|  3 |    16824 |   22000412 | A         | 2021-04-09 00:00:00 |
|  4 |    16825 |   22000412 | B         | 2021-03-26 00:00:00 |
|  5 |    16825 |   22000412 | B         | 2021-03-29 00:00:00 |
|  6 |    16825 |   22000412 | B         | 2021-04-06 00:00:00 |
+----+----------+------------+-----------+---------------------+


SELECT a.id
     , a.order_id
     , a.job_number
     , a.line_item
     , a.insert_time
FROM production_data_archive a 
     LEFT JOIN
     production_data_archive b
         ON a.job_number = b.job_number
         AND a.line_item = b.line_item   
         AND a.insert_time < b.insert_time
WHERE b.job_number IS NULL;

+----+----------+------------+-----------+---------------------+
| id | order_id | job_number | line_item | insert_time         |
+----+----------+------------+-----------+---------------------+
|  3 |    16824 |   22000412 | A         | 2021-04-09 00:00:00 |
|  6 |    16825 |   22000412 | B         | 2021-04-06 00:00:00 |
+----+----------+------------+-----------+---------------------+

 

That does work but only one problem 😕 the column insert_time was only added a few months ago. So there are lots of records where they don't have an insert_time....when I designed the table I didn't think I was going to care about the insert_time....buuuttt later on...I found I did need it

The main difference that I can see is that mine gave the 2 records you were looking for and yours gave none

mysql> SELECT
    ->     pda.*
    -> FROM
    ->     production_data_archive pda
    -> INNER JOIN(
    ->     SELECT MAX(id) AS max_id, order_id FROM production_data_archive GROUP BY order_id
    -> ) maxtbl
    -> ON pda.id = maxtbl.order_id
    -> ORDER BY `pda`.`job_number` ASC;

Empty set (0.00 sec)

Better use dates/times to determine "latest" rather than rely on ids. The approach is broadly similar though. I think yours is matching the wrong id values in the join.

Okay so I've modified the query to get the additional data I needed and it works fine until I add the 'AND a.job_number LIKE :search'

It doesn't return any records no matter what value I put in for the :search. Am I doing something wrong here?

SELECT
    lr.ref_request_id,
    a.job_number,
    a.enterprise,
    a.line_item,
    a.insert_time,
    a.description,
    l.make,
    l.description,
    lr.qty,
    log.name AS issued_to,
    lr.submit_time AS issue_date
FROM
    production_data_archive a
LEFT JOIN production_data_archive b ON
    a.job_number = b.job_number AND a.line_item = b.line_item AND a.insert_time < b.insert_time
LEFT JOIN leds_requests lr ON
    lr.order_id = a.order_id AND lr.status_id = 2
LEFT JOIN leds l ON
    l.id = lr.product_id
LEFT JOIN login log ON
    lr.issued_to = LOG.user_id
WHERE
    b.job_number IS NULL AND a.order_id IN(
    SELECT
        order_id
    FROM
        leds_requests
    WHERE
        status_id = 2
) AND a.job_number LIKE :search

 

Edited by mongoose00318
PHPMyAdmin MySQL Formatter changed some stuff in my query. I've updated it.
2 hours ago, mongoose00318 said:
LEFT JOIN leds_requests lr ON
    lr.order_id = a.order_id AND lr.status_id = 2

If you make that a (INNER) JOIN instead of a LEFT JOIN, you can remove...

 

2 hours ago, mongoose00318 said:
AND a.order_id IN(
    SELECT
        order_id
    FROM
        leds_requests
    WHERE
        status_id = 2
)

The join will do the filtering for you.

  • 1 year later...
On 7/8/2021 at 3:32 PM, Barand said:

Try

TABLE: production_data_archive;
+----+----------+------------+-----------+---------------------+
| id | order_id | job_number | line_item | insert_time         |
+----+----------+------------+-----------+---------------------+
|  1 |    16824 |   22000412 | A         | 2021-03-26 00:00:00 |
|  2 |    16824 |   22000412 | A         | 2021-03-30 00:00:00 |
|  3 |    16824 |   22000412 | A         | 2021-04-09 00:00:00 |
|  4 |    16825 |   22000412 | B         | 2021-03-26 00:00:00 |
|  5 |    16825 |   22000412 | B         | 2021-03-29 00:00:00 |
|  6 |    16825 |   22000412 | B         | 2021-04-06 00:00:00 |
+----+----------+------------+-----------+---------------------+


SELECT a.id
     , a.order_id
     , a.job_number
     , a.line_item
     , a.insert_time
FROM production_data_archive a 
     LEFT JOIN
     production_data_archive b
         ON a.job_number = b.job_number
         AND a.line_item = b.line_item   
         AND a.insert_time < b.insert_time
WHERE b.job_number IS NULL;

+----+----------+------------+-----------+---------------------+
| id | order_id | job_number | line_item | insert_time         |
+----+----------+------------+-----------+---------------------+
|  3 |    16824 |   22000412 | A         | 2021-04-09 00:00:00 |
|  6 |    16825 |   22000412 | B         | 2021-04-06 00:00:00 |
+----+----------+------------+-----------+---------------------+

 

@BarandHey Barand, I was curious is there a way I can use this query but exclude any jobs that have more than one line item? For example, above you returned line_item A and B for job 22000412. So because that job has more than one line item I would not want it returned in the query. Hopefully I worded this well...

I've trying something like this so far: 

$sql = '
		SELECT a.id
		     , a.order_id
		     , a.job_number
		     , a.line_item
		     , a.insert_time
		     , a.as400_ship_date
		FROM production_data_archive a 
		     LEFT JOIN
		     production_data_archive b
		         ON a.job_number = b.job_number
		         AND a.line_item = b.line_item   
		         AND a.insert_time < b.insert_time
		WHERE b.job_number IS NULL AND a.as400_ship_date >= now()-interval 2 month AND a.as400_ship_date <= now()
		ORDER BY a.job_number, a.line_item 
	';
	$stmt = $pdo->query($sql);
	$data = $stmt->fetchAll();

	$i = 0;
	$j = '';
	$l = [];
	foreach ( $data as $k => $v ) {
		if( $v['job_number'] == $j ) {
			$l[] = $v['line_item'];
			
		} else {
			
			echo $j . ' : ' . count($l).'<br>';
			$j = $v['job_number'];
			$l = Array();
		}

	}

Its not going great though.

mysql> select * from production_data_archive;
+----+----------+------------+-----------+---------------------+
| id | order_id | job_number | line_item | insert_time         |
+----+----------+------------+-----------+---------------------+
|  1 |    16824 | 22000412   | A         | 2021-03-26 00:00:00 |
|  2 |    16824 | 22000412   | A         | 2021-03-30 00:00:00 |
|  3 |    16824 | 22000412   | A         | 2021-04-09 00:00:00 |
|  4 |    16825 | 22000412   | B         | 2021-03-26 00:00:00 |
|  5 |    16825 | 22000412   | B         | 2021-03-29 00:00:00 |
|  6 |    16825 | 22000412   | B         | 2021-04-06 00:00:00 |
|  7 |    16825 | 22000413   | C         | 2021-03-26 00:00:00 |
|  8 |    16825 | 22000413   | C         | 2021-03-29 00:00:00 |
|  9 |    16825 | 22000413   | C         | 2021-05-06 00:00:00 |
| 10 |    16825 | 22000414   | C         | 2021-03-26 00:00:00 |
| 11 |    16825 | 22000414   | C         | 2021-03-29 00:00:00 |
| 12 |    16825 | 22000414   | C         | 2021-05-06 00:00:00 |
+----+----------+------------+-----------+---------------------+
12 rows in set (0.00 sec)

mysql> SELECT a.id
    ->      , a.order_id
    ->      , a.job_number
    ->      , a.line_item
    ->      , a.insert_time
    -> FROM production_data_archive a
    ->      JOIN (
    ->             SELECT job_number
    ->             FROM production_data_archive
    ->             GROUP BY job_number
    ->             HAVING count(distinct line_item) = 1
    ->           ) nodupe USING (job_number)
    ->      LEFT JOIN
    ->      production_data_archive b
    ->          ON a.job_number = b.job_number
    ->          AND a.line_item = b.line_item
    ->          AND a.insert_time < b.insert_time
    -> WHERE b.job_number IS NULL;
+----+----------+------------+-----------+---------------------+
| id | order_id | job_number | line_item | insert_time         |
+----+----------+------------+-----------+---------------------+
|  9 |    16825 | 22000413   | C         | 2021-05-06 00:00:00 |
| 12 |    16825 | 22000414   | C         | 2021-05-06 00:00:00 |
+----+----------+------------+-----------+---------------------+
2 rows in set (0.03 sec)

 

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.