Jump to content

Employing user variables to select x records for each value of a given column


bitt3n

Recommended Posts

I have a table: products (product_id, shop_id, date)

 

For each shop_id in the table, the following query retrieves the 10 rows with the most recent dates

 

set @num := 0, @shop_id := NULL;


select shop_id, date
from (
   select shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x where x.row_number <= 10;

 

For example, if there are 10 distinct shop_ids, and each of these shop_ids appears in at least 10 rows, the query will return 100 rows, 10 per shop_id. This query works decently, taking about 0.35 seconds on 80k rows of dummy data. (Method taken from http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/ -- search the page for user variables to see the example I used).

 

In addition to the date and shop_id, I would also like to select the product_id for each row. This query appears to work fine with a test table of about 55k rows. However, on a 65k+ row table, this query causes MySQL to hang indefinitely (using phpMyAdmin, the results never appear):

 

set @num := 0, @shop_id := NULL;


select product_id, shop_id, date
from (
   select product_id, shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x where row_number <= 10;

 

I cannot figure out what I am doing wrong. I noticed that if I leave product_id in the subquery, but remove it from the main query, the query does not hang (although obviously it does not give me the product_id data I want).

 

I have also tried

 

set @num := 0, @shop_id := 0;

select p.product_id,x.shop_id, x.date
from (
   select shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x 
join products p on x.shop_id = p.shop_id and s.date = p.date where x.row_number <= 10;

 

This executes about as fast as the original query, but this won't work because dates are not unique, so that a row with a given product_id might match some other row with a different product_id but the same shop_id and date.

 

I'd like to try to the query using a PHP script just to see if this issue is localized to phpMyAdmin, but the original query, which runs fine in phpMyAdmin, gives me a syntax error running in PHP.

 

Query: SET @num :=0, @shop_id := NULL ; SELECT shop_id, date FROM ( SELECT shop_id, date, @num := IF( @shop_id = shop_id, @num +1, 1 ) AS row_number, @shop_id := shop_id AS dummy FROM products ORDER BY shop_id, DATE DESC ) AS x WHERE x.row_number <=10; 
MySQL Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '; SELECT shop_id, date FROM ( SELECT shop_id, date, @num := IF( @shop_id = shop_' at line 2

 

I don't get the error when I remove the initialization of the user variables, but then the query doesn't run correctly. If I can figure out why that is, I will test the query there also.

 

MySQL version: 4.1.22-standard-log

 

complete table structure:

 

CREATE TABLE IF NOT EXISTS `products` (
  `product_id` int(10) NOT NULL auto_increment,
  `shop_id` int(10) NOT NULL default '0',
  `title` varchar(120) NOT NULL default '',
  `brand` varchar(80) NOT NULL default '',
  `price` float NOT NULL default '0',
  `image` varchar(255) NOT NULL default '',
  `link` varchar(255) NOT NULL default '',
  `date` datetime NOT NULL default '0000-00-00 00:00:00',
  PRIMARY KEY  (`product_id`),
  UNIQUE KEY `shop_id` (`shop_id`,`link`),
  UNIQUE KEY `title_2` (`title`,`image`),
  KEY `brand` (`brand`),
  KEY `title` (`title`),
  KEY `date` (`date`),
  KEY `shop_id_2` (`shop_id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=129402 ;

figured I'd note that this problem is solved, for some reason the query hangs in phpMyAdmin run against a table with more than about 65K rows, but works fine when run through mysql_query() as two separate queries for SET and SELECT. so the lesson seems to be to try running the query in the command-line or through a script if it doesn't work in phpMyAdmin

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.