Jump to content

Employing user variables to select x records for each value of a given column


Recommended Posts

I have a table: products (product_id, shop_id, date)

 

For each shop_id in the table, the following query retrieves the 10 rows with the most recent dates

 

set @num := 0, @shop_id := NULL;


select shop_id, date
from (
   select shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x where x.row_number <= 10;

 

For example, if there are 10 distinct shop_ids, and each of these shop_ids appears in at least 10 rows, the query will return 100 rows, 10 per shop_id. This query works decently, taking about 0.35 seconds on 80k rows of dummy data. (Method taken from http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/ -- search the page for user variables to see the example I used).

 

In addition to the date and shop_id, I would also like to select the product_id for each row. This query appears to work fine with a test table of about 55k rows. However, on a 65k+ row table, this query causes MySQL to hang indefinitely (using phpMyAdmin, the results never appear):

 

set @num := 0, @shop_id := NULL;


select product_id, shop_id, date
from (
   select product_id, shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x where row_number <= 10;

 

I cannot figure out what I am doing wrong. I noticed that if I leave product_id in the subquery, but remove it from the main query, the query does not hang (although obviously it does not give me the product_id data I want).

 

I have also tried

 

set @num := 0, @shop_id := 0;

select p.product_id,x.shop_id, x.date
from (
   select shop_id, date,
      @num := if(@shop_id = shop_id, @num + 1, 1) as row_number,
      @shop_id := shop_id as dummy
  from products
  order by shop_id, date DESC
) as x 
join products p on x.shop_id = p.shop_id and s.date = p.date where x.row_number <= 10;

 

This executes about as fast as the original query, but this won't work because dates are not unique, so that a row with a given product_id might match some other row with a different product_id but the same shop_id and date.

 

I'd like to try to the query using a PHP script just to see if this issue is localized to phpMyAdmin, but the original query, which runs fine in phpMyAdmin, gives me a syntax error running in PHP.

 

Query: SET @num :=0, @shop_id := NULL ; SELECT shop_id, date FROM ( SELECT shop_id, date, @num := IF( @shop_id = shop_id, @num +1, 1 ) AS row_number, @shop_id := shop_id AS dummy FROM products ORDER BY shop_id, DATE DESC ) AS x WHERE x.row_number <=10; 
MySQL Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '; SELECT shop_id, date FROM ( SELECT shop_id, date, @num := IF( @shop_id = shop_' at line 2

 

I don't get the error when I remove the initialization of the user variables, but then the query doesn't run correctly. If I can figure out why that is, I will test the query there also.

 

MySQL version: 4.1.22-standard-log

 

complete table structure:

 

CREATE TABLE IF NOT EXISTS `products` (
  `product_id` int(10) NOT NULL auto_increment,
  `shop_id` int(10) NOT NULL default '0',
  `title` varchar(120) NOT NULL default '',
  `brand` varchar(80) NOT NULL default '',
  `price` float NOT NULL default '0',
  `image` varchar(255) NOT NULL default '',
  `link` varchar(255) NOT NULL default '',
  `date` datetime NOT NULL default '0000-00-00 00:00:00',
  PRIMARY KEY  (`product_id`),
  UNIQUE KEY `shop_id` (`shop_id`,`link`),
  UNIQUE KEY `title_2` (`title`,`image`),
  KEY `brand` (`brand`),
  KEY `title` (`title`),
  KEY `date` (`date`),
  KEY `shop_id_2` (`shop_id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=129402 ;

figured I'd note that this problem is solved, for some reason the query hangs in phpMyAdmin run against a table with more than about 65K rows, but works fine when run through mysql_query() as two separate queries for SET and SELECT. so the lesson seems to be to try running the query in the command-line or through a script if it doesn't work in phpMyAdmin

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.