Jump to content

SELECT query - apply LIMIT condition to unique items only


stubarny

Recommended Posts

Hello,

 

Please could you show me how to find the 7th unqiue website_page_name from the table website_page_names?

 

The code below finds the 8th website_page_name but includes duplicates, I'd like to only count through unique website page names - please could you point me in the right direction?

 

$query ="SELECT website_page_name FROM website_page_names ORDER BY website_page_name ASC LIMIT 7, 1";

 

Many Thanks,

 

Stu

Link to comment
Share on other sites

Thanks guys,

 

Just to clarify - I only want to return the Nth record (only one record) but I want to operate the LIMIT function by counting only unique rows (therefore completely ignoring duplicates).

 

Therefore if I had a list of colours below the 3rd unique row is "pink".

 

blue

green

green

pink

 

you maybe want to GROUP BY and select those elements with count() = 1 and then LIMIT your results

 

thanks mikosiko - please could you tell me if this would work quickly for a large database (>10 million records)?

Link to comment
Share on other sites

Thanks guys.

 

OK I perhaps i need to rethink my database design if this is going to get slow over time.

 

I have a database made up of website addresses and company names. Everytime a user submits a company name in a form they are also asked to submit the website address. (this is to make sure that the websites are automatically kept up to date).

 

So whenever I need to find the latest website address for a company I just refer to the match with the latest timestamp.

 

Please could you tell me the best way of going about this? How about using a cronjob to delete the duplicates (but I don't like deleting data in case of errors)? Or is there something like a 'DISTINCT' index type in MYSQL?

 

Thanks for your help,

 

Stu

Link to comment
Share on other sites

So whenever I need to find the latest website address for a company I just refer to the match with the latest timestamp.

 

How the objectives that you just posted are relate to your first one?

 

... show me how to find the 7th unqiue website_page_name from the table website_page_names?

 

both seems very different to me.

 

what problem do you have using the timestamp?, and moreover is any reason why you are storing more than the latest website for a company?

 

for better help please post your tables and the code that you have currently.

 

BTW: To clarify... As fenway said DISTINCT is very slow on large datasets but, when used with LIMIT it is not critical (obviously assuming that your LIMIT's values are small), as an example: I do have a table with around 21 millions records and a SELECT DISTINCT <any_fieldname> FROM <the_table> LIMIT 50 only took around 0.0011s (0.0001s) ... that is no slow to me at all.

Link to comment
Share on other sites

I have a database made up of website addresses and company names. Everytime a user submits a company name in a form they are also asked to submit the website address. (this is to make sure that the websites are automatically kept up to date).

 

So whenever I need to find the latest website address for a company I just refer to the match with the latest timestamp.

 

In that case I would have a select for company and the max date for that name. Something like this (made up tables and columns):-

 

SELECT Company, MAX(Date_Added) FROM CompanyWebSites

 

Then JOIN that back against the company web sites table to get the rest of the details

 

SELECT Sub1.Company, Sub1.LatestDateAdded, WebSiteName
FROM SELECT Company, MAX(Date_Added) AS LatestDateAdded FROM CompanyWebSites) Sub1
INNER JOIN CompanyWebSites
ON Sub1.Company = CompanyWebSites.Company AND sub1.LatestDateAdded = CompanyWebSites.Date_Added

 

Assuming that date added is unique for any companies web sites then that should give you what you want.

 

It would probably be more efficient to have a table of companies as well and use a unique numeric company id rather than the name.

 

All the best

 

Keith

Link to comment
Share on other sites

Thanks guys, much appreciated.

 

BTW: To clarify... As fenway said DISTINCT is very slow on large datasets but, when used with LIMIT it is not critical (obviously assuming that your LIMIT's values are small), as an example: I do have a table with around 21 millions records and a SELECT DISTINCT <any_fieldname> FROM <the_table> LIMIT 50 only took around 0.0011s (0.0001s) ... that is no slow to me at all.

 

Many thanks for that - I'll keep the design as it is.

 

All the best,

 

Stu

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.