stubarny Posted January 29, 2012 Share Posted January 29, 2012 Hello, Please could you show me how to find the 7th unqiue website_page_name from the table website_page_names? The code below finds the 8th website_page_name but includes duplicates, I'd like to only count through unique website page names - please could you point me in the right direction? $query ="SELECT website_page_name FROM website_page_names ORDER BY website_page_name ASC LIMIT 7, 1"; Many Thanks, Stu Quote Link to comment Share on other sites More sharing options...
AyKay47 Posted January 29, 2012 Share Posted January 29, 2012 the query provided specifies that only 1 row should be returned, so I am confused as to how you are receiving multiple rows? Quote Link to comment Share on other sites More sharing options...
mikosiko Posted January 29, 2012 Share Posted January 29, 2012 you maybe want to GROUP BY and select those elements with count() = 1 and then LIMIT your results starting in 6 Quote Link to comment Share on other sites More sharing options...
stubarny Posted January 30, 2012 Author Share Posted January 30, 2012 Thanks guys, Just to clarify - I only want to return the Nth record (only one record) but I want to operate the LIMIT function by counting only unique rows (therefore completely ignoring duplicates). Therefore if I had a list of colours below the 3rd unique row is "pink". blue green green pink you maybe want to GROUP BY and select those elements with count() = 1 and then LIMIT your results thanks mikosiko - please could you tell me if this would work quickly for a large database (>10 million records)? Quote Link to comment Share on other sites More sharing options...
mikosiko Posted January 30, 2012 Share Posted January 30, 2012 having in mind your data example, you don't need to use GROUP BY nor count() (unique has not the same meaning as distinct) for your objectives just use DISTINCT and LIMIT (starting at 6 or whatever you need) Quote Link to comment Share on other sites More sharing options...
fenway Posted January 30, 2012 Share Posted January 30, 2012 DISTINCT will be very slow on large datasets. Quote Link to comment Share on other sites More sharing options...
mikosiko Posted January 30, 2012 Share Posted January 30, 2012 DISTINCT will be very slow on large datasets. yes, I agree Quote Link to comment Share on other sites More sharing options...
AyKay47 Posted January 30, 2012 Share Posted January 30, 2012 for your objectives just use DISTINCT and LIMIT (starting at 6 or whatever you need) Quote Link to comment Share on other sites More sharing options...
mikosiko Posted January 30, 2012 Share Posted January 30, 2012 for your objectives just use DISTINCT and LIMIT (starting at 6 or whatever you need) and your point is? Quote Link to comment Share on other sites More sharing options...
stubarny Posted January 30, 2012 Author Share Posted January 30, 2012 Thanks guys. OK I perhaps i need to rethink my database design if this is going to get slow over time. I have a database made up of website addresses and company names. Everytime a user submits a company name in a form they are also asked to submit the website address. (this is to make sure that the websites are automatically kept up to date). So whenever I need to find the latest website address for a company I just refer to the match with the latest timestamp. Please could you tell me the best way of going about this? How about using a cronjob to delete the duplicates (but I don't like deleting data in case of errors)? Or is there something like a 'DISTINCT' index type in MYSQL? Thanks for your help, Stu Quote Link to comment Share on other sites More sharing options...
mikosiko Posted January 30, 2012 Share Posted January 30, 2012 So whenever I need to find the latest website address for a company I just refer to the match with the latest timestamp. How the objectives that you just posted are relate to your first one? ... show me how to find the 7th unqiue website_page_name from the table website_page_names? both seems very different to me. what problem do you have using the timestamp?, and moreover is any reason why you are storing more than the latest website for a company? for better help please post your tables and the code that you have currently. BTW: To clarify... As fenway said DISTINCT is very slow on large datasets but, when used with LIMIT it is not critical (obviously assuming that your LIMIT's values are small), as an example: I do have a table with around 21 millions records and a SELECT DISTINCT <any_fieldname> FROM <the_table> LIMIT 50 only took around 0.0011s (0.0001s) ... that is no slow to me at all. Quote Link to comment Share on other sites More sharing options...
kickstart Posted January 31, 2012 Share Posted January 31, 2012 I have a database made up of website addresses and company names. Everytime a user submits a company name in a form they are also asked to submit the website address. (this is to make sure that the websites are automatically kept up to date). So whenever I need to find the latest website address for a company I just refer to the match with the latest timestamp. In that case I would have a select for company and the max date for that name. Something like this (made up tables and columns):- SELECT Company, MAX(Date_Added) FROM CompanyWebSites Then JOIN that back against the company web sites table to get the rest of the details SELECT Sub1.Company, Sub1.LatestDateAdded, WebSiteName FROM SELECT Company, MAX(Date_Added) AS LatestDateAdded FROM CompanyWebSites) Sub1 INNER JOIN CompanyWebSites ON Sub1.Company = CompanyWebSites.Company AND sub1.LatestDateAdded = CompanyWebSites.Date_Added Assuming that date added is unique for any companies web sites then that should give you what you want. It would probably be more efficient to have a table of companies as well and use a unique numeric company id rather than the name. All the best Keith Quote Link to comment Share on other sites More sharing options...
stubarny Posted January 31, 2012 Author Share Posted January 31, 2012 Thanks guys, much appreciated. BTW: To clarify... As fenway said DISTINCT is very slow on large datasets but, when used with LIMIT it is not critical (obviously assuming that your LIMIT's values are small), as an example: I do have a table with around 21 millions records and a SELECT DISTINCT <any_fieldname> FROM <the_table> LIMIT 50 only took around 0.0011s (0.0001s) ... that is no slow to me at all. Many thanks for that - I'll keep the design as it is. All the best, Stu Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.