benanamen Posted March 20, 2016 Share Posted March 20, 2016 A few times I have seen in the case of SELECT username WHERE user_id=? there is LIMIT 1. My immediate thought was, why add the LIMIT 1 since there should only be one result anyways until I saw an explanation somewhere else which said it will stop MySQL from continuing searching the DB for more matches. A light bulb went off because this made sense to me. My question is, is that really the case? And if so, are we talking enough of a performance gain for it to even matter? (Lets consider using something as big as Facebook). If that is the best practice, I am surprised I almost never see it. Quote Link to comment Share on other sites More sharing options...
Barand Posted March 20, 2016 Share Posted March 20, 2016 If user_id is indexed I suspect it makes very little difference. OTOH, if it isn't indexed and the whole table would require scanning, then there may be an advantage in the LIMIT 1 depending on the target record's position in the table. 1 Quote Link to comment Share on other sites More sharing options...
QuickOldCar Posted March 20, 2016 Share Posted March 20, 2016 I'll agree with you. If have many records should have an index and would be using that. Since you mentioned Facebook I'll tell you what I discovered. Is many big sites like Facebook using big tables and NoSQL like cassandra. Facebook created that and open sourced it, I read they stopped using it in 2014 and not sure what are using now. They use a combination of methods to get data. Structured important data such as users info, wall posts, timeline and so on are with mysql and replicated across various data centers. Memcached is used to cache a lot of data and objects into ram. Haystack to handle a http based photo server. Info from facebook. Scribe to handle logging. Varnish http accelerator that acts like a load balancer and cache. Hiphop php transpiler but I believe using virtual machines...HHVM. Quote Link to comment Share on other sites More sharing options...
benanamen Posted March 20, 2016 Author Share Posted March 20, 2016 (edited) Facebook now uses HACK for coding. Not sure of the DB. Edited March 20, 2016 by benanamen Quote Link to comment Share on other sites More sharing options...
Psycho Posted March 21, 2016 Share Posted March 21, 2016 A few times I have seen in the case of SELECT username WHERE user_id=? there is LIMIT 1. If user_id is indexed I suspect it makes very little difference. OTOH, if it isn't indexed and the whole table would require scanning, then there may be an advantage in the LIMIT 1 depending on the target record's position in the table. The query in question is selecting "username" based on a condition for the "user_id". "user_id" should, in most cases. be indexed. But, a more relevant use case would be a logon script that is going to search based on the "username" which, other than logon, would probably not normally be used for any conditional matching. So, that would be a valid case to use LIMIT 1. Unless I am missing something, I would look at it this way: Using it where not needed (i.e. criteria field is indexed) would have no negative affects, but using it where it is needed would have positive performance effects. So, it could be a good practice to use it whenever it is applicable. But, be sure to account for scenarios where you only expect one record, but where there are edge case scenarios where multiple records could be created. Quote Link to comment Share on other sites More sharing options...
benanamen Posted March 21, 2016 Author Share Posted March 21, 2016 (edited) My bad, I meant WHERE username and the use case being a login and only one possible username match. user_id would have been the indexed auto increment which is not the column the WHERE would be checking. Edited March 21, 2016 by benanamen Quote Link to comment Share on other sites More sharing options...
Psycho Posted March 21, 2016 Share Posted March 21, 2016 My bad, I meant WHERE username and the use case being a login and only one possible username match. user_id would have been the indexed auto increment which is not the column the WHERE would be checking. Then a LIMIT 1 would make sense. The username field would likely not be indexed if there is a user_id field used for associations. Quote Link to comment Share on other sites More sharing options...
Jacques1 Posted March 21, 2016 Share Posted March 21, 2016 The username should be indexed. Otherwise the database system has to perform a full table scan whenever somebody logs in. In fact, it sounds rather silly to omit the index and then use LIMIT 1 to “improve performance”. If you care about performance, use an index. This will be much, much more effective than any LIMIT hack. Quote Link to comment Share on other sites More sharing options...
Psycho Posted March 21, 2016 Share Posted March 21, 2016 The username should be indexed. Otherwise the database system has to perform a full table scan whenever somebody logs in. In fact, it sounds rather silly to omit the index and then use LIMIT 1 to “improve performance”. If you care about performance, use an index. This will be much, much more effective than any LIMIT hack. I don't necessarily disagree with you. But, indexing has it's own overhead (more memory consumption, slower write operations, etc.). Depending on the nature of the application, a user may log in once and then do many, many other operations. The login call may be so infrequent and so low in performance impact that adding an index for that one use case may do more harm than good. In any case, there should be proper logging to capture the frequency and duration of particular processes to determine where performance improvements should be made. I agree using "LIMIT 1" for the purpose of performance is a "hackish" solution. I just don't see any downside to using it. Quote Link to comment Share on other sites More sharing options...
benanamen Posted March 21, 2016 Author Share Posted March 21, 2016 So Jaques1, are you saying that by indexing username there is no reason to use the Limit 1? I suppose its time I whip out some code and see what actually happens. Quote Link to comment Share on other sites More sharing options...
benanamen Posted March 21, 2016 Author Share Posted March 21, 2016 (edited) Ok, I did some testing on 300,000 employee records. Based on my test results, the fastest option is set an index and no limit 1 No Index Average .97 .0554 1.142 1.155 1.094 .700 1.153 1.143 No Index Limit 1 Average.38 .423 .354 .388 .348 .412 .405 .333 Indexed Average.026 .046 .026 .022 .020 .019 Indexed Limit 1 . Average 031 .020 .029 .042 .038 .042 .021 .029 Edited March 21, 2016 by benanamen Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.