tomdelonge Posted October 12, 2008 Share Posted October 12, 2008 let's say i have a huge database of entries (about 1000 to 10000 entries). i need to randomly select 50 rows and display them. what's the best way (i'm talking using the least computer power and being the fastest. should i put each row into an array and then use the php shuffle() function? or should i come up with random numbers first and then find that row in each entry. or what way would you do it? so basically, i need 50 randomly selected rows of a database, what's the best way to do it? (it seems to me that an array of that size would take a while to shuffle; maybe i'm wrong) just let me know how you would accomplish this. thanks! Quote Link to comment Share on other sites More sharing options...
R0bb0b Posted October 12, 2008 Share Posted October 12, 2008 I would pull 50 random rows from the database, I wouldn't even consider having an array of 1000 to 10000 entries just for 50 random. Even if you wanted more than 50 random, the database random function is going to be much faster than php. Quote Link to comment Share on other sites More sharing options...
tomdelonge Posted October 12, 2008 Author Share Posted October 12, 2008 so exactly what do you do? there's a function in mysql to randomly select rows? Quote Link to comment Share on other sites More sharing options...
genericnumber1 Posted October 12, 2008 Share Posted October 12, 2008 Here's a post I've seen for alternatives over order by rand() for large tables such as yours. http://www.phpfreaks.com/forums/index.php?topic=125759.0 Quote Link to comment Share on other sites More sharing options...
ibechane Posted October 13, 2008 Share Posted October 13, 2008 One function you can try using is mysql_data_seek() http://us3.php.net/mysql_data_seek This function will allow you to choose which row of a query you'll read from next For example: <?php $connection = mysql_connect('localhost','username','password'); $query = 'SELECT * FROM mytable'; $resource = mysql_query($query,$connection); mysql_data_seek($resource,23); $twenty_fourth_row_data = mysql_fetch_array($resource); mysql_data_seek($resource,0); $first_row_data = mysql_fetch_array($resource); // row numbers start at 0! // etc. ?> So you see, using mysql_data_seek allows you to specify the number row of your query results. Now, to solve your problem, what you can do is figure out a way to get 25 random distinct integers between 0 and the number of rows in your query. Then, all you have to do is call mysql_data_seek with that random number and then extract that row's data and use it however. Hope that helps Quote Link to comment Share on other sites More sharing options...
ibechane Posted October 13, 2008 Share Posted October 13, 2008 I should add some reasons to use my method: 1. You ask MySQL to do as little querying as possible (you only need one query). 2. You do not put any unused data into memory like you would if you were to dump all of the results into an array. 3. You have more control over how you are randomizing your numbers since you are doing through PHP. Quote Link to comment Share on other sites More sharing options...
Lamez Posted October 13, 2008 Share Posted October 13, 2008 what I would do is find how many entrys are in the table and use that as your rang then use the random funtion 50 times as a different string then make a query then echo it out Quote Link to comment Share on other sites More sharing options...
bluejay002 Posted October 13, 2008 Share Posted October 13, 2008 @ibechane sorry but I don't really recommend that one. First, in your method you get all the records in the database which I don't find necessary. Why would I pull all 10000 records if I can pull only 50 records in random directly from the database? Secondly, you do not need to put that one into array since by recursively using mysql_fetch_assoc() or other related functions would suffice since you only want them to be displayed. Lastly, you can do pretty good random with RAND() in MySQL and it is faster. @Lamez I suggest doing it through the query. It is faster and uses less memory. Quote Link to comment Share on other sites More sharing options...
Lamez Posted October 13, 2008 Share Posted October 13, 2008 @bluejay002 lol, I thought that was understood. Quote Link to comment Share on other sites More sharing options...
bluejay002 Posted October 13, 2008 Share Posted October 13, 2008 lolz. yeah but since I came in late, I just gave my opinion. Quote Link to comment Share on other sites More sharing options...
Lamez Posted October 13, 2008 Share Posted October 13, 2008 its all good! Quote Link to comment Share on other sites More sharing options...
genericnumber1 Posted October 13, 2008 Share Posted October 13, 2008 I'd really advise against ORDER BY RAND() LIMIT 50 with 10k rows because RAND() isn't very efficient... it creates a temporary table with all of the rows (all 10k) shuffled and then you pull off 50 of them. There are much more efficient ways, it's just unfortunate that they aren't all as simple. Quote Link to comment Share on other sites More sharing options...
R0bb0b Posted October 13, 2008 Share Posted October 13, 2008 I'd really advise against ORDER BY RAND() LIMIT 50 with 10k rows because RAND() isn't very efficient... it creates a temporary table with all of the rows (all 10k) shuffled and then you pull off 50 of them. There are much more efficient ways, it's just unfortunate that they aren't all as simple. Agreed, but that just depends on the speed that you absolutely must have and how dedicated you are to get it. It is easy to get around this issue if pulling one random record but it becomes complicated when pulling multiple random records. I can think of one way but it wouldn't be pretty. Quote Link to comment Share on other sites More sharing options...
Barand Posted October 13, 2008 Share Posted October 13, 2008 I just happened to have a table with exactly 10,000 rows, so to give some idea of the time taken: [pre]mysql> SELECT * FROM cells c -> ORDER BY RAND() LIMIT 50; +---------+--------+ | idcells | colour | +---------+--------+ | 9035 | BF4432 | | 1450 | 98EBEE | | 9574 | BCB385 | | 6475 | 6C6F68 | | 2508 | EC8466 | | 958 | F10D0E | | 5093 | B27687 | | 4138 | 81207F | | 8964 | 9E1CB5 | | 4403 | AE9E51 | | 6727 | 53201F | | 2747 | 221FC1 | | 2952 | F308F0 | | 8567 | 889882 | | 6131 | 7769F7 | | 1559 | 268EC2 | | 1001 | E84DE6 | | 238 | 826AA2 | | 1785 | CC2EB7 | | 8699 | 8F3D69 | | 7695 | BF0311 | | 7750 | 244639 | | 8158 | D98A29 | | 1596 | C29EAA | | 4439 | D88EF9 | | 675 | 28C20E | | 1083 | ED8E04 | | 6716 | 4A8672 | | 5328 | 993847 | | 3572 | 50FF10 | | 6687 | AADFC2 | | 5176 | D47455 | | 224 | 57B54D | | 2538 | 4384DC | | 363 | 8F84F4 | | 3660 | B16533 | | 6936 | 8F786D | | 3338 | F2A23E | | 5345 | CC7AC4 | | 9730 | 3A4E47 | | 2075 | F381DF | | 8750 | 9E3843 | | 8510 | 324079 | | 1827 | 195FE7 | | 3522 | 70FCDF | | 177 | 7EDBDE | | 6806 | 65C130 | | 4590 | 0C2C3C | | 6692 | 3B4026 | | 1889 | A12B21 | +---------+--------+ 50 rows in set (0.02 sec) [/pre] Quote Link to comment Share on other sites More sharing options...
genericnumber1 Posted October 13, 2008 Share Posted October 13, 2008 Hey, I'm just going off of what I've always been told/read It would be interesting to benchmark order by rand() against many of the other methods I've seen. Quote Link to comment Share on other sites More sharing options...
R0bb0b Posted October 13, 2008 Share Posted October 13, 2008 But as we all know, things aren't always quite so simple in the real world. I know at my job, I average at least 5 joins per query. I just built one that joins 11 tables and a switch. So it works in this case and probably yours, but I can see where it would be useful to find a way to go in with a comma separated list of PKs. and BTW: one of those tables stores 5 years of medical insurance claims so you can imagine how large that table is, of course its on Postgres but, same concept. Quote Link to comment Share on other sites More sharing options...
Barand Posted October 13, 2008 Share Posted October 13, 2008 Hey, I'm just going off of what I've always been told/read It would be interesting to benchmark order by rand() against many of the other methods I've seen. Method suggested in earlier post [pre]mysql> SELECT -> * -> FROM -> cells AS t1 -> INNER JOIN -> ( -> SELECT -> ROUND(RAND() * (SELECT MAX(idcells) FROM cells)) AS id -> ) AS t2 -> ON -> t1.idcells >= t2.id -> ORDER BY -> t1.idcells ASC -> LIMIT 50; +---------+--------+------+ | idcells | colour | id | +---------+--------+------+ | 3810 | B31FF4 | 3810 | | 3811 | 14750F | 3810 | | 3812 | 7FFED6 | 3810 | | 3813 | E874AD | 3810 | | 3814 | 4EC40E | 3810 | | 3815 | FAD638 | 3810 | | 3816 | 023CE6 | 3810 | | 3817 | 5E8B4C | 3810 | | 3818 | 75CE5E | 3810 | | 3819 | 82FB9F | 3810 | | 3820 | B6B571 | 3810 | | 3821 | 69483D | 3810 | | 3822 | 4E464E | 3810 | | 3823 | F38BCB | 3810 | | 3824 | 802EA0 | 3810 | | 3825 | 1215C7 | 3810 | | 3826 | 84B3C6 | 3810 | | 3827 | 15AFC5 | 3810 | | 3828 | CAF11A | 3810 | | 3829 | E1DAB2 | 3810 | | 3830 | 3D1E2D | 3810 | | 3831 | 310F14 | 3810 | | 3832 | 7CC608 | 3810 | | 3833 | DCFE47 | 3810 | | 3834 | 210E6C | 3810 | | 3835 | 0FD2C1 | 3810 | | 3836 | FDF412 | 3810 | | 3837 | 8E6A4E | 3810 | | 3838 | 598CEA | 3810 | | 3839 | F6A053 | 3810 | | 3840 | 36435E | 3810 | | 3841 | FC850E | 3810 | | 3842 | 8C1F90 | 3810 | | 3843 | AEA2D3 | 3810 | | 3844 | 8EFC97 | 3810 | | 3845 | B13850 | 3810 | | 3846 | E3D0C5 | 3810 | | 3847 | 807FC8 | 3810 | | 3848 | EEE7E3 | 3810 | | 3849 | B2EA5B | 3810 | | 3850 | 062672 | 3810 | | 3851 | 34603B | 3810 | | 3852 | BD4BEA | 3810 | | 3853 | 8A83F7 | 3810 | | 3854 | 1C2AFE | 3810 | | 3855 | 11EDB3 | 3810 | | 3856 | E3F0D5 | 3810 | | 3857 | 3F6D6D | 3810 | | 3858 | CD6351 | 3810 | | 3859 | DF4D39 | 3810 | +---------+--------+------+ 50 rows in set (0.05 sec)[/pre] Throw those text books away! EDIT: I think this only works for LIMIT 1 Quote Link to comment Share on other sites More sharing options...
genericnumber1 Posted October 13, 2008 Share Posted October 13, 2008 Yeah, forget those ol' methods not that I use random rows on a regular basis, but if I do, I'll be sure to benchmark instead of just sticking with what I've read. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.