Jump to content

[SOLVED] How to randomize mysql_fetch_array($result) ?


Recommended Posts

I have a database of images--currently about 100 rows.  Because it's so small, I'm getting away with this mysql query:

 

SELECT id, imagefile, imagemeta FROM image WHERE categoryID != 0 ORDER BY RAND() LIMIT 25;

 

The problem is the database is growing and I need a better solution... ORDER BY RAND() is notoriously slow.

 

I was considering this:

 

SELECT id, imagefile, imagemeta FROM image WHERE categoryID != 0 LIMIT 25;

 

...and put the results in an array, use array_rand() or shuffle() to randomize the array, then loop through the array...

 

$sql ='SELECT id, imagefile, imagemeta FROM image WHERE categoryID != 0 LIMIT 25;';
$result = mysql_query($sql);
$keys = array_keys(mysql_fetch_array($result, MYSQL_BOTH));
shuffle($keys);
while (mysql_fetch_array($result, MYSQL_BOTH)
{
   $k = $keys['id']
   echo $result[$k]['imagefile'];
   next($keys);
}

 

Can someone help me turn this pseudo code into working code?

 

* will shuffle() randomize the id returned by the query?

 

* how do I echo the imagefile and imagemeta in a loop?

 

Thanks in advance.

tried... but can't get that code to work...

 

I need a set of 25 random rows.  if I can subquery or join to (SELECT id from image ORDER BY RAND() LIMIT 25) that would optimize things a lot since id is the PK.

 

in pseudo sql, it might look like this:

 

SELECT * FROM

image AS t1

INNER JOIN

(SELECT id from image ORDER BY RAND() LIMIT 25 AS t2) ON t1.id = t2.id;

 

Can this be rewritten to work?

The database will not grow beyond 500 rows... so I think using two queries will be optimization enough:

 

QUERY 1:

SELECT id FROM image ORDER BY RAND() LIMIT 25;

this is a lot quicker than SELECT *

 

QUERY 2:

SELECT * FROM image WHERE id IN ([query 1 results]) AND ([other criteria here]);

 

For some reason I've been unsuccessful combining these queries with a subquery.

 

Even a very simple subquery fails from the phpMySql SQL box.

 

I think my hosting provider has disallowed subqueries on the database.

 

Is it possible to write a INNER JOIN query using (something similar) to these two queries?

 

 

I don't know if your method is more efficient or not, but this is how I think the query should be written

 

SELECT *
FROM image
WHERE id IN (SELECT id FROM image ORDER BY RAND() LIMIT 25)

 

Additional WHERE clauses should go in the subquery. Otherwise the sub query would get 25 records and the WHERE clause for the main query might exclude some of those because of additional criteria

Thanks for the reply.

 

I tried that exact query and received the same error I've received with other subqueries:

 

Error

SELECT *
FROM image
WHERE id
IN (
SELECT id
FROM image
ORDER BY RAND( )
LIMIT 25
)

MySQL said:
#1064 - You have an error in your SQL syntax. Check the manual
that corresponds to your MySQL server version for the right syntax
to use near 'SELECT id FROM image ORDER BY RAND() LIMIT 25)' at line 3

 

I run the query from the phpMyAdmin web interface that my web host provides.

 

I think the problem is the server version of mysql: Server version: 4.0.27-standard-log

I think the Client version is: MySQL client version: 4.1.22 (unless that pertains to phpMyAdmin).

 

From what I've read (and discovered by trial and error) subqueries are not supported below Server version of 4.1.

 

In any case, that's why I think I need some way to JOIN the table to itself.

 

I don't know if creation of a temporary table is necessary for that or not.

 

Here's what I think it looks like in pseudo sql:

 

SELECT * FROM image as t1 INNER JOIN (SELECT id FROM image as t2 ORDER BY RAND LIMIT 25) ON t1.id = t2.id;

 

But I think this will also fail due to subqueries being unsupported in 4.0.

 

The only other solution I can think of is to use two separate queries:

 

STEP 1:

SELECT id FROM image as ORDER BY RAND() LIMIT 25;

get 25 random PKs from table image

 

STEP 2:

$keys = array_keys(mysql_fetch_array($result, MYSQL_BOTH)

put the PKs in an array

 

STEP 3:

SELECT * FROM image WHERE id LIKE (34, 58, 98, 120, 27...);

build another sql statement with the contents of $keys in LIKE ($keys[0], keys[1] ...)

 

In terms of milliseconds... not sure what the overall gain would be.

 

I wonder if using php to randomize an array of numbers might be better in this case and just 'SELECT id FROM image', dump it into an array, shuffle the array, splice out 25, and loop over it.  I tried this but had trouble working with the array retrieved from mysql... could not loop over it and echo the proper element.

 

Do you think this would work?  Other suggestions?

 

 

As a follow up, this is where I'm having trouble echoing the proper keys:

 

$sql ='SELECT id FROM image WHERE categoryID = 0;';
$result = mysql_query($sql);
$keys = array_keys(mysql_fetch_array($result, MYSQL_BOTH));
//shuffle($keys); ??
//array_splice($keys, 25); ??
foreach ($keys as $key)
{
$ids = $ids.$key.', ';
}
$s = (strlen($ids)-2);
$ids = substr($ids,0, $s);
$sql = "SELECT * FROM quote WHERE id IN (".$ids.");";
echo $sql;

 

Once I have the sql statement with the random ids...

 

$result = mysql_query($sql);
while (mysql_fetch_array($result, MYSQL_BOTH))
{
echo into html;
}

 

How do I echo the elements I need from $result array?

Thanks for the reply, but I think I should have asked "How do I echo the elements of $keys?"

 

$result = mysql_query($sql);
$keys = array_keys(mysql_fetch_array($result, MYSQL_BOTH));
//shuffle($keys);
//array_splice($keys, 25);
foreach ($keys as $key)
{
echo $key[][]; ???
}

 

I know array_keys() returns only the keys, so this example might be better:

 

$result = mysql_query($sql);
$keys = mysql_fetch_array($result, MYSQL_BOTH));

 

If the $sql statement selects 5 fields and 50 rows, I assume I'm getting a multidimensional array (?) ... this is where I'm lost.  Is there a way to print out the array (in phpMyAdmin) so I can see all elements of the array so I know how to reference them?  How do I know how to reference the elements in an array created from $keys = mysql_fetch_array($result, MYSQL_BOTH)); ?

mysql_query returns a resource object, in order to extract a row from the resource object you use mysql_fetch_assoc or mysql_fetch_array. This array can then be accessed like a normal array/associative array.

 

<?php
while($row = mysql_fetch_assoc($result)) {
   echo $row['id'];
   echo $row['field_1'];
   echo $row['field_2'];
   // etc
}
?>

I'm not sure what the most efficient method would be. I'll give you three and you can see which is fastest by doing some tests:

 

Option 1: Using a JOIN query.

Actually I'm not sure if this would be any faster than doing a RAND on the original query, but it's worth trying

SELECT image1.id AS randID, image2.* 
FROM image m1
LEFT JOIN image image2 ON image1.id = image2.id
ORDER BY RAND() 
LIMIT 25

 

Option 2: Do one query to get the IDs and a second to get the data

$query = "SELECT id FROM image ORDER BY RAND() LIMIT 25";
$result = mysql_query($query);
while ($record = mysql_fetch_assoc($result))
{
    $randIDs[] = $record['id'];
}

$query = "SELECT * FROM image WHERE id IN (" . implode('', $randIDs) . ")";
$result = mysql_query($query);

 

Option 3: Get ALL ids and use PHP to get 25 random ones

$query = "SELECT id FROM image";
$result = mysql_query($query);
while ($record = mysql_fetch_assoc($result))
{
    $allIDs[] = $record['id'];
}

//Randomize the array
shuffle($allIDs);
//Get the first 25 records
$randIDs = array_slice($allIDs, 0, 25);

$query = "SELECT * FROM image WHERE id IN (" . implode('', $randIDs) . ")";
$result = mysql_query($query);

ah, I see...

 

I'll have to read up on those functions.  My goal is to randomize an array of rows returned by $result = mysql_query($sql);

 

So I have to get the everything into an array, randomize and splice the array, then loop over the array to cherrypick elements.  After nurping the array a few times I lose track of what element I need... very confusing...

So I have to get the everything into an array, randomize and splice the array, then loop over the array to cherrypick elements.  After nurping the array a few times I lose track of what element I need... very confusing...

 

Once you have the array of all elements, randomized the array, and spliced the array, why do you need to loop over them again? I thought the whole idea was to get a random set of records. Once you have that array of 25 elements, just do another query to get the records for those 25 using implode on the array.

 

I noticed I did have an error in creating those queries. When using implode you need to have the array elements separated by a comma. If you go with option 3, this code will result in $result contain the 25 random records:

 

//Get 25 random IDs from the DB
$query = "SELECT id FROM image";
$result = mysql_query($query);
//Populate IDs into an array
while ($record = mysql_fetch_assoc($result)) { $allIDs[] = $record['id']; }

//Randomize the array
shuffle($allIDs);
//Get the first 25 records/IDs
$randIDs = array_slice($allIDs, 0, 25);

//Query for the full records to the 25 random IDs
$query = "SELECT * FROM image WHERE id IN (" . implode(',', $randIDs) . ")";
$result = mysql_query($query);
//$result will contain a record set of the full records for the 25 IDs

Not tested so there may be other syntax errors.

Thanks mjdamato!  I will try each of those.

 

I did find this to be slightly (a few milliseconds) faster:

 

SELECT image.id AS randID, image.*
FROM image m1
LEFT JOIN image image ON image.id = image.id
ORDER BY RAND( )
LIMIT 25;

;

 

I am still kind of green on mysql syntax... what is m1 ?

note: there is only one image table ... no image1 and image2

 

Is this the way JOINs are done... JOIN table1 table2 ON table1.id = table2.id ?

 

Why a LEFT JOIN rather than an INNER JOIN ?

thanks mjdamato - I will be experimenting with those options.

 

I did notice this was faster (by a few milliseconds):

 

SELECT image1.id AS randID, image2.*
FROM image m1
LEFT JOIN image image2 ON image1.id = image2.id
ORDER BY RAND()
LIMIT 25

 

but I had to rewrite it as:

 

SELECT image.id AS randID, image . *
FROM image m1
LEFT JOIN image image ON image.id = image.id
ORDER BY RAND( )
LIMIT 25 ;

 

there is only one image table (no image1 and image2)

 

what is m1 ?

 

(still kind of green on mysql syntax)

 

As for your question: "why do you need to loop over them again? I thought the whole idea was to get a random set of records. "

 

Yes, that is correct.  But I also wanted to test the option of randomizing an array with php after getting the results from mysql with only one query.

 

I appreciate all the help... it helps a lot!

I forget the correct terminology, but in a query you can give a table or field an alternate name. This is a necessity when trying to join the same table on itself. So, in the query with 'image1' and 'image2' I referenced the image table two times and gave each an alternate name or reference to differentiate betweent he two.

 

I use table references a lot. I typically like to give my table descriptive names so I know what they are, but then in my query I will use short references for ease of implementation

 

I forget the correct terminology, but in a query you can give a table or field an alternate name. This is a necessity when trying to join the same table on itself. So, in the query with 'image1' and 'image2' I referenced the image table two times and gave each an alternate name or reference to differentiate betweent he two.

 

I use table references a lot. I typically like to give my table descriptive names so I know what they are, but then in my query I will use short references for ease of implementation

 

 

An alias, using the AS keyword?

thanks, mjdamato - I understand now that you aliased the table to JOIN it to itself.

 

after playing around with this more, I ended up modifying the database schema... created a separate table for random SELECTs... eeked out a few more milliseconds... in the end, I think that ORDER BY RAND() function is a dog anyway you slice it...  need to keep my table very small (under 100 rows) or I'm going to pay for it in latency.  perhaps I could have several tables of 50 rows each and round-robin or randomly hit each table?

I can't see how having multiple tables is worthwhile. Yes, using RAND() is not a speedy process, but what is the performance (or lack of) using a number of records that you expect to have? How often will you run that process? Does it need to be randomized each time or can it only be randomized say once per day? All of these should play into your decision.

 

Good luck

currently, it runs every page hit.  so it's something that adds to page load time.  your question about frequency made me think that perhaps I could run it every x page hits, or 2 or 3 times a day or something...  just periodically dump 25 random images into another table.  that might be the best work-around.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.