Jump to content

Withdrawing Sorted Values - Too Slow


Some Poster55

Recommended Posts

Hi,

 

Currently have an issue which I'd appreciate some advice on.

 

I have a table structured as shown below, with a "name" column that holds a user's name, and a "data" column which holds numerous comma separated values (with a new line \n after each 3rd value):

|    NAME  |    DATA  |

------------------------------

|  User 1  | 554,231,879

                  56,546,55

                  987,649,1

------------------------------

|  User 2  | 88,6631,15879

                456,987,45668

                87,564444,65

------------------------------

|  User 3  | 88,6631,15879

                  456,987,868

                  99,165,67468

             

...And so on, for thousands of users.

 

I need to have a page on my site which displays the top 3 users who have the highest 6th data value (the bolded value above). For instance, using the above example data, the page would display:

 

1) User 2 - 4568

2) User 3 - 868

3) User 1 - 55

 

My first attempt was to select every row in the table, parse each row, then sort the parsed values in descending order, then only display the top 3:

$result = mysql_query("SELECT * FROM myTable");

// Cycle through each row in myTable to grab its data values
while ($row = mysql_fetch_array($result)) {
                // Explode the data row to easily be able to retrieve its 6th data value
	$data_explode = explode("\n",$row['data']);
	foreach ($data_explode as $x=>$y) {
		$data[$x] = explode(",",$y);
	}
                // Place each user's name and 6th data value into a large "storage" array
	$storage[] = array('name' => $row['name'], '6th_data_value' => $data[1][2]);
}

// Sort the large "storage" array in descending order
foreach ($storage as $key => $row) {
$sorter[$key]  = $row['6th_data_value']; 
}
array_multisort($sorter, SORT_DESC, $storage);

// Echo the top 3 results in the sorted "storage" array
for ($x = 0; $x < 3; $x++) {
         echo $x.") ".$storage[$x]['name']." - ".$storage[$x]['6th_data_value'];
}

This worked - however, as I was fetching thousands of rows, the script takes too long (10+ seconds) to effectively use on a page.

 

I was wondering if there was any way in which I wouldn't have to fetch the thousands of user rows (which takes a long time), just to get the top 3 "6th data values"?

 

I know that if the specified data value was in a column of its own, then I could easily change the SQL statement to SORT DESC and LIMIT 3 before fetching, so that it would only fetch the top 3 values. But since the 6th data value I need is buried within other values, I was hoping that someone would have a way to still only fetch from the rows I need (and not the entire table), or at least some other way to improve the scripts timing?

 

Many thanks in advance!

 

 

Link to comment
Share on other sites

You should be able to use two nested mysql SUBSTRING_INDEX() function calls to get the value you are interested in. You would then just need to do an ORDER BY the_value DESC LIMIT 3

 

If you fix your table design, so that each value is stored as a separate row, you can then easily construct queries to efficiently find any value(s) you want.

Link to comment
Share on other sites

Which would look like this -

SELECT name, SUBSTRING_INDEX(SUBSTRING_INDEX(data, '\n', 2),',',-1) + 0 as data FROM tablename ORDER BY data DESC LIMIT 3

Thanks, exactly what I was looking for!

 

I do have one final question though: would there be a way to ignore the 6th data value if it is "999"? I tried the following, but it didn't work (nor did it really seem like something that would work - but just to give you an idea of what I'm looking for):

SELECT name, SUBSTRING_INDEX(SUBSTRING_INDEX(data, '\n', 2),',',-1) + 0 as data FROM tablename WHERE data != '999' ORDER BY data DESC LIMIT 3
Link to comment
Share on other sites

Good stuff, thanks again.

 

Really thought I'd be able to figure this next part out - but it doesn't seem to be working. Now I'm trying to reach the 4th data value instead of the 6th. Since it was still in the second row I kept the innermost SUBSTRING_INDEX as is, but modified the count on the outer SUBSTRING_INDEX from "-1" to "1" (according to SUBSTRING_INDEX's documentation, I believe this should've fetched only the leftmost value, just as "-1" fetches the rightmost value):

SELECT name, SUBSTRING_INDEX(SUBSTRING_INDEX(data, '\n', 2),',',1) + 0 as data FROM tablename HAVING data != 999 ORDER BY data DESC LIMIT 3

 

However, oddly all this actually did was sort it by the very first value - so going along with my above example, I got "554, 88, 88" instead of "456, 456, 56".

 

Any help would be appreciated. :)

 

Link to comment
Share on other sites

That above query actually worked so well I'm attempting to refine another one of my slow pages with something similar.

 

This time the setup is a bit different - There's about 1000 tables in the database. Each table is named the name of a specific user ("Bob", "Dan", etc.). Each table contains 3 columns: "id" (primary key), "date", and "data".  There is exactly 1 row for each day since the user created their account. Example:

 

------------ TABLE: Bob ------------

|    id    |      date    |    data    |

---------------------------------------

| 1  |  Feb 11, 2010  |  89,651631,79

                                526,55646,5

                                9567,847,981

---------------------------------------

|  2  |  Feb 12, 2010  | 86,8973,18979

                                  445,587,4978

                                  66,574,85

---------------------------------------

|  3  |  Feb 13, 2010  | 56,6181,16879

                                  4596,97,8968

                                  10,985,4868

 

----------- TABLE: Dan -------------

|    id    |      date    |    data    |

---------------------------------------

| 1  |  Feb 11, 2010  | 554,231,879

                                56,546,55

                                987,649,1

---------------------------------------

|  2  |  Feb 12, 2010  | 88,6631,15879

                                456,987,45668

                                87,564444,65

---------------------------------------

|  3  |  Feb 13, 2010  | 91,6432,1689

                                  456,987,868

                                  99,165,67468

...And so on, for about a thousand tables.

 

My script takes a user-inputted day (like "Feb 12, 2010") and needs to display the top 3 users who had the highest 6th data value for that day. Again, this is similar to my previous script above, but what's puzzling me is how this could be accomplished across multiple tables.

 

Here is my current inefficient script which completes the task. It selects the user-specified row from every table in the database, parses each row, then sorts the parsed values in descending order, then only display the top 3:

for ($x = 0; $x < 1000; $x++) {
     // Fetch the row in every table for the date the user specifies
     // $table_name_array is a large array which holds the name of every table in the database
     $row = mysql_fetch_array(mysql_query("SELECT * FROM $table_name_array[$x] WHERE date='$user_input'"));

     // Explode the data row to easily be able to retrieve its 6th data value
     $data_explode = explode("\n",$row['data']);
     foreach ($data_explode as $x=>$y) {
	$data[$x] = explode(",",$y);
     }
   
     // Place each user's name (same as the table name) and 6th data value into a large "storage" array
     $storage[] = array('name' => $table_name_array[$x], '6th_data_value' => $data[1][2]);
}

// Sort the large "storage" array in descending order
foreach ($storage as $key => $row) {
     $sorter[$key]  = $row['6th_data_value']; 
}
array_multisort($sorter, SORT_DESC, $storage);

// Echo the top 3 results in the sorted "storage" array
for ($x = 0; $x < 3; $x++) {
     echo $x.") ".$storage[$x]['name']." - ".$storage[$x]['6th_data_value'];
}

Though this again takes too long to effectively use.

 

Essentially what I'd need would a query which does:

SELECT [table Name], SUBSTRING_INDEX(SUBSTRING_INDEX(data, '\n', 2), ',', -1) + 0 as data FROM [Every Table In The Database] WHERE date='$user_input' ORDER BY data DESC LIMIT 3

(Fetch an array with the both the table's name (same as the user's name) and 6th data value for only the top 3 "6th data values" for the user-specified day.)

 

The script would then allow me to display (for example, using "Feb 12, 2010" with the above data):

1) Dan - 45668

2) Bob - 4978

 

So instead of fetching the user-specified row from every table, then sorting every 6th data value to get the Top 3 - I'm looking for a query which simply fetches the top 3 "6th data values" for the user-specified row in every table.

 

Sorry if this is something easy - I've searched around quite a lot and nothing has worked, so I'd again be very appreciative of any help. Thanks! :)

Link to comment
Share on other sites

Ah, made some good progress on this. The below code accomplishes what I need:

// $table_name_array is a large array which holds the name of every table in the database
for ($x = 0; $x < sizeof($table_name_array); $x++) {
        // Cycle through each table and preform a giant union between the 6th data values on the specified date
$sql .= "(SELECT '$table_name_array[$x]' as name, SUBSTRING_INDEX(SUBSTRING_INDEX(data, '\n', 2),',',-1) + 0 as data FROM $table_name_array[$x] WHERE date='$user_input')";

        // Preform a UNION every time in the loop except the very last
        if ($x < sizeof($table_name_array)-1) {
	$sql .= " UNION " ;
}
}

// Finally, after the giant string of UNIONs, sort it to get the top 3 values
$sql .= "ORDER BY data DESC LIMIT 3";

$result = mysql_query($sql);
// Display the top 3 "6th data values" in array form
while ($row = mysql_fetch_assoc($result)) {
     print_r($row);
}

 

Although this script is considerably faster and only takes a few seconds, I'd be curious as to if anyone would be able to improve it even further, as getting it down to just 1 second would be preferable for a live page. Or, if I actually accomplished it in the best way, let me know that too. Thanks! :)

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.