Jump to content

Returning a large result set from mysql


mattbridges

Recommended Posts

Hi all,

This is a problem I'm been struggling with for ages.

We've got a site that is really busy and requires and large and rather expensive mysql search.
The problem I have is how to store the results so they persist page to page for our users.
The array of id's can grow to over 200KB which starts to cause php/apache a problem.

I've tried all sorts.

Help :-S please :-S

I'm pulling my hair out.

Matt
Link to comment
Share on other sites

Using mysql_num_rows() you can determine how many rows are returned from your query. Knowing this you can select a desired amount of records to display per page and generate some links at the base of the page that using those calcs.

add something like this to the top of the page:

[code]
if(! isset($_GET['start')) {
  $start = 0;
} else {
  $start = $_GET['start']
}
[/code]


Table data would go here:

then:

[code]
for ($i=0; $i<$pages; $i++) {
  $h = $i+1;
  $start = $i*$limit;
  $page_link = "<a href=\"" . $_SERVER['PHP_SELF'] . "?start=$start\">$h</a>";
  $results .= " $page_link";
  if ($h < $pages) {
  $results .= ' |';
  }
}
[/code]


It's not complete but hope it gets you started....

:)
Link to comment
Share on other sites

Hey matto

Thanks for the reply.  I probably should have been more clear, sorry it's been a long week already and I've been banging my head against the wall!

We've got about 102000 rows that we return for the search using a rather expensive mysql query.

We currently return about 60000 rows ordered by a last update datetime; but that data we return increases - each row is rather small, consisting of a user id.

Obviously we have to return the number of rows so we can paginate the results set,  which we currently do using the mysql_num_rows function.

Then we have to do a select x, y, z from a, b, c where a.id = b.id and c.id = a.id order by a.id desc limit $offset,10 to return the individual results (this is the expensive bit)

The $offset is supplied by php, as we currently return 10 results per page, so the offset increases by 10 on each subsequent page.

The problem we are experiencing is that the result order can, and frequently does change between pages, and we want to deliver consistent results.  So we must either get all the results and store them in a massive php array (see below), or page through them from the database; which means the results and the order changes page to page and we have to run that expensive query on every search page hit.

The array looks something like

array
(
  [1] = 10
  [2] = 11
  [3] = 14
...... on to
  [60000] = 102940
)

Putting that into php causes php to go burp or we hit a limit on the maximum array size - increasing the array size obviously isn't scalable, so I'm looking for a sensible way to store the results either in memory, files, a cache or x, so that each search page can just grab the next 10 id's and the order is retained.

I've tried using memcached but that has also run into problems with the size of the array.

I'm not sure if I'm making myself completely clear - please forgive me; I've been fighting this for over a week now and my brain is truly fried!


Cheers Matt
Link to comment
Share on other sites

Cheers Huggiebear any help anyone can offer is great!

We currently get about a maximum of 3K-4K users at our peak time.
As far as persisting the search result data it really only needs to persist per user between that individual user hitting the search button.

I've just been going round and round in circles :-)

Cheers

Matt
Link to comment
Share on other sites

Why don't you set a session variable upon the first time someone hits the page effectively giving you a entrance timestamp. Use standard pagination script including an additional AND in the sql statement so that it will only return topics that already existed the first time they hit the page. I just dont see the logic in storing that much information in an array, or the mess of file creation/cleanup...
Link to comment
Share on other sites

Hi there dgiberson,

The problem is the continued running of the query (which is very expensive in terms of mysql cpu time).  The session timestamp is the way we're doing it at the moment, this results in the database query getting hit on every search page.  Which is where the bottleneck is occuring :-(

Cheers

Matt
Link to comment
Share on other sites

SELECT * FROM table1 WHERE timestamp <= '$user_timestamp' ORDER BY id LIMIT 1,10 // return 1st 10 records
SELECT * FROM table1 WHERE timestamp <= '$user_timestamp' ORDER BY id LIMIT 11,10 // return 11-20
SELECT * FROM table1 WHERE timestamp <= '$user_timestamp' ORDER BY id LIMIT 21,10 // return 21-30
etc....

The query returns on this should be extremely fast no matter the table size
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.