Jump to content


Photo

Pagination Efficiency


  • Please log in to reply
4 replies to this topic

#1 jbreits

jbreits
  • Members
  • PipPip
  • Member
  • 12 posts

Posted 12 May 2006 - 09:53 PM

I have looked around and haven't found any answers that have satisfied me.

When doing pagination on a result set from a MySQL query, most places I have looked appear to suggest using the LIMIT option in the query. I understand how this works, but it doesn't appear that this would possibly be the fastest (and easiest on the server) way to do it.

Most of the examples I looked at also used a second query to find the total number of records returned by the query. Therefore, for each page load, 2 queries are executed. This can easily be cured by performing the record count query on the first page load and then passing it forward.

I can see how limit would speed things up if a very simple query were executed, but what if you are doing a full-text search with and ORDER BY on another field (say a date field). Doesn't SQL have to ready the whole result set to order the rows? If so, doesn't that add overhead?

Would it be faster to run the complete query once, and cache the results to a file of some sort? Then when loading the next page, the results are read from the file? Perhaps, you could even cache 1 file per page so no-in file processing has to be done. Does this sound like a more efficient method?

Secondly, if you use the LIMIT method, couldn't the user end up with inaccurate pagination? For example, I am on page 2 of a dataset ordered by date (desc) and while I am looking at page 2, 5 new records are inserted in to the table. Now, when I go to page 3 and run my new LIMIT query, I would assume that I would have 5 records that used to be on page 2 show up on page 3. I haven't tested this yet, but it only makes sense.

Can anyone comment?

Thanks,
JBreits

#2 Zubaz

Zubaz
  • Members
  • PipPip
  • Member
  • 12 posts

Posted 13 May 2006 - 01:01 AM

[!--quoteo(post=373332:date=May 12 2006, 02:53 PM:name=jbreits)--][div class=\'quotetop\']QUOTE(jbreits @ May 12 2006, 02:53 PM) View Post[/div][div class=\'quotemain\'][!--quotec--]
Would it be faster to run the complete query once, and cache the results to a file of some sort? Then when loading the next page, the results are read from the file? Perhaps, you could even cache 1 file per page so no-in file processing has to be done. Does this sound like a more efficient method?[/quote]
YES. This is easily the most efficient way to do this if you expect high traffic or the count(*) is on a table with a lot of rows. Counts are pretty slow as sql queries go, so if you put the count(*) into a cron (you can decide how often, depending on how often the number actually changes and how it would effect your page count) OR you can have the count update when a new item is added - assuming that happens less than it's looked at.

I cache stuff like that into a small MyISAM table in MySQL, and on larger (>1000) row tables, you'll see an improvement immediately. You can certainly cache to a file if you wish, but there's no need to. You know that pagecount table will only get selects except for say once an hour on cron - so you dont' have to worry about locks or anything.

#3 hvle

hvle
  • Members
  • PipPipPip
  • Advanced Member
  • 667 posts
  • Locationmelbourne, Australia

Posted 13 May 2006 - 01:41 AM

[!--quoteo(post=373372:date=May 13 2006, 11:01 AM:name=Zubaz)--][div class=\'quotetop\']QUOTE(Zubaz @ May 13 2006, 11:01 AM) View Post[/div][div class=\'quotemain\'][!--quotec--]
YES. This is easily the most efficient way to do this if you expect high traffic or the count(*) is on a table with a lot of rows. Counts are pretty slow as sql queries go, so if you put the count(*) into a cron (you can decide how often, depending on how often the number actually changes and how it would effect your page count) OR you can have the count update when a new item is added - assuming that happens less than it's looked at.

I cache stuff like that into a small MyISAM table in MySQL, and on larger (>1000) row tables, you'll see an improvement immediately. You can certainly cache to a file if you wish, but there's no need to. You know that pagecount table will only get selects except for say once an hour on cron - so you dont' have to worry about locks or anything.
[/quote]


Actually, according to the manual, count(*) return a pre-stored number in each table. So it is like instantly.
I think caching of results is efficient when the you have a huge # of records, say 10 millions. I have tested 3 millions record of full text search, the response time was averagely 5.0 seconds, (on 1.7GZ p4, 223 MB ram, Windows). Provide it's a slow server. I'm sure it's much faster in a real server. Also, Index a column make a huge speed different.
Life's too short for arguing.

#4 Zubaz

Zubaz
  • Members
  • PipPip
  • Member
  • 12 posts

Posted 13 May 2006 - 02:58 AM

It depends on your where clause. I can gaurantee you my method is fast. If you're lucky enough to be using an index, great - if not...

#5 jbreits

jbreits
  • Members
  • PipPip
  • Member
  • 12 posts

Posted 16 May 2006 - 09:34 PM

Thanks for your responses. It sounds like I would be wise to cache the search results. Not only for efficience on the server, but also to keep the result set consistently paged to the user.

Now I just need to decide if I should cache to a file or to a MSSQL table as was mentioned. I wonder which would be easier on the server.

Thanks,
JBreits




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users