Jump to content


Photo

I am noticing major load time in retrieving results from large tables

load mysql table result large

  • Please log in to reply
9 replies to this topic

#1 imgrooot

imgrooot
  • Members
  • PipPipPip
  • Advanced Member
  • 221 posts

Posted 03 May 2018 - 09:46 AM

I am retrieving data from multiple joined tables that contain around 3,500 rows each. I am only showing limited results but the load time is the same. Like it takes 5 seconds to load the page. I don't want this same issue to continue if I am predicting over 1,000,000 rows from these tables.  So what's the best way to fix this?

 

Here is my query.

$get_members = $db->prepare("SELECT users.*, approvals.*, sponsors.* FROM approvals
LEFT JOIN users ON approvals.user_id = users.user_id
LEFT JOIN sponsors ON approvals.user_id = sponsors.sponsored_user
WHERE user_id > :user_id ORDER BY user_id DESC LIMIT 20");
$get_members->bindValue(':user_id', 0);
$get_members->execute();
$result_members = $get_members->fetchAll(PDO::FETCH_ASSOC);
if(count($result_members) > 0) {
  foreach($result_members as $row) {
    // show results here
  }
} else {
   echo 'none';
}
if(count($result_members) < 20) {} else {
  // show-more button here
}


#2 Barand

Barand
  • Moderators
  • Sen . ( ile || sei )
  • 18,169 posts

Posted 03 May 2018 - 10:06 AM

  • Don't use user.* etc. Specify just the columns you actually need.
  • Left JOINS are slow. Unless you really need them, use INNER JOIN instead.
  • Make sure you have indexes specified on approvals.user_id and sponsors.sponsored_user

If you are still using mysql_ functions, STOP! Use mysqli_ or PDO. The longer you leave it the more you will have to rewrite.

Donations gratefully received






moon.png

|baaGrid| easy data tables - and more
|baaChart| easy line, column and pie charts

#3 imgrooot

imgrooot
  • Members
  • PipPipPip
  • Advanced Member
  • 221 posts

Posted 03 May 2018 - 10:49 AM

 

  • Don't use user.* etc. Specify just the columns you actually need.
  • Left JOINS are slow. Unless you really need them, use INNER JOIN instead.
  • Make sure you have indexes specified on approvals.user_id and sponsors.sponsored_user

 

 

Should * never be used even if I am retrieving all the columns in that table?

 

So I edited the query using all three of your points above. The load time has decreased from 5 seconds down to 1-2 seconds. It's not instantaneous but much better.



#4 Barand

Barand
  • Moderators
  • Sen . ( ile || sei )
  • 18,169 posts

Posted 03 May 2018 - 11:07 AM

 

Should * never be used even if I am retrieving all the columns in that table?

Only use * when testing when needing to see/show what is in a table, not in production.

 

If, in future, you add more columns to the table (perhaps a large text field) then all queries using * will pick up the new columns even though not required. The more data your query returns the slower it will run.

 

When joining tables you will get duplicated values (in your query you get the user_id three times.

 

Plus, using * obfuscates the query process. When you come back to it in a few months, or someone else does, the presence of actual column names helps to document what the query is doing.


If you are still using mysql_ functions, STOP! Use mysqli_ or PDO. The longer you leave it the more you will have to rewrite.

Donations gratefully received






moon.png

|baaGrid| easy data tables - and more
|baaChart| easy line, column and pie charts

#5 mac_gyver

mac_gyver
  • Staff Alumni
  • Staff Alumni
  • 4,148 posts

Posted 03 May 2018 - 12:08 PM

WHERE user_id > :user_id

$get_members->bindValue(':user_id', 0);

 

 

i'm wondering why you are doing this comparison? you should not have any data with a zero or negative user_id stored in any table, so the above is causing unnecessary work examining the values and will produce the same result if that comparison is not in the query.


multi-purpose programming fool. well written source-code should be self-documenting. well written code should be self-troubleshooting.

#6 gizmola

gizmola
  • Administrators
  • Advanced Member
  • 4,722 posts
  • LocationLos Angeles, CA USA

Posted 03 May 2018 - 03:35 PM

When optimizing queries, you need to be running EXPLAIN on them. If you provide the output of the Explain here, then people might be able to pinpoint ways you can make the query faster.

#7 dalecosp

dalecosp
  • Members
  • PipPipPip
  • Advanced Member
  • 450 posts
  • LocationMissouri

Posted 03 May 2018 - 06:42 PM

 

When joining tables you will get duplicated values (in your query you get the user_id three times.

 

 

Right.  3500 x 3500 = 12.25M rows in/of itself, and let's not think about the possibility of multiplying the result set size again.


"God doesn't play dice" --- Albert Einstein
"Perl is hardly a paragon of beautiful syntax." --- Weedpacket

#8 dalecosp

dalecosp
  • Members
  • PipPipPip
  • Advanced Member
  • 450 posts
  • LocationMissouri

Posted 03 May 2018 - 06:43 PM

i'm wondering why you are doing this comparison? you should not have any data with a zero or negative user_id stored in any table, so the above is causing unnecessary work examining the values and will produce the same result if that comparison is not in the query.

 

Use case (statistics) I can think of: you're doing tracking of "logged-in" as well as "not-logged-in" users.  We have some tables like that.


"God doesn't play dice" --- Albert Einstein
"Perl is hardly a paragon of beautiful syntax." --- Weedpacket

#9 Phi11W

Phi11W
  • Members
  • PipPip
  • Member
  • 15 posts

Posted 15 May 2018 - 03:56 PM

Should * never be used even if I am retrieving all the columns in that table?

You can [manually] use "select *" for testing and diagnostics. 

Applications should never do so. 

 

Databases are intrinsically shared entities and other people might be changing the table structures, adding all sorts of gigantic "text" fields that your application doesn't [know or] care about but using "select *" will retrieve them all anyway. 

 

Regards, 
   Phill  W.



#10 Psycho

Psycho
  • Moderators
  • Move along, nothing to see here
  • 11,937 posts
  • LocationCanada

Posted 15 May 2018 - 04:21 PM

You can [manually] use "select *" for testing and diagnostics. 

Applications should never do so. 

 

Databases are intrinsically shared entities and other people might be changing the table structures, adding all sorts of gigantic "text" fields that your application doesn't [know or] care about but using "select *" will retrieve them all anyway.

 

Another reason is that the data should be "agnostic" to how it is being used. When working in PHP, and many languages, you can reference the data via the column names. However, what if there was a need to reference the data via numerical index (there's a reason why there are explicit options to only retrieve data via numerical indexes). You might be passing the results to another process that can't use the column names and will simply reference the data by the order it is presented.

 

In that case, if "SELECT *" is used and field order is changed or fields are added, the functionality could break. By selecting just the fields that are needed in the order that they are needed, the functionality will not break (unless someone was to remove a field).


The quality of the responses received is directly proportional to the quality of the question asked.

I do not always test the code I provide, so there may be some syntax errors. In 99% of all cases I found the solution to your problem here: http://www.php.net




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users