Jump to content


Photo

mySQL & PHP Efficiency


  • Please log in to reply
11 replies to this topic

#1 lead2gold

lead2gold
  • Members
  • PipPipPip
  • Advanced Member
  • 164 posts
  • LocationOttawa, On

Posted 01 May 2006 - 06:39 PM

Ok, First i will say "fenway" is the man, thank you for all your help in the past (especially with respect to [a href=\"http://www.phpfreaks.com/forums/index.php?act=post&do=new_post&f=3)\" target=\"_blank\"]http://www.phpfreaks.com/forums/index.php?...o=new_post&f=3)[/a]

Ok, now i'll get to my question:

Is it more efficient to do one gigantic select statement crossing multiple tables to get all the information you require all at once? Or is it less of a load to select the main data, then do multiple selects within a loop for the remainin information?

Now... in my head, i would think that the one gigantic select is preforming everything is the most efficient, thus all the cpu load is used once and only once rather then in many small bursts.

However, maybe it makes better sense to do the small bursts to prevent slowdowns for concurrent users?

I work with an INFORMIX server where i work and we do alot of database related stuff, and a coleage was trying to tell me it's faster to do multi selects rather then including to many tables at once.


So... my guru friends..... what is the answer?! :)

#2 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 02 May 2006 - 06:39 PM

Good question. There are really three seperate issues here -- the speed of the query itself (MySQL), the overhead of returning a result set (DB layer), and the cost of running PHP code (PHP).

As far as MySQL is concerned, there are many factors that can affect performance. The first few that come to mind are the table cache, key buffers and index setup. Generally, the fewer tables involved in a query, the more likely that these will be utilized correctly -- the query cache can play a significant role as well. However, the total number of tables will be the same if you query 4 tables at once or 1 at a time. Assuming that the SELECT query is optimized (not an easy task -- and not always possible), and you're not quering 15 tables, I usually find it best to query all tables together. But this a not a rule-of-thumb -- it critically depends on the query itself.

For the DB layer, obviously returning more records and more columns require more time to actually send the data itself, and then there is the overhead of creating PHP objects to represent this information. Having MySQL "combine" some queries such that fewer records/fields are returning can often improve performance as well.

And then there's PHP itself -- I would argue that this is probably the most expensive step if not written properly, since you can lose a lot of time in a tight for loop that hasn't been optimized.

I'm sure I've confused you by now -- but this is how it is. Feel free to post back with comments.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#3 lead2gold

lead2gold
  • Members
  • PipPipPip
  • Advanced Member
  • 164 posts
  • LocationOttawa, On

Posted 03 May 2006 - 01:00 PM

I'm not confused "YET" hehe :)

But, what sounds like the most ideal method of retrieving a substancial amount of information that crosses tables would be to aquire your constants first (through there own selects) elminating the overhead (extra tables that retrieve the same info anyways) from your main select. Then preform your main select afterwards containing an additional WHERE clause using your constants.

I like the idea of multiple joins for the ability to have a dynamic "ORDER BY" at the bottom. You can make all your columns sortable in the end.


#4 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 04 May 2006 - 03:05 AM

I'm all you're getting back from your initial queries are constants, the MySQL optimizer will handle this for you. As far as dynamic order by clauses, be careful when using joins, since the optimizer will generally not use an index for any joined-in tables.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#5 wickning1

wickning1
  • Members
  • PipPipPip
  • Advanced Member
  • 405 posts

Posted 04 May 2006 - 10:02 PM

If all the extra table joins and whatnot are used to limit the data returned, they are definitely worth doing in one query. In my experience, you want to avoid returning more data to PHP than it absolutely needs.

Now, if the joins are only adding more data, then it can be beneficial to break it up, but this depends on whether the database is local or remote, because initiating lots of new communications over a network can be very costly.

#6 lead2gold

lead2gold
  • Members
  • PipPipPip
  • Advanced Member
  • 164 posts
  • LocationOttawa, On

Posted 05 May 2006 - 01:07 PM

[!--quoteo(post=371389:date=May 4 2006, 06:02 PM:name=wickning1)--][div class=\'quotetop\']QUOTE(wickning1 @ May 4 2006, 06:02 PM) View Post[/div][div class=\'quotemain\'][!--quotec--]
If all the extra table joins and whatnot are used to limit the data returned, they are definitely worth doing in one query. In my experience, you want to avoid returning more data to PHP than it absolutely needs.

Now, if the joins are only adding more data, then it can be beneficial to break it up, but this depends on whether the database is local or remote, because initiating lots of new communications over a network can be very costly.
[/quote]

Interesting...
sorry to keep this topic going, but you guys have been great :)

So with what you said in mind... are you saving anything by adding the "LIMIT XX" keyword? I'm guessing no right? It's still gotta fetch everything but then only return "XX" amount?

Is there a good book/site i can get to read up on mySQL and how to get the most efficent uses out of it? fenway mentioned that multiple joins (with respect that you limit them as much as possible) are ok as long as you optimize the select statement. I'm curious to see the difference between 2 identical select statements; one that is optimized, and one that isn't. Maybe i should check out one of those O'Reilly books.



#7 Darkness Soul

Darkness Soul
  • Members
  • PipPipPip
  • Advanced Member
  • 133 posts
  • LocationBrazil; São Paulo.

Posted 05 May 2006 - 04:32 PM

Yo,

Nice topic! ^^ This give me a question.. o-o When I have to store the user information, and only login fields are required.. whats better? one or two table?

I think, with one table, records all user data into the row (many col possibily blank). And with two, one with id, user, pass, mail.. and other with the other data..

I used to keep all into one table.. but don't know if its better or not..

D.Soul
(If something is wrong, please tell me. I'm learning this language. Thank you)

#8 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 05 May 2006 - 04:52 PM

[!--quoteo(post=371518:date=May 5 2006, 08:07 AM:name=lead2gold)--][div class=\'quotetop\']QUOTE(lead2gold @ May 5 2006, 08:07 AM) View Post[/div][div class=\'quotemain\'][!--quotec--]
Interesting...
sorry to keep this topic going, but you guys have been great :)

So with what you said in mind... are you saving anything by adding the "LIMIT XX" keyword? I'm guessing no right? It's still gotta fetch everything but then only return "XX" amount?

Is there a good book/site i can get to read up on mySQL and how to get the most efficent uses out of it? fenway mentioned that multiple joins (with respect that you limit them as much as possible) are ok as long as you optimize the select statement. I'm curious to see the difference between 2 identical select statements; one that is optimized, and one that isn't. Maybe i should check out one of those O'Reilly books.
[/quote]
It's just about making sure you don't waste a lot of time passing useless information back and forth -- i.e. do the LIMIT 1 on the DB side, not the PHP side. That's an extreme example, but I've seen it happen.

As far as optimization is concerned, a lot has to do with indexing... there are surprisingly very few "good" resources on the web (that is, ones that can actually be understood). I'll dig around my bookmarks and see what I can come up with. Otherwise, staying about from subqueries, especially correlated ones, helps too -- JOINs are definitely the way to go for just about everything; so try and get comfortable with derived tables.

[!--quoteo(post=371598:date=May 5 2006, 11:32 AM:name=Darkness Soul)--][div class=\'quotetop\']QUOTE(Darkness Soul @ May 5 2006, 11:32 AM) View Post[/div][div class=\'quotemain\'][!--quotec--]
Yo,

Nice topic! ^^ This give me a question.. o-o When I have to store the user information, and only login fields are required.. whats better? one or two table?

I think, with one table, records all user data into the row (many col possibily blank). And with two, one with id, user, pass, mail.. and other with the other data..

I used to keep all into one table.. but don't know if its better or not..

D.Soul
[/quote]
It's possible that you should have two tables, but there's always a trade-off; if the other "empty" columns aren't NULLable, there are a reasonable number of them, and they're fixed width, you're probably fine. This is partly a normalization question and partly an DB design issue.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#9 Darkness Soul

Darkness Soul
  • Members
  • PipPipPip
  • Advanced Member
  • 133 posts
  • LocationBrazil; São Paulo.

Posted 05 May 2006 - 05:11 PM

[!--quoteo(post=371604:date=May 5 2006, 01:52 PM:name=fenway)--][div class=\'quotetop\']QUOTE(fenway @ May 5 2006, 01:52 PM) View Post[/div][div class=\'quotemain\'][!--quotec--]
It's possible that you should have two tables, but there's always a trade-off; if the other "empty" columns aren't NULLable, there are a reasonable number of them, and they're fixed width, you're probably fine. This is partly a normalization question and partly an DB design issue.
[/quote]
I see, but I was thinking about performance.. so, the performance don't change in this case? About normalization and db design is better two tables.. if I understand right.. =) (never see 'reasonable' word before)

edit: I forget ask about security, I think is naturaly increase when keep the data in two tables, is it true?

D.Soul
(If something is wrong, please tell me. I'm learning this language. Thank you)

#10 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 05 May 2006 - 07:54 PM

[!--quoteo(post=371609:date=May 5 2006, 12:11 PM:name=Darkness Soul)--][div class=\'quotetop\']QUOTE(Darkness Soul @ May 5 2006, 12:11 PM) View Post[/div][div class=\'quotemain\'][!--quotec--]
I see, but I was thinking about performance.. so, the performance don't change in this case? About normalization and db design is better two tables.. if I understand right.. =) (never see 'reasonable' word before)

edit: I forget ask about security, I think is naturaly increase when keep the data in two tables, is it true?

D.Soul
[/quote]
Performance will always vary -- it depends critically on column choices and table layouts. And no, normalization does not always equate with more tables, and can be over-done easily. As far as security, there's no direct relationship with # of tables, and without referential integrity, allows for more holes.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.

#11 Darkness Soul

Darkness Soul
  • Members
  • PipPipPip
  • Advanced Member
  • 133 posts
  • LocationBrazil; São Paulo.

Posted 10 May 2006 - 12:59 PM

Hmmm.. so my performance will be affected less if I broke one in two tables then I make the tabble fields with wrong values? Like... name varchar (128) when i just need about 40..

This give me another question (I think this one is design), like mail field, if I want to limit the mail fild at 45 varchar... whats better? varchar(45) or varchar(64) ?

I think about the 64 because of bits values... 1, 2, 4, 8, 16, 32, 64...

D.Soul
(If something is wrong, please tell me. I'm learning this language. Thank you)

#12 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 10 May 2006 - 04:02 PM

[!--quoteo(post=372910:date=May 10 2006, 07:59 AM:name=Darkness Soul)--][div class=\'quotetop\']QUOTE(Darkness Soul @ May 10 2006, 07:59 AM) View Post[/div][div class=\'quotemain\'][!--quotec--]
Hmmm.. so my performance will be affected less if I broke one in two tables then I make the tabble fields with wrong values? Like... name varchar (128) when i just need about 40..
[/quote]
I don't know what you mean by "wrong values".

[!--quoteo(post=372910:date=May 10 2006, 07:59 AM:name=Darkness Soul)--][div class=\'quotetop\']QUOTE(Darkness Soul @ May 10 2006, 07:59 AM) View Post[/div][div class=\'quotemain\'][!--quotec--]
This give me another question (I think this one is design), like mail field, if I want to limit the mail fild at 45 varchar... whats better? varchar(45) or varchar(64) ?

I think about the 64 because of bits values... 1, 2, 4, 8, 16, 32, 64...
[/quote]
There's no difference... but consider using a proper CHAR column. And be careful about truncation problems.
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users