Which is faster, MySql INNER JOIN or PHP Loop?

SLSCoder · May 19, 2022

I have a table1 with an ID and a Name where the ID is found in a sql query on different table2.

I have access to an existing PHP array containing all the IDs and Names of table1.

For each of the records I find n the sql query on table2 I have a choice.
I can either INNER JOIN table1 on table2 in the sql query and then include the Name in the query OR
while looping through the query recordset, for each record found I can loop through the array of table1 and match the ID to the ID in the query record to get the Name.

Which method is generally faster?

Barand · May 19, 2022

You have the db data and the array, time it.

$t1  = microtime(1);

for($i=0; $i<1000; $i++) {
   // method 1 code
}

%t2 = microtime(1);

for($i=0; $i<1000; $i++) {
   // method 2 code
}

%t3 = microtime(1);

printf("Method 1 took %0.4f seconds<br>", $t2 - $t1);
printf("Method 2 took %0.4f seconds<br>", $t3 - $t2);

SLSCoder · May 19, 2022

11 minutes ago, Barand said:

There are different queries in different parts of the app, each returning different data. I run into this scenario often.
I just wondered if a general practice would be to use one method or the other.

Thanks:)

Barand · May 19, 2022

My preference would be fetch all the data in a single query but I cannot categorically say it would always be faster.

SLSCoder · May 20, 2022

OK thanks. I'll make that a preference as well.

gizmola · May 20, 2022

What is the real question you are asking?

PHP code in a PHP application is one component, loop or no loop.

So it seems your real question is:

1 Query (with a join) vs. Multiple queries.

1 Query will always be faster and more efficient than multiple, unless there is something very wrong/unoptimized with the query.

With that said, when you have static related tables, then having those tables cached can save you having to join the tables or do queries in a loop. As the result set will have to be "hydrated/expanded" within your PHP script to make use of those, or at least resolve the values you need from the cached lookup table, the complexity of adding all that code isn't often worth the pain, and might not even be more efficient in the long run, given a PHP script's short lifespan.

What is certainly worth looking at, is the addition of a relational result cache like memcached or redis.

SLSCoder · June 6, 2022

gizmola thanks for your response.

If I understand memcached correctly it won't help me. The data is constantly changing, it's not static at all.

gizmola · June 7, 2022

13 hours ago, SLSCoder said:

gizmola thanks for your response.

If I understand memcached correctly it won't help me. The data is constantly changing, it's not static at all.

Actually that is rarely the case. If I understand you, part of the data you are looking up is "user" data from a user table. That data certainly is not always changing, and in most system designs, you absolutely know/control when it is changing. Let's say for example, there is user data + profile data perhaps in 2 related tables.

What a memcache based cache would provide is --- your code checks for the existence of the data in the cache. You have to figure out what the cache key is going to be to be able to resolve it, and usually that is some sort of string like: '/user/{userid}'.

If the key exists, you just read the data from memcache. This eliminates the need to query the database. If the key doesn't exist you perform the query, and save the result in a new memcache key. You have the option of specifying a ttl value for the key. Now this query could also be a query that joins to the user profile table.

In your routines that change/update the user or profile data, you invalidate the memcache data for that user when you save it to the database. The next read/select of the data will create a new cache entry. The main trick that memcache does is to allow for it to be clustered, and this is why facebook originally used it, given their enormous graph. There are very few systems that have that level of scalability issues, but most applications get a significant performance and scalability boost out of using some sort of in-memory cache. A lot of projects I've worked on have used Redis.

You do need to do some thinking about the type of data in your system, and whether or not portions of it are relatively static. It also tends to show you if you've made significant design mistakes. An example might be putting some sort of de-normalized counter in a table, such that you made something that is relatively static, like a user table, non-static. Putting a "last login" column, or summary columns like "friend_count" or "topics" all reduce database concurrency, and then cause people to move away from caching because they put these types of fields that require frequent updates into the main user table.

So to conclude, let's say you have something that does get updated with some frequency but then requires a join to another table for context/data availability. An example might be a message system, where you need to join to the user table to get the to/from usernames associated with messages. Again, depending on your design, even though messages could be added with some rapidity, that doesn't mean there aren't users in the system who could have a cached version of the message data. It also doesn't mean that you can't use the cached user data to decorate the messages. You can also store a query that includes a join as in the example of a query meant to return "all private messages sent to user 5" That query can certainly be cached, and in doing so you will reduce load on your database. You just need to understand which queries like this need to be invalidated from the cache. So long as you have cache name schemes that make sense, it's not that hard to understand what caches you need to remove if data was added or changed. Most of those schemes are similar to way you might design a rest api.

SLSCoder · June 7, 2022

gizmola, again thanks for your response.

I am looking for ways to speed the app up so this is clearly something I'm going to have to study and implement.

I'll look into Redis and learn what it will take to use it and where in my app it will help.

Thanks again.

Sign In

Which is faster, MySql INNER JOIN or PHP Loop?

Recommended Posts

SLSCoder

Link to comment

Share on other sites

Barand

Link to comment

Share on other sites

SLSCoder

Link to comment

Share on other sites

Barand

Link to comment

Share on other sites

SLSCoder

Link to comment

Share on other sites

gizmola

Link to comment

Share on other sites

SLSCoder

Link to comment

Share on other sites

gizmola

Link to comment

Share on other sites

SLSCoder

Link to comment

Share on other sites

Join the conversation

Browse

Activity

Important Information