Jump to content

Archived

This topic is now archived and is closed to further replies.

neoform

Slow Inner Join

Recommended Posts

I wrote this (obviously badly designed) query that matches IP addresses to their associated country (located in my IP2Location table)

"SELECT ip, user_id, used_on, country_code2, country_code3, country_name FROM stats_ips INNER JOIN ip_2_country ON ip_to >= ip AND ip_from <= ip ORDER BY ".$_order_by." ".$_order_direction." LIMIT ".(($_page - 1) * PER_PAGE_IPS).", ".PER_PAGE_IPS

this is stats_ips:
ip (INT UNSIGNED - Primary)
user_id (MEDIUMINT - Indexed)
used_on (DATETIME - Indexed)

this is ip_2_country:
ip_from (INT UNSIGNED - Primary)
ip_to (INT UNSIGNED - Primary)
country_name (CHAR 50 - Indexed)
country_code2 (CHAR 2)
country_code3 (CHAR 3)

One of the reasons i bother with the joining then is becuse i want to be able to order by any of the fields: ip, user_id, used_on, country_name

this query worked ok when there was a few rows in stats_ips, but as soon as i added any normal number of rows to it (100,000) it became insanely slow (1000+ seconds).

my guess is it's an index problem, (i've always been terrible with index optmizing).

Share this post


Link to post
Share on other sites
Well, it doesn't help that your JOIN condition has inequalities... post the interpolated query, and the EXPLAIN output.

Share this post


Link to post
Share on other sites
Well, as much as i'd like to have something like "stats_ips.ip = ip_2_location.ip", that would mean i'd have a record for every possible IP address..  that's like what..  4.2 Billion.. ? :P I'm pretty much stuck using ranges.

here's an example of it in use..

SELECT ip, user_id, used_on, country_code2, country_code3, country_name FROM stats_ips INNER JOIN ip_2_country ON ip_from <= ip AND ip_to >= ip ORDER BY used_on DESC LIMIT 0, 20

The output is simply supposed to be a list of ip addresses from the stats_ips table along with related info, like who used it, what country the ip is from and when the ip was last used.

i'm looking to make this out be able to be sorted by all 4 fields, which it currently does, but mysql crumbles if i add any amount of ips to the stats_ips table.

(btw, my ip_2_location table has about 75,000 rows)..

Share this post


Link to post
Share on other sites
Still, post the EXPLAIN output.

Share this post


Link to post
Share on other sites
my bad, i wasn't even aware there was an explain command..  :(

+----+-------------+--------------+------+---------------+-----+---------+-----+-------+------------------------------------------------+
| id | select_type | table        | type | possible_keys | key | key_len | ref | rows  | Extra                                          |
+----+-------------+--------------+------+---------------+-----+---------+-----+-------+------------------------------------------------+
|  1 | SIMPLE      | stats_ips    | ALL  | PRIMARY,ip    | NULL| NULL    | NULL|    6 | Using temporary; Using filesort                |
|  1 | SIMPLE      | ip_2_country | ALL  | PRIMARY      | NULL| NULL    | NULL| 73952 | Range checked for each record (index map: 0x1) |
+----+-------------+--------------+------+---------------+-----+---------+-----+-------+------------------------------------------------+

Share this post


Link to post
Share on other sites
Well, type ALL is bad... no key used is also bad; as is the filesort.  BTW, depending on what your "range" is, you can probably get away with wildcards, especially for subnet ranges (which is what I assume that you're using this for).

Share this post


Link to post
Share on other sites
i don't suppose you could throw me an example.. ?  i've been stuck on this query for 3 days now.. :(

Share this post


Link to post
Share on other sites
If you're using MYSQL 4.1 or higher the following should be faster.

This uses a SUBSELECT to do the initial ordering and limit. The ordered column has to be in the stats_ips table however. Meaning ordering on country_name is out. Although you can modify the query to still order by a stats_ips column and then a ip_2_country column.
[code]
SELECT
ip, used_on, user_id, country_code
FROM
    (
    SELECT ip, used_on, user_id
    FROM stats_ips
    ORDER BY
    used_on
    LIMIT 0, 20
    )
  AS
stats_ips
INNER JOIN
ip_2_country
ON
ip_from <= ip AND ip_to >= ip
ORDER BY
used_on
DESC;
[/code]
You'll just have to limit the number of ips shown each time to keep the speed.

Share this post


Link to post
Share on other sites
You probably caught this, but the order by in the subselect is missing the DESC in case you missed it.

Share this post


Link to post
Share on other sites
That's definitely an improvement... again, what type of IP range is being used here?

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.