neoform Posted October 17, 2006 Share Posted October 17, 2006 So i'm working with a table with 8.1 million rows (business listings), and i have a second table containing a list of all the categories those businesses are found in which has about 15.6 million rows.From what i can tell i've indexed everything properly, yet i'm still running into long load times for seemingly simple queries..[code]SELECT COUNT(*) FROM listings WHERE longitude != 0[/code] (i've got longitude indexed) yet this query takes a minimum of 5 seconds to load (uncached).. any ideas as to why this might be happening?I've also got a number of other queries i'm running (much more complex) that take up to 10 seconds to load but i'm guessing it's because of a number of inner joins taking place..question is, is there a way of speeding this up without splitting the table up into smaller ones? (i hate doing that)[code]# Table "listings" DDLCREATE TABLE `listings` ( `id` mediumint(8) unsigned NOT NULL auto_increment, `url_name` char(50) default NULL, `longitude` double NOT NULL, `latitude` double NOT NULL, `name` char(255) default NULL, `address` char(255) default NULL, `state` smallint(5) unsigned default NULL, `city` mediumint(8) unsigned default NULL, `zip` char(6) default NULL, `phone` char(10) default NULL, `fax` char(10) default NULL, `email` char(128) default NULL, `website` char(255) default NULL, `description` text, `hours` text, `payment_options` text, `locations` smallint(5) unsigned default NULL, `established_on` date default NULL, `image` enum('yes','no') default 'no', `keywords` text, `added_on` datetime default NULL, `rating` tinyint(3) unsigned default NULL, PRIMARY KEY (`id`), UNIQUE KEY `url` (`url_name`), KEY `locate2` (`longitude`,`latitude`), KEY `long` (`longitude`), KEY `city` (`city`), KEY `state` (`state`)) ENGINE=MyISAM AUTO_INCREMENT=8164229 DEFAULT CHARSET=latin1# Table "listings_link" DDLCREATE TABLE `listings_link` ( `listing_id` mediumint(8) unsigned NOT NULL default '0', `category_id` smallint(5) unsigned NOT NULL default '0', PRIMARY KEY (`listing_id`,`category_id`), KEY `category_id` (`category_id`), KEY `listing_id` (`listing_id`)) ENGINE=MyISAM DEFAULT CHARSET=latin1[/code] Quote Link to comment https://forums.phpfreaks.com/topic/24208-big-tables-and-slowness/ Share on other sites More sharing options...
obsidian Posted October 17, 2006 Share Posted October 17, 2006 just for kicks, when you are counting rows, try just counting on the id column instead of *. i don't know for sure if mysql handles aggregates the same as selects or not, but i know with selects, your query is usually substantially faster if you only select those items you need:[code]SELECT COUNT(id) FROM listings WHERE longitude != 0[/code]again, don't know if it will help, but typically optimizing even the smallest things when you're searching through millions of records can help.good luck Quote Link to comment https://forums.phpfreaks.com/topic/24208-big-tables-and-slowness/#findComment-110051 Share on other sites More sharing options...
neoform Posted October 17, 2006 Author Share Posted October 17, 2006 just ran it with COUNT(id)*waits*..*minutes go by*..yeah, that's not working too good.. ;) Quote Link to comment https://forums.phpfreaks.com/topic/24208-big-tables-and-slowness/#findComment-110057 Share on other sites More sharing options...
shoz Posted October 17, 2006 Share Posted October 17, 2006 If you do an "EXPLAIN" on the query ("EXPLAIN SELECT ... ") you should see that the query uses no indexes. That's because MYSQL can't use indexes for !=.The following should be faster[code]SELECT ((SELECT COUNT(*) FROM listings) - (SELECT COUNT(*) FROM listings WHERE longitude = 0)) AS num[/code]As far as why the other queries take 10 seconds or more, you should post the queries and the EXPLAIN output.If the database does not change frequently and the result is being used as a search result perhaps for users to browse through, you may want to store the result of the query in a table for quick retrieval. You can recreate the table every 1hr or so depending on how often you think is best. Quote Link to comment https://forums.phpfreaks.com/topic/24208-big-tables-and-slowness/#findComment-110080 Share on other sites More sharing options...
neoform Posted October 17, 2006 Author Share Posted October 17, 2006 Ooo, that's good to know.. i didn't realize that indexes didn't work for != operators.. (any reason why?)as for the big query.. caching the results would be nice, but it's not possible in this case since the resulting cache table would be far too large since it's calculating the distance from a point a (you) to point b (business) and ordering then by distance..[code]$query = " SELECT listings.id, listings.url_name, listings.name, listings.address, listings.zip, listings.website, listings.phone, listings.fax, listings.rating, listings.city, listings.state, ( 3958 * 3.1415926 * sqrt( (listings.latitude - ".$city['latitude'].") * (listings.latitude - ".$city['latitude'].") + cos(listings.latitude / 57.29578) * cos(".$city['latitude']." / 57.29578) * (listings.longitude - ".$city['longitude'].") * (listings.longitude - ".$city['longitude'].") ) / 180 ) distance FROM listings_link INNER JOIN listings ON listings.id = listings_link.listing_id WHERE (listings_link.category_id = '".$category['id']."' ".$where.") ORDER BY url_name ASC LIMIT ".(($_page - 1) * LOCAL_PER_PAGE_SEARCH).", ".LOCAL_PER_PAGE_SEARCH." ";[/code]Resulting query..[code]SELECT listings.id, listings.url_name, listings.name, listings.address, listings.zip, listings.website, listings.phone, listings.fax, listings.rating, listings.city, listings.state, ( 3958 * 3.1415926 * sqrt( (listings.latitude - 40.394700) * (listings.latitude - 40.394700) + cos(listings.latitude / 57.29578) * cos(40.394700 / 57.29578) * (listings.longitude - -99.121000) * (listings.longitude - -99.121000) ) / 180 ) distance FROM listings_link INNER JOIN listings ON listings.id = listings_link.listing_id WHERE (listings_link.category_id = '1360' ) ORDER BY distance ASC LIMIT 0, 15[/code]Explained..[code]| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+---------------+--------+--------------------------------+-------------+---------+-----------------------------------------+------+---------------------------------+| 1 | SIMPLE | listings_link | ref | PRIMARY,category_id,listing_id | category_id | 2 | const | 200 | Using temporary; Using filesort || 1 | SIMPLE | listings | eq_ref | PRIMARY | PRIMARY | 3 | localdirectory.listings_link.listing_id | 1 | |+----+-------------+---------------+--------+--------------------------------+-------------+---------+-----------------------------------------+------+---------------------------------+[/code]I understand filesort is slower, but i can't figure any way of indexing that.. Quote Link to comment https://forums.phpfreaks.com/topic/24208-big-tables-and-slowness/#findComment-110086 Share on other sites More sharing options...
fenway Posted October 17, 2006 Share Posted October 17, 2006 Well, you have some redundant indexes, but that would just make the inserts slower. As for !=, just think about it... if I give a sorted list of 1000 numbers, and say find 543, that's easy. But if I say show me all that aren't 543, you have to actually check them. You can only use an index to find out what is there, not what isn't.I'm not sure that you could make that last query much faster, the EXPLAIN doesn't really give you much room for improvement if you sort on a expression. Quote Link to comment https://forums.phpfreaks.com/topic/24208-big-tables-and-slowness/#findComment-110117 Share on other sites More sharing options...
neoform Posted October 17, 2006 Author Share Posted October 17, 2006 *shakes fist at the mysql developpers*Make me an index that indexes numbers that don't exist!! ;) Quote Link to comment https://forums.phpfreaks.com/topic/24208-big-tables-and-slowness/#findComment-110134 Share on other sites More sharing options...
shoz Posted October 17, 2006 Share Posted October 17, 2006 I did a test with a large table using a similar JOIN but didn't have anywhere near a 10 sec query time.The EXPLAIN shows[code]Using temporary; Using filesort[/code]There should at least be a "using where" in the output. I'm guessing that probably has something to do with it. How long does the query take to do the JOIN without the LIMIT and the ORDER BY?[code]SELECTlistings.id, listings.url_name, listings.name, listings.address, listings.zip, listings.website,listings.phone, listings.fax, listings.rating, listings.city, listings.state,( 3958 * 3.1415926 * sqrt( (listings.latitude - 40.394700) * (listings.latitude - 40.394700) + cos(listings.latitude / 57.29578) * cos(40.394700 / 57.29578) * (listings.longitude - -99.121000) * (listings.longitude - -99.121000) ) / 180 ) distanceFROMlistings_linkINNER JOINlistingsONlistings.id = listings_link.listing_idWHERE(listings_link.category_id = '1360' )[/code]You might try an "ANALYZE TABLE listings" and "ANALYZE TABLE listings_link" and see if that helps.If you create a table that holds the result of the JOIN without the WHERE you may be able to speed things up by searching on that table. Again, this would be a table that you'd recreate periodically.[code]CREATE TABLE list_and_link ....INSERT INTO TABLE list_and_link SELECT ..... FROM table1 INNER JOIN table2 ON table1.id = table2.idALTER TABLE ADD INDEX ...SELECT l.*, COS * 1111/222 AS distance FROM list_and_link AS l WHERE category = ... ORDER BY distance[/code]You should be able to shorten the table creation process by using CREATE TABLE SELECT ... syntax, but I rather create the table definition by hand because MYSQL may choose column types when using the "create table select syntax" that would be problematic.EDIT: Btw, make sure the EXPLAIN output is from the query posted. I'm mentioning this because the code snippet you posted although almost the same as the "Resulting query', is not the same. Quote Link to comment https://forums.phpfreaks.com/topic/24208-big-tables-and-slowness/#findComment-110156 Share on other sites More sharing options...
neoform Posted October 17, 2006 Author Share Posted October 17, 2006 If i do the query without ORDER BY it takes 0.16 seconds.Without ORDER BY and LIMIT it takes 0.01 seconds.. (weird that it'd be faster to call more info....)here's an EXPLAIN on the query you reposted..[code]+----+-------------+---------------+--------+--------------------------------+-------------+---------+-----------------------------------------+------+-------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+---------------+--------+--------------------------------+-------------+---------+-----------------------------------------+------+-------+| 1 | SIMPLE | listings_link | ref | PRIMARY,category_id,listing_id | category_id | 2 | const | 200 | || 1 | SIMPLE | listings | eq_ref | PRIMARY | PRIMARY | 3 | localdirectory.listings_link.listing_id | 1 | |+----+-------------+---------------+--------+--------------------------------+-------------+---------+-----------------------------------------+------+-------+[/code]I'm doing the analyze on the tables.. (never done that before... took a bloody long time)Once it's done that i'll try doing to listings_and_link idea, but.. wont that eat up a ton more ram in the process since it's going to basically be double the size of the listings table.. ?[code]mysql> ANALYZE TABLE listings;+-------------------------+---------+----------+----------+| Table | Op | Msg_type | Msg_text |+-------------------------+---------+----------+----------+| localdirectory.listings | analyze | status | OK |+-------------------------+---------+----------+----------+1 row in set (6 min 35.16 sec)[/code][code]mysql> ANALYZE TABLE listings_link;+------------------------------+---------+----------+----------+| Table | Op | Msg_type | Msg_text |+------------------------------+---------+----------+----------+| localdirectory.listings_link | analyze | status | OK |+------------------------------+---------+----------+----------+1 row in set (35.63 sec)[/code] Quote Link to comment https://forums.phpfreaks.com/topic/24208-big-tables-and-slowness/#findComment-110185 Share on other sites More sharing options...
shoz Posted October 17, 2006 Share Posted October 17, 2006 If the join is that fast without the ORDER BY then it seems to be taking far too long to sort 200 rows. The result is 200 rows yes? Quote Link to comment https://forums.phpfreaks.com/topic/24208-big-tables-and-slowness/#findComment-110202 Share on other sites More sharing options...
neoform Posted October 17, 2006 Author Share Posted October 17, 2006 [quote author=shoz link=topic=111770.msg453292#msg453292 date=1161111518]If the join is that fast without the ORDER BY then it seems to be taking far too long to sort 200 rows. The result is 200 rows yes?[/quote]yeah, it's for the 200 rows, I felt it was taking too long for that amount as well.. :Pbtw, if you wanna see the site, it's here: http://local.netdeals.ca/listings/ and example listing page is: http://local.netdeals.ca/listings/new_york/manhattan/computer_communications_consultants/if you view the source for the page i stuck the SQL queries (3 of them, scroll to see them all) as an <!-- html comment -->Maybe it's my imagination, but everything is loading a LOT faster after having run "ANALYZE TABLE".. is it just me? Quote Link to comment https://forums.phpfreaks.com/topic/24208-big-tables-and-slowness/#findComment-110210 Share on other sites More sharing options...
shoz Posted October 17, 2006 Share Posted October 17, 2006 [quote=neoform]Maybe it's my imagination, but everything is loading a LOT faster after having run "ANALYZE TABLE".. is it just me?[/quote]ANALYZE TABLE may have helped. As far as whether or not it's your imagination, I'd say that you should run the queries manually and see if they're faster. Quote Link to comment https://forums.phpfreaks.com/topic/24208-big-tables-and-slowness/#findComment-110234 Share on other sites More sharing options...
fenway Posted October 17, 2006 Share Posted October 17, 2006 Well, yes, if you indexes weren't nice distributed, then the optimizer wouldn't optimize very well! You should be running this on all of your tables regularly, especially after any large updates. Quote Link to comment https://forums.phpfreaks.com/topic/24208-big-tables-and-slowness/#findComment-110297 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.