scottmelb Posted May 17, 2008 Share Posted May 17, 2008 Hi Guys, I have a website primarily written in PHP that connects to a very large SQL database, which contains about 100 tables, some of which contain about 600,000 records. The website is designed to allow users to search the database, and all tables of the database, to retrieve resulting records from the posted search criteria. The search script works fine, and expected results are returned. However, as the size of the database grows, the access speed slows dramatically. Currently, after Submitting the search criteria a user has to wait about 2 minutes for the results to be returned. In order to retrieve relevant records from each database table, I use an SQL SELECT statement with "UNION" to create an SQL statement that inspects every table in the database for matching records. I have spoken with my ISP who has recommended that I move to parallel processing in order to improve access times, using a series of servers to improve performance overall. However, before moving to that more costly solution, I am wondering if I could improve the performance by changing the way I query the database. One thought I had was to merge all of the records into a single database table, however that would become very large indeed, and may exceed the limits possible for table size (I am not sure what this is, but I would expect there would be an upper limit). Would anyone know of some alternatives here, or perhaps some methods I could employ to improve performance? Many thanks, Scott. Quote Link to comment Share on other sites More sharing options...
jvrothjr Posted May 17, 2008 Share Posted May 17, 2008 cach or indexing? Quote Link to comment Share on other sites More sharing options...
scottmelb Posted May 17, 2008 Author Share Posted May 17, 2008 Hi, Just to follow up on my original post... I have now done the following things to attempt to improve performance, based on what I have read in the MySQL Performance documents on this forum: 1. Added an INDEX to all record ID's on all tables (although automatically this was set as a PRIMARY INDEX) 2. Changed my BIGINT (20)s to INT (8)s. Further to this, should I index other columns? Users do not search for records by ID but rather all other columns, such as Name, Location, etc.. Can all of these columns be assigned an INDEX? Would this improve performance? Also, all of my columns (asside from ID) are set to TEXT. Should I use VARCHAR instead? I believe this may also improve performance. Thanks guys, Scott. Quote Link to comment Share on other sites More sharing options...
scottmelb Posted May 18, 2008 Author Share Posted May 18, 2008 Anyone? Quote Link to comment Share on other sites More sharing options...
fenway Posted May 18, 2008 Share Posted May 18, 2008 Hi, Just to follow up on my original post... I have now done the following things to attempt to improve performance, based on what I have read in the MySQL Performance documents on this forum: 1. Added an INDEX to all record ID's on all tables (although automatically this was set as a PRIMARY INDEX) 2. Changed my BIGINT (20)s to INT (8)s. Further to this, should I index other columns? Users do not search for records by ID but rather all other columns, such as Name, Location, etc.. Can all of these columns be assigned an INDEX? Would this improve performance? Also, all of my columns (asside from ID) are set to TEXT. Should I use VARCHAR instead? I believe this may also improve performance. Thanks guys, Scott. 1. Hopefully you didn't index the columns twice 2. You should use UNSIGNED INT, drop the display width. You can use a fulltext index across multiple columns for MyISAM tables. VARCHAR is much better than TEXT, especially in 5.0+, where you can use up to 65K. Lastly, have you used the slow query log to locate the guilty queries? If so, post their eXPLAIN output. Quote Link to comment Share on other sites More sharing options...
scottmelb Posted May 22, 2008 Author Share Posted May 22, 2008 Hello fenway, Thankyou for you feedback. I have ensured that I have not doubled up on any indexing in the tables. Below is an example of the way I have a table setup: Field Type Collation Attributes Null Default Extra ---------- --------- ---------- ---------- ----- -------- -------- ID int( No auto_increment FirstName varchar(50) latin1_swedish_ci No LastName varchar(50) latin1_swedish_ci No RecordLink varchar(100) latin1_swedish_ci No Address1 varchar(50) latin1_swedish_ci No Address2 varchar(50) latin1_swedish_ci No City varchar(50) latin1_swedish_ci No State varchar(50) latin1_swedish_ci No Country varchar(50) latin1_swedish_ci No Region varchar(50) latin1_swedish_ci No Sex varchar(50) latin1_swedish_ci No Age varchar(50) latin1_swedish_ci No Status varchar(50) latin1_swedish_ci No ContactTel varchar(50) latin1_swedish_ci No Email varchar(50) latin1_swedish_ci No Occupation varchar(50) latin1_swedish_ci No GeneralNotes text latin1_swedish_ci No Note that the database data is made up from records that are available on a series of website pages, and a PHP series of PHP scripts that I have written scrape this data and place it in the database tables. Because some of the data is not "clean" I have allowed in most cases for 50 characters in each column, including telephone as there are some instances where this is not simply numeric data, but also includes a short comment. Additionally, there are a number of database tables that are setup slightly differently to capture different data that is available on some of the pages scraped. This means that on some pages there is more information about a person, such as a date of birth, a company name, a position title, etc.. When a user performs a search through a simple form on a PHP page, my PHP search script assembles an SQL query (albeit an ugly one) that looks something like the following (in this example the user is searching for all people located in California between the ages of 21 and 29, who are Male): SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table1 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table2 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table3 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table4 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table5 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table6 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table7 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table8 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table9 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, NULL AS `LastName`, `RecordLink` FROM people_table10 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' ORDER BY `FirstName` ASC LIMIT 1, 31 Note that this query has been cut down a fair but. This would usually show over 50 tables. To explain the above query, I have setup my script this way to check each table in sequence for the required records. You will note that there is an "NULL AS `LastName` in the last SELECT statement as people_table10 does not have a column `LastName`, and this prevents an SLQ error. Also, it is necessary to hold this data in separate tables like this because of the differing data available for each set of website pages scraped. Also note that I use a LIKE operator for `City` because sometimes this includes extraneous data, like a country name as well, or a city name. Please note that this SQL query works perfectly with a small number of records in the tables. However, now as I have over 4 million records, and this will grow alot more, the time it is taking to return results is very long indeed (> 5 minutes). I have taken your earlier advise and changed my INT(8)s to UNSIGNED INT columns, however this has not made alot of difference that I can tell. Currently, my indexes look something like this on most of the tables: Keyname Type Cardinality Field ------- ----- ---------- ------ PRIMARY PRIMARY 0 ID Address1 INDEX 32245 Address1 Address2 INDEX 234 Address2 City INDEX 325 City State INDEX 45 State Country INDEX 12 Country Region INDEX 7 Region Sex INDEX 2 Sex Age INDEX 76 Age Status INDEX 6 Status Occupation INDEX 3785 Occupation Note that I have not indexed columns that are not searchable by users (FirstName, LastName, etc..) So, looking at all of this, is there anything here that is glaringly and obviously wrong? All tables are MyISAM tables. I am unsure what you mean by "You can use a fulltext index across multiple columns for MyISAM tables". I have basically just used MyPHPAdmin to add individual indexes to each column independently. Is this correct? Also, I am unsure how to use the "slow query log to locate the guilty queries". Unfortunately I am still finding my feet with all of this, and I do greatly appreciate your help. Regards, Scott. Quote Link to comment Share on other sites More sharing options...
scottmelb Posted May 22, 2008 Author Share Posted May 22, 2008 Fenway, Apologies, I have just read your posting guidelines, so I would like to add the following information: 1. SQL version is 5.0 2. The sample Raw SQL statement in code block: SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table1 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table2 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table3 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table4 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table5 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table6 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table7 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table8 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, `LastName`, `RecordLink` FROM people_table9 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' UNION SELECT `FirstName`, NULL AS `LastName`, `RecordLink` FROM people_table10 WHERE `Sex` = 'Male' AND (`Age` >= '21' AND `Age` <= '29') AND `City` LIKE '%California%' ORDER BY `FirstName` ASC LIMIT 1, 31 3. The SHOW CREATE TABLE METHOD gives the following for my example table structure: CREATE TABLE `people_table1` (\n `ID` int(8 ) NOT NULL auto_increment,\n `FirstName` varchar(50) NOT NULL default '',\n `LastName` varchar(50) NOT NULL default '',\n `RecordLink` varchar(100) NOT NULL default '',\n `Address1` varchar(50) NOT NULL default '',\n `Address2` varchar(50) NOT NULL default '',\n `City` varchar(50) NOT NULL default '',\n `State` varchar(50) NOT NULL default '',\n `Country` varchar(50) NOT NULL default '',\n `Region` varchar(50) NOT NULL default '',\n `Sex` varchar(50) NOT NULL default '',\n `Age` varchar(50) NOT NULL default '',\n `Status` varchar(50) NOT NULL default '',\n `ContactTel` varchar(50) NOT NULL default '',\n `Email` varchar(50) NOT NULL default '',\n `Occupation` varchar(50) NOT NULL default '',\n `GeneralNotes` text NOT NULL default '',\n) ENGINE=MyISAM AUTO_INCREMENT=7 DEFAULT CHARSET=latin1 Many thanks, Scott. Quote Link to comment Share on other sites More sharing options...
fenway Posted May 22, 2008 Share Posted May 22, 2008 OK... now show me the EXPLAIN output that I originally asked for... Quote Link to comment Share on other sites More sharing options...
scottmelb Posted May 23, 2008 Author Share Posted May 23, 2008 Fenway, This is the output produced using my SQL query (43 tables in total are included in the SQL statement, whereas my example shows only 10): id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY people_table1 ref Sex,Age Sex 50 const 392745 Using where 2 UNION people_table2 ref Sex,Age Sex 50 const 354451 Using where 3 UNION people_table3 ref Sex,Age Sex 50 const 39084 Using where 4 UNION people_table4 range Sex,Age Age 50 NULL 1102 Using where 5 UNION people_table5 range Sex,Age Age 50 NULL 786 Using where 6 UNION people_table6 ref Sex,Age Sex 50 const 339 Using where 7 UNION people_table7 range Sex,Age Age 50 NULL 60 Using where 8 UNION people_table8 ref Sex,Age Sex 50 const 104274 Using where 9 UNION people_table9 ref Sex,Age Sex 50 const 131437 Using where 10 UNION people_table10 ref Sex,Age Sex 50 const 29301 Using where 11 UNION people_table11 ALL NULL NULL NULL NULL 270756 Using where 12 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 13 UNION people_table13 ref Sex,Age Sex 50 const 37295 Using where 14 UNION people_table14 ref Sex,Age Sex 50 const 244851 Using where 15 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 16 UNION people_table16 ref Sex,Age Sex 50 const 56158 Using where 17 UNION people_table17 ref Age,Sex Sex 50 const 68 Using where 18 UNION people_table18 ref Sex,Age Sex 50 const 1 Using where 19 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 20 UNION people_table20 range Sex,Age Age 50 NULL 320 Using where 21 UNION people_table21 ref Sex,Age Sex 50 const 8521 Using where 22 UNION people_table22 ref Sex,Age Sex 50 const 81136 Using where 23 UNION people_table23 ref Sex,Age Sex 50 const 12941 Using where 24 UNION people_table24 ref Sex,Age Sex 50 const 135699 Using where 25 UNION people_table25 ref Sex,Age Sex 50 const 75187 Using where 26 UNION people_table26 ref Sex,Age Sex 50 const 129273 Using where 27 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 28 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 29 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 30 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 31 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 32 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 33 UNION people_table33 range Sex,Age Age 50 NULL 241 Using where 34 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 35 UNION people_table35 ALL NULL NULL NULL NULL 9 Using where 36 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 37 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 38 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 39 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 40 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 41 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 42 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 43 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables NULL UNION RESULT <union1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,...> ALL NULL NULL NULL NULL NULL Using filesort Regards, Scott. Quote Link to comment Share on other sites More sharing options...
scottmelb Posted May 23, 2008 Author Share Posted May 23, 2008 Fenway, I have tried to line up the items in this table to assist in viewing: id select_type table type possible_keys key key_len ref rows Extra -- ----------- ----- ---- ------------- --- ------- --- ---- ----- 1 PRIMARY people_table1 ref Sex,Age Sex 50 const 392745 Using where 2 UNION people_table2 ref Sex,Age Sex 50 const 354451 Using where 3 UNION people_table3 ref Sex,Age Sex 50 const 39084 Using where 4 UNION people_table4 range Sex,Age Age 50 NULL 1102 Using where 5 UNION people_table5 range Sex,Age Age 50 NULL 786 Using where 6 UNION people_table6 ref Sex,Age Sex 50 const 339 Using where 7 UNION people_table7 range Sex,Age Age 50 NULL 60 Using where 8 UNION people_table8 ref Sex,Age Sex 50 const 104274 Using where 9 UNION people_table9 ref Sex,Age Sex 50 const 131437 Using where 10 UNION people_table10 ref Sex,Age Sex 50 const 29301 Using where 11 UNION people_table11 ALL NULL NULL NULL NULL 270756 Using where 12 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 13 UNION people_table13 ref Sex,Age Sex 50 const 37295 Using where 14 UNION people_table14 ref Sex,Age Sex 50 const 244851 Using where 15 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 16 UNION people_table16 ref Sex,Age Sex 50 const 56158 Using where 17 UNION people_table17 ref Age,Sex Sex 50 const 68 Using where 18 UNION people_table18 ref Sex,Age Sex 50 const 1 Using where 19 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 20 UNION people_table20 range Sex,Age Age 50 NULL 320 Using where 21 UNION people_table21 ref Sex,Age Sex 50 const 8521 Using where 22 UNION people_table22 ref Sex,Age Sex 50 const 81136 Using where 23 UNION people_table23 ref Sex,Age Sex 50 const 12941 Using where 24 UNION people_table24 ref Sex,Age Sex 50 const 135699 Using where 25 UNION people_table25 ref Sex,Age Sex 50 const 75187 Using where 26 UNION people_table26 ref Sex,Age Sex 50 const 129273 Using where 27 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 28 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 29 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 30 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 31 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 32 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 33 UNION people_table33 range Sex,Age Age 50 NULL 241 Using where 34 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 35 UNION people_table35 ALL NULL NULL NULL NULL 9 Using where 36 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 37 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 38 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 39 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 40 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 41 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 42 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables 43 UNION NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const tables NULL UNION RESULT <union1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,...> ALL NULL NULL NULL NULL NULL Using filesort Regards, Scott. Quote Link to comment Share on other sites More sharing options...
fenway Posted May 23, 2008 Share Posted May 23, 2008 Seriously? 43 tables? At the very least, a covering index on (sex,age) would be helpful... but what's with all the impossibles? Quote Link to comment Share on other sites More sharing options...
scottmelb Posted May 24, 2008 Author Share Posted May 24, 2008 Fenway, Yes 43 tables currently, however as the number of websites that my site indexes (scrapes) continues to grow, so do the number of tables. The "Impossibles" seem to be each table that is in the database that currently have no records. These are yet to have any data placed in them. With regard to your suggestion of a covering index for Sex,Age - Can you please tell me how a covering index is setup, different from my index that I have assigned to each column? Note also that the SQL query will differ depending on the search criteria the user selects when performing the search. In the example given, the user has selected Age >= 21 and Age <= 29, Sex=Male, and State LIKE California. I assume that if a user searches by other criteria, such as City, Country, Region, Status, etc... that covering indexes may also help to cover these combinations also. This leads me to this question: Can I setup a number of covering indexes to cover most common search combination possibilities? Many thanks, Scott. Quote Link to comment Share on other sites More sharing options...
jordanwb Posted May 25, 2008 Share Posted May 25, 2008 I'm no expert but would using mysql_unbuffered_query work? Quote Link to comment Share on other sites More sharing options...
scottmelb Posted May 25, 2008 Author Share Posted May 25, 2008 Thanks jordanwb, I will give mysql_unbuffered_query a go. So far in my tests this appears to speed up the process somewhat. Another thought I had was to combine all of the tables into one table. Would a single SELECT statement working on one table (with say 25,000,000 rows) be more efficient than 50 SELECT statements joined together using UNION on 50 separate tables (with say 500,000 rows in each)? Many thanks, Scott. Quote Link to comment Share on other sites More sharing options...
fenway Posted May 29, 2008 Share Posted May 29, 2008 Would a single SELECT statement working on one table (with say 25,000,000 rows) be more efficient than 50 SELECT statements joined together using UNION on 50 separate tables (with say 500,000 rows in each)? Well, that's 50 times more overhead (client/server, result set object creation/deletion, etc.) -- but it really depends on how many rows match, how much data are returned, etc. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.