arthur1984 Posted April 3, 2012 Share Posted April 3, 2012 Hello, I have a DB containing old ship records with these 4 tables: passages(id_passage, year, skipper_firstname, skipper_lastname, skipper_residence, departure, arrival) cargo(id_cargo, toll_daalders, toll_skilling, images) places_source(code, place) places_standard(code, place, st_place, region_code) I use the following query to get everything from the first two tables: SELECT DISTINCT passages.skipper_firstname AS Firstname, passages.skipper_lastname AS Lastname, passages.skipper_residence AS Residence, passages.departure AS "Departure port", passages.arrival AS "Arrival port", cargo.toll_daalders AS "Total in Daalders" FROM passages JOIN cargo ON passages.id_passage = id_cargo WHERE passages.skipper_lastname LIKE "Jansen" AND passages.departure LIKE "Amsterdam" The problem however is that names for residences and departure/arrival ports in the passage table are often spelled in different ways. The table places_source stores every possible name variation which can be found in the passage table along with a code. This code can then be used to look up the standardized name in the places_standard table. To get these standardized names I came up with the following: SELECT DISTINCT passage.skipper_firstname AS Firstname, passage.skipper_lastname AS Lastname, R2.st_name AS Residence, D2.st_name AS "Departure port", A2.st_name AS "Arrival port", cargo.toll_daalders AS "Total in Daalders" FROM passages JOIN cargo ON passages.id_passage = id_cargo JOIN places_source AS R1 ON passages.skipper_residence = R1.place JOIN places_source AS D1 ON passages.departure = D1.place JOIN places_source AS A1 ON passages.arrival = A1.place JOIN places_standard AS R2 ON R1.code = R2.code JOIN places_standard AS D2 ON D1.code = D2.code JOIN places_standard AS A2 ON A1.code = A2.code WHERE passages.skipper_lastname LIKE "Johnson" AND R2.code LIKE "501AMS" AND D2.region_code LIKE "14" There are indexes on all fields in the base table. As long as I use skipper_lastname in the WHERE clause all works reasonably well. As soon as I remove it the query becomes extremely slow. SQL now has to do a full scan on the passages table (1.2 million rows) because the WHERE clause no longer contains any fields found in the passages table...only the aliases. Unfortunately the people who set up the DB won't allow me to alter/merge any of the tables so i'm affraid this is what I have to work with. Is there any other (faster) way to get the standard names? Any help is appreciated! Quote Link to comment Share on other sites More sharing options...
Muddy_Funster Posted April 3, 2012 Share Posted April 3, 2012 Don't use LIKE for exact values, use = Like will work through the contents of each record of each field bit by bit to build matches, = compares the record entry for the field as a whole. This should speed things up a bit. Also, why are you joining the same table so many times? Quote Link to comment Share on other sites More sharing options...
arthur1984 Posted April 3, 2012 Author Share Posted April 3, 2012 Hi Muddy_Funster, Thanks, i'll change the LIKE statements where possible and see how that works out. About the joins: I need the standardized names from the places_standard table for 3 different fields in the passages table, hence the 3 different joins to the same places_standard table. Quote Link to comment Share on other sites More sharing options...
arthur1984 Posted April 3, 2012 Author Share Posted April 3, 2012 I could use the following code to avoid the joins but I run into the same problem. SELECT DISTINCT p.skipper_firstname AS Firstname, p.skipper_lastname AS Lastname, s1.st_name AS Residence, s2.st_name AS "Departure port", s3.st_name AS "Arrival port", c.toll_daalders AS "Total in Daalders" FROM passages p, cargo c, places_source p1, places_source p2, places_source p3, places_standard s1, places_standard s2, places_standard s3 WHERE p.id_passage = c.id_cargo AND p.skipper_residence = p1.place AND p1.code = s1.code AND p.departure = p2.place AND p2.code = s2.code AND p.arrival = p3.place AND p3.code = s3.code AND s1.code = "501AMS" AND p.skipper_lastname = "Jansen" As soon as I ditch p.skipper_lastname from the WHERE clause it has to scan the whole passages table because the first & lastname fields are the only ones I retrieve directly from the passages table. The standardized names for residences and ports come from the places_standard table instead. The non-standardized skipper_residence, departure and arrival fields from the passages table are only used in the WHERE clause to decide what values to get from the places_standard table. I guess with the way the DB is setup, scanning the whole passage table when there is no firstname or lastname field in the WHERE clause is inevitable Quote Link to comment Share on other sites More sharing options...
Muddy_Funster Posted April 3, 2012 Share Posted April 3, 2012 it's a sour pill, but sometimes when your working with a 3rd party system you need to make concessions for mistakes/inflexability in design. Just run it on a 64GB dedicated server and youll be fine Quote Link to comment Share on other sites More sharing options...
arthur1984 Posted April 3, 2012 Author Share Posted April 3, 2012 I'll see if I can convince them to upgrade to such a server Just to make sure before this thread can be considered closed...is there any other way to deal with this problem using MySQL? If anyone has any suggestions please let me know. Quote Link to comment Share on other sites More sharing options...
fenway Posted April 9, 2012 Share Posted April 9, 2012 Sorry, what's the problem? Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.