Jump to content

Left join not working correctly


DWilliams

Recommended Posts

What I need to do is query our database for customers who have never had a letter sent to them. The relevant tables here are dbase (holds customer account information) and letters (has one row for each letter sent by our system, and has a maindatabaseid column linking it with the dbase row).

 

From a technical perspective, to accomplish this, I need to find all rows in dbase that don't have an entry in letters. I thought I could accomplish this with a left join but apparently I'm doing something wrong. Here's what I tried:

 

SELECT dbase.id, dbase.fullname 
FROM dbase LEFT JOIN letters ON dbase.id = letters.maindatabaseid 
WHERE letters.maindatabaseid IS NULL;

 

When I execute that, the whole server locks up until I terminate the query so I assume it's making way to many matches. I'm expecting less than 50 rows returned.

 

What am I doing wrong and how can I accomplish my goal here?

Link to comment
Share on other sites

The whole server locks up?  Have you tried letting it finish, or running an EXPLAIN against that query?

 

Have you indexes these columns properly and set up a foreign key for this relationship?

 

-Dan

 

I haven't tried letting it finish since this server is being used by about 5 people in the office. Generally I don't hesitate to run select statements untested on the live server due to their typically benign nature. If I need to let the query run fully then it'll need to wait since I don't want to lock those users up for an unknown amount of time (that tends to make them grumpy  :D ).

 

As for indexes and foreign key setups, no I haven't touched that. This database is not my design, it was created and it maintained by a separate software package that I have no control over. I'm just running queries against it.

 

- How many records do you have in your dbase and letters tables?

- There are indexes defined for both tables? (dbase.id and letters.maindatabaseid)

- Did you run an EXPLAIN over your select to detect what is happening?

 

The EXPLAIN statement shows how many rows are in both tables:

mysql> EXPLAIN SELECT dbase.id, dbase.fullname FROM dbase LEFT JOIN letters ON dbase.id = letters.maindatabaseid WHERE letters.maindatabaseid IS NULL;
+----+-------------+---------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table   | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+---------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | dbase   | ALL  | NULL          | NULL | NULL    | NULL |  4823 |             | 
|  1 | SIMPLE      | letters | ALL  | NULL          | NULL | NULL    | NULL | 12554 | Using where | 
+----+-------------+---------+------+---------------+------+---------+------+-------+-------------+
2 rows in set (0.04 sec)

 

From what I've seen, when I have joins lock up and go out of control they seem to be comparing every row in table A to every row in table B, which in this case is 60,547,942 returned rows so that could very well explain the lockup.

Link to comment
Share on other sites

I could have sworn I replied to this before I left on Friday.

 

Fenway is right, there are no keys.  The database isn't using the indexes to filter the rows in the tables, so it has to take EVERY row of table1 and compare it to EVERY row of table2, row-by-row, one at a time.  This is called a cross-product and it takes forever.

 

If you set up a foreign key or an index on the fields that you're using to join the tables, your query time will be cut down to a few milliseconds.  Yes, it matters that much.

 

-Dan

Link to comment
Share on other sites

Hmm I must admit that I'm not really familiar with indexing.

 

As I said before, this database is used by another piece of software that I did not code. Would indexing mess with the database structure and potentially disrupt the program that primarily uses this database in any way?

 

Also if somebody could provide a link to tell me what I need to know here that would be awesome.

Link to comment
Share on other sites

The MySQL manual page on Indexes is a good start.

 

Indexes should do nothing but speed up your queries, but too many indexes will slow down inserts.  If there aren't a lot of inserts, indexing these columns won't have any adverse effects.

 

Put a "normal" index on dbase.id and letters.maindatabaseid. Indexing those columns will lock the tables while the index is formed, so you may want to create a copy of the tables to play with while you work on your index knowledge.

 

Or you could work with the person who created the application, or a DBA if one exists for your organization. 

 

-Dan

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.