Jump to content

[SOLVED] Oddly slow query


daviddth

Recommended Posts

I have a query that is searching a 2500 record mySQL database on one (indexed) field, but the query is running slow bcause of one particular search term. The query is:

 

$sql = "SELECT * FROM graves WHERE ((LName='???') OR (LName like 'A%') OR (LName like 'B%') or (LName like 'C%')) ORDER BY LName, FName";

 

Yes one of the LName searches is ? ? ? (without the spaces - had to do that to stop it showing as ??? ), as that is the exact entry used in the LName field. That entry is used for unlknown surnames, usually because I know the married name of a female, but not the maiden name.

 

The query takes 1.1 seconds to build the page, whereas a similar query takes less than 0.1 seconds for a similar number of returned names. I have tried removing the LName='? ? ?' and the query returns to 0.08 seconds, so why is it taking so long?

 

I assume it's because of the ?'s but is there any way around it? As it is, it returns the right records, but just takes much longer than the other queries. I tried LName like '?%' and thats the same - long search time.

 

The query can be seen in action at http://www.davidrawsthorne.com/familytree/people.php?alphindex=ABC and if you select http://www.davidrawsthorne.com/familytree/people.php?alphindex=DEF you will see a faster query that has no ? ? ?'s to search for.

 

 

 

Link to comment
Share on other sites

My bet is because alphabetical items and numbers can be indexed. ? marks really cannot. I would change that to be different, either just null or another item or add like an "A" infront of them. However you will have to remember to take that into effect each time you run a query against just "A" to not include that.

 

 

Link to comment
Share on other sites

You are probably right, but there is no easy way to change the names as they marry into a lot of other queries, both shown and hidden, but this is the only one that searches just on LName, and thus has the speed issue.  Without a lot of complex on the fly adding and then removing extra characters, I was hoping for an "easier" way.

 

It's not a huge issue, but I was sitting here at work at 3:00am wondering why, so thought I'd ask the experts here :)

Link to comment
Share on other sites

Are you sure ? can't be indexed?  I don't see why not.

 

 

 

Try doing:

 

EXPLAIN SELECT * FROM graves WHERE ((LName='???') OR (LName like 'A%') OR (LName like 'B%') or (LName like 'C%')) ORDER BY LName, FName

 

And see what indexes it is using.

Link to comment
Share on other sites

If I remove the LName='? ? ?' then it says possible key LName and Key is LName, (which is the Last Name index, and thus is what is expected), but adding it back in there it says possible key is LName, but key is NULL

 

 

Link to comment
Share on other sites

Use NULL for entries you have no data about. Then use IS NULL to select them

 

OT: I'm kind of disturbed knowing, that somewhere in your application there is a query

INSERT INTO graves...

Link to comment
Share on other sites

Strange.... I guess ??? really can't be indexed.  I would think that b-trees in MySQL work based on binary content, therefore letter content wouldn't matter.

 

(Edit:  Obviously letters are binary, but I mean I would think it would be based entirely on just what ever....  I would think MySQL wouldn't care if it were indexing ? or A.)

 

 

Just of curiosity, if you change the query to just:

 

EXPLAIN SELECT * FROM graves WHERE LName = '???';

 

 

Then does it say an index is used?

Link to comment
Share on other sites

Yes it uses the LName index. Odd. Some more testing - if I use ? ? ? and A% and B% then it works, Adding C% stops it using an index. If I try ? ? ? and B & C then it works too.

 

It seems as though doing all 4 together causes it not to use an index. It returns 470 items (Correct) out of 2359 items, but will not use the index when all 4 items are used. I have searches using 5 items (the letters V through Z) and they work fine, but they dont return as many hits.

 

I wonder if, once you get to a certain number of items, it searches through all rather than using an index. A through D does the same - no index....

 

I can fix it by making the queries to smaller groups of letters, but that does not explain why...

Link to comment
Share on other sites

Maybe the extra parenthesis are confusing MySQL for some weird reason.

 

 

$sql = "SELECT * FROM graves WHERE (LName='???') OR (LName like 'A%') OR (LName like 'B%') or (LName like 'C%') ORDER BY LName, FName";

 

 

Or maybe it doesn't like the = with the LIKEs....  Maybe:

 

$sql = "SELECT * FROM graves WHERE (LName='???') OR ((LName like 'A%') OR (LName like 'B%') OR (LName like 'C%')) ORDER BY LName, FName";

 

 

The easy solution would just be a UNION.

 

Example:

 

SELECT * FROM graves WHERE LName = '???'
UNION
SELECT * FROM graves WHERE LName LIKE 'A%' OR LName LIKE 'B%' OR LName like 'C%';

 

As people have already stated, though, you really should use NULL or atleast an empty string when LName isn't known.

Link to comment
Share on other sites

The union worked, thanks.

 

Because you need to display something for a surname, rather than no value (Null), most sources suggest (Unknown) or ? ? ?, as this makes understanding (by humans) much easier.

 

When starting the family tree webpage some 12 years ago I chose ? ? ?, and changing now is simply not an option as the ? ? ?'s are spread throughout static, and php pages, plus the family tree software, and it would be just too big to change now (Close to 500 entries are like this, but only a small fraction are in the database I am currently working with).

 

Link to comment
Share on other sites

Out of interest, in the end I found a few more pages that had issues, not just using the ? ? ? surnames, but with queries that ended up returning about 15 to 20% of the records - they defaulted to using no index. The end fix was to look at the page and realise "Darn thats a LOT of info on one page....." and split the page up into smaller (200 records) results

 

The "SELECT * FROM graves WHERE ((LName='???') OR (LName like 'A%') OR (LName like 'B%') or (LName like 'C%')) ORDER BY LName, FName" ended up having ' LIMIT '.$perpage.' OFFSET '.$offsetnum appended to it, where $perpage is a defineable limit, currently 200, per page, and offset number is the current page number multiplied by $perpage and minus 1 (as the offset starts at 0, but I prefer record 1 lol)

 

This will eventually filter to all pages, so thanks heaps for all your input. It works really well in the test page (which is used to add/view/edit the live data) so moving it to the public pages should be real easy to do. I still have to "tidy" the code, but I blame my 'newness' to PHP for its horrid design ;)

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.