Jump to content

Advanced MySQL Search


bendurber

Recommended Posts

I am trying to query a table of people based on there name, then I wish to return the closest match and also the 10 records either side of the match based on a sort of surname, forename.

 

e.g. Search for forename=“Joe” and surname=”Bloggs” would return 11 records as below:

 

ForenameSurname

JaneBarnes

VickyBates

SarahBirt

AdamBloggs

AnneBloggs

JoeBloggs

RuthBloggs

BeckyBowman

PeteBrown

GraemeButt

PaulCahill

 

My overall aim is to allow users to loop through the table using a name as a reference so for instance in this case 'Joe Bloggs' is the reference and they could then go back 100 records and see something like 'Ben Adams'.

 

 

Any ideas how this can be done?

 

Thanks

 

Ben

Link to comment
Share on other sites

If you don't want to add a new field to your table, of the top of my head I can see myself running two queries with the help of LIMIT, ORDER BY,  >=, <, and INSERT INTO

 

One query takes cares of receiving Joe Bloogs at the top plus the next 10 records (>= Joe Bloogs, LIMIT 10, ORDER BY asc)

 

The other no Joe Blogs at the bottom and 10 records (< Joe Bloogs, LIMIT 10, ORDER BY desc)

 

You use INSERT INTO a temporary table and then read from it.

 

Naturally all this if performance tests prove you won't get into trouble. If you do... well, I'm pretty new to mysql myself, but if stored procedures are cached as they are on other systems, you can do the same with the help of one.

 

Link to comment
Share on other sites

You're probably not going to like this, but here it is in ONE (1) query:

 

SELECT d1.forename
      ,d2.lastname
FROM
(SELECT forename
        ,lastname
        ,(@rnk1 := @rnk1+1) as 'rank'
  FROM people p
  JOIN (SELECT @rnk1 := 0) x
  ORDER BY p.lastname ASC, p.forename ASC
) d1
JOIN (SELECT * FROM (SELECT forename
        ,lastname
        ,(@rnk2 := @rnk2+1) as 'rank'
  FROM people p
  JOIN (SELECT @rnk2 := 0) x
  ORDER BY p.lastname ASC, p.forename ASC
)x WHERE forename = 'joe' AND lastname = 'bloggs') d2
WHERE d1.rank BETWEEN (d2.rank - 3) AND (d2.rank + 3)
ORDER BY d2.rank;

 

I don't expect you to understand it, nor do I fully (ranking functionality using some MySQL quirks is dangerous), and thus this is provided with NO warrantees whatsoever. Use at your own discretion...

Link to comment
Share on other sites

That works well thanks.

 

The first part of the JOIN (d1) gets all of the records in order of surname, forename (note you need to change the first line to d1.surname not d2.surname)

 

Then the second part (d2) finds the rank value of the 'joe' 'bloggs' (it is worth while putting a 'LIMIT 1' at the end in case you have duplicate records)

 

Finally it searches for records between the ranked value in d2.

 

However it only works if the person 'joe' 'bloggs' is in the database exactly.

 

I suspect I will need to run a separate query to find the closest match if the exact match is not there then run it through the query again.

 

Is there a way to query for the closest match to 'joe' 'bloggs' so it would return 'jo' 'blogg's' for instance?

 

Thanks for your help.

 

Ben

Link to comment
Share on other sites

Just decided to do an EXPLAIN on my query to see what indexes it could/might use. I placed indexes on forename, lastname and forename AND lastname (as primary key as I assumed joe bloggs was unique), however I now believe this query to be horribly inefficient.

 

With regards to "guessing" names, you're going down the soundex() road, however I DON'T recommend this as soundex is rather old on MySQL now and hasn't been updated in a while ( to my knowledge ).

 

However, having in fact looked at your original request, it seems like you're merely paginating your dataset...mmm... might be a better way.

Link to comment
Share on other sites

On a table of 15000 it took around 0.6 seconds without any indexing.

 

I like the idea of using surname <= 'bloggs' LIMIT 10 however this would be difficult when we are a few pages back as the only reference point I would have is 'bloggs'.

 

Your solutions solves that because I can do:

 

WHERE d1.rank BETWEEN (d2.rank - 300) AND (d2.rank - 260).

 

In fact I could do:

SELECT d1.cand_name_fo, d1.cand_name_ln, d1.cand_id, d1.rank
FROM (
SELECT cand_id, cand_name_fo,cand_name_ln,(@rnk1 := @rnk1+1) as 'rank'
  	FROM tbl_cand p
  	JOIN (SELECT @rnk1 := 0) x
  	ORDER BY p.cand_name_ln ASC, p.cand_name_fo ASC
) d1
JOIN (
SELECT * FROM (
    	SELECT cand_name_fo,cand_name_ln,(@rnk2 := @rnk2+1) as 'rank'
  		FROM tbl_cand p
  		JOIN (SELECT @rnk2 := 0) x
  		ORDER BY p.cand_name_ln ASC, p.cand_name_fo ASC
) x 
WHERE cand_name_ln >= 'bloggs' LIMIT 1
) d2
WHERE d1.rank BETWEEN (d2.rank - 3) AND (d2.rank + 3)
ORDER BY d1.rank;

 

Changing the WHERE statement to just look at the surname but the problem is if we have 100 'Smith' and we are looking for 'zoe' 'smith' it wouldn't show her on the first page where we want it.

 

 

 

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.