Jump to content

finding duplicate records using tidbits


richei

Recommended Posts

Its become apparent to me that people are rather stupid when it comes to following the most basic of directions when registering on a website.  I'm redoing our user view code, transitioning from straight php to jquery.  But that's not where I need help. 

 

For an example, I have a user with three accounts (I purposely left off the domain)

 

acct 1 - priesttheugk4729@

acct 2 - priestthe4729@

acct 3 - priesttheugk@

 

I know they're all the same person, he used different emails because it was coming up already in use (like its supposed to), but instead of following the instructions to get the password reset or change the login name, he choose to create 2 more accounts.  I have a duplicates script, but this guy isn't showing up because its 3 different emails and 3 different names.

 

I've looked through quite a few scripts that will find duplicate records, but they're all based off of a single term, not looking at a tiny of the word - which is why its not showing up now.  If I wanted to find this guy, it would need to search for priest.  I don't know if there's a way I can do it automatically or if it has to be done through a search box (if that's the case, then its not worth it right now).

Link to comment
Share on other sites

If you want to find all users with 'priestthe' in their email, you could do this;

 

SELECT * FROM table WHERE email  LIKE '%priestthe%';

 

But ofcourse that only works if you already know that you're looking for 'priestthe'.

Databases generally have fuzzy string functions which you can use to calculate how different two strings are. Look at SOUNDEX and LEVENSHTEIN. They're not fast but speed isn't an issue here I think.

 

For example, in PostgreSQL, this:

 

SELECT levenshtein('acct 1 - priesttheugk4729@' , 'acct 2 - priestthe4729@');

 

Returns '4', indicating that the two strings only differ by four characters, which is odd for two strings that are more than 20 characters long.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.