Jump to content

Recommended Posts

I need to compare a db's results before I add a new entry, and prevent something similar from being added.

For example, I already have "Dell" in the database, and someone wants to enter "Dell Inc". 
I tried using MySQL's LIKE and REGEXP functions but it was not helpful.

My goal is to compare "Dell Inc" to "Dell" and if there is a similarity.

Thanks
Link to comment
https://forums.phpfreaks.com/topic/20776-what-is-the-best-method/
Share on other sites

This is challenging in a few ways. If  "Dell Inc" was already in the database, it's easy to match "Dell" in "Dell Inc." The reverse is not--it will never match. The next approach would be to split the string by whitespace and look for each piece, but how do you know which parts are valid to the company name? We know "Inc" can be dropped from "Dell Inc", but what about longer company names such as "Johnson and Johnson"? My only thought at the moment is to make a list of known prefixes and their variations--e.g., Corporation, Corp., Inc., Incorporated, etc.--and strip these from the end of the string before running the match.
O.K Thanks.  I might just split things by a space, then compare each piece. 

[quote author=effigy link=topic=108075.msg434433#msg434433 date=1158266064]
This is challenging in a few ways. If  "Dell Inc" was already in the database, it's easy to match "Dell" in "Dell Inc." The reverse is not--it will never match. The next approach would be to split the string by whitespace and look for each piece, but how do you know which parts are valid to the company name? We know "Inc" can be dropped from "Dell Inc", but what about longer company names such as "Johnson and Johnson"? My only thought at the moment is to make a list of known prefixes and their variations--e.g., Corporation, Corp., Inc., Incorporated, etc.--and strip these from the end of the string before running the match.
[/quote]
I have successfully inplemented something quite like this.

You need to compare values of the same length

if DELL is in and you are entered DELL INC, you only want to try to match the first 4 letters and see if they're the same.

you can also use the similar_text function whic takes two values and gives you the percent that is the same, and throw new entries at a sertain threshold.

While DELL and DELL INC are the same, DELL INC and DELL FINANCIAL are not, so throwing the last part will not help.
Thanks

I can see a few problems with this so I think I'll not do this.

I have "Sun" listed, but not "Sun Microsystems".

[quote author=bholbrook link=topic=108075.msg434482#msg434482 date=1158271780]
I have successfully inplemented something quite like this.

You need to compare values of the same length

if DELL is in and you are entered DELL INC, you only want to try to match the first 4 letters and see if they're the same.

you can also use the similar_text function whic takes two values and gives you the percent that is the same, and throw new entries at a sertain threshold.

While DELL and DELL INC are the same, DELL INC and DELL FINANCIAL are not, so throwing the last part will not help.
[/quote]
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.