Jump to content

Similar Tags in a Tags Sorting System


Recommended Posts

I am asking out of curiosity, how does this problem usually get solved, when using a tag system to sort content, and users enter similar tags.


For example:


flower and flowers...


which means the difference between singular and plural. Judging by other sites which are around, this problem is not solved at all, they simply let you add as many tags as you want and then rank the most used tags, it indeed is a solution, but seems more like a walk-around solution to me.


Any ideas how somebody could approach to solve this one?




My suggestion would be an analysis function, which analyzes the entered tag and then suggests the user that similar tags have been already entered and perhaps he wants to choose one of those which have been already entered. This does work though with a high traffic websites what can happen is the following:


php and php5


You can end up with two similar tags which are both widely used though have different meanings to the user base.


This one could be solved by simply prohibiting similar tags with little additions or changes, and so to speak "forcing" the user to choose something which is already there. May be a way, though it can turn out as a bad solution as well.

Link to comment
Share on other sites

The two conflicting examples you showed above are probably the reason that, accordingly to you, "this problem is not solved".


There are just too many nuances in the English (and other) languages that this is an exercise in futility. You could probably implement some rudimentary logic to pose suggestions on "like" sounding tags, but anything further than that and you would be probably implementing limitations that will create more problems than they are solving.


For example, what if you tried to limit plural entries as in the suggestion above. How do you know that "flowers" is supposed to be the plural of the word "flower" and not something else? One scenario that comes to mind is when "Jennifer Flowers" was all over the news back when Bill Clinton was running for president. What if there were a lot of articles relating to Jennifer Flowers and users attempted to enter tags for her last name?


Yes, that is a manufactured scenario, but there would be many like that. At least that is my two cents.

Link to comment
Share on other sites

For actual english words you can look for a stemming algorithm that will allow you to locate plurals and other variants of the same word. 


However, this problem cannot accurately  be solved.  There are just too many variations, too many synonyms, too many words with multiple meanings.

Link to comment
Share on other sites

This thread is more than a year old.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.