ale1981 Posted September 14, 2006 Share Posted September 14, 2006 Ok, here is my problem, I need to compare RSS titles taken from 3 different feeds.At the moment I am taking RSS feeds and inserting them into the database. What I need to make sure is that the same title or similar is not inserted more than once so we dont get lots of repeated stories. The same story could be covered by all 3 feeds but with slightly different titles.How it stands at the moment is I check to see if the exact title exists do not insert into the database. What I need to do is some kind of comparison or to compare the title with ones already in the database.I thought of using something like;[code]SELECT * FROM stories WHERE title LIKE '%$title%'[/code].. but wouldnt that compare all items in the title, words like and, a, the etc?Any help would be very grateful. Quote Link to comment Share on other sites More sharing options...
fenway Posted September 14, 2006 Share Posted September 14, 2006 Well, the wildcards won't help much if $title is multi-word... in fact, any exact match is probably useless. You'll have to come up with a scoring algorithm and rank matches to do this properly. Quote Link to comment Share on other sites More sharing options...
ale1981 Posted September 15, 2006 Author Share Posted September 15, 2006 Thanks for your reply, bit beyond me, do you know any sites with tutorials on this kind of thing? Quote Link to comment Share on other sites More sharing options...
ale1981 Posted September 15, 2006 Author Share Posted September 15, 2006 Seems like php's similar_text function works well for what I needed. Quote Link to comment Share on other sites More sharing options...
fenway Posted September 15, 2006 Share Posted September 15, 2006 Guess so... I thought you were looking for a MySQL-only solution. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.