elgoog Posted September 17, 2008 Share Posted September 17, 2008 I am trying to do the following and have no idea where to start with this one, if anyone has tackled something like this or could set me on the right path, it would be massively appreciated. Info Say i have the following table called 'items', with three columns itemID(int), userID(int), Description(text) any row could have thousands of words in the description and there could be hundreds of entrys. Problem If i wanted to extract a list of the top 50 most common words of a paticular users descriptions, where should i begin. I will also be wanting to exclude words under a certain number and ignore words such as a, the, and etc. Im not even sure what to search for on google to help with this one either, and what sort of methodology i should be approaching this problem with. Thanks in advance. Link to comment https://forums.phpfreaks.com/topic/124726-headache-problem-with-searching-and-common-words/ Share on other sites More sharing options...
BlueSkyIS Posted September 17, 2008 Share Posted September 17, 2008 off the top of my head: i would create another table, descWords, with columns id - INT unsigned auto-increment primary index descWord - varchar(64), indexed occurrences - INT unsigned then occasionally run a script that explodes each user's description into an array of words. the script would first make the array of words unique, and ensure there aren't any 'non-words', like commas, spaces, question marks, etc. after the array is unique, i'd begin updating the descWords table for each word. i wouldn't make this happen every time someone does a search, because it might take too long for our script to run over all the records and update descWords for each use. Link to comment https://forums.phpfreaks.com/topic/124726-headache-problem-with-searching-and-common-words/#findComment-644249 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.