funkybeat Posted February 8, 2013 Share Posted February 8, 2013 (edited) Hi phpFreaks! To improve my php skills I do some private coding for my own. I did some small php scripts in the past, however never worked with the "search/DB" topic. This is where someone may help me a bit to give me a good start for my next project: "A scrabble search engine based on a dictionary with php & mysql" I'll try to do this as most efficient (fast) as possible. Requirements: > "field to enter random 1-14 letters" to search for words in the DB > "Dictionary in Database" or in file but I dont think that may be faster... My plan: > Put every single word into a database table called "words" (EDIT: maybe I should add a second row and save the "word length" of each word as well) > split up the 1-14 letters into a letter array > build a loop which groups the 1-14 letterArray together in all possible combinations: > 1st run: two-party groups -> check with "like" if its in the database for every single entry and put the result in a resultArray (EDIT: only check the words with the "word length" = 2) > 2nd run: three-party groups -> check with "like" if its in the database for every single entry and attach the result to the resultArray (EDIT: only check the words with the "word length" = 3) .....>14th run... > output resultArray I'm not looking for code parts, I'm just looking for a good concept to start with. Does someone have any recommendations on how I could do it better than the above? In my eyes there are too many DB search requests and maybe there is a better way to do it? Cheers Mike Edited February 8, 2013 by funkybeat Quote Link to comment https://forums.phpfreaks.com/topic/274202-most-eficient-way-to-do-a-dictionary-search/ Share on other sites More sharing options...
Christian F. Posted February 8, 2013 Share Posted February 8, 2013 (edited) Assuming you're going to save each form of a word as a new word in itself, which would be easiest, instead of saving the root form and then having another table for the other forms/suffixes. You don't need to use like at all, and this challenge becomes rather simple: Add an index on the word, preferably a unique, and you're pretty much set. Seeing as you want words of that specific length, and not words containing a certain letter combination. Additionally, for the low number of words in the above database, and the high number of combinations possible, I'd probably just retrieve all of them first. Stuff them into an array as keys, and then do an check for isset () in the array that generates/checks the letter combinations. Always query with and for the least amount of data, using the minimum number of queries possible. If there had only been a few words to check against the database, I'd build the query using IN() to retrieved all of the legit words. PS: Moved this to the "Application design" section, as this is about application design and not actual code assistance. Edited February 8, 2013 by Christian F. Quote Link to comment https://forums.phpfreaks.com/topic/274202-most-eficient-way-to-do-a-dictionary-search/#findComment-1410975 Share on other sites More sharing options...
kicken Posted February 8, 2013 Share Posted February 8, 2013 > build a loop which groups the 1-14 letterArray together in all possible combinations: Know that such a task is not really practical, especially with an upper limit of 14 letters. With 14 letters you're looking at 14-factorial combinations (roughly 87 billion). Aside from the time it would take to generate that many combinations, in order to store all of them if you wanted to do a query would cost you 1.1 TB of space. Querying all the dictionary words and checking against them as you generate each combination would cut down on the memory requirement, but you'd still be looking at a fairly substantial time investment just to generate the combinations. Something that would probably help to find possible words is to first check for any letter combinations that create common letter groupings, such as 'er', 'ing', 'tion', 'ed', etc and then search for dictionary words containing those combinations. Then you can filter the dictionary words based on whether or not all the letters for the word are present in the available letters list. Quote Link to comment https://forums.phpfreaks.com/topic/274202-most-eficient-way-to-do-a-dictionary-search/#findComment-1410990 Share on other sites More sharing options...
funkybeat Posted February 11, 2013 Author Share Posted February 11, 2013 Hi Chris & Kicken, thx for your feedback! In this case I'd go for a more array based solution instead of bashing the server with too many SQL requests. The dictionary is around 7MB so it should be doable to put everything into the array and check it with isset(). Cheers any many thanks for your help! Mike Quote Link to comment https://forums.phpfreaks.com/topic/274202-most-eficient-way-to-do-a-dictionary-search/#findComment-1411701 Share on other sites More sharing options...
xylex Posted February 12, 2013 Share Posted February 12, 2013 This sounds like a problem for map-reduce across a few shards. Any reason you're picking MySQL? Quote Link to comment https://forums.phpfreaks.com/topic/274202-most-eficient-way-to-do-a-dictionary-search/#findComment-1411902 Share on other sites More sharing options...
funkybeat Posted February 12, 2013 Author Share Posted February 12, 2013 Hi xylex, so you would recommend MongoDB ? The only reason for me to use MySql is because I know how to use it and it's already running on my server. However I'm more than happy to try something new. I just went through their website and it seems I'll need some more time to understand all the details of mapreduce & sharding (never heard this before...) thanks for throwing this in! Michael Quote Link to comment https://forums.phpfreaks.com/topic/274202-most-eficient-way-to-do-a-dictionary-search/#findComment-1411946 Share on other sites More sharing options...
xylex Posted February 12, 2013 Share Posted February 12, 2013 Yeah, MongoDB would work for this. Reason I'm suggesting it is that you're basically trying to do that with the earlier suggestions - get everything that looks like it could be a match (Map) - and then filter out just the ones that you want (Reduce). Doing it with just MySQL and PHP means that this process in single threaded/single instance orientated and won't scale well to huge datasets - might not be an issue in this specific use case, but if it's a learning project, might as well go big. Mongo runs a single thread per instance in its map/reduce handling, hence the idea to shard it across multiple instances. Quote Link to comment https://forums.phpfreaks.com/topic/274202-most-eficient-way-to-do-a-dictionary-search/#findComment-1412066 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.