razta Posted May 18, 2010 Share Posted May 18, 2010 Hi, I have a search engine on my PHP application and I want to store the top 50 searches. Now, If I store all query's then the table will get huge. If I limit the amount of query's to 50 and then count the recurring searches then any new query's that weren't in the original 50 won't be stored. Any ideas on how I might solve this problem? Thank you in advance. autonumber Search Count 1 wine 5 2 eggs 1 3 bacon 7 Quote Link to comment https://forums.phpfreaks.com/topic/202208-storing-search-querys-limiting-table-rows/ Share on other sites More sharing options...
DavidAM Posted May 18, 2010 Share Posted May 18, 2010 Insert all of the search words into the table, but add a column for LastSearch date. Then run a cron job daily or weekly (or whatever) that deletes any entries that have a low count and the LastSearch date is over 1 week or 1 month or whatever old. This way, you are only tossing out words that have grown "stale". Another option would be to create a second table to "archive" these deleted words to. Then when you get ready to delete, you can check to see if the word is in the archive and if it is, add the archive count to the live count before deciding what to delete. If you delete a word, insert or replace it in the archive table with the updated count. You can have a cron job scan this table periodically to see if any words have a high enough count to be moved back into the live table. Quote Link to comment https://forums.phpfreaks.com/topic/202208-storing-search-querys-limiting-table-rows/#findComment-1060315 Share on other sites More sharing options...
ignace Posted May 19, 2010 Share Posted May 19, 2010 Now, If I store all query's then the table will get huge. I don't see how that's going to be a problem. Insert all of the search words into the table, but add a column for LastSearch date. Then run a cron job daily or weekly (or whatever) that deletes any entries that have a low count and the LastSearch date is over 1 week or 1 month or whatever old. Is not a good idea, because the first 50 will remain the first 50 as all others would-be popular keywords are always reset while the first 50 will keep incrementing. Quote Link to comment https://forums.phpfreaks.com/topic/202208-storing-search-querys-limiting-table-rows/#findComment-1060446 Share on other sites More sharing options...
razta Posted May 19, 2010 Author Share Posted May 19, 2010 Insert all of the search words into the table, but add a column for LastSearch date. Then run a cron job daily or weekly (or whatever) that deletes any entries that have a low count and the LastSearch date is over 1 week or 1 month or whatever old. This way, you are only tossing out words that have grown "stale". Thanks, I think this may be a viable solution. I don't see how that's going to be a problem. I don't know much about how much space or processing power mysql needs to store or query data. But if I store every query; alpha = 26 numeric = 10 Total = 36 (not including special chars and converting all to lowercase) 36 to the power of 36 = 1.06387359 × 10^56 That's one massive table, does it really not matter that it could possibly get that big? Quote Link to comment https://forums.phpfreaks.com/topic/202208-storing-search-querys-limiting-table-rows/#findComment-1060491 Share on other sites More sharing options...
ignace Posted May 19, 2010 Share Posted May 19, 2010 Euhm that's not how you calculate row size, something similar to this would be more correct (although I'm no expert in this): id + search + count = int + varchar + int = 4 bytes + 255 bytes + 4 bytes = 263 bytes (assuming you use varchar(255)) 263 bytes * 1M records = 263 000 000 = 263 M 263 bytes * 1G records = 263 000 000 000 = 263 G If your website is popular you may reach 1M records (which I doubt you'll ever get to). A more realistic number would be: id + search + count = mediumint + varchar(32) + mediumint = 3 bytes + 32 bytes + 3 bytes = 40 bytes 40 bytes * 1M = 40M 40 bytes * 1G = 40G Quote Link to comment https://forums.phpfreaks.com/topic/202208-storing-search-querys-limiting-table-rows/#findComment-1060509 Share on other sites More sharing options...
razta Posted May 19, 2010 Author Share Posted May 19, 2010 Sorry, I was calculating the amount of entries, should have made that clearer. I think I will just let the table grow and see how it goes, if it becomes unmanageable then I will just delete it and start fresh and then look for a solution. Thank you very much for your thoughts and insight. Quote Link to comment https://forums.phpfreaks.com/topic/202208-storing-search-querys-limiting-table-rows/#findComment-1060515 Share on other sites More sharing options...
ignace Posted May 19, 2010 Share Posted May 19, 2010 Sorry, I was calculating the amount of entries, should have made that clearer. And what do you think I calculated? Quote Link to comment https://forums.phpfreaks.com/topic/202208-storing-search-querys-limiting-table-rows/#findComment-1060530 Share on other sites More sharing options...
razta Posted May 19, 2010 Author Share Posted May 19, 2010 Hey thanks for the help and all, but is there any need to be patronising? Quote Link to comment https://forums.phpfreaks.com/topic/202208-storing-search-querys-limiting-table-rows/#findComment-1060550 Share on other sites More sharing options...
ignace Posted May 19, 2010 Share Posted May 19, 2010 I ain't patronizing. I am just saying that I was calculating the same thing you did. Quote Link to comment https://forums.phpfreaks.com/topic/202208-storing-search-querys-limiting-table-rows/#findComment-1060564 Share on other sites More sharing options...
DavidAM Posted May 19, 2010 Share Posted May 19, 2010 ignace: that's not how you calculate row size, Looks like you were calculating disk space, not number of entries. And you did not include any overhead for the indexes; of course, you assumed all of the keywords are going to be 255 characters long (varchar only uses the space needed for the contents plus a little overhead for the length of the content), so I guess that makes up for it. razta: You are storing each keyword only once, right? I mean if I search for "wine" you INSERT it into the database with a count of 1. When the next guy searches for "wine", you UPDATE that entry and increment the count to 2. You are not doing a second INSERT, right? You should probably just let the table grow until it becomes a problem. But if you develop the archive process before the problem comes up, you will be ahead of the game. One thing to make note of, that auto_increment key is not going to do you much good at all. And you need to have the keyword indexed so you can locate it and update it. You might want to consider making the keyword your Primary Key and throw away the auto_number. Also, since your will be querying for the top 50, you might consider putting an index on the count column so that query will be more Quote Link to comment https://forums.phpfreaks.com/topic/202208-storing-search-querys-limiting-table-rows/#findComment-1060565 Share on other sites More sharing options...
ignace Posted May 19, 2010 Share Posted May 19, 2010 of course, you assumed all of the keywords are going to be 255 characters long I wanted to show the worst-case scenario. And I indeed did not include indexes and stuff which would account for another 20%? I just wanted to point out that his current calculation was incorrect. Of course mine wasn't either correct as you pointed out but closer to the truth. Quote Link to comment https://forums.phpfreaks.com/topic/202208-storing-search-querys-limiting-table-rows/#findComment-1060578 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.