Jump to content

best way to search large tables? -need feedback on an idea


nmeyer47

Recommended Posts

I have a couple large tables (77,055 records-79 megs & 111,240 records -69 megs & 20,542 records) that need to be searched by site visitors.

 

I was doing a (like \'%term%\' ) on the one table i\'m searching now but when I have to add the other tables in the mix I need something faster.

 

I was going to index all the words in the fields I want to search in one table(tableA (wordid,word)) then have tableB (id,wordid,recordid, appearences,tableid). i\'ll search tableA for matches to search terms, get the wordid, then search tableB for records matching wordid. Do some sort of rating, like number of search words that appear in a particular record or number of times a word appear as indicated by the appearances. not sure, gotta figure that part out exactly.

 

one of my concerns is that the index tables get really big, lots of records anyway. but i have to imagine that searching a field that only contains one word, as in: (word = \"$term\"); is much more efficient than searching a large text field, as in: (description like \"%$term%\").

 

anyway, i just wanted some feedback on this method. I haven\'t been able to find my sites that discuss search log as it would apply to my problem. figured this was my best source for info.

 

thanks, peace... :arrow:

Link to comment
Share on other sites

The LIKE operator is quite slow, especially if you\'re saying

 

LIKE \"%word%\"...

 

LIKE \"WORD%\" is MUCH faster!

 

Anyway, your idea of an index table with words and a table with reference to the index and the records is ok...

 

BUT table B would be big... (It would basically mean that you\'re just assigning a number of keywords to the record, and you could skip the large text field (then you would loose the word order, though). I\'m NOT sure of the performance.

 

It a little much to ask for massive speed when working on 100k+ records... Please post what you find (I think it an interesting problem).

 

P.

Link to comment
Share on other sites

  • 2 weeks later...

first off thanks biopv for the big help. Otherwise I probably would have given up and become a farmer by now.

 

I am using the Full-Text indexing that MySQL has available that biopv pointed out. Only I have to do some sorting with arrays to get only the relevent results when i search by more than one word.

 

so what i\'m doing is searching by each word like \"+keyword\". if my search term is 3 words long i do 3 searchs \"+keyword1\" \"+keyword2\" \"+keyword3\". I load the ids that are returned by each search into there own array, containing id,points and tableid, then i compare the arrays and only return the records that match all the terms. I add the points they receive along the way, sorting them by points just before I print the results to the screen. this is the only way i can get only the most relevent results. it doesn\'t work well enough to just search by \"+keyword1 +keyword2 +keyword3\". mysql treats keyword1 as more important than the other words so you may not get the results you\'re looking for.

 

Now the ONLY problem I have is that mysql doesn\'t index words that are 3 chars or less. so I can\'t search for usb or hub or WWF just to name a few. what do I do about that?

 

if anybody has any questions post them here and I\'ll do my best to respond.

 

or if you have any suggestions i\'d love to hear them.

 

thanks again BIOPV, these forums are the best!@!!!

 

nate

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.