Jump to content

Archived

This topic is now archived and is closed to further replies.

stubarny

indexing text in databases

Recommended Posts

Hi,

I'm trying to code a search query for a database of about 500,000 job adverts (similar advert length to monster, careerbuilder etc).

I have found this search code (http://iamcal.com/publish/articles/php/search/) which is suitable for 'small' tables. However at the bottom of the article it mentions that using a FULLTEXT index in mysql would be effective for medium sized databases. I'm assuming that 500,000 adverts is a 'medium' database so I'm thinking this could be a good solution, without having to get to grips with specialist indexing software (e.g. Zend Search Lucene - http://framework.zend.com/manual/en/zend.search.html)

Please can you give me your opinions on whether or not you think simply using a FULLTEXT index is a good solution? (I'm only a 'keen amateur' php/sql programmer so i don't have much experience with the practicalities of running largish databases)

Thanks a lot for your your time!

Stewart

Share this post


Link to post
Share on other sites
500K is a good size for FULLTEXT -- it really depends on how "english-like" your search keywords are. 

Share this post


Link to post
Share on other sites
Thanks, that's good news. I'm planning to include non-english text a year or two in the future so I'll keep an eye on the CPU usage, hopefully the processors will have sped up enough by then so there won't be a problem.

Cheers,

Stu

Share this post


Link to post
Share on other sites
It actually has nothing to do with memory/CPU usage -- just that FULLTEXT is designed for English prose, so if you're looking for anything else, it's pretty much useless.  Also, it has some funny quirks that only make sense of large datasets.

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.