Jump to content

punctuation and search function


turpentyne

Recommended Posts

I was showing a friend what I've been working on, and as I watched him type in an entry to search - then not get a result that I know was indeed in the database I realized I needed to accommodate "user laziness."  He didn't use punctuation. So I'm trying to figure out how to make that a nonfactor in searches.

 

In other words, my database may have an entry called "St. Patrick's Cathedral"

If somebody types "st patricks cathedral" I want the result to still show up.

 

Here's what I'm using right now:

 

if(isset($_POST[submitted])) {
$item = $_REQUEST['find'];
$filter1 = trim($_POST['filter1']);
if ($filter1 == 'sci-names') {

$data = mysql_query("SELECT * FROM table WHERE CONCAT(taxonomic_genus,' ',scientific_name) LIKE '%$item%'");

Link to comment
Share on other sites

hmmm.. ok. I've changed the query like below, and left the rest from fetch_array down the same. Is that not something I can do??? I'm no getting:

Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource

 

        $item = $_REQUEST['find'];
$query = "SELECT *,  MATCH(scientific_name, common_name_english) AGAINST ($item) AS score from table WHERE MATCH (scientific_name, common_name_english) AGAINST($item) order by score desc";

//$data = mysql_query("SELECT * FROM plantae WHERE common_name_english LIKE '%$item%'");

$data = mysql_query($query);


while($row = mysql_fetch_array($data)){
echo ' etcetera ';
}

Link to comment
Share on other sites

Try checking if you have created the FULLTEXT indexes on the columns you want to be searchable. If not, have a look at this article, which gives a lot more details on FULLTEXT and how to alter your tables to add indexes.

 

Basically, you have to run a SQL code like this:

ALTER TABLE table ADD FULLTEXT(scientific_name, common_name_english);

 

BTW, "table" is a reserved word in MySQL. Don't know if that's your real table name, but if it is, changing would be a good idea :)

Link to comment
Share on other sites

ah! it was apparently just a lack of apostrophes around $item

 

now I have to figure out a more difficult issue. There's a limit of 4 characters to the search. I need to figure out how to allow 3 and even a selected few 2-word terms. the reason is because there are some very common terms that are only three words which are applicable to my database of plants.

 

elm, oak for example. They would not be searchable if there was no allowance for 3 letters.

 

Link to comment
Share on other sites

That could be a problem actually. The limit is controlled by a MySQL system variable and if you aren't running it on a system where you have full access, chances are low. In a shared host or other limited environments, you can't edit such variables and I doubt the company will change it just for you.

 

Anyway, here is an article that describes FULLTEXT limits. Basically, you have to add "ft_min_word=3" (or 2) in my.cnf, restart the server and rebuild the FULLTEXT indexes. Hope it's something you have access to.

Link to comment
Share on other sites

As there aren't many options involved, it looks more than feasible to me!

 

However, if you want it to be at least decent (speaking of 3 or less characters search), you'll have to build an index table. Basically, what I'm talking about is a table that holds all your pages content, but stripped from punctuation and stopwords (and, or, a, an, stuff like that). FULLTEXT itself uses a list of stopwords, which can be a good starting point. Wrapping it up: write some code that takes the content of your pages, strips it from punctuation and stopwords, adds it to the index table together with that page ID and finally modify that code to be run by a cron job so it updates the indexes periodically (for changes) and adds new pages. The "WHERE LIKE" search will be run on that table once you have it set up and if you like, you can run the FULLTEXT search on it too. It will surely benefit from it!

 

It isn't that complicated, really. Sure, sticking just to FULLTEXT would be easier, but when options are scarce, why not use what you have to the full potential? :)

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.