Jump to content

Gaining access to how MySql parses text for Full Text


HowdeeDoodee

Recommended Posts

I want to gain access to the function or process MySql uses to parse words and phrases for Full Text searching. Here is an example.

 

If the user inputs...

 

Milan in history

 

MySql will search for milan, history, and milan history. Is there a way to extract just the combination of terms MySql uses to search the db without the stop words? Stop words are automatically eliminated from the search request unless the user encloses a phrase in quotes.

 

What I am trying to do is develop a script to highlight found search terms and phrases. I can explode a phrase into single words but if I do that the stop words would be included in the array. If there is some way of getting into the parsed words or phrases MySql Full Text actually uses to search, I can use each of those combinations as a keyword in my highlighting script.

 

This request is about searching for the code or any code related to the questions involved.

 

Thank you in advance for any replies.

Link to comment
Share on other sites

You can remove stopwords.  You can remove words outside of the length range.  At this point you've replicated the first two things MySQL does to pare down the list.  Unfortunately, it also has removed any words that appear in over 50% of the tables rows, and I believe that it is non-trivial to replicate this portion.  (I've assumed you're not using BOOLEAN MODE.)

Link to comment
Share on other sites

Thank you Wildbug.

 

You can remove stopwords.  You can remove words outside of the length range.  At this point you've replicated the first two things MySQL does to pare down the list.  Unfortunately, it also has removed any words that appear in over 50% of the tables rows, and I believe that it is non-trivial to replicate this portion.  (I've assumed you're not using BOOLEAN MODE.)

 

Yes, I want to do what you have described. I want to do what MySql inherently does in that MySql does indeed remove the stop words and any words outside of range. However, just removing the stopwords, ect. is not my ultimate objective. My ultimate objective is to highlight the search terms not removed. I want to highlight the words MySql uses to find text. I need to put those parsed words into an array or a function after the parsing takes place. Now, here is the initial question rephrased, HOW do I do what you describe. Thank you again for your reply.

Link to comment
Share on other sites

I have solved or resolved this issue. With the following function. To run the function, you need to create a stopword file name stopwords.php. A sample of the file contents is below. In my example, if the user inputs any of the words or all of the words dog, red, or max, those words will be stripped from the user input and replaced by a blank space.

 

function stopwordfilter($userinput){

//This function filters stopwords from the users input.

$FileName="stopwords.php";

$list = file ($FileName);

foreach ($list as $value) {

list ($stopword,$filter,) = explode ("|^|", $value);

$userinput = eregi_replace($stopword, $filter, $userinput);

}

return $userinput;

}

 

//see if the stopwords filter works

 

$userinput = stopwordfilter($userinput);

 

//these lines go in a file called stopwords.php

//if the user types in or inputs any of these words, these words will be replaced by a blank space.

dog|^| |^|

red|^| |^|

max|^| |^|

 

Thank you for the followup posts.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.