Jump to content

Need help with my search query


rondog

Recommended Posts

I've been searching since this morning on some good search methods. I found one that worked with multiple words

<?php
$searchtxt = $_GET['q'];

$arr = explode(' ', $searchtxt);

$query = "SELECT * FROM video WHERE keywords LIKE '%$searchtxt%'";
foreach($arr as $v) {
$query .= " OR keywords LIKE '%$v%' OR keywords LIKE '%v' OR keywords LIKE 'v%'";
}
$query .= " LIMIT $limitvalue, $limit";

$result = mysql_query($query) or die(mysql_error());


not all code is shown (pagination was used here which is why I have $limit,limitvalue and you dont see them defined).
?>

 

ok thats good for more than one word I guess, but lets say the user searches "african"

 

How could I return results with anything like "afric" in the string? so it would return results for "africa" and "african"

 

I have my keywords field set to 'text' if that makes any difference.

 

thanks for any tips.

Link to comment
Share on other sites

what you are talking about is a derivative based table to find words similar

It be a daunting task and the good ones out there are costly

but I think you want something like

 

if the search is for cute dogs the matches are

dog, dogs, puppies, puppy, etc.

 

You would need to build a mysql table that joins words into "groups" and then can use derivations as additional full text search criteria.

 

 

 

There is no simply computer logic method to bridge African and Africa or "afric" simply because the word African is a proper noun and other words like running can be nouns, verbs etc.  you need a derivative system

Link to comment
Share on other sites

thanks for the replies. I will check out what has been provided to me so far. I guess this is going to be a tougher task than I thought!  :D

 

If u can find a dictionary of some sort in mysql form you can probably do it fairly easily, but to write your own is  unless u use dictionary.com via cURL to get derivatives.

Link to comment
Share on other sites

thanks for the replies. I will check out what has been provided to me so far. I guess this is going to be a tougher task than I thought!  :D

 

If u can find a dictionary of some sort in mysql form you can probably do it fairly easily, but to write your own is  unless u use dictionary.com via cURL to get derivatives.

 

So are you saying you use dictionary.com's searching functions ? How is that possible? Wouldn't you be kind of stealing their bandwidth?

Link to comment
Share on other sites

no you simply start a logical build system to slowly build a MySQL table of data

you build a function to discover every 3,4,5,6,7,8,9 letter word in the dictionary using permutation combinations

 

http://dictionary.reference.com/browse/Fishasgf

returns 1 no word page

http://dictionary.reference.com/browse/Fish

returns a valid word page

 

If its a valid page (use regex to test for case A/B) (better idea is thesaurus.com for synonyms)

return all synonyms of the "word" and insert them into mysql stating they are "relative" to the inital searched word "fish"

 

You now have developed a new list of synonyms that are valid words that in turn can then be "searched" so then run a mysql query for all rows that have not been "indexed" (carry a bool variable in mysql  that says indexed or not) and then in turn run all unindexed words into thesaurus.com for their own personal synonyms list in turn hopefully returning a greater number of words to be indexed.  the script repeats this word indexing until it runs out of synonyms in which it reattempts where it left of the 3,4,5,6,7,8,9 word permutation combinations in seek of additional words.

 

The script isn't too difficult, the processing time is probably 5-10 weeks of solid computing though.

 

Edit:

Section 3 of their ToU seems to say its not allowed, however i am sure there is a "dictionary/Thesaurus" source out there that can provide you with unrestricted access to its content for a non commercial use.

 

Link to comment
Share on other sites

I don't believe so, their site is open for public access.

Your script would simply be a client requesting a page. Whether they choose to block your script is their option.

 

Its never smart to use content from an external site, though. If the format changes or the external site does not respond you script will probably break.

Link to comment
Share on other sites

Its never smart to use content from an external site, though. If the format changes or the external site does not respond you script will probably break.

 

The time of use for their format is so minimal it doesn't matter if it changes.

Its not uncommon to reverse engineer formatted stuff using regex

Link to comment
Share on other sites

no you simply start a logical build system to slowly build a MySQL table of data

you build a function to discover every 3,4,5,6,7,8,9 letter word in the dictionary using permutation combinations

 

http://dictionary.reference.com/browse/Fishasgf

returns 1 no word page

http://dictionary.reference.com/browse/Fish

returns a valid word page

 

If its a valid page (use regex to test for case A/B) (better idea is thesaurus.com for synonyms)

return all synonyms of the "word" and insert them into mysql stating they are "relative" to the inital searched word "fish"

 

You now have developed a new list of synonyms that are valid words that in turn can then be "searched" so then run a mysql query for all rows that have not been "indexed" (carry a bool variable in mysql  that says indexed or not) and then in turn run all unindexed words into thesaurus.com for their own personal synonyms list in turn hopefully returning a greater number of words to be indexed.  the script repeats this word indexing until it runs out of synonyms in which it reattempts where it left of the 3,4,5,6,7,8,9 word permutation combinations in seek of additional words.

 

The script isn't too difficult, the processing time is probably 5-10 weeks of solid computing though.

 

Edit:

Section 3 of their ToU seems to say its not allowed, however i am sure there is a "dictionary/Thesaurus" source out there that can provide you with unrestricted access to its content for a non commercial use.

 

 

Is their a product you can buy to integrate searching into the site. I dont think I am nearly skilled enough to figure all this out. And I dont think using an external source (ie. dictionary.com) would be a good idea. This is a fairly large site/company.

Link to comment
Share on other sites

I understand, I use it all the time.

 

I was more thinking on a per-event basis, rather than building your own giant database of results.

 

And a site like thesaurus.com is generally not going to change formatting, but if the text for a bad result was changed suddenly, the script would have troubles determining if the word existed or not.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.