Jump to content

Recommended Posts

I have written a search function which I am moderately happy with, but I feel I can improve on 2 things.  One there has to be a better way.  Now I am not asking anyone to tackle that for me so I would for now hope to just improve the code so it does that last part in what I need it to do.  The below code is for a general search box.  It currently does 3 things.  Allows for OR, Exact phrace match, and AND.  Additionally I have it set up to allow a user to type 2 exact phrases BUT I would like the exact phrases to be OR or AND.  Right now it only does OR because I explode on a space.  So the search "Dog man" "Cat woman" searches all records that have "Dog man" OR "Car Woman" I want to set it up to also do AND.  Now here is the tricky part.  My AND is done through a multiterm variable linked by a +.  So where the search sees a + it goes through an IF statement to read the + sign how I need it to be read to come up as an AND.  Now when you look at the code I would love suggestions how to make it easier, but most importantly would love to know how to use the + in the exact phrase case mentioned above.  Well thanks and I wish you all luck, as I am  ::) (stumped) and need your help!  :P

function search($arr) {

$arr['searchterms'] = trim($arr['searchterms']); 
         if (strstr($arr['searchterms'],'"'))
         $terms = explode('" "',substr($arr['searchterms'],1,-1));
else
$terms = explode(" ",$arr['searchterms']);

$from = 0;
$limit = 20;

if (isset($arr['from']) && $arr['from'] > 0) {
	$from = $arr['from'];
}

// enforce hard-coded max limit
if (isset($arr['limit']) && $arr['limit'] > 0 && $arr['limit'] <= 100) {
	$limit = $arr['limit'];
}

$query = " FROM `".TABLE_PREFIX."pictures` p LEFT JOIN `".TABLE_PREFIX."comments` c
	ON p.`id` = c.`parent_id` ";

if ((count($terms) != 1) || ($terms[0] != '')){
	$query .= " WHERE ( ";
	foreach ((array)$terms as $term) {
$term = mysql_real_escape_string($term);
$multi_term = explode("+",$term);
if (count($multi_term)>1) {
$path = implode("%' AND `path` LIKE '%",$multi_term);
$description = implode("%' AND `description` LIKE '%",$multi_term);
$comment = implode("%' AND `comment` LIKE '%",$multi_term);
$caption = implode("%' AND `caption` LIKE '%",$multi_term);
$EXIF_subj_ref = implode("%' AND `EXIF_subj_ref` LIKE '%",$multi_term);
$EXIF_cont_loc_name = implode("%' AND `EXIF_cont_loc_name` LIKE '%",$multi_term);
$EXIF_city = implode("%' AND `EXIF_city` LIKE '%",$multi_term);
$EXIF_state = implode("%' AND `EXIF_state` LIKE '%",$multi_term);
$EXIF_country = implode("%' AND `EXIF_country` LIKE '%",$multi_term);
$EXIF_credit = implode("%' AND `EXIF_credit` LIKE '%",$multi_term);
$EXIF_source = implode("%' AND `EXIF_source` LIKE '%",$multi_term);
$EXIF_contact = implode("%' AND `EXIF_contact` LIKE '%",$multi_term);
$EXIF_caption = implode("%' AND `EXIF_caption` LIKE '%",$multi_term);
$EXIF_make = implode("%' AND `EXIF_make` LIKE '%",$multi_term);
$EXIF_model = implode("%' AND `EXIF_model` LIKE '%",$multi_term);
$EXIF_copyright = implode("%' AND `EXIF_copyright` LIKE '%",$multi_term);
$EXIF_artist = implode("%' AND `EXIF_artist` LIKE '%",$multi_term);
$EXIF_lat_ref = implode("%' AND `EXIF_lat_ref` LIKE '%",$multi_term);
$EXIF_long_ref = implode("%' AND `EXIF_long_ref` LIKE '%",$multi_term);
$EXIF_lat = implode("%' AND `EXIF_lat` LIKE '%",$multi_term);
$EXIF_long = implode("%' AND `EXIF_long` LIKE '%",$multi_term);
$EXIF_date_time_dig = implode("%' AND `EXIF_date_time_dig` LIKE '%",$multi_term);
$EXIF_date_time_orig = implode("%' AND `EXIF_date_time_orig` LIKE '%",$multi_term);
$EXIF_img_hist = implode("%' AND `EXIF_img_hist` LIKE '%",$multi_term);
$EXIF_usr_comment = implode("%' AND `EXIF_usr_comment` LIKE '%",$multi_term);
$EXIF_comment = implode("%' AND `EXIF_comment` LIKE '%",$multi_term);
$EXIF_keyword = implode("%' AND `EXIF_keyword` LIKE '%",$multi_term);
} else {
$path = $description = $comment = $caption = $EXIF_subj_ref = $EXIF_cont_loc_name = $EXIF_city = $EXIF_state = $EXIF_country = $EXIF_credit = $EXIF_source = $EXIF_contact = $EXIF_caption = $EXIF_make = $EXIF_model = $EXIF_copyright = $EXIF_artist = $EXIF_lat_ref = $EXIF_long_ref = $EXIF_lat = $EXIF_long = $EXIF_date_time_dig = $EXIF_date_time_orig = $EXIF_img_hist = $EXIF_usr_comment = $EXIF_comment =$EXIF_keyword = $term;
}  $query .= "    
`path` LIKE '%$path%' OR
`description` LIKE '%$description%' OR
`comment` LIKE '%$comment%' OR
`caption` LIKE '%$caption%' OR
`EXIF_subj_ref` LIKE '%$EXIF_subj_ref%' OR
`EXIF_cont_loc_name` LIKE '%$EXIF_cont_loc_name%' OR
`EXIF_city` LIKE '%$EXIF_city%' OR
`EXIF_state` LIKE '%$EXIF_state%' OR
`EXIF_country` LIKE '%$EXIF_country%' OR
`EXIF_credit` LIKE '%$EXIF_credit%' OR
`EXIF_source` LIKE '%$EXIF_source%' OR
`EXIF_contact` LIKE '%$EXIF_contact%' OR
`EXIF_caption` LIKE '%$EXIF_caption%' OR
`EXIF_make` LIKE '%$EXIF_make%' OR
`EXIF_model` LIKE '%$EXIF_model%' OR
`EXIF_copyright` LIKE '%$EXIF_copyright%' OR
`EXIF_artist` LIKE '%$EXIF_artist%' OR
`EXIF_lat_ref` LIKE '%$EXIF_lat_ref%' OR
`EXIF_long_ref` LIKE '%$EXIF_long_ref%' OR
`EXIF_lat` LIKE '%$EXIF_lat%' OR
`EXIF_long` LIKE '%$EXIF_long%' OR
`EXIF_date_time_dig` LIKE '%$EXIF_date_time_dig%' OR
`EXIF_date_time_orig` LIKE '%$EXIF_date_time_orig%' OR
`EXIF_img_hist` LIKE '%$EXIF_img_hist%' OR
`EXIF_usr_comment` LIKE '%$EXIF_usr_comment%' OR
`EXIF_comment` LIKE '%$EXIF_comment%' OR
`EXIF_keyword` LIKE '%$EXIF_keyword%' OR ";
}		
	$query = substr($query, 0, strlen($query) - 3) .") ";
} else {
	// no search terms? no results either
	$query .= " WHERE 1 = 0";
}

$sort_fields = array('date_submitted','id');
$sortby = 'date_submitted';

if (isset($arr['sortby']) && in_array($arr['sortby'],$sort_fields)) {
	$sortby = $arr['sortby'];
}

$sortdir = ' ASC';

if (isset($arr['sortdir']) && 'desc' == $arr['sortdir']) {
	$sortdir = ' DESC';
}

$result = run_query("SELECT COUNT(DISTINCT p.`id`) AS cnt " . $query);
$row = mysql_fetch_assoc($result);

$GLOBALS["total_pictures"] = $row["cnt"];
// and I need sort order here as well
// from and limit too
$result = run_query("SELECT p.*,c.`comment`,
			UNIX_TIMESTAMP(`date_submitted`) AS `unix_date_submitted` ".$query .
			" GROUP BY p.`id` ORDER BY `$sortby` $sortdir LIMIT $from,$limit");

$GLOBALS["available_pictures"] = mysql_num_rows($result);
$GLOBALS["picture_counter"] = 0;
$GLOBALS["picture_dbh"] = $result;
}

 

 

I don't see any reason to use the + or - for included and excluded terms in an exact phrase search.

 

For my site I made a dropdown that they can select which type of search and from what areas they would like to look from. It then uses a multiple if/else for the mysql queries and I use a variety of match,like and boolean mode to get the specific type of results.

 

Pretty much any + or - in any phrase or search word still works regardless what they selected.

 

I'm sure doing it your way in a certain circumstance you can just do a replace.

 

Possibly if the first character is not a number or letter then do something.

$search_words = str_replace(array('+','-'), "", $search_words);  

 

Anyway, If this helps here's a snippet of my code and how I do it.

I integrated my pagination, so it's like a search/navigation rolled into one.

It uses the get values from the form and then is also paged by same form.

I have some old queries and such, I fiddle with this more sometimes.

 

//search get variables from search-navigation 
if ($search == "Date") {
//    $result = mysql_query("SELECT * FROM posts $post_status AND post_date LIKE //'".$search_words."%' ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
//$total_count = mysql_query("SELECT * FROM posts $post_status AND post_date LIKE //'".$search_words."%'");

$result = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_date) AGAINST ('\"$search_words\"' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_date) AGAINST ('\"$search_words\"' IN BOOLEAN MODE)"); 
} elseif ($search == "ID") {
    $result = mysql_query("SELECT * FROM posts $post_status AND ID LIKE '".$search_words."%' ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND ID LIKE '".$search_words."%'");
} elseif ($search == "url_begins_characters") {
    $result = mysql_query("SELECT * FROM posts $post_status AND post_title LIKE '".$search_words."%' ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND post_title LIKE '".$search_words."%'");
} elseif ($search == "url_contains_characters") {
    $result = mysql_query("SELECT * FROM posts $post_status AND post_title LIKE '%"."$search_words"."%' ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND post_title LIKE '%"."$search_words"."%'");
} elseif ($search == "feed_single_word") {
    $result = mysql_query("SELECT * FROM posts $post_status AND link_rss LIKE '%"."$search_words"."%' ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND link_rss LIKE '%"."$search_words"."%'");
} elseif ($search == "one_word") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_title,post_content) AGAINST ('$search_words' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_title,post_content) AGAINST ('$search_words' IN BOOLEAN MODE)");
} elseif ($search == "exact_words") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_title,post_content) AGAINST ('\"$search_words\"' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_title,post_content) AGAINST ('\"$search_words\"' IN BOOLEAN MODE)");   
} elseif ($search == "least_one_word") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_title,post_content) AGAINST ('$search_words' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
    $total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_title,post_content) AGAINST ('$search_words' IN BOOLEAN MODE)");
} elseif ($search == "exclude_word") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_title,post_content) AGAINST ('+$search_words -$search_words' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
    $total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_title,post_content) AGAINST ('+$search_words -$search_words' IN BOOLEAN MODE)");
} elseif ($search == "title_one_word") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (title_2) AGAINST ('$search_words' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (title_2) AGAINST ('$search_words' IN BOOLEAN MODE)");
} elseif ($search == "title_exact_words") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (title_2) AGAINST ('\"$search_words\"' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (title_2) AGAINST ('\"$search_words\"' IN BOOLEAN MODE)");   
} elseif ($search == "title_least_one_word") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (title_2) AGAINST ('$search_words' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
    $total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (title_2) AGAINST ('$search_words' IN BOOLEAN MODE)");
} elseif ($search == "title_exclude_word") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (title_2) AGAINST ('+$search_words -$search_words' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
    $total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (title_2) AGAINST ('+$search_words -$search_words' IN BOOLEAN MODE)");
} elseif ($search == "description_one_word") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_description) AGAINST ('$search_words' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_description) AGAINST ('$search_words' IN BOOLEAN MODE)");
} elseif ($search == "description_exact_words") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_description) AGAINST ('\"$search_words\"' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_description) AGAINST ('\"$search_words\"' IN BOOLEAN MODE)");   
} elseif ($search == "description_least_one_word") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_description) AGAINST ('$search_words' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
    $total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_description) AGAINST ('$search_words' IN BOOLEAN MODE)");
} elseif ($search == "description_exclude_word") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_description) AGAINST ('+$search_words -$search_words' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
    $total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_description) AGAINST ('+$search_words -$search_words' IN BOOLEAN MODE)");
} elseif ($search == "keyword_one_word") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_keywords) AGAINST ('$search_words' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_keywords) AGAINST ('$search_words' IN BOOLEAN MODE)");
} elseif ($search == "keyword_exact_words") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_keywords) AGAINST ('\"$search_words\"' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_keywords) AGAINST ('\"$search_words\"' IN BOOLEAN MODE)");   
} elseif ($search == "keyword_least_one_word") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_keywords) AGAINST ('$search_words' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
    $total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_keywords) AGAINST ('$search_words' IN BOOLEAN MODE)");
} elseif ($search == "keyword_exclude_word") {
    $result = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_keywords) AGAINST ('+$search_words -$search_words' IN BOOLEAN MODE) ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
    $total_count = mysql_query("SELECT * FROM posts $post_status AND MATCH (post_keywords) AGAINST ('+$search_words -$search_words' IN BOOLEAN MODE)");
} else {

//if anything goes wrong above or nothing selected, this will be used as the default query instead

/*
all posts
//$result = mysql_query("SELECT * FROM posts $post_status ORDER BY $display $order //LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status");
*/

//just show last 10 by id main page
if ($url == "http://get.blogdns.com/dynaindex/index.php") {
$result = mysql_query("SELECT * FROM posts ORDER BY ID DESC LIMIT 0,10");
$total_count = $result;
} else {
//todays results new and updated
$result = mysql_query("SELECT * FROM posts $post_status AND post_date LIKE '".$today_date."%' ORDER BY $display $order LIMIT $startrow,$posts_per_page" );
$total_count = mysql_query("SELECT * FROM posts $post_status AND post_date LIKE '".$today_date."%'");
}

Thank you for the post, but do you not feel there is anyway to include my idea of having the exact phrase match include the AND usage via the + sign which is how I have it set up now?

 

Meaning that "Hello"+"world" would be a exact phrase match on BOTH hello and world?  Currently that fails as you know due to the explode.  If that is not possible I will start reworking the search based around your suggestion.

Looking over your code again and thinking.

 

I thought that if you place the string into an array, and make each word unique, you could then maybe do a foreach loop and modify each as you like, checking for patterns using preg_match for each word if contains spaces or the +. Then would need to trim and remove the plus...run whatever query you wanted with the words.

 

I did come across this, it may help or just confuse the situation more.

http://www.joedolson.com/Search-Engine-in-PHP-MySQL.php

 

I would use preg_match with a pattern like /and|or/\i if needed it to work for either one

 

Pretty much you can do anything you want here, most things can be done in php if are slick enough about it.

 

if wanna display OR plus also AND, then run a double query, if checking for exact matches...look for double whitespace, if is an include check for the +

 

To me I'd rather just let the user select which type of search they want, after all, they are the only ones that know what they want.

Once my server gets done recovering some lost files I can fiddle more with an idea I have.(been 3 days thus far)

 

It would be them typing in either special expressions or just using the simple words of and,or,not

 

Highlighting the word, or groups of words to show them their search terms better.

 

Doing a preg_match on the string and seperating these into groups, query the database with all the desired results.

 

Maybe you don't know this but +,- are default characters to include and exclude in a boolean mode search.

Didn't know that about the = and - or the boolean search mode.  I learn as I go so I will be looking this up : P

 

I have looked into preg_match but really couldn't find a snippet to code from as I failed trying to start from scratch with that idea, hence why I use the above.  It gets the job done, but probably could be way more efficient.

Looking over your code again and thinking.

 

I thought that if you place the string into an array, and make each word unique, you could then maybe do a foreach loop and modify each as you like, checking for patterns using preg_match for each word if contains spaces or the +. Then would need to trim and remove the plus...run whatever query you wanted with the words.

 

I did come across this, it may help or just confuse the situation more.

http://www.joedolson.com/Search-Engine-in-PHP-MySQL.php

 

I would use preg_match with a pattern like /and|or/\i if needed it to work for either one

 

Pretty much you can do anything you want here, most things can be done in php if are slick enough about it.

 

if wanna display OR plus also AND, then run a double query, if checking for exact matches...look for double whitespace, if is an include check for the +

 

To me I'd rather just let the user select which type of search they want, after all, they are the only ones that know what they want.

 

I have looked over that link previous to posting here and was considering it if I needed to do a complete rewrite of the expression, but really my expression works in all cases that I need it based on user feedback and in house testing.  The only case it failed is where a user wanted to look for photos along a specific lat and long.  When they looked for say "33.2" and "78.2" they were getting images at "33.2" or 78.2"  because I am not EXACT MATCH totally 33.2 could be 33.23432 which I wanted.  This allows them to naturally wildcard without thinking about it. 

You do have a very complex code there, must have taken you a while and also with problems along the way.

 

I do think this entire process could be made simpler, that's for sure.

 

LIKE searches are kinda slow, I believe doing full text indexing on your database would remedy a lot of your issues, as can see I only used LIKE for when checking the first characters and not the rest.

 

Anyway here's another snippet of code for you to try out.

 

$startrow ="0";
$posts_per_page ="20";
$search_query = "('";
  for($i=0;$i < $word_count;$i++){
    $search_query .= '+'.$search_words[$i].'* ';
  }
$search_query .= "' IN BOOLEAN MODE)";
  
$sql="SELECT * FROM table WHERE MATCH (title,description) AGAINST" . $search_query .  "ORDER BY id DESC LIMIT $startrow,$posts_per_page";

As per http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html

 

In implementing this feature, MySQL uses what is sometimes referred to as implied Boolean logic, in which

 

    *

 

      + stands for AND

    *

 

      - stands for NOT

    *

 

      [no operator] implies OR

 

Why in the below could I not add the ability to also have the code look for areas with qoutes and an addition sign?  This would add that last bit that I need without really changing the overall scheme.  I do realize I need to clean this up but for demoing purposes isn't there a quick and dirty fix?

 

$arr['searchterms'] = trim($arr['searchterms']); 
         if (strstr($arr['searchterms'],'"'))
         $terms = explode('" "',substr($arr['searchterms'],1,-1));
else
$terms = explode(" ",$arr['searchterms']);

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.