Jump to content


Photo

Searching an array for multiple keywords


  • Please log in to reply
3 replies to this topic

#1 litebearer

litebearer
  • Members
  • PipPipPip
  • Advanced Member
  • 2,357 posts
  • Locationwhite lake michigan

Posted 18 July 2006 - 03:45 AM

Evening, all...

I can't seem to get my thoughts properly wrapped around this problem.

Scenario:

Searching an array of data for matches to keywords. Create a new array from the matches.

01. From 1 to 3 keywords possible. Match must have ALL keywords.

02. Case insensitive

03. $needle1, $needle2 and $needle3 represent the possible keywords

04. $old_haystack represents an element from the original array

05. $haystack represents lowercase version of $old_haystack

06. must be compatible with PHP versions 3, 4 and 5

I came up with this sledge-hammer approach, looping through all of elements of the original array and runing this function on each element.

However, I am sure there is a more elegant, refined and effective approach, but what is it?

<?PHP
function validate_words($needle1, $needle2, $needle3, $old_haystack) {
	$haystack = strtolower($old_haystack);
	$new_element = "";
	$kwl[0] = strlen(trim($needle1);
	$kwl[1] = strlen(trim($needle2);
	$kwl[2] = strlen(trim($needle3);
	$add_element = 0;

	if(($kwl[0]>0) AND ($kwl[1]>0) AND ($kwl[2]>0) AND (substr_count($haystack, $needle1)>0) AND (substr_count($haystack, $needle2)>0) AND (substr_count($haystack, $needle3)>0)) {
	// add element to new array
	return 1;
	}

	if(($kwl[0]>0) AND ($kwl[1]>0) AND ($kwl[2]<1) AND (substr_count($haystack, $needle1)>0)  AND (substr_count($haystack, $needle2)>0)){
	// add element to new array
	return 1;
	}

	if(($kwl[0]>0) AND ($kwl[1]<1) AND ($kwl[2]>0) AND (substr_count($haystack, $needle1)>0) AND (substr_count($haystack, $needle3)>0)) {
	// add element to new array
	return 1;
	}

	if(($kwl[0]>0) AND ($kwl[1]<1) AND ($kwl[2]<1) AND (substr_count($haystack, $needle1)>0)) {
	// add element to new array
	return 1;
	}

	if(($kwl[0]<1) AND ($kwl[1]>0) AND ($kwl[2]>0) AND (substr_count($haystack, $needle2)>0) AND (substr_count($haystack, $needle3)>0)) {
	// add element to new array
	return 1;
	}

	if(($kwl[0]><1) AND ($kwl[1]>0) AND ($kwl[2]<1) AND (substr_count($haystack, $needle2)>0)) {
	// add element to new array
	return 1;
	}

	if(($kwl[0]<1)  AND ($kwl[1]<1) AND ($kwl[2]>0) AND (substr_count($haystack, $needle3)>0)) {
	// add element to new array
	return 1;
	}
	return 0;
}

?>

Thanks,

Lite...

all the brothers were valiant!

[br][br]The truely intelligent people are not those who create the dots; rather they are they ones with the ability to connect the dots into a coherent picture

#2 emehrkay

emehrkay
  • Staff Alumni
  • Advanced Member
  • 1,214 posts

Posted 18 July 2006 - 04:02 AM

im having trouble understanding exactly what you're trying to do.

you use substr_count, but i thought $haystack was an array.

it might just be easier to place your needles in one array and your haystack in another and loop though both - again, i dont understand what you are doing. can you explain a little bit more?

#3 litebearer

litebearer
  • Members
  • PipPipPip
  • Advanced Member
  • 2,357 posts
  • Locationwhite lake michigan

Posted 18 July 2006 - 04:21 AM

Hmmm, ok, a little more definition...

First, I know that using mysql would simplify things; however, for various reasons, I am using a flatfile for this particular project.

each line in the file represents a 'record'

each record has several 'fields'

user desires to search the file for all records that contain certain keywords

user can use either 1, 2 or 3 keywords.

user can select that either ALL or ANY of the keywords are in a record

Matching using ANY is no brainer

The code I previously posted will, in the end, accomplish the task.

My basic question was/is:  Is there a more effective, simpler method of reaching the same end result

Not sure if that makes it any clearer.

Thanks,

Lite...

all the brothers were valiant!

[br][br]The truely intelligent people are not those who create the dots; rather they are they ones with the ability to connect the dots into a coherent picture

#4 akitchin

akitchin
  • Staff Alumni
  • Advanced Member
  • 2,516 posts
  • LocationCalgary, AB, Canada

Posted 18 July 2006 - 04:25 AM

try something like this:

<?php
// define the needles
$needle1 = 'one';
$needle2 = 'two';
$needle3 = 'three';

// map a function to all levels of an array
function deep_map($function, $val)
{
	if (is_array($val))
	{
		$return_arr = array();
		foreach ($val AS $k => $v)
			$return_arr["$k"] = deep_map($function, $val["$k"]);
		return $return_arr;
	}
	else
	{
		$command = "return (".$function."(\$val));";
		return eval($command);
	}
}

// the function to see if all the set needles match the element
function check_for_needles($haystack)
{
  // check for needle1, but only if it's set
  if (isset($GLOBALS['needle1']))
  {
    $results[] = (stristr($haystack, $GLOBALS['needle1']) !== FALSE) ? TRUE : FALSE;
  }

  // check for needle2, but only if it's set
  if (isset($GLOBALS['needle2']))
  {
    $results[] = (stristr($haystack, $GLOBALS['needle2']) !== FALSE) ? TRUE : FALSE;
  }

  // check for needle3, but only if it's set
  if (isset($GLOBALS['needle3']))
  {
    $results[] = (stristr($haystack, $GLOBALS['needle3']) !== FALSE) ? TRUE : FALSE;
  }

  // check if all set needles matched - return FALSE if they didn't, the element if they did
  if (in_array(FALSE, $results))
    return FALSE;
  else
    return $element;
}

// run the function on your array haystack (can be multi-leveled)
$matches = deep_map('check_for_needles', $haystack_array);

// strip out the un-matching ones
$matches = array_diff($matches, array(FALSE));

// $matches will now contain all the matching elements, keys intact, at all levels
?>

may not be anymore elegant, but it might be easier to debug.  the deep_map() is a function i made based on php.net entries, and will map the results of any defined function run on every element to an array with the same structure as the original (key => returned value for that element).  keep this one handy, it helps in a LOT of applications.

hth.

EDIT:  just noticed you said they should be able to select whether ANY or ALL are matched.  this can be adjusted by changing the if() condition in check_for_needles() to suit: checking whether there are any FALSEs vs. three FALSEs.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users