Jump to content

Regarding search without a database


Wuhtzu

Recommended Posts

Hey

 

I would like some advice and ideas on how to perform a search without the use of a database and functionality such as MySQL's LIKE.

 

Basically I have an array which contains filenames and I would like to find find entries in this array which "look alike" a given filename. Let me exemplify it:

[b]Array containing file names:[/b]
Array
(
    [0] => pictures-of-something_1.ext
    [1] => picturesofsomething2.ext
    [2] => myfile.ext
    [3] => WeIrDcAsE.eXt
    [4] => stupid file name.ext

}

 

Search and result:

pictures_of_something.ext will find pictures-of-something_1.ext and picturesofsomething2.ext

weirdcase.ext will find WeIrDcAsE.eXt

stupid_file.ext will find stupid file name.ext

myfile.gif will find myfile.ext

 

So I would like to be able to find entries which differs in case, use of space _ -, numbering and extension.

 

Any ideas on how to accomplish this / approach it?

 

One idea would be to take the filename, e.g. pictures_of_something_1.ext, and replace underscores, numbering and .ext with a wildcard in a regex.

 

But I would like to hear if you have some ideas regarding this task/problem :)

 

Best regards

Wuhtzu

 

 

 

 

 

 

Link to comment
Share on other sites

Hmm... maybe I choose the wrong board, I thought of posting in PHP Help, but since it's not concrete coding help it choose this. Thank to those who have stopped by so far. To those who will be stopping by in the future I can widen the type of advice I would like:

 

If you know of any articles regarding search in general, explanation of specific search mechanisms. For example how MySQL's LIKE works or something like that.

 

Thanks again

Wuhtzu

Link to comment
Share on other sites

as having the 'ext' extension will also find 'gif', it's probably fair to say the extension is negligable.

 

let's say that $search contains what you're looking for, and the array $haystack contains the data. first we'd get rid of the extension, then we'd make the $search and $haystack elements "common" - ie, take out all the characters that don't matter and put it all in the same case:

 

<?php
$search = 'pictures_of_something.ext';

$haystack = array (
   'pictures-of-something_1.ext',
   'picturesofsomething2.ext',
   'myfile.ext',
   'WeIrDcAsE.eXt',
   'stupid file name.ext'
);

// simple function to convert filename to lowercase alpha, without extension
function getCommon($file)
{
   // get filename part in lowercase
   list($filename, $ext) = explode('.', strtolower($file));

   // remove unwanted stuff
   $filename = preg_replace('/[^a-z]/', '', $filename);

   return $filename;
}

$needle = getCommon($search);

$results = array(); // our results will go here

// now check each array element
foreach($haystack as $item)
{
   if ($needle == getCommon($item))
   {
      $results[] = $item;
   }
}

// output results!
echo '<pre>';
print_r($results);
echo '</pre>';
?>

 

as it's used a fair few times, i've put the "common" converter into a little function called getCommon. what that does is removes the extension (as we don't really need it for sake of search), leaving us with a lowercase filename. then we just strip out all the characters from it we don't like - so if you search for "pictures_of_something.ext", then it gets converted into "picturesofsomething". Running the first two elements of the $haystack array will also return "picturesofsomething" and voila - 2 matches.

 

Hope that helps

Link to comment
Share on other sites

You are a genius man. I appreciate the code but the following was more than enough to paint the picture.

 

 

let's say that $search contains what you're looking for, and the array $haystack contains the data. first we'd get rid of the extension, then we'd make the $search and $haystack elements "common" - ie, take out all the characters that don't matter and put it all in the same case:

 

 

I think it's a good solution to the problem!

 

Thanks alot

Wuhtzu

Link to comment
Share on other sites

If you're looking to search through the contents of an array with permutations of natural language or anything like that you'll have to write some regex. It looks intimidating at first but it's fun once you figure it out. Sorry for the vague answer, just trying to provide a next-step jumping off point if you find out that you mean something more ambitious by "search."

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.