Jump to content

Finding similiar strings


gwolgamott

Recommended Posts

Ok I have a script here for searching a list of file names I'll give sudo code so as to not take up a huge space:

//$variable_key = searched item sent to function
$string = explode("_",$filename);
//$string[0] = number  $string[1] = name
if ($variable_key == $string[0] || $variable_key == $string[1])
{
   echo link to that file
}
//repeat through file interations for all possible matches

 

This can get me exact matches for name or form number if I use a string compare even case insesitive or if using strchr(string,search) I can find a string within a string.... but what if I have something like engineering.... and a file named engineer... or if they search enginering misspelled... what would I need to investigate for this?

Link to comment
https://forums.phpfreaks.com/topic/193990-finding-similiar-strings/
Share on other sites

Look at PHP:

soundex()

levenshtein()

similar_text()

metaphone()

 

but

 

MySql has its own soundex

http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_soundex

Here is a MySQL UDF for levenshtein

http://joshdrew.com/

 

Atomboy has double metaphone for Mysql here

http://atomboy.isa-geek.com/plone/Members/acoil/programing/double-metaphone/metaphone.sql/view

 

HTH

Teamatomic

EDIT: almost forgot, what could I use to test alpha-numeric results?

 

Do you think something like this would be proficient, combined a few functions using a test page I found, adjusting to some test data just to see if it gets the sort of results I want.  I don't like this code though, before I polish it up to be used as a function what can I improve on this? Right now it's a sloppy mess I know, why I'm asking for suggestions.

 

<html>
<body>
<form action="searchtest.php" name="test" method="get">
Name1:<input type="text" name="name1" /><br />
Name2:<input type="text" name="name2" /><br />
<input type="submit" name="submit" value="compare" />
</form>

<!--php code begins here -->

<?php
if(isset($_GET['submit']))
{
$str1 = $_GET['name1'];
$str2 = $_GET['name2'];
$meta_one=metaphone($str1);
$meta_two=metaphone($str2);
$lev = levenshtein($meta_one, $meta_two);
echo "metaphone code for ".$str1." is ". $meta_one;
echo "<br />";
echo "metaphone code for ".$str2." is ". $meta_two."<br>";
echo "<br />";
echo "levenshtein of the two metaphones is:".$lev.".<br>";

$length = strlen($str1);
$match = $lev / $length;
echo "percentage of length is: ".$match.".<br>";

if($match < .2)
{
echo "metaphone codes are matching";
}
if($match > .2 && $match < .61)
{
similar_text($str1,$str2,$percent);
echo "Percentage of similar texts are: ".$percent.".<br>";
if($percent > 50)
{
	echo "Match is:" .$percent." by the similar test.";
}
else
{
echo "metaphone codes are not matching";
}
}
elseif($match > .2)
{
echo "metaphone codes are not matching";
}
}
?>

</body>
</html>

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.