Jump to content

Need Help Creating Text Variation Function


eldan88

Recommended Posts

Hey,

 

I am trying to create a function that will take a word can create a variation. LIke for example if i type in the word "ave" it will generate the word avenue. If I type in "st", it will generate the word "street". So for example if i type in "1st st", "1st ave" it will then echo back "1st st", "1st street", "1st avenue"

 

I have created simple form to do that below with str_replace, but didn't find that useful... Any ideas on how I can accomplish this.

 

<?php
if(isset($_POST['submit'])) {
$variation = $_POST['variation'];
$delivery_range = str_replace("ave", "avenue", $variation);
}
?>

<form action="variation_keyword.php" method="post">
<textarea cols="10" rows="10" name="variation"><?php echo $delivery_range ?> </textarea>
<input type="submit" name="submit" value="submit" />
</form>

Link to comment
Share on other sites

I was going to agree with Stooney until I notices a problem: 'st'. In your example you state the input could be something like "1st st" and you want that converted to "1st Street". But, str_replace would actually change that to "1street street".

 

Now, regular expression *might* be a better solution. But, the thing is that it is nearly impossible to programatically modify "text" as a human would do it. As in the example above, you really only want to replace 'st' with 'street' based upon the context that it is used. Humans are great at deciphering the meaning of something based upon the context of how it is used. To have a machine do it requires programming an understanding of all those possible contexts. For example, you and I would understand the following inputs as 'logically' the same thing: "1st st", "First St.", "first street", etc.

 

So, your best bet is to create a programatical solution that will meet a large number of possibilities, but will never be fool proof. For somethign like this I would think it would be better to not modify a value that perhaps should be than to erroneously modify a value that shouldn't.

Link to comment
Share on other sites

^

 

good catch. perhaps this then:

$delivery_range = str_replace(array(' ave', ' st'), array(' avenue', ' street'), $variation);

 

Using the space, one could minimize the number of unintentional replaces, seeing as '1st st' would be more common that '1 st st'. This would be more of a quick/temp solution, nothing serious or long term.

Link to comment
Share on other sites

I would use regular expression which can look for word boundaries, make the replacement case-insensitive, and add an optional period at the end. Here is an example script. Note: I couldn't get the optional period to work. It normally has a special meaning in regular expression and I couldn't seem to properly escape it - and I'm too busy to research at the moment.

 

//Master aray of replacements
$replacements = array
(
   'st' => 'street',
   'av' => 'avenue',
   'ave' => 'avenue'
);

//Array of test values to modify
$test_values = array
(
   "1st st",
   "1st St",
   "1st St.",
   "1st ave",
   "Averton av",
   "Averton ave",
   "Averton Ave."
);

//Create separate array for search patterns
$patterns = array_keys($replacements);
//Modify patterns to:
// 1 - include word boundaries
// 2 - make case insensitive
// 3 - add optional period NOT IMPLEMENTED
foreach($patterns as &$pattern)
{
   $pattern = '#\b' . $pattern . '\b#i';
}

//Perform replacement for all test values
$results = preg_replace($patterns, $replacements, $test_values);

//Output the before and after results
echo "<br><br>Before:<pre>" . print_r($test_values, 1) . "</pre>";
echo "After:<pre>" .  print_r($results, 1) . "</pre>";

 

Results

Before:
Array
(
   [0] => 1st st
   [1] => 1st St
   [2] => 1st St.
   [3] => 1st ave
   [4] => Averton av
   [5] => Averton ave
   [6] => Averton Ave.
)

After:
Array
(
   [0] => 1st street
   [1] => 1st street
   [2] => 1st street.
   [3] => 1st avenue
   [4] => Averton avenue
   [5] => Averton avenue
   [6] => Averton avenue.
)

Edited by Psycho
Link to comment
Share on other sites

I would use regular expression which can look for word boundaries, make the replacement case-insensitive, and add an optional period at the end. Here is an example script. Note: I couldn't get the optional period to work. It normally has a special meaning in regular expression and I couldn't seem to properly escape it - and I'm too busy to research at the moment.

 

//Master aray of replacements
$replacements = array
(
'st' => 'street',
'av' => 'avenue',
'ave' => 'avenue'
);

//Array of test values to modify
$test_values = array
(
"1st st",
"1st St",
"1st St.",
"1st ave",
"Averton av",
"Averton ave",
"Averton Ave."
);

//Create separate array for search patterns
$patterns = array_keys($replacements);
//Modify patterns to:
// 1 - include word boundaries
// 2 - make case insensitive
// 3 - add optional period NOT IMPLEMENTED
foreach($patterns as &$pattern)
{
$pattern = '#\b' . $pattern . '\b#i';
}

//Perform replacement for all test values
$results = preg_replace($patterns, $replacements, $test_values);

//Output the before and after results
echo "<br><br>Before:<pre>" . print_r($test_values, 1) . "</pre>";
echo "After:<pre>" . print_r($results, 1) . "</pre>";

 

Results

Before:
Array
(
[0] => 1st st
[1] => 1st St
[2] => 1st St.
[3] => 1st ave
[4] => Averton av
[5] => Averton ave
[6] => Averton Ave.
)

After:
Array
(
[0] => 1st street
[1] => 1st street
[2] => 1st street.
[3] => 1st avenue
[4] => Averton avenue
[5] => Averton avenue
[6] => Averton avenue.
)

 

Physco,

 

Thansk for this great tutorial. But my question is how would I incoperate to work with a text area, so that when I hit submit, all these keywords are all generated? Thanks

Link to comment
Share on other sites

Physco,

 

Thansk for this great tutorial. But my question is how would I incoperate to work with a text area, so that when I hit submit, all these keywords are all generated? Thanks

 

Seriously? That wasn't a "tutorial" it was a working example. I even included comments at each step to state what that code was doing. So, let's go through it.

 

First section was to define an array for all of the replacements. Is there something you don't understand about this or how you would need to create the array for your needs?

 

Next section was an array of test values - since that is only used for testing you don't need it.

 

Then there is a section to convert the master array of replacements into a usable array for the regular expression. You would want to do the same thing with the master array you create

 

Lastly you run the regular expression and get the results. In the script above I used the test array. Instead you will want to use whatever value you want to run the replacements on. I.e. the text-area input.

Link to comment
Share on other sites

Seriously? That wasn't a "tutorial" it was a working example. I even included comments at each step to state what that code was doing. So, let's go through it.

 

First section was to define an array for all of the replacements. Is there something you don't understand about this or how you would need to create the array for your needs?

 

Next section was an array of test values - since that is only used for testing you don't need it.

 

Then there is a section to convert the master array of replacements into a usable array for the regular expression. You would want to do the same thing with the master array you create

 

Lastly you run the regular expression and get the results. In the script above I used the test array. Instead you will want to use whatever value you want to run the replacements on. I.e. the text-area input.

 

Sorry. What I meant to say is working example. What you did was fantastic. I want to thank you for pointing me into the right direction. As you can see i am fairly new to PHP, and I'm still learning.

 

I have implemented your code (below) and it works great!

 

I just wanted to know how you used to the foreach statement below..

foreach($patterns as &$pattern)
{
$pattern = '#\b' . $pattern . '\b#i';
}

 

and what is the '#\b' and '\b#i' supposed to mean.

 

Below is how i implemented the code.

 

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Plular Keyword Tool</title>
</head>
<body>
<?php
if(isset($_POST['submit'])) {
//Master aray of replacements
$replacements = array
(
'st' => 'street',
'av' => 'avenue',
'ave' => 'avenue',
'E' => 'east',
'W' => 'west'
);
$delivery_range = $_POST['variation'];
//Create separate array for search patterns
$patterns = array_keys($replacements);
//Modify patterns to:
// 1 - include word boundaries
// 2 - make case insensitive
// 3 - add optional period NOT IMPLEMENTED
foreach($patterns as &$pattern)
{
$pattern = '#\b' . $pattern . '\b#i';
}
//Perform replacement for all test values
$results = preg_replace($patterns, $replacements, $delivery_range);
//Output the before and after results
echo "Plural Keywords:<pre>" . print_r($results, 1) . "</pre>";
}
?>

<form action="variation_keyword.php" method="post">
<textarea cols="10" rows="10" name="variation"><?php echo $delivery_range ?> </textarea>
<input type="submit" name="submit" value="submit" /><br />

</form>
</body>
</html>

 

Thanks for everything.

Edited by eldan88
Link to comment
Share on other sites

I just wanted to know how you used to the foreach statement below..

foreach($patterns as &$pattern)
{
   $pattern = '#\b' . $pattern . '\b#i';
}

 

and what is the '#\b' and '\b#i' supposed to mean.

 

Now that is a good follow up question! Sometimes I put in more advanced code than I think the person asking the question is familiar with. The intent is for that person to use it as a learning opportunity.

 

In the foreach loop you may have noticed that I used an ampersand before the variable $pattern. Without the ampersand PHP will assign the value of each array element to the variable in the foreach statement. You can then use that variable within the loop as needed. But, if you modify that variable in the loop it doesn't affect the value in the array. however, if you put the ampersand there it passes the value "by reference". This means that it is not simply assigning the value to the variable - the variable is a reference to the actual value in the array. So, if you modify the value in the loop, you are modifying the value in the array. This can be a confusing to understand at first.

 

As Christian stated, the \b is a word boundary when doing regular expressions (another thing that can be confusing at first). So, if you have a regular expression such as \bave\b, the expression is looking for the string 'ave' that starts with a word boundary or ends with a word boundary. What does that mean? In basic terms it means that it can't be within another word. So, a word boundary can be the absolute beginning or end of a string, a space, a tap, a period or other punctuation mark. Some examples might help. Using the regular expression \bave\b:

"125 fifth ave." would be a match

"125 fifth ave suite #5" would be a match

"125 fifth avenue" would NOT be a match

"talk to dave" would NOT be a match

Link to comment
Share on other sites

Now that is a good follow up question! Sometimes I put in more advanced code than I think the person asking the question is familiar with. The intent is for that person to use it as a learning opportunity.

 

In the foreach loop you may have noticed that I used an ampersand before the variable $pattern. Without the ampersand PHP will assign the value of each array element to the variable in the foreach statement. You can then use that variable within the loop as needed. But, if you modify that variable in the loop it doesn't affect the value in the array. however, if you put the ampersand there it passes the value "by reference". This means that it is not simply assigning the value to the variable - the variable is a reference to the actual value in the array. So, if you modify the value in the loop, you are modifying the value in the array. This can be a confusing to understand at first.

 

As Christian stated, the \b is a word boundary when doing regular expressions (another thing that can be confusing at first). So, if you have a regular expression such as \bave\b, the expression is looking for the string 'ave' that starts with a word boundary or ends with a word boundary. What does that mean? In basic terms it means that it can't be within another word. So, a word boundary can be the absolute beginning or end of a string, a space, a tap, a period or other punctuation mark. Some examples might help. Using the regular expression \bave\b:

"125 fifth ave." would be a match

"125 fifth ave suite #5" would be a match

"125 fifth avenue" would NOT be a match

"talk to dave" would NOT be a match

 

 

Thanks for that useful information. I am trying to digest what you just explained.. so i have 3 questions

 

1)Why did you assign $patterns to array_keys function why not just use the array function?

2) What is the difference between array and array_keys?

3) Is '#\b' and '\b#i' the format I should use for all boundries?

 

Thank you

Link to comment
Share on other sites

Let's see if we can't clear up some more confusions. ;)

 

  1. "Array" is not a function, it's a data type. A function is a group of source code lines which have been grouped logically, to accomplish a single logical task. Such as translating from lower to upper case characters, for example. Everything followed by a () in programming is (usually) a function.
  2. That's why he used array_keys (), which does what it says and fetches the keys from the key-value mappings in the array data type. Kind of the same as substr () fetches a part of a string value, if looking broadly at it.
  3. Only \b is the actual word boundary escape sequence. The hash tags (#) are the Regular Exp<b></b>ression delimiters, which tells the RegExp engine where the actual pattern starts and stops; While the "i" is a RegExp modifier, telling the engine that the text should be matched in a case-insensitive manner.

 

Hopefully that helps explain some things. I also would recommend sitting down with the manual, a good cup of tea, and reading everything up to the "Features" chapter. There's a LOT of really good and useful information there, and most of it very easy to read with good examples provided.

Well... You could skip the "Installation" chapter, unless you plan on installing PHP manually (which I don't recommend for anything other than the exercise). :)

 

Good luck!

Edited by Christian F.
Link to comment
Share on other sites

Let's see if we can't clear up some more confusions. ;)

  1. "Array" is not a function, it's a data type. A function is a group of source code lines which have been grouped logically, to accomplish a single logical task. Such as translating from lower to upper case characters, for example. Everything followed by a () in programming is (usually) a function.
  2. That's why he used array_keys (), which does what it says and fetches the keys from the key-value mappings in the array data type. Kind of the same as substr () fetches a part of a string value, if looking broadly at it.
  3. Only \b is the actual word boundary escape sequence. The hash tags (#) are the Regular Expression delimiters, which tells the RegExp engine where the actual pattern starts and stops; While the "i" is a RegExp modifier, telling the engine that the text should be matched in a case-insensitive manner.

Hopefully that helps explain some things. I also would recommend sitting down with the manual, a good cup of tea, and reading everything up to the "Features" chapter. There's a LOT of really good and useful information there, and most of it very easy to read with good examples provided.

Well... You could skip the "Installation" chapter, unless you plan on installing PHP manually (which I don't recommend for anything other than the exercise). :)

 

Good luck!

 

Thanks, that helped a lot. I apperciate it. I guess I would need to go barnes and nobles with a nice cup of earl grey tea to go over the manuel

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.