Jump to content

Recommended Posts

one way would be to split by the space characters, and then run an array_intersect against the results:

 

$first_sentence = 'Does the do jump over the lazy sheep or the spotted cow?';
$second_sentence = 'The dog jumps over the lazy sheep.';

$first_components = explode(' ', $first_sentence);
$second_components = explode(' ', $second_sentence);

$duplicate_words = array_intersect($first_components, $second_components);
print_r($duplicate_words);

 

note that this won't take into account periods and will be case-sensitive. to avoid case sensitivity, you can use strtolower against the original string, as well as str_replace to replace any characters you don't want interfering such as punctuation.

 

have a look in the manual at the other intersect functions if you want to play around with key preservation.

$first_components = explode(' ', $first_sentence);

$second_components = explode(' ', $second_sentence);

 

Can also be written as (and may depending on the PHP implementation even be better):

 

$first_components = str_word_count($first_sentence, 1);
$second_components = str_word_count($second_sentence, 1);

That's a pretty complex task. "The" also appears in both of them as well and so does "the dog".

 

Anyway, try to experiment with something like this:

 

<?php
$string1 = 'Does the dog jump over the lazy sheep or the spotted cow?';
$string2 = 'The dog jumps over the lazy sheep.';

$similarities = array_intersect(
explode(' ', preg_replace('#[^a-z ]#', '', strtolower($string1))),
explode(' ', preg_replace('#[^a-z ]#', '', strtolower($string2)))
);

print_r($similarities);

 

Edit: Someone beat me to it.

 

Edit 2: Why would this not be a "similarity"?

"Does the dog jump over the lazy sheep or the spotted cow?"

"The dog jumps over the lazy sheep."

Edit 2: Why would this not be a "similarity"?

"Does the dog jump over the lazy sheep or the spotted cow?"

"The dog jumps over the lazy sheep."

 

Well using the array_intersect and explode will give you the words both strings have in common. But maybe the OP wants the substrings both strings have in common and want to do something else if it isn't:

 

"Does the dog jump over the lazy sheep or the spotted cow?"

"The lazy dog jumps over the sheep."

 

"over the lazy sheep" is only present in the first sentence and not in the second.

Now I'm working on breaking up a string into all possible parts. For example "the dog jumps over the lazy sheep" would break into:

 

 

  • the
  • the dog jumps
  • the dog jumps over
  • the dog jumps over the
  • the dog jumps over the lazy
  • the dog jumps over the lazy sheep
  • dog
  • dog jumps
  • dog jumps over
  • dog jumps over the
  • dog jumps over the lazy
  • dog jumps over the lazy sheep
  • jumps
  • jumps over
  • jumps over the
  • jumps over the lazy
  • jumps over the lazy sheep
  • over
  • over the
  • over the lazy
  • over the lazy sheep
  • the
  • the lazy
  • the lazy sheep
  • lazy
  • lazy sheep
  • sheep

 

 

Long list :P and yes I want similar substrings but I think I can achieve that by modifying your methods. Then when I get the similarities I can find the one I want by simply checking which one is the longest. Anyone have a way to generate all possible word combinations?

$words = str_word_count('the dog jumps over the lazy sheep', 1);
$sizeof = sizeof($words);
for ($k = 0; $k < $sizeof; ++$k) {//specifies offset
    for ($i = $k; $i < $sizeof; ++$i) {
        for ($j = $k; $j <= $i; ++$j) {
            echo $words[$j];
        }
        echo '<br>';
    }
}

 

Outputs:

 

the
thedog
thedogjumps
thedogjumpsover
thedogjumpsoverthe
thedogjumpsoverthelazy
thedogjumpsoverthelazysheep
dog
dogjumps
dogjumpsover
dogjumpsoverthe
dogjumpsoverthelazy
dogjumpsoverthelazysheep
jumps
jumpsover
jumpsoverthe
jumpsoverthelazy
jumpsoverthelazysheep
over
overthe
overthelazy
overthelazysheep
the
thelazy
thelazysheep
lazy
lazysheep
sheep

Like this?

 

<?php
$string1 = 'Does the dog jump over the lazy sheep or the spotted cow?';
$string2 = 'The dog jumps over the lazy sheep.';

$similarities = array_intersect(
   explode(' ', preg_replace('#[^a-z ]#', '', strtolower($string1))),
   explode(' ', preg_replace('#[^a-z ]#', '', strtolower($string2)))
);

$similarities = array_values($similarities);

$count = count($similarities);
$matches = array();

for ($i = 0; $i < $count; ++$i) {
for ($x = 0, $xMax = $count - $i; $x < $xMax; ++$x) {
	$m = array();
	for ($j = $i, $jMax = $i + $x; $j <= $jMax; ++$j) {
		$m[] = $similarities[$j];
	}

	$m = join(' ', $m);
	if (!in_array($m, $matches)) {
		$matches[] = $m;
	}
}
}

print_r($matches);

Alright, here's what I've come up with:

 

<?php
$words1 = str_word_count('does the dog jump over the lazy sheep or the spotted cow', 1);
$sizeof = sizeof($words1);
for ($k = 0; $k < $sizeof; ++$k) {//specifies offset
    for ($i = $k; $i < $sizeof; ++$i) {
        for ($j = $k; $j <= $i; ++$j) {
            $combination[] = $words1[$j];
        }
        $combinations1[] = implode(' ', $combination);
        unset($combination);
    }
}

$words2 = str_word_count('the dog jumps over the lazy sheep', 1);
$sizeof = sizeof($words2);
for ($k = 0; $k < $sizeof; ++$k) {//specifies offset
    for ($i = $k; $i < $sizeof; ++$i) {
        for ($j = $k; $j <= $i; ++$j) {
            $combination[] = $words2[$j];
        }
        $combinations2[] = implode(' ', $combination);
        unset($combination);
    }
}


$similarities = array_intersect(
   $combinations1,
   $combinations2
);

print_r($similarities);
?>

I still need to add some functions and tidy it up a bit and get the longest sentence, but it gets the job done.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.